blob: 7a3b28525527688754ca3a5ebb32dcd85368bcff (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
Framework Desktop ML (cogito) Hardware Specifications
Model: AMD Ryzen AI MAX+ 395 (Strix Halo)
CPU: 16 cores Zen 5
GPU: Radeon 8060S (40 RDNA 3.5 CUs)
Total RAM: 128GB unified memory
Max VRAM: 96GB (via AMD Variable Graphics Memory)
NPU: XDNA 2, 50+ peak AI TOPS
GPU Arch: gfx1151
Peak Perf: 59.4 FP16/BF16 TFLOPS @ 2.9GHz
Inference Performance (AMD testing with LM Studio 0.3.11 / llama.cpp 1.18):
- Small models (1-3B): ~100+ tokens/sec
- Medium models (7-8B): ~60-80 tokens/sec
- Large models (20B): ~58 tokens/sec
- Very large models (120B): ~38 tokens/sec
- Context support: Up to 256K tokens with Flash Attention
|