Framework Desktop ML (cogito) Hardware Specifications Model: AMD Ryzen AI MAX+ 395 (Strix Halo) CPU: 16 cores Zen 5 GPU: Radeon 8060S (40 RDNA 3.5 CUs) Total RAM: 128GB unified memory Max VRAM: 96GB (via AMD Variable Graphics Memory) NPU: XDNA 2, 50+ peak AI TOPS GPU Arch: gfx1151 Peak Perf: 59.4 FP16/BF16 TFLOPS @ 2.9GHz Inference Performance (AMD testing with LM Studio 0.3.11 / llama.cpp 1.18): - Small models (1-3B): ~100+ tokens/sec - Medium models (7-8B): ~60-80 tokens/sec - Large models (20B): ~58 tokens/sec - Very large models (120B): ~38 tokens/sec - Context support: Up to 256K tokens with Flash Attention