Wow! Only $2k with no quantization. hit between 4.25 to 3.5 TPS (tokens per seco... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

rahimnathwani on Feb 1, 2025 | parent | context | favorite | on: How to Run DeepSeek R1 671B Locally on a $2000 EPY...

Wow! Only $2k with no quantization.

  hit between 4.25 to 3.5 TPS (tokens per second) on the Q4 671b full model

kristianp on Feb 2, 2025 [–]

I think it is quantised, they actually said no distillation.

rahimnathwani on Feb 3, 2025 | [–]

I think you're right. The instructions say

  ollama pull deepseek-r1:671b

This will pull down 400GB: https://ollama.com/library/deepseek-r1:671b

But the Huggingface repo has 163 files of ~4.3GB each, so around 700GB: https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact