DRAM Requirement for DeepSeek-R1 Q4_K_M (OOM on DRAM) #300

shepardyan · 2025-02-14T15:40:17Z

shepardyan
Feb 14, 2025

Hello, developing team & community.

I'm trying to load DeepSeek-R1 Q4_K_M ggufs using ktransformers v0.2 (0.2.0+cu121torch24fancy). Here is the setup of my server:
2 x Xeon Gold 6326 16-core CPU
4 x RTX 4090
512 GB DDR4 RAM

And here's the startup command which leads to OOM of DRAM:

python ./ktransformers/local_chat.py --model_path deepseek-ai/Deepseek-R1 --gguf_path /LLM/models/unsloth/Deepseek-R1-Q4_K_M/ --cpu_infer 30 --max_new_tokens 1000 --optimize_rule_path DeepSeek-V3-Chat.yaml

During the process of loading ggufs, the memory usage continuously increased to over 500 GB, resulting in the process being killed, while the GPU memory usage increased more slowly. Is this due to insufficient system DRAM? What is the approximate memory requirement for Q4_K_M under such setup? Would using four GPUs reduce the memory demand? Thank you for any help.

Answered by xmcbbkad

Feb 16, 2025

same question: #375

View full answer

xmcbbkad · 2025-02-16T14:08:29Z

xmcbbkad
Feb 16, 2025

same question: #375

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DRAM Requirement for DeepSeek-R1 Q4_K_M (OOM on DRAM) #300

{{title}}

Replies: 1 comment

{{title}}

Select a reply

DRAM Requirement for DeepSeek-R1 Q4_K_M (OOM on DRAM) #300

shepardyan Feb 14, 2025

Replies: 1 comment

xmcbbkad Feb 16, 2025

shepardyan
Feb 14, 2025

xmcbbkad
Feb 16, 2025