Description
Bug description
The point is, when I use the distilled 32B Qwen version of deepseek, it significantly hallucinated like gpt-4o in many mid-tier tasks. Instead, when I switched back to the original R1 model on HuggingFace, the model ddid not hallucinate.
Steps to reproduce
Maybe the devs should replace deepseek r1-distill-32b-qwen model with the original deepseek-r1 model. The latter is proved to be more capable at handling more complex tasks and does not hallucinate like the r1-distill-32b-qwen model.
Please fix this problem as soon as possible, the current deepseek r1-distilled-32b model is now unusable after this type of error.
Screenshots
The model hallucinates every time i prompt it to write a vietnamese essay.
Specs
- OS: Ubuntu 24.04 LTS
- Browser: Vivaldi
Notes
Since the current deepseek-r1-distill-32b model is based on the qwq-32b-preview model in which the latter is already available on huggingchat, so why having deepseek-r1-distill-32b model anyway? It's best to replace it with the original deepseek-r1.
Activity