Skip to content

HuggingChat should use the original Deepseek R1 instead of the Distilled Qwen 32 B parameter version. #1724

Open
@dagoatzmcclusky

Description

Bug description

The point is, when I use the distilled 32B Qwen version of deepseek, it significantly hallucinated like gpt-4o in many mid-tier tasks. Instead, when I switched back to the original R1 model on HuggingFace, the model ddid not hallucinate.

Steps to reproduce

Maybe the devs should replace deepseek r1-distill-32b-qwen model with the original deepseek-r1 model. The latter is proved to be more capable at handling more complex tasks and does not hallucinate like the r1-distill-32b-qwen model.
Please fix this problem as soon as possible, the current deepseek r1-distilled-32b model is now unusable after this type of error.

Screenshots

Image

The model hallucinates every time i prompt it to write a vietnamese essay.

Specs

  • OS: Ubuntu 24.04 LTS
  • Browser: Vivaldi

Notes

Since the current deepseek-r1-distill-32b model is based on the qwq-32b-preview model in which the latter is already available on huggingchat, so why having deepseek-r1-distill-32b model anyway? It's best to replace it with the original deepseek-r1.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions