Skip to content

【问题】为什么要把可训练参数精度强行转换为全精度? #4549

Closed
@LaniakeaS

Description

我在尝试全参微调,发现显存不够用。排查后发现llama-factory会强制把精度设置在fp32。由于我使用了deepspeed,所以无法使用pure bf16参数。

想问一下这个步骤的必要性是什么?能否在使用deepspeed的情况下也支持bf16和fp16?

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    solvedThis problem has been already solved

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions