基于最新的LLaMA-Factory训练Qwen2.5-vl, 训练变得非常慢

### Reminder

- [x] I have read the above rules and searched the existing issues.

### System Info

llama-factory: 0.9.2.dev0
transformers: 4.49.0
python: 3.9

使用最新的LLaMA-Factory训练Qwen2-vl(2b), 采用full方式全量微调，训练速度基本无问题；  但切换成训练Qwen2.5-vl(3b)后，不管是采用full还是lora方式全量微调，训练变得异常的慢，基本比Qwen2-vl(2b)慢5-6x。



### Reproduction

```text
Put your message here.
```


### Others

_No response_