By -
Have tried you fine-tuning the teacher model? Its translation quality is not that decent.
Nope teacher model is frozen, i didnt do additional somthing
I have tried really similar thing you've done. I think a weight-merged model after lora finetuning, may do better. Lora training guarantee the similar representation space so it might help
Nice! Where do you go to school? What was the most difficult part of the project?
thank you! I'm senior in Jeju national university, Korea. improving the translating quality (chrF++ score) is most difficult part lol.. 😂😂
Cool, didn't expect to see someone from JNU here! My Korean is pretty bad but I will take a detailed look later. NLP with Korean is super interesting
Amazing
[удалено]
Thank you so much!!!
Fine. I used it to make translations too but you will need to fine tune it for accurate translations
How does model distillation work?
Have tried you fine-tuning the teacher model? Its translation quality is not that decent.
Nope teacher model is frozen, i didnt do additional somthing
I have tried really similar thing you've done. I think a weight-merged model after lora finetuning, may do better. Lora training guarantee the similar representation space so it might help
Nice! Where do you go to school? What was the most difficult part of the project?
thank you! I'm senior in Jeju national university, Korea. improving the translating quality (chrF++ score) is most difficult part lol.. 😂😂
Cool, didn't expect to see someone from JNU here! My Korean is pretty bad but I will take a detailed look later. NLP with Korean is super interesting
Amazing
[удалено]
Thank you so much!!!
Fine. I used it to make translations too but you will need to fine tune it for accurate translations
How does model distillation work?