Scaling laws data #42

borgr · 2024-02-29T21:37:32Z

I am researching scaling laws across models and architectures among other things and was wondering if you could share the logs\training losses\val eval of the models you have ran for the scaling law experiments in DeepSeek LLM. If you have other similar losses or results it would also be interesting. It might not be super well curated, anything can be helpful.
Thanks

borgr · 2024-02-29T21:50:33Z

Also the model losses from the figure, are they available somewhere?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling laws data #42

Scaling laws data #42

borgr commented Feb 29, 2024

borgr commented Feb 29, 2024

Scaling laws data #42

Scaling laws data #42

Comments

borgr commented Feb 29, 2024

borgr commented Feb 29, 2024