Request for Assistance with Replication #15

WentaoTan · 2025-01-15T06:59:13Z

Hello! Our team has been working on reproducing SkyThought, which we find to be groundbreaking and insightful. We have successfully replicated the results using the Qwen2.5-32B model based on the llama-factory code, achieving significant performance improvements.

However, we encountered some challenges when attempting the same training with the llama3.3-70B model. Contrary to our expectations, we did not observe a notable performance boost, and there was even a slight decline in performance on the math500 benchmark.

We greatly appreciate any guidance or insights. Thank you!

hxdtest · 2025-01-15T07:30:15Z

Have you tested eval scripts? issue I can't reproduce Qwen/QwQ-32B-Preview accuracy on AIME with eval scripts? Can you reproduce Qwen/QwQ-32B-Preview accuracy on AIME with eval scripts?

WentaoTan · 2025-01-15T07:41:50Z

We tried to replicate the performance of QwQ but failed because our resources could not support very long output and QwQ output was very long.

richardliaw · 2025-01-15T19:59:02Z

Hi @WentaoTan, we're currently working on more extensive experiments on different model sizes and architectures, and have not tested on 70b yet.

Can you share some of the numbers you've gotten so far?

DachengLi1 assigned DachengLi1 and richardliaw and unassigned richardliaw and DachengLi1 Jan 16, 2025

caoshiyi added help wanted Extra attention is needed question Further information is requested labels Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for Assistance with Replication #15

Request for Assistance with Replication #15

WentaoTan commented Jan 15, 2025

hxdtest commented Jan 15, 2025 •

edited

Loading

WentaoTan commented Jan 15, 2025

richardliaw commented Jan 15, 2025 •

edited

Loading

Request for Assistance with Replication #15

Request for Assistance with Replication #15

Comments

WentaoTan commented Jan 15, 2025

hxdtest commented Jan 15, 2025 • edited Loading

WentaoTan commented Jan 15, 2025

richardliaw commented Jan 15, 2025 • edited Loading

hxdtest commented Jan 15, 2025 •

edited

Loading

richardliaw commented Jan 15, 2025 •

edited

Loading