New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

考虑基于项目上下文重建高可读性的反编译代码? #42

Open

clipsheep6 opened this issue Feb 10, 2025 · 1 comment

clipsheep6 commented Feb 10, 2025

当前的反编译基于文件级，可读性并不是很好，是否可以基于项目的上下文重建高可读性的反编译代码？最新的模型在重建上可能效果也更好

Owner

albertan017 commented Feb 10, 2025 •

edited

Loading

目前的llm并不具备项目级代码理解能力（llm翻译一段话很简单，翻译一个章节明显出现遗忘问题），训练和推理开销也是极其高（不考虑优化，attention计算是输入长度的三次方关系），训练项目级重建成本和难度太高。

我们更倾向于单独重构，整合重组：利用好函数自身的信息去重构，再将一个个重构的函数一起送入更强的模型（GPT-o1，Deepseek-R1）去refine。llm4decompile负责做好单个函数，GPT等则擅长从更高层次整合数据

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment