details of splitting exebench for train and evaluation #28

wang-yongpan · 2024-09-29T07:04:45Z

hello, i m impressed by the Decompile model you released.

i want to know the details of splitting exebench for train and evaluation. because i want to reproduce your evaluation results for a better application.

thank you.

albertan017 · 2024-09-29T07:45:19Z

We use all the training samples:

train_synth_compilable
train_real_compilable
train_synth_simple_io
train_real_simple_io
train_synth_rich_io

And test on its test set:

test_synth

Good luck for your project!

wang-yongpan · 2024-09-29T07:46:50Z

ok, i get it. thanks for your reply

wang-yongpan · 2024-10-08T08:38:35Z

hi,

I found that the exebench has three optimization options (O0, O3, Os). How can I evaluate your tool on different options (O0, O1, O2, O3) similar to your experiments?

albertan017 · 2024-10-08T11:22:37Z

Unfortunately, you will have to compile the dataset on your own as we do not have the authorization to distribute another's dataset. For more details on the issues we faced, please refer to Appendix A in our paper.

wang-yongpan · 2024-10-09T01:56:12Z

OK, how can i obtain the source code of exebench? i can not find them from huggingface.

albertan017 · 2024-10-09T03:11:50Z

you can find it here

wang-yongpan · 2024-10-09T03:40:27Z

I can not find the source code from this link. I just found the below files:

train_not_compilable: 2.357M
train_synth_compilable: 2.308373M
train_real_compilable: 0.675074M
train_synth_simple_io: 0.550116M
train_real_simple_io: 0.043769M
train_synth_rich_io: 0.097250M
valid_synth: 5k
valid_real: 2.133k
test_synth: 5k
test_real: 2.134k

do you mean the source code is contained in these?

albertan017 · 2024-10-09T04:11:21Z

that's all they provided...

wang-yongpan · 2024-10-10T09:23:35Z

ok, I noticed that your paper tested the re-executability rate of the exebench dataset. Can I ask how you achieved it?

albertan017 · 2024-10-11T03:58:01Z

in the examples/basic.py, you can see

synth_wrapper = Wrapper(c_deps=row['synth_deps'] + '\n' + row['synth_io_pairs']['dummy_funcs'][0] + '\n',
                                    func_c_signature=row['func_head_types'].replace('extern', ''), func_assembly=row['asm']['code'][0],
                                    cpp_wrapper=row['synth_exe_wrapper'])

it requires the func_assembly.
So we remove the func_assembly, and add the func_def:

synth_wrapper = Wrapper(c_deps=row['synth_deps'] + '\n' + row['synth_io_pairs']['dummy_funcs'][0] + '\n' + row['func_def'],
                                    func_c_signature=row['func_head_types'].replace('extern', ''), func_assembly=None,
                                    cpp_wrapper=row['synth_exe_wrapper'])

We made some additional changes to the code for our specific needs, but that's essentially how you can modify it. However, we were only able to compile half of the code with these modifications.
If you have any better solutions, we would really appreciate your insights!

albertan017 mentioned this issue Dec 6, 2024

How to Obtain O0-O3 Assembly Code from ExeBench Dataset? #37

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

details of splitting exebench for train and evaluation #28

details of splitting exebench for train and evaluation #28

wang-yongpan commented Sep 29, 2024

albertan017 commented Sep 29, 2024

wang-yongpan commented Sep 29, 2024

wang-yongpan commented Oct 8, 2024

albertan017 commented Oct 8, 2024

wang-yongpan commented Oct 9, 2024

albertan017 commented Oct 9, 2024

wang-yongpan commented Oct 9, 2024

albertan017 commented Oct 9, 2024

wang-yongpan commented Oct 10, 2024

albertan017 commented Oct 11, 2024

details of splitting exebench for train and evaluation #28

details of splitting exebench for train and evaluation #28

Comments

wang-yongpan commented Sep 29, 2024

albertan017 commented Sep 29, 2024

wang-yongpan commented Sep 29, 2024

wang-yongpan commented Oct 8, 2024

albertan017 commented Oct 8, 2024

wang-yongpan commented Oct 9, 2024

albertan017 commented Oct 9, 2024

wang-yongpan commented Oct 9, 2024

albertan017 commented Oct 9, 2024

wang-yongpan commented Oct 10, 2024

albertan017 commented Oct 11, 2024