Avoid stack overflows by computing tree depth and giving up before attempting OCaml translation #73

mjambon · 2024-01-19T20:48:42Z

Typically, long sequences of items such as long array literals or long sequences of statements are found only in generated files. Such files may not be interesting to analyze (e.g. in the case of semgrep) so it would be fine to give up on them. The problem is that at least the translation to the OCaml tree uses stack space that's proportional to the length of the tree (including lists) and it results in segfaults on some platforms while on other platforms it raises a Stack_overflow exception. To avoid such a crash, a solution may be to calculate the depth of the tree returned by the tree-sitter parser and return an error if it exceeds some limit. This assumes the tree-sitter parser itself doesn't crash due to insufficient stack space.

Here's an example of a large generated C++ file whose parsing results in stack overflows: https://github.com/juce-framework/JUCE/blob/d054f0d14dcac387aebda44ce5d792b5e7a625b3/extras/Projucer/JuceLibraryCode/BinaryData.cpp

Tasks:

Check whether we can parse the input file above with just tree-sitter-cpp (e.g. with tree-sitter parse).
If so, add a pass to calculate/estimate the depth of the tree returned by tree-sitter before its translation to OCaml.
Add an option to fail if the tree depth exceeds a limit.

Ideas to avoid complete failure:

Truncate excessively deep trees/lists without aborting when possible (e.g. tree-sitter's repeat() and repeat1() constructs).
Increase the system's stack size limit or ask the user to do so in a last-gasp error message.

The text was updated successfully, but these errors were encountered:

mjambon added bug Something isn't working enhancement New feature or request priority:medium to do, not blocking users labels Jan 19, 2024

mjambon changed the title ~~Avoid stack overflows by computing tree depth and giving up before OCaml translation~~ Avoid stack overflows by computing tree depth and giving up before attempting OCaml translation Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid stack overflows by computing tree depth and giving up before attempting OCaml translation #73

Avoid stack overflows by computing tree depth and giving up before attempting OCaml translation #73

mjambon commented Jan 19, 2024 •

edited

Loading

Avoid stack overflows by computing tree depth and giving up before attempting OCaml translation #73

Avoid stack overflows by computing tree depth and giving up before attempting OCaml translation #73

Comments

mjambon commented Jan 19, 2024 • edited Loading

mjambon commented Jan 19, 2024 •

edited

Loading