Avoid stack overflows by computing tree depth and giving up before attempting OCaml translation #73
Labels
bug
Something isn't working
enhancement
New feature or request
priority:medium
to do, not blocking users
Typically, long sequences of items such as long array literals or long sequences of statements are found only in generated files. Such files may not be interesting to analyze (e.g. in the case of semgrep) so it would be fine to give up on them. The problem is that at least the translation to the OCaml tree uses stack space that's proportional to the length of the tree (including lists) and it results in segfaults on some platforms while on other platforms it raises a
Stack_overflow
exception. To avoid such a crash, a solution may be to calculate the depth of the tree returned by the tree-sitter parser and return an error if it exceeds some limit. This assumes the tree-sitter parser itself doesn't crash due to insufficient stack space.Here's an example of a large generated C++ file whose parsing results in stack overflows: https://github.com/juce-framework/JUCE/blob/d054f0d14dcac387aebda44ce5d792b5e7a625b3/extras/Projucer/JuceLibraryCode/BinaryData.cpp
Tasks:
tree-sitter parse
).Ideas to avoid complete failure:
repeat()
andrepeat1()
constructs).The text was updated successfully, but these errors were encountered: