[Python] Unicode partitioning has poor time complexity #165

lucasmcdonald3 · 2025-01-10T17:26:10Z

Unicode partitioning code: https://github.com/dafny-lang/libraries/blob/master/src/Unicode/UnicodeEncodingForm.dfy#L123-L128

The Dafny Python runtime implements "get subsequence" in O(k), where k is the length of the subsequence. (code)
The Unicode partitioning code hopes that "get subsequence" is O(1) for reasonable performance in a runtime.
The linked code runs very slowly as as result of the slow implementation.
This creates challenges when I'm working with large JSON files using the Dafny JSON tools.

lucasmcdonald3 · 2025-01-10T17:37:41Z

This would be addressed by dafny-lang/dafny#2313

lucasmcdonald3 closed this as completed Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Python] Unicode partitioning has poor time complexity #165

[Python] Unicode partitioning has poor time complexity #165

lucasmcdonald3 commented Jan 10, 2025

lucasmcdonald3 commented Jan 10, 2025

[Python] Unicode partitioning has poor time complexity #165

[Python] Unicode partitioning has poor time complexity #165

Comments

lucasmcdonald3 commented Jan 10, 2025

lucasmcdonald3 commented Jan 10, 2025