-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text encode decode #1645
base: master
Are you sure you want to change the base?
Text encode decode #1645
Conversation
I'd like to benchmark this change |
One should create the encoder and the decoder only once and reuse it, and use a fixed buffer for |
Done |
@vouillon, any idea on how to benchmark this other that micro benchmarks ? |
9fea72f
to
5499d77
Compare
5499d77
to
0e9dcba
Compare
0e9dcba
to
7375672
Compare
A quick micro benchmark show a significant slowdown.
this PR
|
cae4138
to
9724287
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (2)
runtime/js/mlBytes.js:642
- The function
caml_jsstring_of_string
should handle non-ASCII strings correctly. The previous implementation usedcaml_utf16_of_utf8
, which may handle edge cases differently thanTextDecoder
.
if (jsoo_is_ascii(s)) return s;
runtime/js/mlBytes.js:659
- The function
caml_string_of_jsstring
should handle non-ASCII strings correctly. The previous implementation usedcaml_utf8_of_utf16
, which may handle edge cases differently thanTextEncoder
.
if (jsoo_is_ascii(s)) return caml_string_of_jsbytes(s);
var r = this.toString(); | ||
if (this.t === 9) return r; | ||
return caml_utf16_of_utf8(r); | ||
if (this.t === 9) return this.c; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method toUtf16
should handle ASCII and non-ASCII cases consistently. The previous implementation used caml_utf16_of_utf8
for non-ASCII strings, which may handle surrogate pairs and invalid sequences differently than TextDecoder
.
if (this.t === 9) return this.c; | |
return caml_utf16_of_utf8(r); |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
No description provided.