Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Character unescaping improvements #699

Open
6 tasks
ForNeVeR opened this issue Nov 6, 2024 · 0 comments
Open
6 tasks

Character unescaping improvements #699

ForNeVeR opened this issue Nov 6, 2024 · 0 comments
Labels
area:standard-support Related to the C standard support good-first-issue An issue considered simple enough for new contributors status:help-wanted Open for contributors

Comments

@ForNeVeR
Copy link
Owner

ForNeVeR commented Nov 6, 2024

Some issues with the current code in Cesium.CodeGen.Ir.Expressions.Constants.CharConstant.UnescapeCharacter and Cesium.Parser.TokenExtensions.UnwrapStringLiteral:

  • There are two of them, with different implementations. There should be only one.
  • UnescapeCharacter doesn't support \u and \U aka universal-character-name from the standard.
  • UnescapeCharacter also has a bug in handling octal and hex sequences: both are considered to only have two digits, with special treatment of \0. While the standard defines octal sequences to be either one, two or three characters long, while the hex escapes are of arbitrary length.
  • \0 should not be a special case in either of the methods; it is just an octal number.
  • UnwrapStringLiteral also seems to treat octal sequences weirdly: I only see support for octal numbers starting from 0 which is not correct (UnescapeCharacter handles these better).
  • Normal compiler behavior is to report a warning on an invalid sequence (e.g. \m) and treat it as the character itself. We don't do this: we either silently accept or break on such sequences.

See also #295.

@ForNeVeR ForNeVeR added status:help-wanted Open for contributors area:standard-support Related to the C standard support good-first-issue An issue considered simple enough for new contributors labels Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:standard-support Related to the C standard support good-first-issue An issue considered simple enough for new contributors status:help-wanted Open for contributors
Projects
None yet
Development

No branches or pull requests

1 participant