-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for IDN (punycode) handles with standard sanitization #7308
Open
drash-course
wants to merge
13
commits into
bluesky-social:main
Choose a base branch
from
drash-course:idn-handles-standard-sanitization
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Support for IDN (punycode) handles with standard sanitization #7308
drash-course
wants to merge
13
commits into
bluesky-social:main
from
drash-course:idn-handles-standard-sanitization
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* main: (58 commits) Fix tests Layout tweaks (bluesky-social#7150) Trending (Beta) (bluesky-social#7144) Fix emoji picker position (bluesky-social#7146) Tweak Follow dialog Search placeholder (bluesky-social#7147) New progress guide - 10 follows (bluesky-social#7128) Pipe statsig events to logger (bluesky-social#7141) Fix notifications borders (bluesky-social#7140) Refetch empty feed on focus (bluesky-social#7139) Read storage on window.onstorage (bluesky-social#7137) [ELI5] Tweak wording on the signup screen (bluesky-social#7136) alf error screen (bluesky-social#7135) add safe area view to profile error screen (bluesky-social#7134) Adjust gates (bluesky-social#7132) disable automaticallAdjustsScrollIndicatorInsets (bluesky-social#7131) Bump more native deps (bluesky-social#7129) Update more Expo packages (bluesky-social#7127) feat: widen recent search profile link for mobile devices (bluesky-social#7119) Fix video uploads on native (bluesky-social#7126) Fix post time localization on Android (bluesky-social#6742) ... # Conflicts: # src/view/com/profile/ProfileSubpageHeader.tsx # src/view/screens/ProfileList.tsx
…ndles-standard-sanitization
Any news from a maintainer on this? Is there an issue or concern that prevents this PR from moving forward? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a follow up PR to #7043.
This PR implements the mechanisms described in Unicode TR39, specifically the Restriction Level 2 "Single Script" based on the data from Unicode 16. If the IDN handle passes the checks, it's displayed in full unicode. If it fails the checks, it's displayed in punycode form like
@xn--6frz82g.tld
. This is what Firefox does as well.To do so, I wrote a TS script that reads some reference files published by Unicode and generates a "unicode map" of codepoints that the Bluesky app then uses. When a punycode handle is encountered, we decode it and check if the codepoints are not banned and if they only belong to scripts (e.g. Latin, Cyrillic, Hiragana, etc) that can be mixed according to TR39.
The required files are :
When a new Unicode version is published, you can put the updated files in
scripts/unicode/Public
and reruncompileUnicodeMaps.ts
to get the new mappings. Something to check on every few months, though there is little risk if it's done later or not at all. You can also choose which codepoints should be banned on top of what is specified in TR39. I banned the checkmark emojis for example, to be aligned with the display name rules.The resulting "unicode map" is kinda large (about 1600 ranges of codepoints) so I made sure the checking code is speed optimized. This way the app does not need to e.g. memoize the sanitized handles or other fancy tricks to feel smooth. I wrote in timing_test.txt the timing I got with
performance.now()
while typingxn--a
xn--b
xn--c
in the search bar, if you want to see exact numbers.Note that searching for IDN handles with the U-label does not work, but typing the
xn--
code works. Something to add in a future PR.Finally, I also added some positive and negative tests for the things I could think of. I've looked for the tests that Firefox may have for IDNs but couldn't find much. Let me know if you feel this needs more tests.
Feel free to give me pointers if you think something is missing.