Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for IDN (punycode) handles with standard sanitization #7308

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

drash-course
Copy link

@drash-course drash-course commented Dec 30, 2024

This is a follow up PR to #7043.

Currently, all @handle.tld are displayed in their ASCII form. This is fine for most domains, but some TLDs accept IDN domains and it'd be nice to support them fully. I found at least one other issue #6354 discussing this.

Of course, this opens the can of worms that are IDN homograph attacks.

This PR implements the mechanisms described in Unicode TR39, specifically the Restriction Level 2 "Single Script" based on the data from Unicode 16. If the IDN handle passes the checks, it's displayed in full unicode. If it fails the checks, it's displayed in punycode form like @xn--6frz82g.tld. This is what Firefox does as well.

Simulator Screenshot - iPhone 16 Pro - 2024-12-30 at 01 15 24

To do so, I wrote a TS script that reads some reference files published by Unicode and generates a "unicode map" of codepoints that the Bluesky app then uses. When a punycode handle is encountered, we decode it and check if the codepoints are not banned and if they only belong to scripts (e.g. Latin, Cyrillic, Hiragana, etc) that can be mixed according to TR39.

The required files are :

When a new Unicode version is published, you can put the updated files in scripts/unicode/Public and rerun compileUnicodeMaps.ts to get the new mappings. Something to check on every few months, though there is little risk if it's done later or not at all. You can also choose which codepoints should be banned on top of what is specified in TR39. I banned the checkmark emojis for example, to be aligned with the display name rules.

The resulting "unicode map" is kinda large (about 1600 ranges of codepoints) so I made sure the checking code is speed optimized. This way the app does not need to e.g. memoize the sanitized handles or other fancy tricks to feel smooth. I wrote in timing_test.txt the timing I got with performance.now() while typing xn--a xn--b xn--c in the search bar, if you want to see exact numbers.

Note that searching for IDN handles with the U-label does not work, but typing the xn-- code works. Something to add in a future PR.

Finally, I also added some positive and negative tests for the things I could think of. I've looked for the tests that Firefox may have for IDNs but couldn't find much. Let me know if you feel this needs more tests.

Feel free to give me pointers if you think something is missing.

* main: (58 commits)
  Fix tests
  Layout tweaks (bluesky-social#7150)
  Trending (Beta) (bluesky-social#7144)
  Fix emoji picker position (bluesky-social#7146)
  Tweak Follow dialog Search placeholder (bluesky-social#7147)
  New progress guide - 10 follows (bluesky-social#7128)
  Pipe statsig events to logger (bluesky-social#7141)
  Fix notifications borders (bluesky-social#7140)
  Refetch empty feed on focus (bluesky-social#7139)
  Read storage on window.onstorage (bluesky-social#7137)
  [ELI5] Tweak wording on the signup screen (bluesky-social#7136)
  alf error screen (bluesky-social#7135)
  add safe area view to profile error screen (bluesky-social#7134)
  Adjust gates (bluesky-social#7132)
  disable automaticallAdjustsScrollIndicatorInsets (bluesky-social#7131)
  Bump more native deps (bluesky-social#7129)
  Update more Expo packages (bluesky-social#7127)
  feat: widen recent search profile link for mobile devices (bluesky-social#7119)
  Fix video uploads on native (bluesky-social#7126)
  Fix post time localization on Android (bluesky-social#6742)
  ...

# Conflicts:
#	src/view/com/profile/ProfileSubpageHeader.tsx
#	src/view/screens/ProfileList.tsx
@drash-course
Copy link
Author

Any news from a maintainer on this? Is there an issue or concern that prevents this PR from moving forward?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant