Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass for adding North American indigenous locales #596

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions docs/_docs/refs/mojito-locales.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,12 @@ permalink: /docs/refs/mojito-locales/
| bn-IN | Bengali (India) |
| bs-BA | Bosnian (Bosnia and Herzegovina) |
| ca-ES | Catalan (Spain) |
| chr-021 | Cherokee (Northern America) |
| cr-021 | Cree (Northern America) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current thinking is to not have region/territory unless it is a locale that is available in CLDR. I know currently Mojito always have a region but I'm thinking to move away from that pattern.

The new locale list would be any locale defined in: https://github.com/unicode-cldr/cldr-core/blob/master/availableLocales.json + any languages: https://github.com/unicode-cldr/cldr-localenames-full/blob/master/main/en/languages.json.

So for the few i checked they'd be only languages

thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fair. I'd love to support someone on getting the most appropriate language codes into CLDR itself through formal registration process, rather than doing something opinionated here :)

| cs-CZ | Czech (Czech Republic) |
| cy-GB | Welsh (United Kingdom) |
| da-DK | Danish (Denmark) |
| dak-021 | Dakota (Northern America) |
| de-AT | German (Austria) |
| de-CH | German (Switzerland) |
| de-DE | German (Germany) |
Expand All @@ -56,7 +59,7 @@ permalink: /docs/refs/mojito-locales/
| en-US | English (United States) |
| en-ZA | English (South Africa) |
| en-ZW | English (Zimbabwe) |
| en-419 | Spanish (Latin America) |
| es-419 | Spanish (Latin America) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! can be a first commit just fixing that :)

| es-AR | Spanish (Argentina) |
| es-BO | Spanish (Bolivia) |
| es-CL | Spanish (Chile) |
Expand Down Expand Up @@ -99,6 +102,7 @@ permalink: /docs/refs/mojito-locales/
| is-IS | Icelandic (Iceland) |
| it-CH | Italian (Switzerland) |
| it-IT | Italian (Italy) |
| iu-021 | Inuktitut (Northern America) |
| ja-JP | Japanese (Japan) |
| ka-GE | Georgian (Georgia) |
| kk-KZ | Kazakh (Kazakhstan) |
Expand All @@ -115,12 +119,15 @@ permalink: /docs/refs/mojito-locales/
| ms-BN | Malay (Brunei Darussalam) |
| ms-MY | Malay (Malaysia) |
| mt-MT | Maltese (Malta) |
| mus-021 | Creek (Northern America) |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Muscogee (Creek). Creek is the historical name, Muscogee (Creek) is the nation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Muscogee works fine here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, this is the small tedious stuff that I'd be happy to wade through the bureacracy on -- "Creek" is what's in CLDR dataset that's maintained internationally, so it'll keep showing up if it's wrong, since much of the field of localization seems programmatic or at least highly inclined to follow the standards:
https://github.com/unicode-cldr/cldr-localenames-full/blob/993632df2f5d6a2d33cbbf40d922474c2482eaca/main/en-001/languages.json#L390

If you can confirm this is universal that "Muscogee" is the more appropriate term, then I could wrangle with the bureacracy to have that reflected in the standards. We could review the standards and collect a list of stuff like this to push through en masse

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arecvlohe this is great info, thanks for sharing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hey, good news: seems someone else chimed in on this in Aug 2009, and it's in the upcoming release: https://unicode-org.atlassian.net/browse/CLDR-13193?jql=text%20~%20%22muskogee%22

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice! I guess I don't have to do anything then.

| nb-NO | Norwegian (Norway) |
| nl-BE | Dutch (Belgium) |
| nl-NL | Dutch (Netherlands) |
| nn-NO | Norwegian (Nynorsk) (Norway) |
| ns-ZA | Northern Sotho (South Africa) |
| nv-003 | Navajo (North America) |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Navajo is a name given by the Spanish. Better to go with Diné

Copy link
Author

@patcon patcon Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES. This is great insight. To clarify, this downstream tool isn't the place to fix this, but I'd like to support on this.

Term originates from this dataset: https://github.com/unicode-cldr/cldr-localenames-full/blob/993632df2f5d6a2d33cbbf40d922474c2482eaca/main/en-001/languages.json#L426

Current CLDR is v36 (release notes), and nv was added in March 2018 for v33 (release notes) (or some time in the year prior to its official version release)

I can try to dig up the mailing lists to see the conversation when this was added. It may be that this conversation happened already. But even so, receptivity to this sort of feedback may have changed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, requires login to UNICODE ticket tracker: https://unicode-org.atlassian.net/browse/CLDR-13814?jql=text%20~%20%22navajo%22

  • CLDR-13814 Addition of core data and new locale: Navajo (nv)
    • lots of missing information in the registration request -- looks like they could use some support
    • Was submitted by Google on May 24, but they just withdrew on June 8th, as they were not able to "get vetters" in time.

I'm not sure I understand, since it seems it's in there already, but perhaps it's not an official locale yet, as it's lacking some deeper level of detail.

Copy link
Author

@patcon patcon Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine they might naively lean on decisions like this:
https://www.indianz.com/News/2017/04/19/navajo-nation-council-rejects-bill-to-ch.asp

but if there are alternative perspectives that have been legitimized through community channels, it could perhaps be raised... maybe there's precedent for some other approach (e.g., multiple entries)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the information. Let's talk about this in a chat so I can get the proper perspective on it. What I would like to do is send this out through out social media and gather feedback that way. It would be nice to if someone can point me to something more official.

| ny-MW | Nyanja; Chewa; Chichewa (Malawi) |
| oj-021 | Ojibwa (Northern America) |
Copy link

@arecvlohe arecvlohe Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ojibway is a French term I think. Anishinaabe is what they call themselves

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto above

Quick research says this is much more entrenched than other terms. Seems to have originated in 2009 from days when changes happened via internet engineering taskforce (IETF) RFC:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I will have to read up more on this. Bureaucracy but without engaging Native peoples it seems. Is that the norm?

| pa-IN | Punjabi (India) |
| pl-PL | Polish (Poland) |
| ps-AR | Pashto (Afghanistan) |
Expand Down Expand Up @@ -157,11 +164,11 @@ permalink: /docs/refs/mojito-locales/
| uz-UZ | Uzbek (Uzbekistan) |
| vi-VN | Vietnamese (Viet Nam) |
| xh-ZA | Xhosa (South Africa) |
| ypk-021 | Yupik (Northern America) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yupik seems to be a group of language, https://en.wikipedia.org/wiki/Yupik_languages. "ypk": "ems ess esu ynk", from this list only "esu" has a display name hence was generated from the script I wrote. All other entries you had are covered.

Wondering if you really need this entry of if "esu" would work in your case.

| zh-CN | Chinese (Simplified) |
| zh-HK | Chinese (Hong Kong) |
| zh-MO | Chinese (Macau) |
| zh-SG | Chinese (Singapore) |
| zh-TW | Chinese (Traditional) |
| zu-ZA | Zulu (South Africa) |


18 changes: 17 additions & 1 deletion webapp/src/main/resources/db/hsql/data.sql
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,14 @@ insert into locale (id, bcp47_tag) values (781, 'zh-MO');
insert into locale (id, bcp47_tag) values (785, 'zh-SG');
insert into locale (id, bcp47_tag) values (153, 'zh-TW');
insert into locale (id, bcp47_tag) values (793, 'zu-ZA');
insert into locale (id, bcp47_tag) values (804, 'nv-003');
insert into locale (id, bcp47_tag) values (805, 'oj-021');
insert into locale (id, bcp47_tag) values (806, 'iu-021');
insert into locale (id, bcp47_tag) values (807, 'cr-021');
insert into locale (id, bcp47_tag) values (808, 'mus-021');
insert into locale (id, bcp47_tag) values (809, 'dak-021');
insert into locale (id, bcp47_tag) values (810, 'chr-021');
insert into locale (id, bcp47_tag) values (811, 'ypk-021');


insert into plural_form (id, name) values (1, 'one');
Expand Down Expand Up @@ -549,4 +557,12 @@ insert into plural_form_for_locale (locale_id, plural_form_id) values (802, 1);
insert into plural_form_for_locale (locale_id, plural_form_id) values (802, 5);
insert into plural_form_for_locale (locale_id, plural_form_id) values (803, 1);
insert into plural_form_for_locale (locale_id, plural_form_id) values (803, 5);

insert into plural_form_for_locale (locale_id, plural_form_id) values (804, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (805, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (806, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have standard plural form like english (singular, plural) instead of single one? Any clue how that works in those languages?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I just left the "zeros" as placeholders until I had time to figure out what these were.)

I can work on this! Might need some time. Can we add this later, and use a sane default for now, or does merging something incorrect lead to hassle?

insert into plural_form_for_locale (locale_id, plural_form_id) values (807, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (808, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (809, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (810, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (811, 0);

17 changes: 17 additions & 0 deletions webapp/src/main/resources/db/migration/V52__Add_Native_Locales.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
insert into locale (id, bcp47_tag) values (804, 'nv-003');
insert into locale (id, bcp47_tag) values (805, 'oj-021');
insert into locale (id, bcp47_tag) values (806, 'iu-021');
insert into locale (id, bcp47_tag) values (807, 'cr-021');
insert into locale (id, bcp47_tag) values (808, 'mus-021');
insert into locale (id, bcp47_tag) values (809, 'dak-021');
insert into locale (id, bcp47_tag) values (810, 'chr-021');
insert into locale (id, bcp47_tag) values (811, 'ypk-021');

insert into plural_form_for_locale (locale_id, plural_form_id) values (804, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (805, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (806, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (807, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (808, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (809, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (810, 0);
insert into plural_form_for_locale (locale_id, plural_form_id) values (811, 0);