Support for converting Uyghur Latin Script (tex) to Uyghur Arabic (PDF) #482

neouyghur · 2021-03-22T00:58:54Z

Hi polyglossia team, thanks for your support for the Uyghur language. I have started to edit the Uyghur latex file in the latex editor. However, I find it is a little bit hard to edit documents with Uyghur Arabic script. So, I wonder if the polyglossia package supports Uyghur written with Latin characters. Let me explain it more. There are two main scripts for the Uyghur language now. One is Uyghur Arabic, another is Uyghur Latin. It is easier to edit the source file with Uyghur Latin. However, Uyghur Arabic is used in our daily life. If Polygossia renders the Latin alphabet to Arabic when generating the pdf, it would be very helpful. Is it possible? I can provide the python code for converting Uyghur Latin to Arabic.

jspitz · 2021-03-22T07:03:05Z

We do support multiple scripts (same in in- and output) via the script option. We do not have a case yet where we support transliteration, though.

Wikipedia tells me that three scripts are common in different regions: Arabic, Latin, and Cyrillic. Given this, a script option would make sense.

jspitz · 2021-03-22T07:20:08Z

Note, though, that the ArabXeTeX package supports transliteration also for Uyghur (via TECKit fontmappings). There is also ArabLuaTeX, but this doesn't yet support Uyghur yet.

I am not sure polyglossia and ArabXeTeX can be used together.

neouyghur · 2021-03-22T09:14:46Z

@jspitz is there any tutorial for script option?

jspitz · 2021-03-22T09:49:31Z

No, just look at other languages that provide this option (e.g., Kurdish)

yannis1962 · 2021-03-22T10:22:11Z

If you want to input in one script and obtain output in another script you can use the mapping mechanism of XeTeX. You need to create a map file in TECkit syntax, as in https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=teckit and then convert it to tcp format using the teckit_compile tool you can download from SIL (at https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=teckitdownloads). In fact TECkit takes over (without acknowledging it) ideas that were previously implemented in Omega through Omega Translation Processes… But beware of the fact that this mapping operation is very simple (it consists of rules "send this string to that string") so your Latin input must contain some additional information you don't use normally, for example a character for the hamza in the Arabic script. The system is (probably) not intelligent enough to say "these are two vowels, let me put a hamza inbetween" (I say "probably" because I haven't yet tested it on Uyghur). The other inconvenient is that you will need to use macros whenever you switch from one script to another, since for both Uyghur and English (or Russian) you will use the same input script. Yannis

Le 22 mars 2021 à 01:59, Osman Tursun ***@***.***> a écrit : So, I wonder if the polyglossia package supports Uyghur written with Latin characters.

<http://www.imt-atlantique.fr/> Yannis HARALAMBOUS Professor Computer Science Department UMR CNRS 6285 Lab-STICC <http://perso.telecom-bretagne.eu/yannisharalambous/> <https://twitter.com/y_haralambous> <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name>Technopôle Brest-Iroise CS 83818 29238 Brest Cedex 3, France Une école de l'IMT <http://www.imt.fr/> I have no more answers (John Cage)

yannis1962 · 2021-03-22T21:56:20Z

dear Osman, here is support for converting Latin to Arabic script for the Uyghur language. I have used the ULY correspondence as described in Wikipedia <https://en.wikipedia.org/wiki/Uyghur_alphabets>. Remains the problem of the digraphs zh, sh and ng: how to distinguish them from standalone letters (there is no problem for ch since there is no c in the transliteration scheme). I also used both ë and é, as in the picture on the same page there is an é instead of an ë. I'm sending you the TeX file, the MAP file, the TEC file and the resulting PDF (the input for Latin and Arabic is strictly the same Latin-alphabet string). If you can send me a longer text in Latin alphabet I can typeset it and we can start debugging. You need XeTeX to run this file. Cheers, Yannis <http://www.imt-atlantique.fr/> Yannis HARALAMBOUS Professor Computer Science Department UMR CNRS 6285 Lab-STICC <http://perso.telecom-bretagne.eu/yannisharalambous/> <https://twitter.com/y_haralambous> <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name>Technopôle Brest-Iroise CS 83818 29238 Brest Cedex 3, France Une école de l'IMT <http://www.imt.fr/> Und nach einer kleinen Stille fügte Sie hinzu: Jeder Weg, der dorthin führt, war am Ende der richtige. (Michael Ende)

ivankokan · 2021-03-22T22:52:59Z

dear Osman, here is support for converting Latin to Arabic script for the Uyghur language. I have used the ULY correspondence as described in Wikipedia https://en.wikipedia.org/wiki/Uyghur_alphabets. Remains the problem of the digraphs zh, sh and ng: how to distinguish them from standalone letters (there is no problem for ch since there is no c in the transliteration scheme). I also used both ë and é, as in the picture on the same page there is an é instead of an ë. I'm sending you the TeX file, the MAP file, the TEC file and the resulting PDF (the input for Latin and Arabic is strictly the same Latin-alphabet string). If you can send me a longer text in Latin alphabet I can typeset it and we can start debugging. You need XeTeX to run this file. Cheers, Yannis http://www.imt-atlantique.fr/ Yannis HARALAMBOUS Professor Computer Science Department UMR CNRS 6285 Lab-STICC http://perso.telecom-bretagne.eu/yannisharalambous/ https://twitter.com/y_haralambous <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name>Technopôle Brest-Iroise CS 83818 29238 Brest Cedex 3, France Une école de l'IMT http://www.imt.fr/ Und nach einer kleinen Stille fügte Sie hinzu: Jeder Weg, der dorthin führt, war am Ende der richtige. (Michael Ende)

@yannis1962 We cannot see the attachments here? Are they supposed to be included in polyglossia or used separately?

Also, how hard would it be to support transliteration for Serbian Cyrillic? (I can provide bidirectional Unicode mappings.)

yannis1962 · 2021-03-22T23:11:48Z

Le 22 mars 2021 à 23:53, Ivan Kokan ***@***.***> a écrit : dear Osman, here is support for converting Latin to Arabic script for the Uyghur language. I have used the ULY correspondence as described in Wikipedia https://en.wikipedia.org/wiki/Uyghur_alphabets <https://en.wikipedia.org/wiki/Uyghur_alphabets>. Remains the problem of the digraphs zh, sh and ng: how to distinguish them from standalone letters (there is no problem for ch since there is no c in the transliteration scheme). I also used both ë and é, as in the picture on the same page there is an é instead of an ë. I'm sending you the TeX file, the MAP file, the TEC file and the resulting PDF (the input for Latin and Arabic is strictly the same Latin-alphabet string). If you can send me a longer text in Latin alphabet I can typeset it and we can start debugging. You need XeTeX to run this file. Cheers, Yannis http://www.imt-atlantique.fr/ <http://www.imt-atlantique.fr/> Yannis HARALAMBOUS Professor Computer Science Department UMR CNRS 6285 Lab-STICC http://perso.telecom-bretagne.eu/yannisharalambous/ <http://perso.telecom-bretagne.eu/yannisharalambous/> https://twitter.com/y_haralambous <https://twitter.com/y_haralambous> <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name>Technopôle <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name%3ETechnop%C3%B4le> Brest-Iroise CS 83818 29238 Brest Cedex 3, France Une école de l'IMT http://www.imt.fr/ <http://www.imt.fr/> Und nach einer kleinen Stille fügte Sie hinzu: Jeder Weg, der dorthin führt, war am Ende der richtige. (Michael Ende) @yannis1962 <https://github.com/yannis1962> We cannot see the attachments here? Are they supposed to be included in polyglossia or used separately?

Sorry, the list server must have removed the attachments. you can fetch them from here: https://we.tl/t-bBPjBZghGB (only for a week or so). They are only test files to show Osman how easy it is to create transliterations.

Also, how hard would it be to support transliteration for Serbian Cyrillic? (I can provide bidirectional Unicode mappings.)

If you have bidirectional Unicode mappings, then it is trivial.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#482 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFXC7JM7SM7JW4XT3KNZA3TE7C5VANCNFSM4ZSEW6UQ>.

<http://www.imt-atlantique.fr/> Yannis HARALAMBOUS Professor Computer Science Department UMR CNRS 6285 Lab-STICC <http://perso.telecom-bretagne.eu/yannisharalambous/> <https://twitter.com/y_haralambous> <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name>Technopôle Brest-Iroise CS 83818 29238 Brest Cedex 3, France Une école de l'IMT <http://www.imt.fr/> Der ganze Mensch bringt sich dar in der Art, wie er seine Worte darbringt. (Rudolf Koch)

neouyghur · 2021-03-23T00:22:55Z

@yannis1962 Thanks for your help. Overall, it is working. However, there is a bug related to " ' " (hamza). Is it because it is a one-to-one mapping ? For example, "wijdan'gha" -> "ۋىجدانئ‍غا" is not correct. It should be "ۋىجدانغا".

jspitz · 2021-03-23T12:38:20Z

Maybe you can also compare your efforts with the mappings provided by ArabXeTeX. And then, we also need a LuaTeX implementation if we want to provide this.

neouyghur closed this as completed Mar 22, 2021

neouyghur reopened this Mar 22, 2021

ivankokan mentioned this issue Mar 23, 2021

Support Latin <-> Cyrillic transliteration and Latin digraphs for Serbian #483

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for converting Uyghur Latin Script (tex) to Uyghur Arabic (PDF) #482

Support for converting Uyghur Latin Script (tex) to Uyghur Arabic (PDF) #482

neouyghur commented Mar 22, 2021

jspitz commented Mar 22, 2021

jspitz commented Mar 22, 2021

neouyghur commented Mar 22, 2021

jspitz commented Mar 22, 2021

yannis1962 commented Mar 22, 2021 via email

yannis1962 commented Mar 22, 2021 via email

ivankokan commented Mar 22, 2021

yannis1962 commented Mar 22, 2021 via email

neouyghur commented Mar 23, 2021

jspitz commented Mar 23, 2021

Support for converting Uyghur Latin Script (tex) to Uyghur Arabic (PDF) #482

Support for converting Uyghur Latin Script (tex) to Uyghur Arabic (PDF) #482

Comments

neouyghur commented Mar 22, 2021

jspitz commented Mar 22, 2021

jspitz commented Mar 22, 2021

neouyghur commented Mar 22, 2021

jspitz commented Mar 22, 2021

yannis1962 commented Mar 22, 2021 via email

yannis1962 commented Mar 22, 2021 via email

ivankokan commented Mar 22, 2021

yannis1962 commented Mar 22, 2021 via email

neouyghur commented Mar 23, 2021

jspitz commented Mar 23, 2021