Skip to content

HomeServerPro/cyrillic-transliteration

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is CyrTranslit?

A Python package for bi-directional transliteration of Cyrillic script text into Roman alphabet text and vice versa.

By default, transliterates for the Serbian language. A language flag can be set in order to transliterate to and from Macedonian, Montenegrin, Tajik and Russian.

What is transliteration?

Transliteration is the conversion of a text from one script to another. For instance, a Roman alphabet transliteration of the Serbian phrase "Република Косово" is "Republika Kosovo".

How do I install this?

CyrTranslit is hosted in the Python Package Index (PyPI) so it can be installed using pip:

python -m pip install cyrtranslit		# latest version
python -m pip install cyrtranslit==0.4	# specific version
python -m pip install cyrtranslit>=0.4	# minimum version

What languages are supported?

CyrTranslit currently supports bi-directional transliteration of Montenegrin, Serbian, Macedonian, Tajik and Russian:

>>> import cyrtranslit
>>> cyrtranslit.supported()
['me', 'sr', 'mk', 'ru', 'tj']

How do I use this?

Serbian

>>> import cyrtranslit
>>> cyrtranslit.to_latin('Мој ховеркрафт је пун јегуља')
'Moj hoverkraft je pun jegulja'
>>> cyrtranslit.to_cyrillic('Moj hoverkraft je pun jegulja')
'Мој ховеркрафт је пун јегуља'

Macedonian

>>> import cyrtranslit
>>> cyrtranslit.to_latin('Моето летачко возило е полно со јагули', 'mk')
'Moeto letačko vozilo e polno so jaguli'
>>> cyrtranslit.to_cyrillic('Moeto letačko vozilo e polno so jaguli', 'mk')
'Моето летачко возило е полно со јагули'

Montenegrin

>>> import cyrtranslit
>>> cyrtranslit.to_latin('Република Косово', 'me')
'Republika Kosovo'
>>> cyrtranslit.to_cyrillic('Republika Kosovo', 'me')
'Република Косово'

Russian

>>> import cyrtranslit
>>> cyrtranslit.to_latin('Моё судно на воздушной подушке полно угрей', 'ru')
'Moyo sudno na vozdushnoj podushke polno ugrej'
>>> cyrtranslit.to_cyrillic('Moyo sudno na vozdushnoj podushke polno ugrej', 'ru')
'Моё судно на воздушной подушке полно угрей'

Tajik

>>> import cyrtranslit
>>> cyrtranslit.to_latin('Ман мактуб навишта истодам', 'tj')
'Man maktub navišta istodam'
>>> cyrtranslit.to_cyrillic('Man maktub navišta istodam', 'tj')
'Ман мактуб навишта истодам'

How can I contribute?

You can include support for other Cyrillic script alphabets. Follow these steps in order to do so:

  1. Create a new transliteration dictionary in the mapping.py file and reference to it in the TRANSLIT_DICT dictionary.
  2. Watch out for cases where two consecutive Roman alphabet letters are meant to transliterate into a single Cyrillic script letter. These cases need to be explicitely checked for inside the to_cyrillic() function in __init__.py.
  3. Add test cases inside of tests.py.
  4. Update the documentation in the README.md and in the doc directory.

Consider contributing support for the following Cyrillic scripts:

  • Bulgarian
  • Ukrainian

About

Transliterate Cyrillic script text into Roman alphabet text and vice versa.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%