A Python package for bi-directional transliteration of Cyrillic script text into Roman alphabet text and vice versa.
By default, transliterates for the Serbian language. A language flag can be set in order to transliterate to and from Macedonian, Montenegrin, Tajik and Russian.
Transliteration is the conversion of a text from one script to another. For instance, a Roman alphabet transliteration of the Serbian phrase "Република Косово" is "Republika Kosovo".
CyrTranslit is hosted in the Python Package Index (PyPI) so it can be installed using pip:
python -m pip install cyrtranslit # latest version
python -m pip install cyrtranslit==0.4 # specific version
python -m pip install cyrtranslit>=0.4 # minimum version
CyrTranslit currently supports bi-directional transliteration of Montenegrin, Serbian, Macedonian, Tajik and Russian:
>>> import cyrtranslit
>>> cyrtranslit.supported()
['me', 'sr', 'mk', 'ru', 'tj']
>>> import cyrtranslit
>>> cyrtranslit.to_latin('Мој ховеркрафт је пун јегуља')
'Moj hoverkraft je pun jegulja'
>>> cyrtranslit.to_cyrillic('Moj hoverkraft je pun jegulja')
'Мој ховеркрафт је пун јегуља'
>>> import cyrtranslit
>>> cyrtranslit.to_latin('Моето летачко возило е полно со јагули', 'mk')
'Moeto letačko vozilo e polno so jaguli'
>>> cyrtranslit.to_cyrillic('Moeto letačko vozilo e polno so jaguli', 'mk')
'Моето летачко возило е полно со јагули'
>>> import cyrtranslit
>>> cyrtranslit.to_latin('Република Косово', 'me')
'Republika Kosovo'
>>> cyrtranslit.to_cyrillic('Republika Kosovo', 'me')
'Република Косово'
>>> import cyrtranslit
>>> cyrtranslit.to_latin('Моё судно на воздушной подушке полно угрей', 'ru')
'Moyo sudno na vozdushnoj podushke polno ugrej'
>>> cyrtranslit.to_cyrillic('Moyo sudno na vozdushnoj podushke polno ugrej', 'ru')
'Моё судно на воздушной подушке полно угрей'
>>> import cyrtranslit
>>> cyrtranslit.to_latin('Ман мактуб навишта истодам', 'tj')
'Man maktub navišta istodam'
>>> cyrtranslit.to_cyrillic('Man maktub navišta istodam', 'tj')
'Ман мактуб навишта истодам'
You can include support for other Cyrillic script alphabets. Follow these steps in order to do so:
- Create a new transliteration dictionary in the mapping.py file and reference to it in the TRANSLIT_DICT dictionary.
- Watch out for cases where two consecutive Roman alphabet letters are meant to transliterate into a single Cyrillic script letter. These cases need to be explicitely checked for inside the to_cyrillic() function in __init__.py.
- Add test cases inside of tests.py.
- Update the documentation in the README.md and in the doc directory.
Consider contributing support for the following Cyrillic scripts:
- Bulgarian
- Ukrainian