Skip to content
This repository has been archived by the owner on Nov 10, 2024. It is now read-only.

No difference between ه [h] and ح [h] in dictionary #3

Open
1 task
Jargonautika opened this issue Sep 28, 2022 · 2 comments
Open
1 task

No difference between ه [h] and ح [h] in dictionary #3

Jargonautika opened this issue Sep 28, 2022 · 2 comments

Comments

@Jargonautika
Copy link

Obvious care has been taken to make sure the pronunciation of ه (hā-ye do-češm) has been differentiated between its pronunciation as /h/ (word-initially and -medially) and /e/ (word-finally) as in:

  • ماهه m A h e

Examples like that above are useful to make sure when the grapheme should be pronounced as [h] or [e]. However, there does not seem to be a distinction between the [h] pronunciation of ه (hā-ye do-češm) and the [h] pronunciation of ح (ḥâ-ye ḥotti / ḥâ-ye jimi) anywhere in the dictionary. Consider the following examples:

  • [1] حادثه h A d e s e
  • [2] حوزه h o z e
  • [3] صحیح s a h i h
  • [4] آنها A n h A

If this dictionary were to be used in its reverse form, [1] could be reconstituted from "h A d e s e" into either "هادِثه" or "حادِثه". This is certainly a niche issue, but I am trying to diacritize non-diacritized text, and so in order to re-constitute the original text I have with included vowels given your dictionary's scheme, I need to know which Farsi character to convert back to in the end. There are no instances in either dictionary where ḥâ-ye jimi and hā-ye do-češm (pronounced as [h]) appear in the same word so a simple string replace should do it there. I suggest replacing [h] with [H] for
ḥâ-ye jimi to make this dictionary reversible.

It may well be that there are no words in Farsi which contain both ح and ه, but this would solve the edge use-case I describe here.

Thanks for your work!

@Jargonautika
Copy link
Author

In using this more, this is actually also true of a number of different pairs:

  • س and ص
  • د and ض,
    etc. It's just not reversible.

@b00f
Copy link
Contributor

b00f commented Oct 2, 2022

@Jargonautika

If this dictionary were to be used in its reverse form

We use this dictionary to predict the pronunciation of a given word. If I am not mistaken you are looking for a reverse function. We have many homophone words in Persian that have different meaning but same pronunciation. For example:

  • h a y A t
    can be written in these forms:
    حیاط (courtyard)
    حیات (life)

In this repository we focused on pronunciation. However there is another project that try to latinized Persian: Alefbaye 2om You may also check it out.

شاد و سلامت باشید

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants