From fc1d29ea9b82066763ce76442f0b57d92e1d97a9 Mon Sep 17 00:00:00 2001 From: Jack Rueter Date: Sat, 16 Nov 2024 13:25:52 +0200 Subject: [PATCH] Modifier Letter Apostrophe was missing Add additional letter from dialect transcriptions. --- tools/tokenisers/tokeniser-disamb-gt-desc.pmscript | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/tokenisers/tokeniser-disamb-gt-desc.pmscript b/tools/tokenisers/tokeniser-disamb-gt-desc.pmscript index 210593ce..4ffe05f5 100644 --- a/tools/tokenisers/tokeniser-disamb-gt-desc.pmscript +++ b/tools/tokenisers/tokeniser-disamb-gt-desc.pmscript @@ -66,6 +66,7 @@ Define alphabet "a-z" !! * lower-case ASCII |"A-Z" !! * upper-case ASCII |Lst({àáâãāăȧäảåǎȁȃąạḁæǽǣèéêẽēĕėëẻěȅȇẹȩęḙḛìíîĩīīĭi̇ïỉǐịįȉȋḭɨòóôõōŏȯöỏőǒȍȏơǫọɵøờớỡởợǭộǿœùúûũūŭüủůűǔȕȗưụṳųṷṵừứữửựʉỳýŷỹȳẏÿỷƴỵɏÀÁÂÃĀĂȦÄẢÅǍȀȂĄẠḀÆǼǢÈÉÊẼĒĔĖËẺĚȄȆẸȨĘḘḚÌÍÎĨĪĪĬİÏỈǏỊĮȈȊḬƗÒÓÔÕŌŎȮÖỎŐǑȌȎƠǪỌƟØỜỚỠỞỢǬỘǾŒÙÚÛŨŪŬÜỦŮŰǓȔȖƯỤṲŲṶṴỪỨỮỬỰɄỲÝŶỸȲẎŸỶƳỴɎšžčđðíŋňŧñńŠŽČĐÐÍŊŇŦÑ}) !! * select extended latin symbols + | Lst({ʼĺšń·e͔i͔t́śźлü‿₍š́āžƞǵñv́h́źēīūōǟd́}) | "0-9" !! ASCII digits | Lst({_§°†}) !! * select symbols !! * Combining diacritics as individual symbols,