Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo that the Plugin API is suitable to make something like word-o-mat? #2000

Open
davelab6 opened this issue Jan 30, 2025 · 4 comments
Open

Comments

@davelab6
Copy link
Member

davelab6 commented Jan 30, 2025

While chatting with @JeremieHornus and @chrissimpkins about #1998 and what ought to be considered part of a "v1.0" release of any font editor, I championed the inclusion of a plugin/extension API feature, since I believe GlyphsApp and RoboFont had that at their v1.0 releases (and indeed this was kind of the whole point of RoboFont to exist :)

I was later chatting with Chris about the uses of LLMs in type design and development, and he was talking about how effective they are at developing what i call 'sample texts' or 'proofing strings'. Chris' idea was inspired by a post on TypeDrawers where someone was looking for "African language" words that include the exemplars [x], [y], [z], in a number of African languages. Chris had prompted Gemini for that request with Hausa, asking for one word per exemplar provided in a list, and was returned ten words that included each of the exemplars. I then immediately thought "that would make a great fontra plugin" :)

So I'd like to propose that the fontra extension APIs be implemented and documented to such a level that someone on the fontra team can make a nice little video tutorial showing how to replicate something like https://github.com/ninastoessinger/word-o-mat as a fontra plugin.

If this is part of the actual #1998 v1.0 scope or not, is a different question, but I thought I'd share this idea of having a practical litmus test for the readiness of the Fontra Plugin API :)

@justvanrossum
Copy link
Collaborator

I believe Simon's work on #1952 will additionally make externally hosted extensions/plugins closer to reality.

@BlackFoundry
Copy link
Collaborator

as far as I remember, the plugins development had to be made simpler for testing offline/locally before it's hosted externally

@davelab6
Copy link
Member Author

davelab6 commented Jan 31, 2025 via email

@tallpauley
Copy link

@davelab6 I’m interested in contributing to a word-o-mat sort of plug-in. I built WordSiv which is very alpha at this point and has poor language support, but I have a vision for it to eventually be a sort of DSL for dynamic proofs (or dynamic snippets, with various level of detail in word requirements). Please poke holes in it if you’re interested!

I know there are a lot of cases I’m missing—in the least Turkish dotless I capitalization and proper grapheme support—and I’ll tackle these (and many many more) as they come up adding language support. Also there are probably some high-level api issues, my arg glyphs doesn’t make a lot of sense— it maybe should be something like unicodes (since glyphs come into play after shaping?). I even wondered about borrowing some shaping code from @simoncozens using rustybuzz so you could actually pick words that can actually be displayed?

I agree LLMs are great for sample text, though I’m curious if you’ve messed around with attempting to constrain their outputs to using specific glyphs. At least o1 really struggled with it, and I don’t see any reason why an existing LLM would succeed at this task since even a whole novel like Gadsby only excludes “e”. I suppose if the LLM is big enough one could constrain the output layer, but I just have my doubts it’s worth the cost?

My approach is VERY unsophisticated with zero context word probabilities, but I think it’s useful and I’m starting to think even adding just a tiny bit of “sentence shape” data could massively improve results. One idea is training a Markov chain, but only storing actual words for the most common words (and punctuation?), and transitioning to maybe word length and capitalization as we get into less common words. Not as an optimization, but to allow for highly useful words that match our glyph set, not just word combos that are in the training data.

I guess there’s still the question of whether proofing text should make any sense. My current thought is that it doesn’t need to make sense, but we should be seeing the most common patterns in letter, word and sentence shapes that occur. Of course there’s also seeing as many permutations of letter pairings or trigrams as we can jam in as well (but that’s kind of another type of proofing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants