-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Pandoc style markdown #22
Comments
Definitely an interesting idea. I'm a huge fan of pandoc so I understand why it would be so enticing. Would you be open to sharing how you implemented it locally so I can check it out? If it doesn't complicate the project too much I'm open to discussing how we could integrate alternative parsers. |
Was quite simple to implement. import pandoc
def is_math_class(tag: Tag) -> bool:
"""Check if an HTML tag is a math oriented tag generated by pandoc"""
try:
return "math" in tag["class"]
except KeyError:
return False
def parse_markdown(
file: str, deck_title_prefix: str, generate_cloze_model: bool
) -> Deck:
"""Parse a markdown string to an anki deck."""
metadata, markdown_string = frontmatter.parse(read_file(file))
doc = pandoc.read(markdown_string)
html = pandoc.write(doc, format="html", options=["--mathjax"])
soup = BeautifulSoup(html, "html.parser")
# Find all the math tags using filter
math_tags = soup.find_all(is_math_class)
for tag in math_tags:
tag.unwrap() # Done for cleaner html in Anki
... The rest of the script is identical |
Interesting. On the one hand, we could make this a command-line flag. I would probably call it As an example, we support multiline questions, which I believe uses python-markdown specific syntax. So while I love how simple this is, it requires a little bit of thought and care before we can go ahead and add it. |
Maybe we can add an option to receive pandoc ast (output in json format) from stdin and operate on it. That allows to convert any input format as long as pandoc support it. That also allows us to add custom pandoc filter. I'm trying to make such change. |
Given that we have multiple people interested in this I'll try to add support for this fairly soon. |
I write all my notes using Pandoc, a powerful document converter. Using Pandoc's python package will allow us to solve the backslash escaping problem causing math syntax to be clunky and use dollar and double dollar signs for math.
There would also be a nice option to use Pandoc-style divs:
Turns into:
I've implemented this change locally and haven't had any problems. There are some cons:
There are many pros, however. It makes
.md
files which weren't written for Anki conversion need fewer changes. In fact, we would be writing pure Pandoc markdown which gets converted via an abstract syntax tree to html.Just a thought
The text was updated successfully, but these errors were encountered: