Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output as Text #26

Open
mike-ward opened this issue Feb 15, 2015 · 14 comments
Open

Output as Text #26

mike-ward opened this issue Feb 15, 2015 · 14 comments

Comments

@mike-ward
Copy link

I'm back with my hat in hand :)

Plain text output please, with hard line breaks at a set column.

Currently I do this with pandoc

pandoc -f markdown -t markdown --columns=80 document.md > newdocument.md

Then I could build refomatter into Markdown Edit.

@Knagis Knagis mentioned this issue Feb 16, 2015
@Knagis
Copy link
Owner

Knagis commented Feb 16, 2015

To clarify - you are not looking for plain text output but rather markdown output, preserving all the little details?

One problem that I see right away - inline links and reference links - the details on how they were entered are not preserved. So the formatter has to choose the resulting formatting - but what should it be? For example, are the references added after the paragraph or at the end? Does it use numeric reference labels or creates some magic strings?

@mike-ward
Copy link
Author

Correct, I want to preservie the mMarkdown although some interpretation, as you point out, will occur. I don't use references so I can't speak to the issue. Pandoc reformats # headers into === and --- headers. I imagine there are some other things it does like thatl, but nothing I have ever been bothered by. Hard line breaks make reading text from the console a bit nicer.

@Knagis
Copy link
Owner

Knagis commented Feb 16, 2015

Header texts can be left alone - AST contains information from which format it was parsed. I would avoid converting them between formats.

For now reference links seem to be the only thing that loses significant information during parsing.

@Knagis
Copy link
Owner

Knagis commented Mar 5, 2015

I have started work on this in the md-writer branch. Perhaps you could try it out and list the issues you can find.

Some I know already:

  • hard-wrapping lines are not yet implemented
  • blockquote prefix > is not rendered on blank lines within the quote
  • links and images loses title attribute

@mike-ward
Copy link
Author

ooooh, new shiny to play with this weekend. Thanks!

On Thu, Mar 5, 2015 at 3:50 PM, Kārlis Gaņģis [email protected]
wrote:

I have started work on this in the md-writer
https://github.com/Knagis/CommonMark.NET/tree/md-writer branch. Perhaps
you could try it out and list the issues you can find.

Some I know already:

  • hard-wrapping lines are not yet implemented
  • blockquote prefix > is not rendered on blank lines within the quote
  • links and images loses title attribute


Reply to this email directly or view it on GitHub
#26 (comment)
.

@mike-ward
Copy link
Author

Hey, this is looking nice. Here's something I noticed

Header Text
=========

is formatted as

Header Text
===

I think in the spirit of, "keeping it pretty", we should either keep the number of underlines elements in the original document or better still, match the number of underline elements to the length of the text.

The parser correctly preserves # Header elements. +1

In ordered and unordered lists, the original indent is not preserved

  1. line 1
  2. line 2

is rendered as

1. line 1
2. line 2

This is not big issue. However, when I write markdown I usually indent these items a couple of spaces to simulate how they appear in HTML

If preserving the original indentation is difficult, I would settle for a configuration option.

Same thing goes for blocks quotes. Source indentation in not preserved.

Finally, the following:

  1. hello
  2. there

  > block
  > quote

renders as:

1. hello
2. there
> block quote

Should there be a blank line b/t the numbered list and block quote?

That's all for now. Great job!

@mike-ward
Copy link
Author

Horizontal breaks present some interesting challenges.

----------------

is currently output as

---

Couple of options here.

  1. Keep current behavior. It's correct but not stylish
  2. Perserve the original number of dashes
  3. Pad with dashes to the column length
  4. Configurable dash length

Not sure what I want here.

@Knagis
Copy link
Owner

Knagis commented Mar 7, 2015

header texts

I will change this to match the number of characters in the previous line

In ordered and unordered lists, the original indent is not preserved

I think that the parser preserves this information so the list could be indented based on the first item.

block quote after list

that is a bug coming from the fact that the list is tight so the paragraphs do not render the newline. this should be easy to fix.

horizontal breaks

I suppose that the only reasonable solution is to add a configuration option on how many dashes to add. Leaving the count from the source goes against the idea of reformatting the document accoring to a single style.

Thank you for the feedback!

@mike-ward
Copy link
Author

I agree on all your points.

@mike-ward
Copy link
Author

One other thought. You might want to include a configuration to always force H1/H2 headers to underline. Or if you're really crazy a 3 way setting.

  1. always underlined
  2. always #
  3. keep document format

@mike-ward
Copy link
Author

I'm looking for a new feature to add to Markdown Edit this weekend. Hint, hint, hint 😄

@Knagis
Copy link
Owner

Knagis commented Apr 10, 2015

Unfortunately I had no time to work on this so it is still in the state you last reviewed...

@kieranbenton
Copy link

Any movement on this guys? Is it still living in the branch? Any idea what kind of work is outstanding and if I can contribute to get it out and into a release? Cheers.

@jazzdelightsme
Copy link

Just FYI, I am interested in a markdown parser that can do full-fidelity "round trip", as well as normalization on top of that. The work described in this thread seems aimed at normalization, but being able to just do round-trip (emitting the exact same text as the original output) is also useful when you want to do programmatic transformations without mucking about with the existing style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants