Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve Ledger file reading #1962

Open
16 of 28 tasks
simonmichael opened this issue Dec 16, 2022 · 5 comments
Open
16 of 28 tasks

Solve Ledger file reading #1962

simonmichael opened this issue Dec 16, 2022 · 5 comments
Labels
A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. docs Documentation-related. journal The journal file format, and its features. ledger-compat

Comments

@simonmichael
Copy link
Owner

simonmichael commented Dec 16, 2022

Capturing this from chat:

Despite all the work on supporting Ledger syntax, the vast majority of Ledger users who try to read their file with hledger, fail and move on. Including the folks who don't use value expressions, and the folks who have been told the ledger print | hledger -f- ... trick.

you'd think that last would work - ledger print is always valid h/ledger syntax, right ? I think now there are two main snags:

needing to set LANG, because everyone has non-ascii and haskell programs throw up their hands if they see that without proper LANG

needing to add commodity directives or -c options, because ledger print adds decimal zeros forcing hledger to check transaction-balancedness more precisely than ledger does

definitely time we had a better strategy here

Wishes

  • Users can easily find out the requirements and workarounds for making a Ledger file hledger-readable
  • Support providers can easily, with low effort from the user, estimate the (in)compatibility level of user's Ledger file
  • When hledger fails to read a Ledger file, the reason is clear. Ideally it detects a Ledger file and gives a custom message for this scenario, not just the usual parse errors.
  • Causes of incompatibility, and any workarounds, are collected and documented in one place.
  • hledger reads all the Ledger syntax features that correspond to our data model
  • hledger ignores all other Ledger syntax features that can be ignored
  • the syntax features it doesn't read or ignore are few in number and clearly documented
  • (Or if this complicates hledger too much, there is a separate ledger2hledger tool for it.)

Actions

  • Test ledger file reading more
    • gather sources of ledger files
    • test manually
    • set up test automation
    • characterise issues
    • (gather clean examples/tests)
    • (set up some easy example contribution hub or workflow, like a pastebin or chatbot or command)
  • Improve https://hledger.org/ledger.html
    • better presentation of journal format differences
    • show support status of each feature
    • list common incompatibilities and workarounds
  • Improve Ledger file parsing
    • identify features which are supported, are ignored, should be supported, should be ignored, should be rejected
    • clarify enhancement priorities
      • support/ignore more directives (with warnings ?)
      • improve performance on sample file collection
      • review & improve error messages
      • more lot notation support (ledger & beancount)
      • possibly revive amount expressions PR
    • collect test specimens
    • clean up parsers as needed
    • ignore all features which should be ignored
    • reject all features which should be rejected
    • support all features which should be supported
    • decide/implement local-precision balancing
    • design and implement some kind of Ledger file detection
    • implement desired UX, custom messages
    • as part of our tests, handle properly all files from: ledger tests, ledger2beancount tests, collected examples
  • Improve locale handling
    • catalogue common locale-related startup exceptions/messages
    • review/consolidate IO paths triggering failure
    • implement graceful failure, catching exceptions
    • review/improve tests

Related

@simonmichael simonmichael added A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. docs Documentation-related. journal The journal file format, and its features. ledger-compat labels Dec 16, 2022
@simonmichael simonmichael moved this to In Progress in 2023: hledger 1.29 (Mar) Dec 16, 2022
@simonmichael
Copy link
Owner Author

simonmichael commented Dec 18, 2022

Documented the transaction-balancing precision issue for users at https://hledger.org/ledger.html#incompatible-balancing

@simonmichael
Copy link
Owner Author

https://github.com/simonmichael/hledger/tree/master/hledger/test/ledger-compat is the start of a test suite for Ledger file compatibility. It uses Ledger's functional tests as a source of diverse sample Ledger files, and others collected manually can be added over time. Let me know if you can think of another good source.

https://gist.github.com/simonmichael/052703b1641669bfe067c68b81f707cc is the categorised results of a test run.. easier to read in Emacs, but to summarise, we currently read about 80% of Ledger's tests' sample data files. The most frequent causes of read failure were amount expressions and lot notation. There was ~20 other distinct causes of failure as well.

@simonmichael
Copy link
Owner Author

simonmichael commented Dec 22, 2022

https://hledger.org/ledger.html#journal-format is a new status table.

@alensiljak
Copy link

design and implement some kind of Ledger file detection

What do you think of ledger-rs/incubator#2? The main suggestion is the data format specified in the header, similar to shebangs. The program handling becomes easier with semantic versioning.

@simonmichael
Copy link
Owner Author

@alensiljak seems a good idea. hledger uses file extension as a hint for input/output format also - .csv/.tsv/.ssv/.timeclock/.timedot/.journal/.hledger (/.ledger/.beancount/...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. docs Documentation-related. journal The journal file format, and its features. ledger-compat
Projects
No open projects
Development

No branches or pull requests

2 participants