Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to extend precanned csv import extract method to add csv source to __source__ metadata #14

Open
blais opened this issue Feb 14, 2020 · 3 comments
Labels
CSV Related to the CSV importer

Comments

@blais
Copy link
Member

blais commented Feb 14, 2020

Original report by Jeff Mondoux.


I have a custom importer to import my banks csv statements, this imported inherits from beancount csv importer in which I override the extract() method as such:

def extract(self, file, existing_entries=None):
    mapped_account = self.file_account(file)
    entries = super().extract(file, existing_entries)
    for entry in entries:
        entry.meta['__source__']='source'
    return entries

What I can’t figure out with my limited python abilities is how can I add the raw csv line to the __source__ metadata field so that it can be displayed by fava import gui. I want to avoid rolling my own csv importer entirely as the generic csv importer provided by beancount does what I need for the most part. I know I can reread the csv file a second time to append the data, but is this the best or only way?

@floriskruisselbrink
Copy link
Contributor

My current workaround (which involves having my own copy of the generic csv importer) is patching the signature of the categorize() method so it accepts two parameters: transaction (as it does right now, with the generic information already filled in) and row (the current row from the csv-file, so you can parse extra columns yourself there).

I will try to create a pullrequest somewhere this week!

@jpluscplusm
Copy link
Contributor

jpluscplusm commented Aug 27, 2020

Hey folks - I'm considering extending this so that the CSV importer passes both the transaction's row and the CSV file's header, on each categorize() call. It seems a shame for the CSV importer to be doing all the discovery work about the file's structure, and not passing part of the benefit over to the Categorizer! And, as well, not having this info reduces the utility of more generic, cross-account Categorizers (or so I'm finding).

Before I work on this:

  • have I missed something obvious that removes the need for this change?
  • how firm are we on the row object being passed to categorize() being a list; or could I change its prototype to something more stringly-indexable? Or do I need to subclass list to ensure other folks' existing Categorizers don't break? I guess this could only affect Categorizers written or updated since Enhance csv categorizer beancount#483 was merged, 2 months ago. If it were my call, I /think/ I'd not be that concerned with the backwards compat, here. But that's just my 2 cents :-)

@blais blais transferred this issue from beancount/beancount Feb 2, 2021
@dnicolodi dnicolodi added the CSV Related to the CSV importer label Feb 26, 2021
@dnicolodi
Copy link
Collaborator

The CSV importer stores the origin file path and line number in the metadata 'filename' and 'lineno' fields (which are not serialized when the entries are printed). You can simply post-process the entries to move the information from these metadata entries to where fava expects them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CSV Related to the CSV importer
Projects
None yet
Development

No branches or pull requests

4 participants