Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We need to establish a workflow for creating chants given a CSV file #1259

Open
jacobdgm opened this issue Jan 16, 2024 · 6 comments · May be fixed by #1770
Open

We need to establish a workflow for creating chants given a CSV file #1259

jacobdgm opened this issue Jan 16, 2024 · 6 comments · May be fixed by #1770

Comments

@jacobdgm
Copy link
Contributor

Debra recently gave us a .csv file with information for a bunch of chants that need created on cantusdatabase.org.

We need a system to create chants based on CSV files. Debra said that this used to happen on OldCantus, and will continue to need to happen from time to time in NewCantus.

We could create a fully automated system with a specification - "your file should include all these columns and exactly these columns", and so on. It might, however, make sense to have a more flexible system - perhaps a management command that can be adapted by a developer to accommodate whichever individual spreadsheets are sent to us by the musicologists. If we adopt this second approach, however, we will need to attend to these .csv files promptly.

Thoughts on how best to approach this?

@annamorphism
Copy link

I think the best approach would be the first one, in an interface only accessible to admins. The second approach means a lot more work for developers to work out what's what, and I would anticipate such files to be mediated through a Debra anyway.

@ahankinson
Copy link
Member

My 2c, FWIW. I would suggest doing a combination of 1 and 2: A strict CSV format that is uploaded by admins on the command line.

My reasons:

  1. Error handling with data import is really hard. Communicating the errors with processing and uploading spreadsheets takes a lot of forethought and effort. The command-line, on the other hand, is quite easy -- an exception thrown on the command line doesn't need to be reported anywhere else.
  2. My experience is that users always want to modify their spreadsheets, either intentionally ("Oh, I thought it would import this new column automatically") or unintentionally ("No, I must have deleted that header by accident."). A validation step, followed by an import, is probably the best approach for all involved.
  3. If something goes wrong ("OH, shoot -- I didn't mean to overwrite those!") it's much easier to see that happen on the command line
  4. It's easier for devs to test a CSV upload on a staging system and then run it on the production system, than it is to expect users to run it on staging first.

You might approach it in a way that you develop a sort of import module, which is initially called by the management scripts but, then when it matures, can move to a UI system.

@jacobdgm
Copy link
Contributor Author

jacobdgm commented Jan 16, 2024

If we follow @ahankinson's advice, perhaps the best approach is: set up a management command that expects a CSV file with a specific format. A developer copies the CSV into the container and runs the management command on Staging, and makes any necessary changes to the CSV (reordering/renaming columns, etc.) in case of error messages. If/when the command runs to completion, upload the working CSV to Production and run it there. Unless something unforseen arises, this would take maybe 5-10 minutes of developer time per CSV.

Does this make sense?

@jacobdgm
Copy link
Contributor Author

jacobdgm commented Jan 16, 2024

I've begun work on this, but for the specific CSV at hand, progress is blocked until we figure out what's going on with #1261.

@jacobdgm
Copy link
Contributor Author

jacobdgm commented Jan 23, 2024

if it's true that Sources (rather than Chants) should have a fragmentarium_id (see #1261 (comment)), then work on this can proceed - after creating the source and all the chants, we can just add the proper value for the Fragmentarium ID on the source once the field has been created.

@annamorphism
Copy link

curious if there's been any progress on this, since it came up today in passing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants