-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make a multi-year EIA MECS archive #542
Conversation
if int(year) >= 2006: | ||
table_link_pattern = re.compile( | ||
r"(RSE|)[Tt]able(\d{1,2}|\d{1.1})_(\d{1,2})(.xlsx|.xls)" | ||
) | ||
elif int(year) == 2002: | ||
table_link_pattern = re.compile( | ||
r"(RSE|)[Tt]able(\d{1,2}).(\d{1,2})_\d{1,2}(.xlsx|.xls)" | ||
) | ||
elif int(year) == 1998: | ||
table_link_pattern = re.compile( | ||
r"(d|e)\d{2}([a-z]\d{1,2})_(\d{1,2})(.xlsx|.xls)" | ||
) | ||
elif int(year) == 1994: | ||
# These earlier years the pattern is functional but not actually that informative. | ||
# so we will just use the original name by making the whole pattern a match | ||
table_link_pattern = re.compile( | ||
r"((rse|)m\d{2}_(\d{2})([a-d]|)(.xlsx|.xls))" | ||
) | ||
elif int(year) == 1991: | ||
table_link_pattern = re.compile(r"((rse|)mecs(\d{2})([a-z]|)(.xlsx|.xls))") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this whole situation but i didn't know what else to do. i thought about making a dict w/ year as key and pattern as value but then it wouldn't naturally grab the next year if the newer pattern holds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make the latest year be the default pattern and update with a dict-key if it's one of the other years? I agree it's a bit verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I run this, 1998 is full of files called 'd' and 'e'. All other years look just fine!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to go! Publish that thing!
published archives: |
Overview
Closes #516.
converted the 2018 mecs archive into a mutli-year archive. the link patterns for each of the years before 2006 are pretty different. The info about the major and minor table numbers are not in the original file names for the 1998 and 1994 archives so I didn't attempted to rename those.
Testing
How did you make sure this worked? How can a reviewer verify this?
Unpublished archives:
https://zenodo.org/uploads/14749820
https://sandbox.zenodo.org/uploads/158873
To-do list
Tasks