-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to uv and airflow #129
Conversation
- Updates to use the latest uv (v0.5.28) which changes some things in `pyproject.toml` - Update docs to use `uv run` where appropriate. - Update CI to use uv to run tests. - To simplify things just have one requirements.txt file that is used by Docker to install dependencies with `uv pip` - Changes to tests to accomodate changes in the openalex API
@@ -1,4 +1,4 @@ | |||
FROM apache/airflow:2.9.3-python3.12 | |||
FROM apache/airflow:2.10.4-python3.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating to the latest version of Airflow. I've run the harvest DAG locally but not in our "production" environment.
@@ -67,28 +67,10 @@ https://sul-rialto-dev.stanford.edu/authors?action=index&commit=Search&controlle | |||
### Set-up | |||
|
|||
1. Install `uv` for dependency management as described in [the uv docs](https://github.com/astral-sh/uv?tab=readme-ov-file#getting-started). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uv will create the virtual environment as needed when you uv run
@@ -92,7 +91,7 @@ def publications_from_dois(dois: list): | |||
# TODO: do we need this to stay within 100,000 requests / day API quota? | |||
time.sleep(1) | |||
|
|||
doi_list = quote("|".join([doi for doi in doi_batch])) | |||
doi_list = "|".join([doi for doi in doi_batch]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with &
appearing in the DOI has now been fixed in pyalex:
assert len(pubs[0].keys()) == 53, "first publication has 53 columns" | ||
assert len(pubs[1].keys()) == 53, "second publication has 53 columns" | ||
assert len(pubs[0].keys()) == 54, "first publication has 54 columns" | ||
assert len(pubs[1].keys()) == 54, "second publication has 54 columns" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a new column coming back from OpenAlex abstract_inverted_index_v3"
.
|
||
- name: Run tests | ||
run: pytest | ||
run: uv run pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uv installs dependencies if needed when you uv run
Update uv and Airflow
pyproject.toml
.uv run
where appropriate.uv pip