Table of Contents generated with DocToc
- Running the CI Jobs locally
- Getting the CI image from failing job
- Options and environment variables used
The main goal of the CI philosophy we have that no matter how complex the test and integration infrastructure, as a developer you should be able to reproduce and re-run any of the failed checks locally. One part of it are pre-commit checks, that allow you to run the same static checks in CI and locally, but another part is the CI environment which is replicated locally with Breeze.
You can read more about Breeze in README.rst but in essence it is a python wrapper around docker commands that allows you (among others) to re-create CI environment in your local development instance and interact with it. In its basic form, when you do development you can run all the same tests that will be run in CI - but locally, before you submit them as PR. Another use case where Breeze is useful is when tests fail on CI.
All our CI jobs are executed via breeze
commands. You can replicate
exactly what our CI is doing by running the sequence of corresponding
breeze
command. Make sure however that you look at both:
- flags passed to
breeze
commands - environment variables used when
breeze
command is run - this is useful when we want to set a common flag for allbreeze
commands in the same job or even the whole workflow. For exampleVERBOSE
variable is set totrue
for all our workflows so that more detailed information about internal commands executed in CI is printed.
In the output of the CI jobs, you will find both - the flags passed and environment variables set.
Every contributor can also pull and run images being result of a specific CI run in GitHub Actions. This is a powerful tool that allows to reproduce CI failures locally, enter the images and fix them much faster.
Note that this currently only works for AMD machines, not for ARM machines, but this will change soon.
To load the image from specific PR, you can use the following command:
breeze ci-image load --from-pr 12345 --python 3.9 --github-token <your_github_token>
To load the image from specific run (for example 12538475388), you can use the following command, find the run id from github action runs.
breeze ci-image load --from-run 12538475388 --python 3.9 --github-token <your_github_token>
After you load the image, you can reproduce the very exact environment that was used in the CI run by entering breeze container without mounting your local sources:
breeze shell --mount-sources skip [OPTIONS]
And you should be able to run any tests and commands interactively in the very exact environment that was used in the failing CI run even without checking out sources of the failing PR. This is a powerful tool to debug and fix CI issues.
You can also build the image locally by checking-out the branch of the PR that was used and running:
breeze ci-image build
You have to be aware that some of the PRs and canary builds use the --upgrade-to-newer-dependencies
flag
(UPGRADE_TO_NEWER_DEPENDENCIES
environment variable set to true
) and they are not using constraints
to build the image so if you want to build it locally, you should pass the --upgrade-to-newer-dependencies
flag when you are building the image.
Note however, that if constraints changed for regulare builds and if someone released a new package in PyPI
since the build was run (which is very likely - we have many packages released a day), the image you
build locally might be different than the one in CI, that's why loading image using breeze ci-image load
is more reliable way to reproduce the CI build.
If you check-out the branch of the PR that was used, regular breeze
commands will
also reproduce the CI environment without having to rebuild the image - for example when dependencies
changed or when new dependencies were released and used in the CI job - and you will
be able to edit source files locally as usual and use your IDE and tools you usually use to develop Airflow.
In order to reproduce the exact job you also need to set the "[OPTIONS]" corresponding to the particular
job you want to reproduce within the run. You can find those in the logs of the CI job. Note that some
of the options can be passed by --flags
and some via environment variables, for convenience, so you should
take a look at both if you want to be sure to reproduce the exact job configuration. See the next chapter
for summary of the most important environment variables and options used in the CI jobs.
You can read more about it in Breeze and Testing
Depending whether the scripts are run locally via Breeze or whether they are run in
Build Images
or Tests
workflows can behave differently.
You can use those variables when you try to reproduce the build locally - alternatively you can pass
those via corresponding command line flag passed to breeze shell
command.
Those variables are controlling basic configuration and behaviour of the breeze command.
Variable | Option | Local dev | CI | Comment |
---|---|---|---|---|
PYTHON_MAJOR_MINOR_VERSION | --python | Major/Minor version of Python used. | ||
BACKEND | --backend | Backend used in the tests. | ||
INTEGRATION | --integration | Integration used in tests. | ||
DB_RESET | --db-reset/--no-db-reset | false | true | Determines whether database should be reset at the container entry. |
ANSWER | --answer | yes | This variable determines if answer to questions should be automatically set. |
Those variables are used to control the test execution.
Variable | Option | Local dev | CI | Comment |
---|---|---|---|---|
RUN_DB_TESTS_ONLY | --run-db-tests-only | true in db tests | Whether only db tests should be executed. | |
SKIP_DB_TESTS | --skip-db-tests | true in non-db tests | Whether db tests should be skipped. |
Those variables are used to control the initialization of the environment in the container.
Variable | Option | Local dev | CI | Comment |
---|---|---|---|---|
MOUNT_SOURCES | --mount-sources | skip | Whether to mount the local sources into the container. | |
SKIP_ENVIRONMENT_INITIALIZATION | --skip-enviromnment-initialization | false (*) | false (*) | Skip initialization of test environment (*) set to true in pre-commits. |
SKIP_IMAGE_UPGRADE_CHECK | --skip-image-upgrade-check | false (*) | false (*) | Skip checking if image should be upgraded (*) set to true in pre-commits. |
SKIP_PROVIDERS_TESTS | false | false | Skip running provider integration tests (in non-main branch). | |
SKIP_SSH_SETUP | false | false (*) | Skip setting up SSH server for tests. (*) set to true in GitHub CodeSpaces. | |
VERBOSE_COMMANDS | false | false | Whether every command executed in docker should be printed. |
Those variables are automatically set by Breeze when running the commands locally, but you can override them if you want to run the commands in a different environment.
Variable | Local dev | CI | Comment |
---|---|---|---|
HOST_USER_ID | Host UID | User id of the host user. | |
HOST_GROUP_ID | Host GID | Group id of the host user. | |
HOST_OS | linux | OS of the Host (darwin/linux/windows). | |
COMMIT_SHA | GITHUB_SHA | SHA of the commit of the build is run |
Thank you for reading this far. We hope that you have learned a lot about Reproducing Airlfow's CI job locally and CI in general.