forked from aws/amazon-mwaa-docker-images
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Early code for Airflow v2.8.0 Docker image
- Loading branch information
Showing
18 changed files
with
448 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Coding Guidelines | ||
|
||
_This is a still work-in-progress and is likely to be updated during the early phases of the development of this repository._ | ||
|
||
This document contains the coding guidelines we follow in this repository. We follow the guidelines here strictly, so make sure your Pull Requests abide by them. | ||
|
||
To make it easier for developers to know the guidelines for what they are contributing, this document has multiple sections. Use the list below to jump to the section related to the code iny our Pull Request. | ||
|
||
## Table of Contents | ||
|
||
- [Bash Scripts](#bash-scripts) | ||
- [Python Scripts](#python-scripts) | ||
- [Docker](#docker) | ||
|
||
## Bash Scripts | ||
|
||
For Bash scripts, we use [ShellCheck](https://www.shellcheck.net/) to help the developers common bugs and bad practices related to Bash scripts. We have GitHub workflows that execute ShellCheck on every Bash script in the repository, and fails if the code breaks any of its rules. To make it easier for the developer to test their code before publishing a PR, we have pre-commit hooks that automatically test your code. However, you need to setup `pre-commit` for the hooks to run. Check the README files for instructions. | ||
|
||
## Python Scripts | ||
|
||
For Python scripts, we follow [PEP8](https://peps.python.org/pep-0008/). Additionally, we use Flask8 rules. Failure to comply by these will result in your PR failing our GitHub workflows. Like Bash scripts, to make it easier for the developer to test their code before publishing a PR, we have pre-commit hooks that automatically test your code. However, you need to setup `pre-commit` for the hooks to run. Check the README files for instructions. | ||
|
||
## Docker | ||
|
||
For Dockerfile bootstrapping, don't add your code to the Dockerfile directly. Instead, create a Bash script under the bootstrap/ folder. Follow these rules when creating a new bootstrapping file: | ||
|
||
1. Make sure the file name starts with a 3-digit number that indicates its order of execution. | ||
2. Keep your files as small as possible (but not smaller!). This way you better employ Docker caching and reduce the number of unnecessary rebuilds. | ||
3. If you need a system package in your bootstrap file, install it at the beginning and remove it at the end. For eaxmple, if you need to download a file using `wget` then do a `dnf install` at the beginning and a `dnf remove` at the end. This keeps the bootstrap files self-contained, and avoid leaving unnecessary system packages in the final Docker image. | ||
- Don't worry about removing a package that is actaully needed in the final image. There is a step at the end that will do that. | ||
- Don't worry about a certain DNF package being installed and removed multiple times during the bootstrapping process. Keeping bootstrapping files self-contained and avoiding leaving unnecessary packages is more important than the couple of seconds you will save optimizing the installation of system packages, especially considering Docker caching which means that steps are rarely repeated (assuming a well-written Dockerfile) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
# | ||
# WARNING: Don't change this file manually. This file is auto-generated from the | ||
# Jinja2-templated Dockerfile.j2 file, so you need to change that file instead. | ||
# | ||
# This file was generated on 2024-01-17 21:57:12.451896 | ||
# | ||
|
||
FROM public.ecr.aws/amazonlinux/amazonlinux:2023 | ||
|
||
# Versions | ||
ENV AIRFLOW_VERSION=2.8.0 | ||
ENV AIRFLOW_AMAZON_PROVIDERS_VERSION=8.13.0 | ||
ENV PYTHON_VERSION=3.11 | ||
ENV AIRFLOW_USER_HOME=/usr/local/airflow | ||
ENV AIRFLOW_HOME=${AIRFLOW_USER_HOME} | ||
|
||
ENV PATH_DEFAULT=${PATH} | ||
ENV PATH_AIRFLOW_USER=${AIRFLOW_USER_HOME}/.local/bin:${PATH_DEFAULT} | ||
|
||
# Bootstrapping steps (root user - first pass) | ||
|
||
COPY ./bootstrap/01-root-firstpass/001-init.sh /001-init.sh | ||
RUN chmod +x /001-init.sh && /001-init.sh | ||
RUN rm /001-init.sh | ||
|
||
COPY ./bootstrap/01-root-firstpass/002-install-python.sh /002-install-python.sh | ||
RUN chmod +x /002-install-python.sh && /002-install-python.sh | ||
RUN rm /002-install-python.sh | ||
|
||
COPY ./bootstrap/01-root-firstpass/003-install-mariadb.sh /003-install-mariadb.sh | ||
RUN chmod +x /003-install-mariadb.sh && /003-install-mariadb.sh | ||
RUN rm /003-install-mariadb.sh | ||
|
||
COPY ./bootstrap/01-root-firstpass/004-create-airflow-user.sh /004-create-airflow-user.sh | ||
RUN chmod +x /004-create-airflow-user.sh && /004-create-airflow-user.sh | ||
RUN rm /004-create-airflow-user.sh | ||
|
||
COPY ./bootstrap/01-root-firstpass/005-install-aws-cli.sh /005-install-aws-cli.sh | ||
RUN chmod +x /005-install-aws-cli.sh && /005-install-aws-cli.sh | ||
RUN rm /005-install-aws-cli.sh | ||
|
||
COPY ./bootstrap/01-root-firstpass/999-install-needed-dnf-packages.sh /999-install-needed-dnf-packages.sh | ||
RUN chmod +x /999-install-needed-dnf-packages.sh && /999-install-needed-dnf-packages.sh | ||
RUN rm /999-install-needed-dnf-packages.sh | ||
|
||
|
||
# Bootstrapping steps (airflow user) | ||
|
||
USER root | ||
COPY ./bootstrap/02-airflow/001-install-airflow.sh /001-install-airflow.sh | ||
RUN chmod +x /001-install-airflow.sh | ||
ENV PATH=${PATH_AIRFLOW_USER}} | ||
USER airflow | ||
RUN /001-install-airflow.sh | ||
ENV PATH=${PATH_DEFAULT} | ||
USER root | ||
RUN rm /001-install-airflow.sh | ||
|
||
|
||
# Bootstrapping steps (root user - second pass) | ||
# Put in these steps stuff that you want to execute as a root user | ||
# and also relies on the successful execution of the bootstrapping | ||
# steps of the 'airflow' user. For example, giving ownership of the | ||
# Airflow home user to the 'airflow' user, which requires having all | ||
# files. | ||
|
||
COPY ./bootstrap/03-root-secondpass/001-create-mwaa-dir.sh /001-create-mwaa-dir.sh | ||
RUN chmod +x /001-create-mwaa-dir.sh && /001-create-mwaa-dir.sh | ||
RUN rm /001-create-mwaa-dir.sh | ||
|
||
COPY ./bootstrap/03-root-secondpass/999-chown-airflow-folder.sh /999-chown-airflow-folder.sh | ||
RUN chmod +x /999-chown-airflow-folder.sh && /999-chown-airflow-folder.sh | ||
RUN rm /999-chown-airflow-folder.sh | ||
|
||
|
||
# Create a volume for syncing files with the sidecar. The actual folder | ||
# is created by the `001-create-mwaa-dir.sh` script. | ||
VOLUME ["/usr/local/mwaa"] | ||
|
||
# TODO We should only expose this port if the comand is 'webserver'. | ||
EXPOSE 8080 | ||
|
||
ENV PATH=${PATH_AIRFLOW_USER}} | ||
RUN unset PATH_DEFAULT | ||
RUN unset PATH_AIRFLOW_USER | ||
|
||
WORKDIR ${AIRFLOW_USER_HOME} | ||
|
||
COPY entrypoint.py /entrypoint.py | ||
COPY entrypoint.sh /entrypoint.sh | ||
RUN chmod +x /entrypoint.sh | ||
|
||
USER airflow | ||
|
||
ENTRYPOINT ["/entrypoint.sh"] | ||
|
||
CMD /bin/bash |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
FROM public.ecr.aws/amazonlinux/amazonlinux:2023 | ||
|
||
# Versions | ||
ENV AIRFLOW_VERSION=2.8.0 | ||
ENV AIRFLOW_AMAZON_PROVIDERS_VERSION=8.13.0 | ||
ENV PYTHON_VERSION=3.11 | ||
ENV AIRFLOW_USER_HOME=/usr/local/airflow | ||
ENV AIRFLOW_HOME=${AIRFLOW_USER_HOME} | ||
|
||
ENV PATH_DEFAULT=${PATH} | ||
ENV PATH_AIRFLOW_USER=${AIRFLOW_USER_HOME}/.local/bin:${PATH_DEFAULT} | ||
|
||
# Bootstrapping steps (root user - first pass) | ||
{% for filename, filepath in bootstrapping_scripts_root_firstpass %} | ||
COPY {{ filepath }} /{{ filename }} | ||
RUN chmod +x /{{ filename }} && /{{ filename }} | ||
RUN rm /{{ filename }} | ||
{% endfor %} | ||
|
||
# Bootstrapping steps (airflow user) | ||
{% for filename, filepath in bootstrapping_scripts_airflow %} | ||
USER root | ||
COPY {{ filepath }} /{{ filename }} | ||
RUN chmod +x /{{ filename }} | ||
ENV PATH=${PATH_AIRFLOW_USER}} | ||
USER airflow | ||
RUN /{{ filename }} | ||
ENV PATH=${PATH_DEFAULT} | ||
USER root | ||
RUN rm /{{ filename }} | ||
{% endfor %} | ||
|
||
# Bootstrapping steps (root user - second pass) | ||
# Put in these steps stuff that you want to execute as a root user | ||
# and also relies on the successful execution of the bootstrapping | ||
# steps of the 'airflow' user. For example, giving ownership of the | ||
# Airflow home user to the 'airflow' user, which requires having all | ||
# files. | ||
{% for filename, filepath in bootstrapping_scripts_root_secondpass %} | ||
COPY {{ filepath }} /{{ filename }} | ||
RUN chmod +x /{{ filename }} && /{{ filename }} | ||
RUN rm /{{ filename }} | ||
{% endfor %} | ||
|
||
# Create a volume for syncing files with the sidecar. The actual folder | ||
# is created by the `001-create-mwaa-dir.sh` script. | ||
VOLUME ["/usr/local/mwaa"] | ||
|
||
# TODO We should only expose this port if the comand is 'webserver'. | ||
EXPOSE 8080 | ||
|
||
ENV PATH=${PATH_AIRFLOW_USER}} | ||
RUN unset PATH_DEFAULT | ||
RUN unset PATH_AIRFLOW_USER | ||
|
||
WORKDIR ${AIRFLOW_USER_HOME} | ||
|
||
COPY entrypoint.py /entrypoint.py | ||
COPY entrypoint.sh /entrypoint.sh | ||
RUN chmod +x /entrypoint.sh | ||
|
||
USER airflow | ||
|
||
ENTRYPOINT ["/entrypoint.sh"] | ||
|
||
CMD /bin/bash |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
dnf update -y |
38 changes: 38 additions & 0 deletions
38
images/airflow/2.8.0/bootstrap/01-root-firstpass/002-install-python.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
dnf install -y wget xz tar | ||
|
||
PYTHON_VERSION=3.11.7 | ||
PYTHON_MD5_CHECKSUM=d96c7e134c35a8c46236f8a0e566b69c | ||
|
||
mkdir python_install | ||
python_file=Python-$PYTHON_VERSION | ||
python_tar=$python_file.tar | ||
python_tar_xz=$python_tar.xz | ||
|
||
# Download Python's source code archive. | ||
mkdir python_source | ||
wget "https://www.python.org/ftp/python/$PYTHON_VERSION/$python_tar_xz" -P /python_source | ||
|
||
# Verify the checksum | ||
echo "$PYTHON_MD5_CHECKSUM /python_source/$python_tar_xz" | md5sum --check - | grep --basic-regex "^/python_source/${python_tar_xz}: OK$" | ||
|
||
cp /python_source/$python_tar_xz /python_install/$python_tar_xz | ||
unxz ./python_install/$python_tar_xz | ||
tar -xf ./python_install/$python_tar -C ./python_install | ||
|
||
dnf install -y dnf-plugins-core | ||
dnf builddep -y python3 | ||
|
||
pushd /python_install/$python_file | ||
./configure | ||
make install -s -j "$(nproc)" # use -j to set the cores for the build | ||
popd | ||
|
||
# Upgrade pip | ||
pip3 install --upgrade pip | ||
|
||
rm -rf /python_source /python_install | ||
|
||
dnf remove -y wget xz tar |
31 changes: 31 additions & 0 deletions
31
images/airflow/2.8.0/bootstrap/01-root-firstpass/003-install-mariadb.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
dnf install -y wget | ||
|
||
MARIADB_RPM_COMMON_CHECKSUM=e87371d558efa97724f3728fb214cf19 | ||
MARIADB_RPM_SHARED_CHECKSUM=ed82ad5bc5b35cb2719a9471a71c6cdb | ||
MARIADB_RPM_DEVEL_CHECKSUM=cfce6e9b53f4e4fb1cb14f1ed720c92c | ||
|
||
# Installing mariadb-devel dependency for apache-airflow-providers-mysql. | ||
MARIADB_RPM_COMMON=MariaDB-common-11.1.2-1.fc38.x86_64.rpm | ||
MARIADB_RPM_SHARED=MariaDB-shared-11.1.2-1.fc38.x86_64.rpm | ||
MARIADB_RPM_DEVEL=MariaDB-devel-11.1.2-1.fc38.x86_64.rpm | ||
|
||
# Download the necessary RPMs. | ||
mkdir /mariadb_rpm | ||
wget https://mirror.mariadb.org/yum/11.1/fedora38-amd64/rpms/$MARIADB_RPM_COMMON -P /mariadb_rpm | ||
wget https://mirror.mariadb.org/yum/11.1/fedora38-amd64/rpms/$MARIADB_RPM_SHARED -P /mariadb_rpm | ||
wget https://mirror.mariadb.org/yum/11.1/fedora38-amd64/rpms/$MARIADB_RPM_DEVEL -P /mariadb_rpm | ||
|
||
# Verify their checkums | ||
echo "$MARIADB_RPM_COMMON_CHECKSUM /mariadb_rpm/$MARIADB_RPM_COMMON" | md5sum --check - | grep --basic-regex "^/mariadb_rpm/$MARIADB_RPM_COMMON: OK$" | ||
echo "$MARIADB_RPM_SHARED_CHECKSUM /mariadb_rpm/$MARIADB_RPM_SHARED" | md5sum --check - | grep --basic-regex "^/mariadb_rpm/$MARIADB_RPM_SHARED: OK$" | ||
echo "$MARIADB_RPM_DEVEL_CHECKSUM /mariadb_rpm/$MARIADB_RPM_DEVEL" | md5sum --check - | grep --basic-regex "^/mariadb_rpm/$MARIADB_RPM_DEVEL: OK$" | ||
|
||
# Install the RPMs. | ||
rpm -ivh /mariadb_rpm/* | ||
|
||
rm -rf /mariadb_rpm | ||
|
||
dnf remove -y wget |
9 changes: 9 additions & 0 deletions
9
images/airflow/2.8.0/bootstrap/01-root-firstpass/004-create-airflow-user.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
dnf install -y shadow-utils | ||
|
||
# AIRFLOW_USER_HOME is defined in the Dockerfile. | ||
adduser -s /bin/bash -d "${AIRFLOW_USER_HOME}" airflow | ||
|
||
dnf remove -y shadow-utils |
4 changes: 4 additions & 0 deletions
4
images/airflow/2.8.0/bootstrap/01-root-firstpass/005-install-aws-cli.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
dnf install -y awscli-2 |
7 changes: 7 additions & 0 deletions
7
images/airflow/2.8.0/bootstrap/01-root-firstpass/999-install-needed-dnf-packages.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
dnf install -y java-17-amazon-corretto # For Java lovers. | ||
dnf install -y libcurl-devel # For pycurl | ||
dnf install -y postgresql-devel # For psycopg2 | ||
dnf install -y procps # For 'ps' command, which is used for monitoring. |
27 changes: 27 additions & 0 deletions
27
images/airflow/2.8.0/bootstrap/02-airflow/001-install-airflow.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
# List of required environment variables | ||
required_vars=("AIRFLOW_VERSION" "AIRFLOW_AMAZON_PROVIDERS_VERSION" "PYTHON_VERSION") | ||
|
||
# Function to check if environment variables are set | ||
check_env_vars() { | ||
for var in "${required_vars[@]}"; do | ||
if [[ -z ${!var} ]]; then | ||
echo "Error: Environment variable ${var} is not set." | ||
exit 1 | ||
fi | ||
done | ||
} | ||
|
||
# Check required environment variables | ||
check_env_vars | ||
|
||
CONSTRAINT_FILE="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt" | ||
pip3 install --constraint "${CONSTRAINT_FILE}" \ | ||
pycurl \ | ||
psycopg2 \ | ||
"celery[sqs]" \ | ||
"apache-airflow[celery,statsd]==${AIRFLOW_VERSION}" \ | ||
"apache-airflow-providers-amazon[aiobotocore]==${AIRFLOW_AMAZON_PROVIDERS_VERSION}" \ | ||
watchtower |
4 changes: 4 additions & 0 deletions
4
images/airflow/2.8.0/bootstrap/03-root-secondpass/001-create-mwaa-dir.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
|
||
mkdir -p /usr/local/mwaa | ||
chown -R airflow: /usr/local/mwaa |
4 changes: 4 additions & 0 deletions
4
images/airflow/2.8.0/bootstrap/03-root-secondpass/999-chown-airflow-folder.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
chown -R airflow: "${AIRFLOW_USER_HOME}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
python3 generate-dockerfile.py | ||
|
||
docker build ./ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
""" | ||
This is the entrypoint of the Docker image when running Airflow components. | ||
The script gets called with the Airflow component name, e.g. scheduler, as the | ||
first and only argument. It accordingly runs the requested Airlfow component | ||
after setting up the necessary configurations. | ||
""" | ||
|
||
import sys | ||
|
||
|
||
def main() -> None: | ||
"""Entrypoint of the script.""" | ||
print("Warming the Docker container.") | ||
print(sys.argv) | ||
# TODO Not yet implemented | ||
|
||
|
||
if __name__ == '__main__': | ||
main() | ||
else: | ||
print('This module cannot be imported.') | ||
sys.exit(1) |
Oops, something went wrong.