Skip to content

Commit

Permalink
Addressing feedback
Browse files Browse the repository at this point in the history
- Minor changes/fixes to CODING_GUIDELINES.Md and CONTRIBUTING.md
- Reduce the number of layers generated by the Dockerfile.
- Exract constants from `002-install-python.sh` and
  `003-install-mariadb.sh` into the Dockerfile to make the files as
  reusable as possible.
- Some refactoring
  • Loading branch information
rafidka committed Jan 18, 2024
1 parent 4841c6c commit 370be7a
Show file tree
Hide file tree
Showing 9 changed files with 228 additions and 124 deletions.
4 changes: 2 additions & 2 deletions CODING_GUIDELINES.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ _This is a still work-in-progress and is likely to be updated during the early p

This document contains the coding guidelines we follow in this repository. We follow the guidelines here strictly, so make sure your Pull Requests abide by them.

To make it easier for developers to know the guidelines for what they are contributing, this document has multiple sections. Use the list below to jump to the section related to the code iny our Pull Request.
To make it easier for developers to know the guidelines for the type of their contribution, this document has multiple sections. Use the list below to jump to the section related to the code you plan to submit.

## Table of Contents

Expand All @@ -14,7 +14,7 @@ To make it easier for developers to know the guidelines for what they are contri

## Bash Scripts

For Bash scripts, we use [ShellCheck](https://www.shellcheck.net/) to help the developers common bugs and bad practices related to Bash scripts. We have GitHub workflows that execute ShellCheck on every Bash script in the repository, and fails if the code breaks any of its rules. To make it easier for the developer to test their code before publishing a PR, we have pre-commit hooks that automatically test your code. However, you need to setup `pre-commit` for the hooks to run. Check the README files for instructions.
For Bash scripts, we use [ShellCheck](https://www.shellcheck.net/) to help the developers discover common bugs and bad practices related to Bash scripts. We have GitHub workflows that execute ShellCheck on every Bash script in the repository, and fails if the code breaks any of its rules. To make it easier for the developer to test their code before publishing a PR, we have pre-commit hooks that automatically test your code. However, you need to setup `pre-commit` for the hooks to run. Check the README files for instructions.

## Python Scripts

Expand Down
3 changes: 2 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@ Contributions via pull requests are much appreciated. Before sending us a pull r
1. You are working against the latest source on the _main_ branch.
2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
4. You have configured [`pre-commit`](https://pre-commit.com/). If you don't configure `pre-commit`, then you are considerably more likely to be pushing code that fails our GitHub workflows. To configure it, you can execute the following commands on the repository root:
4. You have read our [CODING_GUIDELINES.md](CODING_GUIDELINES.md).
5. You have configured [`pre-commit`](https://pre-commit.com/). If you don't configure `pre-commit`, then you are considerably more likely to be pushing code that fails our GitHub workflows. To configure it, you can execute the following commands on the repository root:

```
pip install pre-commit
Expand Down
135 changes: 86 additions & 49 deletions images/airflow/2.8.0/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,77 +1,103 @@
#
# WARNING: Don't change this file manually. This file is auto-generated from the
# Jinja2-templated Dockerfile.j2 file, so you need to change that file instead.
# WARNING: Don't change this file manually. This file is auto-generated from
# the Jinja2-templated Dockerfile.j2 file, so you need to change that file
# instead.
#
# This file was generated on 2024-01-17 21:57:12.451896
# This file was generated on 2024-01-18 19:40:10.545188
#

FROM public.ecr.aws/amazonlinux/amazonlinux:2023

# Versions
ENV AIRFLOW_VERSION=2.8.0
# Environment variables
ENV AIRFLOW_AMAZON_PROVIDERS_VERSION=8.13.0
ENV PYTHON_VERSION=3.11
ENV AIRFLOW_USER_HOME=/usr/local/airflow
ENV AIRFLOW_HOME=${AIRFLOW_USER_HOME}
ENV AIRFLOW_VERSION=2.8.0
ENV PYTHON_VERSION=3.11.7

ENV PATH_DEFAULT=${PATH}
ENV PATH_AIRFLOW_USER=${AIRFLOW_USER_HOME}/.local/bin:${PATH_DEFAULT}
# Temporarily environment variables (removed at the end of the Dockerfile)
ARG PATH_DEFAULT=${PATH}
ARG PATH_AIRFLOW_USER=${AIRFLOW_USER_HOME}/.local/bin:${PATH_DEFAULT}
ARG PYTHON_MD5_CHECKSUM=d96c7e134c35a8c46236f8a0e566b69c
ARG MARIADB_DOWNLOAD_BASE_URL=https://mirror.mariadb.org/yum/11.1/fedora38-amd64/rpms
ARG MARIADB_RPM_COMMON=MariaDB-common-11.1.2-1.fc38.x86_64.rpm
ARG MARIADB_RPM_COMMON_CHECKSUM=e87371d558efa97724f3728fb214cf19
ARG MARIADB_RPM_DEVEL=MariaDB-devel-11.1.2-1.fc38.x86_64.rpm
ARG MARIADB_RPM_DEVEL_CHECKSUM=cfce6e9b53f4e4fb1cb14f1ed720c92c
ARG MARIADB_RPM_SHARED=MariaDB-shared-11.1.2-1.fc38.x86_64.rpm
ARG MARIADB_RPM_SHARED_CHECKSUM=ed82ad5bc5b35cb2719a9471a71c6cdb

# Bootstrapping steps (root user - first pass)
# Copy bootstrapping files.
COPY ./bootstrap /bootstrap
RUN chmod -R +x /bootstrap

COPY ./bootstrap/01-root-firstpass/001-init.sh /001-init.sh
RUN chmod +x /001-init.sh && /001-init.sh
RUN rm /001-init.sh
#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# BEGINNING marker for root user, first pass bootstrapping steps.
#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

COPY ./bootstrap/01-root-firstpass/002-install-python.sh /002-install-python.sh
RUN chmod +x /002-install-python.sh && /002-install-python.sh
RUN rm /002-install-python.sh

COPY ./bootstrap/01-root-firstpass/003-install-mariadb.sh /003-install-mariadb.sh
RUN chmod +x /003-install-mariadb.sh && /003-install-mariadb.sh
RUN rm /003-install-mariadb.sh
RUN /bootstrap/01-root-firstpass/001-init.sh

COPY ./bootstrap/01-root-firstpass/004-create-airflow-user.sh /004-create-airflow-user.sh
RUN chmod +x /004-create-airflow-user.sh && /004-create-airflow-user.sh
RUN rm /004-create-airflow-user.sh
RUN /bootstrap/01-root-firstpass/002-install-python.sh

COPY ./bootstrap/01-root-firstpass/005-install-aws-cli.sh /005-install-aws-cli.sh
RUN chmod +x /005-install-aws-cli.sh && /005-install-aws-cli.sh
RUN rm /005-install-aws-cli.sh
RUN /bootstrap/01-root-firstpass/003-install-mariadb.sh

COPY ./bootstrap/01-root-firstpass/999-install-needed-dnf-packages.sh /999-install-needed-dnf-packages.sh
RUN chmod +x /999-install-needed-dnf-packages.sh && /999-install-needed-dnf-packages.sh
RUN rm /999-install-needed-dnf-packages.sh
RUN /bootstrap/01-root-firstpass/004-create-airflow-user.sh

RUN /bootstrap/01-root-firstpass/005-install-aws-cli.sh

# Bootstrapping steps (airflow user)
RUN /bootstrap/01-root-firstpass/999-install-needed-dnf-packages.sh

USER root
COPY ./bootstrap/02-airflow/001-install-airflow.sh /001-install-airflow.sh
RUN chmod +x /001-install-airflow.sh
ENV PATH=${PATH_AIRFLOW_USER}}

#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# END marker for root user, first pass bootstrapping steps.
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<



#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# > BEGINNING marker for airflow user bootstrapping steps.
#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

# Switch to 'airflow' user and update the PATH environment variable.
USER airflow
RUN /001-install-airflow.sh
ENV PATH=${PATH_AIRFLOW_USER}


RUN /bootstrap/02-airflow/001-install-airflow.sh


# Revert the PATH and user.
ENV PATH=${PATH_DEFAULT}
USER root
RUN rm /001-install-airflow.sh

#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# > END marker for airflow user bootstrapping steps.
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<



#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# BEGINNING marker for root user, second pass bootstrapping steps.
#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

# Those steps are also executed as the root user. However, they rely on the
# successfull execution of the airflow user bootstrapping steps. For example,
# giving ownership of the Airflow home user to the 'airflow' user requires the
# the folder to be fully setup first.


RUN /bootstrap/03-root-secondpass/001-create-mwaa-dir.sh

# Bootstrapping steps (root user - second pass)
# Put in these steps stuff that you want to execute as a root user
# and also relies on the successful execution of the bootstrapping
# steps of the 'airflow' user. For example, giving ownership of the
# Airflow home user to the 'airflow' user, which requires having all
# files.
RUN /bootstrap/03-root-secondpass/999-chown-airflow-folder.sh

COPY ./bootstrap/03-root-secondpass/001-create-mwaa-dir.sh /001-create-mwaa-dir.sh
RUN chmod +x /001-create-mwaa-dir.sh && /001-create-mwaa-dir.sh
RUN rm /001-create-mwaa-dir.sh

COPY ./bootstrap/03-root-secondpass/999-chown-airflow-folder.sh /999-chown-airflow-folder.sh
RUN chmod +x /999-chown-airflow-folder.sh && /999-chown-airflow-folder.sh
RUN rm /999-chown-airflow-folder.sh
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# END marker for root user, second pass bootstrapping steps.
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

# remove bootstrapping files.
RUN rm -rf /bootstrap

# Create a volume for syncing files with the sidecar. The actual folder
# is created by the `001-create-mwaa-dir.sh` script.
Expand All @@ -80,9 +106,20 @@ VOLUME ["/usr/local/mwaa"]
# TODO We should only expose this port if the comand is 'webserver'.
EXPOSE 8080

ENV PATH=${PATH_AIRFLOW_USER}}
RUN unset PATH_DEFAULT
RUN unset PATH_AIRFLOW_USER
ENV PATH=${PATH_AIRFLOW_USER}

# # Unset temporary environment variables.
# RUN unset \
# PATH_DEFAULT \
# PATH_AIRFLOW_USER \
# PYTHON_MD5_CHECKSUM \
# MARIADB_DOWNLOAD_BASE_URL \
# MARIADB_RPM_COMMON \
# MARIADB_RPM_COMMON_CHECKSUM \
# MARIADB_RPM_DEVEL \
# MARIADB_RPM_DEVEL_CHECKSUM \
# MARIADB_RPM_SHARED \
# MARIADB_RPM_SHARED_CHECKSUM

WORKDIR ${AIRFLOW_USER_HOME}

Expand Down
99 changes: 67 additions & 32 deletions images/airflow/2.8.0/Dockerfile.j2
Original file line number Diff line number Diff line change
@@ -1,57 +1,92 @@
FROM public.ecr.aws/amazonlinux/amazonlinux:2023

# Versions
ENV AIRFLOW_VERSION=2.8.0
# Environment variables
ENV AIRFLOW_AMAZON_PROVIDERS_VERSION=8.13.0
ENV PYTHON_VERSION=3.11
ENV AIRFLOW_USER_HOME=/usr/local/airflow
ENV AIRFLOW_HOME=${AIRFLOW_USER_HOME}
ENV AIRFLOW_VERSION=2.8.0
ENV PYTHON_VERSION=3.11.7

# We don't want those variables to stay in the final image, so we use ARG instead of ENV.
ARG PATH_DEFAULT=${PATH}
ARG PATH_AIRFLOW_USER=${AIRFLOW_USER_HOME}/.local/bin:${PATH_DEFAULT}
ARG PYTHON_MD5_CHECKSUM=d96c7e134c35a8c46236f8a0e566b69c
ARG MARIADB_DOWNLOAD_BASE_URL=https://mirror.mariadb.org/yum/11.1/fedora38-amd64/rpms
ARG MARIADB_RPM_COMMON=MariaDB-common-11.1.2-1.fc38.x86_64.rpm
ARG MARIADB_RPM_COMMON_CHECKSUM=e87371d558efa97724f3728fb214cf19
ARG MARIADB_RPM_DEVEL=MariaDB-devel-11.1.2-1.fc38.x86_64.rpm
ARG MARIADB_RPM_DEVEL_CHECKSUM=cfce6e9b53f4e4fb1cb14f1ed720c92c
ARG MARIADB_RPM_SHARED=MariaDB-shared-11.1.2-1.fc38.x86_64.rpm
ARG MARIADB_RPM_SHARED_CHECKSUM=ed82ad5bc5b35cb2719a9471a71c6cdb

ENV PATH_DEFAULT=${PATH}
ENV PATH_AIRFLOW_USER=${AIRFLOW_USER_HOME}/.local/bin:${PATH_DEFAULT}
# Copy bootstrapping files.
COPY ./bootstrap /bootstrap
RUN chmod -R +x /bootstrap

# Bootstrapping steps (root user - first pass)
{% for filename, filepath in bootstrapping_scripts_root_firstpass %}
COPY {{ filepath }} /{{ filename }}
RUN chmod +x /{{ filename }} && /{{ filename }}
RUN rm /{{ filename }}
#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# BEGINNING marker for root user, first pass bootstrapping steps.
#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

{% for filepath in bootstrapping_scripts_root_firstpass %}
RUN {{ filepath }}
{% endfor %}

# Bootstrapping steps (airflow user)
{% for filename, filepath in bootstrapping_scripts_airflow %}
USER root
COPY {{ filepath }} /{{ filename }}
RUN chmod +x /{{ filename }}
ENV PATH=${PATH_AIRFLOW_USER}}
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# END marker for root user, first pass bootstrapping steps.
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<



#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# > BEGINNING marker for airflow user bootstrapping steps.
#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

# Switch to 'airflow' user and update the PATH environment variable.
USER airflow
RUN /{{ filename }}
ENV PATH=${PATH_AIRFLOW_USER}

{% for filepath in bootstrapping_scripts_airflow %}
RUN {{ filepath }}
{% endfor %}

# Revert the PATH and user.
ENV PATH=${PATH_DEFAULT}
USER root
RUN rm /{{ filename }}
{% endfor %}

# Bootstrapping steps (root user - second pass)
# Put in these steps stuff that you want to execute as a root user
# and also relies on the successful execution of the bootstrapping
# steps of the 'airflow' user. For example, giving ownership of the
# Airflow home user to the 'airflow' user, which requires having all
# files.
{% for filename, filepath in bootstrapping_scripts_root_secondpass %}
COPY {{ filepath }} /{{ filename }}
RUN chmod +x /{{ filename }} && /{{ filename }}
RUN rm /{{ filename }}
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# > END marker for airflow user bootstrapping steps.
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<



#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# BEGINNING marker for root user, second pass bootstrapping steps.
#>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

# Those steps are also executed as the root user. However, they rely on the
# successfull execution of the airflow user bootstrapping steps. For example,
# giving ownership of the Airflow home user to the 'airflow' user requires the
# the folder to be fully setup first.

{% for filepath in bootstrapping_scripts_root_secondpass %}
RUN {{ filepath }}
{% endfor %}

#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
# END marker for root user, second pass bootstrapping steps.
#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

# remove bootstrapping files.
RUN rm -rf /bootstrap

# Create a volume for syncing files with the sidecar. The actual folder
# is created by the `001-create-mwaa-dir.sh` script.
VOLUME ["/usr/local/mwaa"]

# TODO We should only expose this port if the comand is 'webserver'.
EXPOSE 8080

ENV PATH=${PATH_AIRFLOW_USER}}
RUN unset PATH_DEFAULT
RUN unset PATH_AIRFLOW_USER
ENV PATH=${PATH_AIRFLOW_USER}

WORKDIR ${AIRFLOW_USER_HOME}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,31 +1,35 @@
#!/bin/bash
set -e

dnf install -y wget xz tar
# shellcheck source=images/airflow/2.8.0/bootstrap/common.sh
source /bootstrap/common.sh

verify_env_vars_exist \
PYTHON_VERSION \
PYTHON_MD5_CHECKSUM

PYTHON_VERSION=3.11.7
PYTHON_MD5_CHECKSUM=d96c7e134c35a8c46236f8a0e566b69c
dnf install -y wget xz tar

mkdir python_install
python_file=Python-$PYTHON_VERSION
python_tar=$python_file.tar
python_tar_xz=$python_tar.xz
python_tar=${python_file}.tar
python_tar_xz=${python_tar}.xz

# Download Python's source code archive.
mkdir python_source
wget "https://www.python.org/ftp/python/$PYTHON_VERSION/$python_tar_xz" -P /python_source
wget "https://www.python.org/ftp/python/${PYTHON_VERSION}/${python_tar_xz}" -P /python_source

# Verify the checksum
echo "$PYTHON_MD5_CHECKSUM /python_source/$python_tar_xz" | md5sum --check - | grep --basic-regex "^/python_source/${python_tar_xz}: OK$"

cp /python_source/$python_tar_xz /python_install/$python_tar_xz
unxz ./python_install/$python_tar_xz
tar -xf ./python_install/$python_tar -C ./python_install
cp "/python_source/${python_tar_xz}" "/python_install/${python_tar_xz}"
unxz "./python_install/${python_tar_xz}"
tar -xf "./python_install/${python_tar}" -C ./python_install

dnf install -y dnf-plugins-core
dnf builddep -y python3

pushd /python_install/$python_file
pushd "/python_install/${python_file}"
./configure
make install -s -j "$(nproc)" # use -j to set the cores for the build
popd
Expand Down
Loading

0 comments on commit 370be7a

Please sign in to comment.