Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert to OCI builder #125

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

LaurentGoderre
Copy link
Member

Addresses #124

Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start! There are a lot of little nitpicks, but I tried to stay away from them and focus on the higher-level things first so we can hash out what the end state needs to look like before we get too buried in the nuts and bolts / implementation (although it's definitely useful to have an explicit implementation to point at for articulating and discussing those 👍).

I think at a really high level, this probably needs to be integrated deeper into the existing build process, especially so we can have it be fully reproducible (by building it inside that container/image). 🤔


created="2025-01-21T23:32:32Z"

arches=( *"/$image/hello" )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I noted in #124, we should probably do this and #122 together, which will simplify parts of this script (and other scripts in the repository).

I'm thinking something as straightforward as removing the hello-world subdirectory in all the architecture directories.

schemaVersion: 2,
mediaType: "application/vnd.oci.image.index.v1+json",
manifests: $manifests
}' > "$ociroot/index.json"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lumping all the architectures together in one big index.json won't actually work in the oci-import builder -- it requires exactly/only one image in index.json or (as I noted in #124) that we specify File: pointing to a JSON file that contains a single image descriptor.

As much as I'd like to lump all these into a single OCI layout here (it's very cute!), I think we'll actually have a much better time downstream if we instead do a separate OCI layout per architecture (amd64/oci/index.json, i386/oci/index.json, etc), especially since the sourceId for the image/build will currently be calculated based on the full contents of the OCI layout, not just the one architecture, so any update to any binaries will result in "rebuilding" all of them (which, to be fair, is something we should try to fix downstream, but implementing that correctly is very complex).

} | tojson'
}

created="2025-01-21T23:32:32Z"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for this specific value? 😅

(long-term we probably need to scrape an appropriate value per-architecture from the latest Git commit to touch the relevant hello binary or something like that, but that might get recursive if we want to update the binary and the OCI layouts in one step 🤔 there's also an argument we could make that committing the binaries and the tarballs/OCI layouts is overkill and we should only commit the latter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahaha, that was s just the value from the current one in registry and forgot to change it


mkdir -p $blobs

toBlob() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +33 to +39
"com.docker.official-images.bashbrew.arch": $arch,
"org.opencontainers.image.base.name": "scratch",
"org.opencontainers.image.created": $timestamp,
"org.opencontainers.image.revision": $revision,
"org.opencontainers.image.source": "https://github.com/docker-library/hello-world.git#\($revision):\($arch)/hello-world",
"org.opencontainers.image.url": "https://hub.docker.com/_/hello-world",
"org.opencontainers.image.version": $os
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these annotations are added automatically in the index by meta-scripts, so we shouldn't need them at all here. 🙈

local arch="$1"; shift

path="./$arch/hello-world"
tar -C $path -cf - hello > tmp.tar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to commit these tarballs (and even if we're not, really), we should go out of our way to make sure they are generated reproducibly like we do in BusyBox, which is probably going to involve/require we build them inside a container or image (such as the Dockerfile.build we already have 😅): https://github.com/docker-library/busybox/blob/69136da0a606fde9f84b593f975060819960f42b/build.sh#L44-L53

Comment on lines +141 to +164
platform=""

case "$arch" in
arm64v8)
platform=$(jq -r -n --arg os $os --arg arch "arm64" --arg variant "v8" '{
architecture: $arch,
os: $os,
variant: $variant
} | tojson')
;;
arm32v*)
platform=$(jq -r -n --arg os $os --arg arch "arm" --arg variant "${arch/arm32/}" '{
architecture: $arch,
os: $os,
variant: $variant
} | tojson')
;;
*)
platform=$(jq -r -n --arg os $os --arg arch $arch '{
architecture: $arch,
os: $os
} | tojson')
;;
esac
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should let bashbrew interpret these for us instead, like we do in BusyBox: https://github.com/docker-library/busybox/blob/69136da0a606fde9f84b593f975060819960f42b/build.sh#L11-L13 😅

(however, we don't actually need this for index.json - only for the config blob)

"created": $timestamp,
} | tojson')"

toBlob $content
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having all these functions go straight into blobs/sha256/xxx is valid, but it makes the end result hard for us humans to navigate and make sense out of, which is why BusyBox takes the approach of human-readable names in the OCI directory like rootfs.tar.gz, image-config.json, and image-manifest.json, and then makes symlinks into blobs/sha256/xxx instead: https://github.com/docker-library/busybox/blob/69136da0a606fde9f84b593f975060819960f42b/build.sh#L88

https://github.com/docker-library/busybox/blob/69136da0a606fde9f84b593f975060819960f42b/latest/musl/amd64/image-config.json

https://github.com/docker-library/busybox/blob/69136da0a606fde9f84b593f975060819960f42b/latest/musl/amd64/image-manifest.json

https://github.com/docker-library/busybox/tree/69136da0a606fde9f84b593f975060819960f42b/latest/musl/amd64/blobs/sha256

manifests=$(jq -r -n --argjson manifests $manifests --argjson manifest $indexdescriptor '$manifests | . += [$manifest] | tojson')
done

# TODO: Windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, we won't be able to use OCI import for Windows (the only remotely tenable means would be foreign layers for the Windows layers, but I believe the oci-import builder explicitly blocks/clears those), so they'll need to stay as Dockerfile-based builds for the foreseeable future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants