Skip to content

Commit

Permalink
merge dev
Browse files Browse the repository at this point in the history
Signed-off-by: David Wood <[email protected]>
  • Loading branch information
daw3rd committed Sep 26, 2024
2 parents 3d2de8c + 7241d6a commit 0666491
Show file tree
Hide file tree
Showing 8 changed files with 82 additions and 31 deletions.
46 changes: 27 additions & 19 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Release Management

## Overview
Release are created from the main repository branch using the version
Releases are created from the main repository branch using the version
numbers, including an intermediate version suffix,
defined in `.make.versions`.
The following points are important:

1. In general, common a version number is used for all published pypi wheels and docker images.
1. In general, a common version number is used for all published pypi wheels and docker images.
1. `.make.versions` contains the version to be used when publishing the **next** release.
1. Whenever `.make.versions` is changed, `make set-versions` should be run from the top of the repo.
1. Corollary: `make set-versions` should ONLY be used from the top of the repo when `.make.versions` changes.
Expand All @@ -20,29 +20,35 @@ allows intermediate publishing from the main branch using version X.Y.Z.dev\<N\>
## Cutting the release
Creating the release involves

1. Creating a release branch and tag and updating the main branch versions.
1. Creating a github release from the release branch and tag.
1. Edit the `release-notes.md` to list major/minor changes
1. Creating a release branch and updating the main branch versions (using `release-branch.sh`).
1. Creating a github release and tag from the release branch.
1. Building and publishing pypi library wheels and docker registry image.

Each is discussed below.

### Creating release branch and tag
### Editing release-notes.md
Make a dummy release on github (see below) to get a listing of all commits.
Use this to come up with the items.
Commit this to the main branch so it is ready for including in the release branch.

### Creating release branch
The `scripts/release-branch.sh` is currently run manually to create the branch and tags as follows:

1. Creates the `releases/vX.Y.Z` from the main branch where `X.Y.Z` are defined in .make.versions
1. Creates the `vX.Y.Z` branch for PR'ing back into the `releases/vX.Y.Z` branch.
1. In the new `vX.Y.Z` branch
1. Nulls out the version suffix in the new branch's `.make.version` file.
1. Applies the unsuffixed versions to the artifacts published from the repo using `make set-versions`..
1. Commits and pushes branch and tag
1. Commits and pushes branch
1. Creates the `pending-version-change/vX.Y.Z` branch for PR'ing back into the main branch.
1. In the `pending-version-change/vX.Y.Z` branch
1. Increments the minor version (i.e. Z+1) and resets the suffix to `dev0` in `.make.versions`.
1. Commits and pushes branch

To double-check the version that will be published from the release,
```
git checkout releasing/vX.Y.Z
git checkout vX.Y.Z
make show-version
```
This will print for example, 1.2.3.
Expand All @@ -58,20 +64,22 @@ After running the script, you should
2. Use the github web UI to create a git release and tag of the `releases/vX.Y.Z` branch
3. Create a pull request from branch `pending-version-change/vX.Y.Z` into the main branch, and merge.

### Github release
### Creating the Github Release
After running the `release-branch.sh` script, to create tag `vX.Y.Z` and branch `releases/vX.Y.Z`
and PRing/merging `vX.Y.Z` into `releases/vX.Y.Z`.
1. Go to the [releases page](https://github.com/IBM/data-prep-kit/releases).
2. Select `Draft a new release`
3. Select `Choose a tag -> vX.Y.Z`
4. Press `Generate release notes`
5. Add a title (e.g., Release X.Y.Z)
6. Add any additional relese notes.
7. Press `Publish release`

### Publishing wheels and images
After creating the release branch and tag using the `scripts/release-branch.sh` script:

1. Switch to a release branch (e.g. releases/v1.2.3) created by the `release-branch.sh` script
1. Select `Draft a new release`
1. Select target branch `releases/vX.Y.Z`
1. Select `Choose a tag`, type in vX.Y.Z, click `Create tag`
1. Press `Generate release notes`
1. Add a title (e.g., Release X.Y.Z)
1. Add any additional relese notes.
1. Press `Publish release`

### Building and Publishing Wheels and Images
After creating the release and tag on github:

1. Switch to a release branch (e.g. releases/v1.2.3).
1. Be sure you're at the top of the repository (`.../data-prep-kit`)
1. Optionally, `make show-version` to see the version that will be published
1. Running the following, either manually or in a git action
Expand Down
31 changes: 31 additions & 0 deletions release-notes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,36 @@
# Data Prep Kit Release notes

## Release 0.2.1 - 9/24/2024

### General
1. Bug fixes across the repo
1. Added AI Alliance RAG demo, tutorials and notebooks and tips for running on google colab
1. Added new transforms and single package for transforms published to pypi
1. Improved CI/CD with targeted workflow triggered on specific changes to specific modules
1. New enhancements for cutting a release


### data-prep-toolkit libraries (python, ray, spark)

1. Restructure the repository to distinguish/separate runtime libraries
1. Split data-processing-lib/ray into python and ray
1. Spark runtime
1. Updated pyarrow version
1. Define required transform() method as abstract to AbstractTableTransform
1. Enables configuration of makefile to use src or pypi for data-prep-kit library dependencies


### KFP Workloads

1. Add a configurable timeout before destroying the deployed Ray cluster.

### Transforms

1. Added 7 new transdforms including: language identification, profiler, repo level ordering, doc quality, pdf2parquet, HTML2Parquet and PII Transform
1. Added ededup python implementation and incremental ededup
1. Added fuzzy floating point comparison


## Release 0.2.0 - 6/27/2024

### General
Expand Down
1 change: 1 addition & 0 deletions transforms/language/pdf2parquet/python/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ RUN cd data-processing-lib-python && pip install --no-cache-dir -e .
# END OF STEPS destined for a data-prep-kit base image

COPY --chown=dpk:root pyproject.toml pyproject.toml
COPY --chown=dpk:root requirements.txt requirements.txt
RUN pip install ${PIP_INSTALL_EXTRA_ARGS} --no-cache-dir -e .

# Download models
Expand Down
12 changes: 4 additions & 8 deletions transforms/language/pdf2parquet/python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,15 @@ authors = [
{ name = "Michele Dolfi", email = "[email protected]" },
{ name = "Christoph Auer", email = "[email protected]" },
]
dependencies = [
"data-prep-toolkit==0.2.2.dev0",
"docling-core==1.2.0",
"docling-ibm-models==1.1.7",
"deepsearch-glm==0.21.0",
"docling==1.11.0",
"filetype >=1.2.0, <2.0.0",
]
dynamic = ["dependencies"]

[build-system]
requires = ["setuptools>=68.0.0", "wheel", "setuptools_scm[toml]>=7.1.0"]
build-backend = "setuptools.build_meta"

[tool.setuptools.dynamic]
dependencies = {file = ["requirements.txt"]}

[project.optional-dependencies]
dev = [
"twine",
Expand Down
6 changes: 6 additions & 0 deletions transforms/language/pdf2parquet/python/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
data-prep-toolkit==0.2.2.dev0
docling-core==1.3.0
docling-ibm-models==1.1.7
deepsearch-glm==0.21.0
docling==1.11.0
filetype >=1.2.0, <2.0.0
1 change: 1 addition & 0 deletions transforms/language/pdf2parquet/ray/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ RUN cd python-transform && pip install ${PIP_INSTALL_EXTRA_ARGS} --no-cache-dir


COPY --chown=ray:users pyproject.toml pyproject.toml
COPY --chown=ray:users requirements.txt requirements.txt
RUN pip install ${PIP_INSTALL_EXTRA_ARGS} --no-cache-dir -e .

# Download models
Expand Down
9 changes: 5 additions & 4 deletions transforms/language/pdf2parquet/ray/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,16 @@ authors = [
{ name = "Michele Dolfi", email = "[email protected]" },
{ name = "Christoph Auer", email = "[email protected]" },
]
dependencies = [
"dpk-pdf2parquet-transform-python==0.2.2.dev0",
"data-prep-toolkit-ray==0.2.2.dev0",
]

dynamic = ["dependencies"]

[build-system]
requires = ["setuptools>=68.0.0", "wheel", "setuptools_scm[toml]>=7.1.0"]
build-backend = "setuptools.build_meta"

[tool.setuptools.dynamic]
dependencies = {file = ["requirements.txt"]}

[project.optional-dependencies]
dev = [
"twine",
Expand Down
7 changes: 7 additions & 0 deletions transforms/language/pdf2parquet/ray/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
dpk-pdf2parquet-transform-python==0.2.2.dev0
data-prep-toolkit-ray==0.2.2.dev0
docling-core==1.3.0
docling-ibm-models==1.1.7
deepsearch-glm==0.21.0
docling==1.11.0
filetype >=1.2.0, <2.0.0

0 comments on commit 0666491

Please sign in to comment.