Skip to content

Commit

Permalink
merge kfp 2.0.3
Browse files Browse the repository at this point in the history
  • Loading branch information
Tomcli committed Oct 27, 2023
2 parents e735b67 + 58ce09e commit c697471
Show file tree
Hide file tree
Showing 124 changed files with 6,666 additions and 1,037 deletions.
42 changes: 42 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,47 @@
# Changelog

### [2.0.3](https://github.com/kubeflow/pipelines/compare/2.0.2...2.0.3) (2023-10-27)


### Features

* **backend:** Support consuming parent DAG input artifact ([\#10162](https://github.com/kubeflow/pipelines/issues/10162)) ([52f5cf5](https://github.com/kubeflow/pipelines/commit/52f5cf51c4a6c233aae57125561c0fc95c4fd20f))
* **backend:** Update driver and launcher images ([\#10164](https://github.com/kubeflow/pipelines/issues/10164)) ([c0093ec](https://github.com/kubeflow/pipelines/commit/c0093ecef6bc5f056efa135d019267327115d79d))
* **components:** [endpoint_batch_predict] Initialize component ([0d75611](https://github.com/kubeflow/pipelines/commit/0d7561199751e83b4d7e1603c3d32d4088a7e208))
* **components:** [text2sql] Generate column names by model batch predict ([1bee8be](https://github.com/kubeflow/pipelines/commit/1bee8be071a91f44c0129837c381863327cb337d))
* **components:** [text2sql] Generate table names by model batch prediction ([ebb4245](https://github.com/kubeflow/pipelines/commit/ebb42450d0b07eaa8de35a3f6b70eacb5f26f0d8))
* **components:** [text2sql] Implement preprocess component logic ([21079b5](https://github.com/kubeflow/pipelines/commit/21079b5910e597a38b67853f3ecfb3929344371e))
* **components:** [text2sql] Initialize preprocess component and integrate with text2sql pipeline ([9aa750e](https://github.com/kubeflow/pipelines/commit/9aa750e62f6e225d037ecdda9bf7cab95f05675d))
* **components:** [text2sql] Initialize evaluation component ([ea93979](https://github.com/kubeflow/pipelines/commit/ea93979eed02e131bd20180da149b9465670dfe1))
* **components:** [text2sql] Initialize validate and process component ([633ddeb](https://github.com/kubeflow/pipelines/commit/633ddeb07e9212d2e373dba8d20a0f6d67ab037d))
* **components:** Add ability to preprocess chat llama datasets to `_implementation.llm.chat_dataset_preprocessor` ([99fd201](https://github.com/kubeflow/pipelines/commit/99fd2017a76660f30d0a04b71542cbef45783633))
* **components:** Add question_answer support for AutoSxS default instructions ([412216f](https://github.com/kubeflow/pipelines/commit/412216f832a848bfc61ce289aed819d7f2860fdd))
* **components:** Add sliced evaluation metrics support for custom and unstructured AutoML models in evaluation feature attribution pipeline ([d8a0660](https://github.com/kubeflow/pipelines/commit/d8a0660df525f5695015e507e981bceff836dd3d))
* **components:** Add sliced evaluation metrics support for custom and unstructured AutoML models in evaluation pipeline ([0487f9a](https://github.com/kubeflow/pipelines/commit/0487f9a8b1d8ab0d96d757bd4b598ffd353ecc81))
* **components:** add support for customizing model_parameters in LLM eval text generation and LLM eval text classification pipelines ([d53ddda](https://github.com/kubeflow/pipelines/commit/d53dddab1c8a042e58e06ff6eb38be82fefddb0a))
* **components:** Make `model_checkpoint` optional for `preview.llm.infer_pipeline` ([e8fb699](https://github.com/kubeflow/pipelines/commit/e8fb6990dfdf036c941c522f9b384ff679b38ca6))
* **components:** migrate `DataflowFlexTemplateJobOp` to GA namespace (now `v1.dataflow.DataflowFlexTemplateJobOp`) ([faba922](https://github.com/kubeflow/pipelines/commit/faba9223ee846d459f7bb497a6faa3c153dcf430))
* **components:** Set display names for SFT, RLHF and LLM inference pipelines ([1386a82](https://github.com/kubeflow/pipelines/commit/1386a826ba2bcdbc19eb2007ca43f6acd1031e4d))
* **components:** Support service account in kubeflow model_batch_predict component ([1682ce8](https://github.com/kubeflow/pipelines/commit/1682ce8adeb2c55a155588eae7492b2f0a8b783a))
* **components:** Update image tag used by llm pipelines ([4d71fda](https://github.com/kubeflow/pipelines/commit/4d71fdac3fc92dd4d54c6be3a28725667b8f3c5e))
* **sdk:** support a Pythonic artifact authoring style ([\#9932](https://github.com/kubeflow/pipelines/issues/9932)) ([8d00d0e](https://github.com/kubeflow/pipelines/commit/8d00d0eb9a1442ed994b6a90acea88604efc6423))
* **sdk:** support collecting outputs from conditional branches using `dsl.OneOf` ([\#10067](https://github.com/kubeflow/pipelines/issues/10067)) ([2d3171c](https://github.com/kubeflow/pipelines/commit/2d3171cbfec626055e59b8a58ce83fb54ecad113))


### Bug Fixes

* **components:** [text2sql] Turn model_inference_results_path to model_inference_results_directory and remove duplicate comment ([570e56d](https://github.com/kubeflow/pipelines/commit/570e56dd09af32e173cf041eed7497e4533ec186))
* **frontend:** Replace twitter artifactory endpoint with npm endpoint. ([\#10099](https://github.com/kubeflow/pipelines/issues/10099)) ([da6a360](https://github.com/kubeflow/pipelines/commit/da6a3601468282c0592eae8e89a3d97b982e2d43))
* **sdk:** fix bug when `dsl.importer` argument is provided by loop variable ([\#10116](https://github.com/kubeflow/pipelines/issues/10116)) ([73d51c8](https://github.com/kubeflow/pipelines/commit/73d51c8a23afad97efb6d7e7436c081fa22ce24d))
* **sdk:** Fix OOB for IPython and refactor. Closes [\#10075](https://github.com/kubeflow/pipelines/issues/10075). ([\#10094](https://github.com/kubeflow/pipelines/issues/10094)) ([c903271](https://github.com/kubeflow/pipelines/commit/c9032716ab2013df56cb1078a703d48ed8e36fb4))
* **sdk:** type annotation for client credentials ([\#10158](https://github.com/kubeflow/pipelines/issues/10158)) ([02e00e8](https://github.com/kubeflow/pipelines/commit/02e00e8439e9753dbf82856ac9c5a7cec8ce3243))


### Other Pull Requests

* feat(components) Extend kserve component ([\#10136](https://github.com/kubeflow/pipelines/issues/10136)) ([2054b7c](https://github.com/kubeflow/pipelines/commit/2054b7c45d4831c787115563c8be0048abcb9be1))
* No public description ([0e240db](https://github.com/kubeflow/pipelines/commit/0e240db39799cb0afbd8c7f982ffdd4f9eb58121))

### [2.0.2](https://github.com/kubeflow/pipelines/compare/2.0.0...2.0.2) (2023-10-11)


Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.0.2
2.0.3
4 changes: 2 additions & 2 deletions backend/api/v1beta1/python_http_client/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ This file contains REST API specification for Kubeflow Pipelines. The file is au

This Python package is automatically generated by the [OpenAPI Generator](https://openapi-generator.tech) project:

- API version: 2.0.2
- Package version: 2.0.2
- API version: 2.0.3
- Package version: 2.0.3
- Build package: org.openapitools.codegen.languages.PythonClientCodegen
For more information, please visit [https://www.google.com](https://www.google.com)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

from __future__ import absolute_import

__version__ = "2.0.2"
__version__ = "2.0.3"

# import apis into sdk package
from kfp_server_api.api.experiment_service_api import ExperimentServiceApi
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def __init__(self, configuration=None, header_name=None, header_value=None,
self.default_headers[header_name] = header_value
self.cookie = cookie
# Set default User-Agent.
self.user_agent = 'OpenAPI-Generator/2.0.2/python'
self.user_agent = 'OpenAPI-Generator/2.0.3/python'
self.client_side_validation = configuration.client_side_validation

def __enter__(self):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -351,8 +351,8 @@ def to_debug_report(self):
return "Python SDK Debug Report:\n"\
"OS: {env}\n"\
"Python Version: {pyversion}\n"\
"Version of the API: 2.0.2\n"\
"SDK Package Version: 2.0.2".\
"Version of the API: 2.0.3\n"\
"SDK Package Version: 2.0.3".\
format(env=sys.platform, pyversion=sys.version)

def get_host_settings(self):
Expand Down
2 changes: 1 addition & 1 deletion backend/api/v1beta1/python_http_client/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from setuptools import setup, find_packages # noqa: H301

NAME = "kfp-server-api"
VERSION = "2.0.2"
VERSION = "2.0.3"
# To install the library, run the following
#
# python setup.py install
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"swagger": "2.0",
"info": {
"title": "Kubeflow Pipelines API",
"version": "2.0.2",
"version": "2.0.3",
"description": "This file contains REST API specification for Kubeflow Pipelines. The file is autogenerated from the swagger definition.",
"contact": {
"name": "google",
Expand Down
4 changes: 2 additions & 2 deletions backend/api/v2beta1/python_http_client/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ This file contains REST API specification for Kubeflow Pipelines. The file is au

This Python package is automatically generated by the [OpenAPI Generator](https://openapi-generator.tech) project:

- API version: 2.0.2
- Package version: 2.0.2
- API version: 2.0.3
- Package version: 2.0.3
- Build package: org.openapitools.codegen.languages.PythonClientCodegen
For more information, please visit [https://www.google.com](https://www.google.com)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

from __future__ import absolute_import

__version__ = "2.0.2"
__version__ = "2.0.3"

# import apis into sdk package
from kfp_server_api.api.auth_service_api import AuthServiceApi
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def __init__(self, configuration=None, header_name=None, header_value=None,
self.default_headers[header_name] = header_value
self.cookie = cookie
# Set default User-Agent.
self.user_agent = 'OpenAPI-Generator/2.0.2/python'
self.user_agent = 'OpenAPI-Generator/2.0.3/python'
self.client_side_validation = configuration.client_side_validation

def __enter__(self):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -351,8 +351,8 @@ def to_debug_report(self):
return "Python SDK Debug Report:\n"\
"OS: {env}\n"\
"Python Version: {pyversion}\n"\
"Version of the API: 2.0.2\n"\
"SDK Package Version: 2.0.2".\
"Version of the API: 2.0.3\n"\
"SDK Package Version: 2.0.3".\
format(env=sys.platform, pyversion=sys.version)

def get_host_settings(self):
Expand Down
2 changes: 1 addition & 1 deletion backend/api/v2beta1/python_http_client/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from setuptools import setup, find_packages # noqa: H301

NAME = "kfp-server-api"
VERSION = "2.0.2"
VERSION = "2.0.3"
# To install the library, run the following
#
# python setup.py install
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"swagger": "2.0",
"info": {
"title": "Kubeflow Pipelines API",
"version": "2.0.2",
"version": "2.0.3",
"description": "This file contains REST API specification for Kubeflow Pipelines. The file is autogenerated from the swagger definition.",
"contact": {
"name": "google",
Expand Down
4 changes: 2 additions & 2 deletions backend/src/v2/compiler/argocompiler/argo.go
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,8 @@ func Compile(jobArg *pipelinespec.PipelineJob, kubernetesSpecArg *pipelinespec.S
wf: wf,
templates: make(map[string]*wfapi.Template),
// TODO(chensun): release process and update the images.
driverImage: "gcr.io/ml-pipeline/kfp-driver@sha256:fa68f52639b4f4683c9f8f468502867c9663823af0fbcff1cbe7847d5374bf5c",
launcherImage: "gcr.io/ml-pipeline/kfp-launcher@sha256:6641bf94acaeec03ee7e231241800fce2f0ad92eee25371bd5248ca800a086d7",
driverImage: "gcr.io/ml-pipeline/kfp-driver@sha256:8e60086b04d92b657898a310ca9757631d58547e76bbbb8bfc376d654bef1707",
launcherImage: "gcr.io/ml-pipeline/kfp-launcher@sha256:50151a8615c8d6907aa627902dce50a2619fd231f25d1e5c2a72737a2ea4001e",
job: job,
spec: spec,
executors: deploy.GetExecutors(),
Expand Down
16 changes: 14 additions & 2 deletions backend/src/v2/driver/driver.go
Original file line number Diff line number Diff line change
Expand Up @@ -768,7 +768,11 @@ func resolveInputs(ctx context.Context, dag *metadata.DAG, iterationIndex *int,
if err != nil {
return nil, err
}
glog.Infof("parent DAG input parameters %+v", inputParams)
inputArtifacts, err := mlmd.GetInputArtifactsByExecutionID(ctx, dag.Execution.GetID())
if err != nil {
return nil, err
}
glog.Infof("parent DAG input parameters: %+v, artifacts: %+v", inputParams, inputArtifacts)
inputs = &pipelinespec.ExecutorInput_Inputs{
ParameterValues: make(map[string]*structpb.Value),
Artifacts: make(map[string]*pipelinespec.ArtifactList),
Expand Down Expand Up @@ -998,7 +1002,15 @@ func resolveInputs(ctx context.Context, dag *metadata.DAG, iterationIndex *int,
}
switch t := artifactSpec.Kind.(type) {
case *pipelinespec.TaskInputsSpec_InputArtifactSpec_ComponentInputArtifact:
return nil, artifactError(fmt.Errorf("component input artifact not implemented yet"))
inputArtifactName := artifactSpec.GetComponentInputArtifact()
if inputArtifactName == "" {
return nil, artifactError(fmt.Errorf("component input artifact key is empty"))
}
v, ok := inputArtifacts[inputArtifactName]
if !ok {
return nil, artifactError(fmt.Errorf("parent DAG does not have input artifact %s", inputArtifactName))
}
inputs.Artifacts[name] = v

case *pipelinespec.TaskInputsSpec_InputArtifactSpec_TaskOutputArtifact:
taskOutput := artifactSpec.GetTaskOutputArtifact()
Expand Down
11 changes: 9 additions & 2 deletions components/google-cloud/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,14 @@ RUN pip3 install -U google-cloud-storage
RUN pip3 install -U google-api-python-client

# Required by dataflow_launcher
RUN pip3 install -U "apache_beam[gcp]"
# Pin to `2.50.0` for compatibility with `google-cloud-aiplatform`, which
# depends on `shapely<3.0.0dev`.
# Prefer an exact pin, since GCPC's apache_beam version must match the
# version the in custom Dataflow worker images for the Dataflow job to succeed.
# Inexact pins risk that the apache_beam in GCPC drifts away from a
# user-specified version in the image.
# From docs: """When running your pipeline, launch the pipeline using the Apache Beam SDK with the same version and language version as the SDK on your custom container image. This step avoids unexpected errors from incompatible dependencies or SDKs.""" https://cloud.google.com/dataflow/docs/guides/using-custom-containers#before_you_begin_2
RUN pip3 install -U "apache_beam[gcp]==2.50.0"

# Required for sklearn/train_test_split_jsonl
RUN pip3 install -U "fsspec>=0.7.4" "gcsfs>=0.6.0" "pandas<=1.3.5" "scikit-learn<=1.0.2"
Expand All @@ -37,7 +44,7 @@ RUN pip3 install -U "fsspec>=0.7.4" "gcsfs>=0.6.0" "pandas<=1.3.5" "scikit-learn
RUN pip3 install -U google-cloud-notebooks

# Install main package
RUN pip3 install "git+https://github.com/kubeflow/pipelines.git@google-cloud-pipeline-components-2.4.1#egg=google-cloud-pipeline-components&subdirectory=components/google-cloud"
RUN pip3 install "git+https://github.com/kubeflow/pipelines.git@google-cloud-pipeline-components-2.5.0#egg=google-cloud-pipeline-components&subdirectory=components/google-cloud"

# Note that components can override the container entry ponint.
ENTRYPOINT ["python3","-m","google_cloud_pipeline_components.container.v1.aiplatform.remote_runner"]
12 changes: 12 additions & 0 deletions components/google-cloud/RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,19 @@
## Upcoming release

## Release 2.5.0
* Upload tensorboard metrics from `preview.llm.rlhf_pipeline` if a `tensorboard_resource_id` is provided at runtime.
* Support `incremental_train_base_model`, `parent_model`, `is_default_version`, `model_version_aliases`, `model_version_description` in `AutoMLImageTrainingJobRunOp`.
* Add `preview.automl.vision` and `DataConverterJobOp`.
* Set display names for `preview.llm` pipelines.
* Add sliced evaluation metrics support for custom and unstructured AutoML models in evaluation pipeline and evaluation pipeline with feature attribution.
* Support `service_account` in `ModelBatchPredictOp`.
* Release `DataflowFlexTemplateJobOp` to GA namespace (`v1.dataflow.DataflowFlexTemplateJobOp`).
* Make `model_checkpoint` optional for `preview.llm.infer_pipeline`. If not provided, the base model associated with the `large_model_reference` will be used.
* Bump `apache_beam[gcp]` version in GCPC container image from `<2.34.0` to `==2.50.0` for compatibility with `google-cloud-aiplatform`, which depends on `shapely<3.0.0dev`. Note: upgrades to `google-cloud-pipeline-components`>=2.5.0 and later may require using a Dataflow worker image with `apache_beam==2.50.0`.
* Apply latest GCPC image vulnerability resolutions (base OS and software updates)
* Add support for customizing model_parameters (maxOutputTokens, topK, topP, and
temperature) in LLM eval text generation and LLM eval text classification
pipelines.

## Release 2.4.1
* Disable caching for LLM pipeline tasks that store temporary artifacts.
Expand Down
5 changes: 5 additions & 0 deletions components/google-cloud/docs/source/versions.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
[
{
"version": "https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.5.0",
"title": "2.5.0",
"aliases": []
},
{
"version": "https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.4.1",
"title": "2.4.1",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,18 @@
# See the License for the specific language governing permissions and
# limitations under the License.
"""Google Cloud Pipeline Components."""
from google_cloud_pipeline_components.version import __version__
import sys
import warnings

if sys.version_info < (3, 8):
warnings.warn(
(
'Python 3.7 has reached end-of-life. Google Cloud Pipeline Components'
' will drop support for Python 3.7 on April 23, 2024. To use new'
' versions of the KFP SDK after that date, you will need to upgrade'
' to Python >= 3.8. See https://devguide.python.org/versions/ for'
' more details.'
),
FutureWarning,
stacklevel=2,
)
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@


def get_private_image_tag() -> str:
return os.getenv('PRIVATE_IMAGE_TAG', '20230918_1327_RC00')
return os.getenv('PRIVATE_IMAGE_TAG', '20231010_1107_RC00')


def get_use_test_machine_spec() -> bool:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,15 @@ def resolve_reference_model_metadata(
reward_model_path='gs://vertex-rlhf-restricted/pretrained_models/palm/t5x_otter_pretrain/',
is_supported=True,
),
'chat-bison@001': reference_model_metadata(
large_model_reference='BISON',
reference_model_path=(
'gs://vertex-rlhf-restricted/pretrained_models/palm/t5x_bison/'
),
reward_model_reference='OTTER',
reward_model_path='gs://vertex-rlhf-restricted/pretrained_models/palm/t5x_otter_pretrain/',
is_supported=True,
),
'elephant': reference_model_metadata(
large_model_reference='ELEPHANT',
reference_model_path=(
Expand Down Expand Up @@ -356,9 +365,14 @@ def generate_default_instruction(
task = task.lower()
if task == 'summarization':
return f'Summarize in less than {target_sequence_length} words.'

elif task == 'question_answer':
return f'Answer the question in less than {target_sequence_length} words.'

else:
raise ValueError(
f'Task not recognized: {task}. Supported tasks are: summarization.'
f'Task not recognized: {task}. Supported tasks are: "summarization",'
' "question_answer".'
)


Expand Down Expand Up @@ -456,3 +470,22 @@ def resolve_upload_model(large_model_reference: str) -> bool:
if large_model_reference in supported_models:
return True
return False


@dsl.component(base_image=_image.GCPC_IMAGE_TAG, install_kfp_package=False)
def resolve_instruction(
large_model_reference: str, instruction: Optional[str] = None
) -> str:
"""Resolves the instruction to use for a given reference model.
Args:
large_model_reference: Base model tuned by the pipeline.
instruction: Instruction provided at runtime.
Returns:
Instruction to use during tokenization based on model type. Returns an empty
string for chat models because the instruction is prepended as the default
context. Otherwise the original instruction is returned.
"""
instruction = instruction or ''
return instruction if 'chat' not in large_model_reference.lower() else ''
Loading

0 comments on commit c697471

Please sign in to comment.