-
Notifications
You must be signed in to change notification settings - Fork 128
components documentation
-
action_analyzer_correlation_test
Perform correlation test on different groups to generate actions.
-
action_analyzer_identify_problem_traffic
Separate bad queries into different groups.
-
action_analyzer_metrics_calculation
Calculate futher metrics for generating actions.
-
action_analyzer_output_actions
Merge and output actions.
-
Pipeline component for proxy fine-tuning with AOAI
-
Upload data to Azure OpenAI resource, finetune model and delete data
-
Component that kicks off an AutoML job to train a classification model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
Component that kicks off an AutoML job to train a forecasting model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
Inference component for AutoML Forecasting.
-
automl_hts_automl_training_step
-
automl_hts_data_aggregation_step
-
Enables inference for hts components.
-
automl_hts_inference_collect_step
-
automl_hts_inference_setup_step
-
-
Enables AutoML Training for hts components.
-
automl_hts_training_collect_step
-
automl_hts_training_setup_step
-
Component that kicks off an AutoML job to train an image classification model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
automl_image_classification_multilabel
Component that kicks off an AutoML job to train an multilabel image classification model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
automl_image_instance_segmentation
Component that kicks off an AutoML job to train an image instance segmentation model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
Component that kicks off an AutoML job to train an image object detection model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
Inference components for AutoML many model.
-
automl_many_models_inference_collect_step
-
automl_many_models_inference_setup_step
-
automl_many_models_inference_step
-
Enables AutoML many models training.
-
automl_many_models_training_collection_step
-
automl_many_models_training_setup_step
-
automl_many_models_training_step
-
Component that kicks off an AutoML job to train a regression model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
automl_tabular_data_partitioning
Enables dataset partitioning for AutoML many models and hierarchical timeseries solution accelerators using spark.
-
Component that kicks off an AutoML job to train a NLP text classification model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
automl_text_classification_multilabel
Component that kicks off an AutoML job to train a NLP text classification multilabel model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
Component that kicks off an AutoML job to train a NLP NE (Named Entity Recognition) model within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
batch_benchmark_config_generator
Generates the config for the batch score component.
-
Components for batch endpoint inference
-
batch_benchmark_inference_claude
Components for batch endpoint inference
-
batch_benchmark_inference_with_inference_compute
Components for batch endpoint inference with inference compute support.
-
-
Batch deploy a model to a workspace. The component works on compute with MSI attached.
-
Prepare the jsonl file and endpoint for batch inference component.
-
Output Formatter for batch inference output
-
Resource Manager for batch inference.
-
-
-
Component for benchmarking an embedding model via MTEB.
-
Aggregate quality metrics, performance metrics and all of the metadata from the pipeline. Also add them to the root run.
-
chat_completion_datapreprocess
Component to preprocess data for chat completion task. See docs to learn more.
-
Component to finetune Hugging Face pretrained models for chat completion task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
Component to import PyTorch / MLFlow model. See docs to learn more.
-
Pipeline Component to finetune Hugging Face pretrained models for chat completion task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
Calculate model performance metrics, given ground truth and prediction data.
-
Performs performance metric post processing using data from a model inference run.
-
Component converts models from supported frameworks to MLflow model packaging format
-
Delete data file from Azure OpenAI resource
-
Compute data drift metrics given a baseline and a deployment's model data input.
-
Computes the data drift between a baseline and production data assets.
-
Compute data quality metrics leveraged by the data quality monitor.
-
Compute data statistics leveraged by the data quality monitor.
-
Join baseline and target data quality metrics into a single output.
-
Computes the data quality of a target dataset with reference to a baseline.
-
Component to upload user's data from AzureML workspace to Azure OpenAI resource
-
Downloads the dataset onto blob store.
-
Dataset Preprocessor
-
Samples a dataset containing JSONL file(s).
-
Deletes an endpoint resource.
-
Deploy a model to a workspace. The component works on compute with MSI attached.
-
diffusers_text_to_image_dreambooth_pipeline
Pipeline component for text to image dreambooth training using diffusers library and transformers models.
-
diffusers_text_to_image_finetune
Component to finetune stable diffusion models using diffusers for text to image.
-
diffusers_text_to_image_model_import
Import PyTorch / MLflow model
-
Downloads a publicly available model
-
Evaluate MLFlow models for supported task types.
-
Component that export data from uri_file data asset to database within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
feature_attribution_drift_compute_metrics
Feature attribution drift using model monitoring.
-
feature_attribution_drift_signal_monitor
Computes the feature attribution between a baseline and production data assets.
-
Feature importance for model monitoring.
-
Retrieval component to be used to retrieve offline features from feature store.
-
Component to validate the finetune job against Validation Service
-
Component to submit FT job to Azure OpenAI resource
-
Component to validate the finetune job against Validation Service
-
Component to convert the finetune job output to pytorch and mlflow model
-
Filters the raw span log based on the window provided, and aggregates it to trace level.
-
genai_token_statistics_compute_metrics
Compute token statistics metrics.
-
genai_token_statistics_signal_monitor
Computes the token and cost metrics over LLM outputs.
-
generation_safety_quality_signal_monitor
Computes the content generation safety metrics over LLM outputs.
-
gsq_annotation_compute_histogram
Compute annotation histogram given a deployment's model data input.
-
gsq_annotation_compute_metrics
Compute annotation metrics given a deployment's model data input.
-
Adapt data to fit into GSQ component.
-
Command Component that takes in a string input message and prints it out.
-
Pipeline Component that takes in a string input message and passes it to the Hello World Command Component to be printed out.
-
Pipeline component for image classification.
-
Framework selector control flow component for image tasks
-
image_instance_segmentation_pipeline
Pipeline component for image instance segmentation.
-
Model output selector control flow component for image tasks
-
image_object_detection_pipeline
Pipeline component for image object detection.
-
Component that import data from database as mltable data asset within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
Component that import data from external file_system as uri_folder data asset within an Azure Machine Learning pipeline. For more details, you can look at the component documentation here (Preview).
-
Import a model into a workspace or a registry
-
Inference Postprocessor
-
llm_dbcopilot_create_promptflow
-
-
-
llm_dbcopilot_grounding_ground_samples
-
llm_ingest_dataset_to_acs_basic
Single job pipeline to chunk data from AzureML data asset, and create ACS embeddings index
-
llm_ingest_dataset_to_acs_user_id
Single job pipeline to chunk data from AzureML data asset, and create ACS embeddings index
-
llm_ingest_dataset_to_faiss_basic
Single job pipeline to chunk data from AzureML data asset, and create FAISS embeddings index
-
llm_ingest_dataset_to_faiss_user_id
Single job pipeline to chunk data from AzureML data asset, and create FAISS embeddings index
-
Single job pipeline to chunk data from AzureML sql data store, and create ACS embeddings index
-
Single job pipeline to chunk data from AzureML sql data store, and create FAISS embeddings index
-
Single job pipeline to chunk data from AzureML DB Datastore and create acs embeddings index
-
llm_ingest_dbcopilot_faiss_e2e
Single job pipeline to chunk data from AzureML DB Datastore and create faiss embeddings index
-
Creates chunks no larger than
chunk_size
frominput_data
, extracted document titles are prepended to each chunk
LLM models have token limits for the prompts passed to them, this is a limiting factor at embedding time and even more limiting at prompt completion time as only so much context ca...
-
llm_rag_crack_and_chunk_and_embed
Creates chunks no larger than
chunk_size
frominput_data
, extracted document titles are prepended to each chunk
LLM models have token limits for the prompts passed to them, this is a limiting factor at embedding time and even more limiting at prompt completion time as only so much context ca...
-
llm_rag_crack_chunk_embed_index_and_register
Creates chunks no larger than
chunk_size
frominput_data
, extracted document titles are prepended to each chunk\n\n
LLM models have token limits for the prompts passed to them, this is a limiting factor at embedding time and even more limiting at prompt completion time as only so much contex...
-
Crawls the given URL and nested links to
max_crawl_depth
. Data is stored tooutput_path
. -
Creates a FAISS index from embeddings. The index will be saved to the output folder. The index will be registered as a Data Asset named
asset_name
ifregister_output
is set toTrue
. -
This component is used to create a RAG flow based on your mlindex data and best prompts. The flow will look into your indexed data and give answers based on your own data context. The flow also provides the capability to bulk test with any built-in or custom evaluation flows.
-
Collects documents from Azure Cognitive Search Index, extracts their contents, saves them to a uri folder, and creates an MLIndex yaml file to represent the search index.
Documents collected can then be used in other components without having to query the ACS index again, allowing for a consiste...
-
Generates embeddings vectors for data chunks read from
chunks_source
.
chunks_source
is expected to contain csv
files containing two columns:
- "Chunk" - Chunk of text to be embedded
- "Metadata" - JSON object containing metadata for the chunk
If embeddings_container
is supplied, input c...
-
llm_rag_generate_embeddings_parallel
Generates embeddings vectors for data chunks read from
chunks_source
.
chunks_source
is expected to contain csv
files containing two columns:
- "Chunk" - Chunk of text to be embedded
- "Metadata" - JSON object containing metadata for the chunk
If previous_embeddings
is supplied, input ch...
-
Clones a git repository to output_data path
-
Embeds input images and stores it in Azure Cognitive Search index with metadata using Florence embedding resource. MLIndex is stored to
output_path
. -
Generates a test dataset of questions and answers based on the input documents.
A chunk of text is read from each input document and sent to the specified LLM with a prompt to create a question and answer based on that text. These question, answer, and context sets are saved as either a csv or j...
-
llm_rag_register_mlindex_asset
Registers a MLIndex yaml and supporting files as an AzureML data asset
-
llm_rag_register_qa_data_asset
Registers a QA data csv or json and supporting files as an AzureML data asset
-
Uploads
embeddings
into Azure Cognitive Search instance specified inacs_config
. The Index will be created if it doesn't exist.
The Index will have the following fields populated:
-
"id", String, key=True
-
"content", String
-
"contentVector", Collection(Single)
-
"category", String
-
"url",...
-
llm_rag_update_cosmos_mongo_vcore_index
Uploads
embeddings
into Azure Cosmos Mongo vCore collection/index specified inazure_cosmos_mongo_vcore_config
. The collection/index will be created if it doesn't exist.
The collection/index will have the following fields populated:
-
"_id", String, key=True
-
"content", String
-
"contentVec...
-
Uploads
embeddings
into Milvus collection/index specified inmilvus_config
. The collection/index will be created if it doesn't exist.
The collection/index will have the following fields populated:
-
"id", String, key=True
-
"content", String
-
"contentVector", Collection(Single)
-
"url", Str...
-
Uploads
embeddings
into Pinecone index specified inpinecone_config
. The Index will be created if it doesn't exist.
Each record in the Index will have the following metadata populated:
-
"id", String
-
"content", String
-
"url", String
-
"filepath", String
-
"title", String
-
"metadata_json_...
-
Validates that completion model, embedding model, and Azure Cognitive Search resource deployments is successful and connections works. For default AOAI, it attempts to create the deployments if not valid or present. This validation is done only if customer is using Azure Open AI models or creatin...
-
microsoft_azureml_rai_tabular_causal
Add Causal to RAI Insights Dashboard Learn More
-
microsoft_azureml_rai_tabular_counterfactual
Add Counterfactuals to RAI Insights Dashboard Learn More
-
microsoft_azureml_rai_tabular_erroranalysis
Add Error Analysis to RAI Insights Dashboard Learn More
-
microsoft_azureml_rai_tabular_explanation
Add Explanation to RAI Insights Dashboard Learn More
-
microsoft_azureml_rai_tabular_insight_constructor
RAI Insights Dashboard Constructor Learn More
-
microsoft_azureml_rai_tabular_insight_gather
Gather RAI Insights Dashboard Learn More
-
microsoft_azureml_rai_tabular_score_card
Generate rai insight score card Learn More
-
Validates if a MLFLow model can be loaded on a compute and is usable for inferencing.
-
mmdetection_image_objectdetection_instancesegmentation_finetune
Component to finetune MMDetection models for image object detection and instance segmentation.
-
mmdetection_image_objectdetection_instancesegmentation_model_import
Import PyTorch / MLflow model
-
mmdetection_image_objectdetection_instancesegmentation_pipeline
Pipeline component for image object detection and instance segmentation using MMDetection models.
-
mmtracking_video_multi_object_tracking_finetune
Component to finetune MMTracking models for video multi-object tracking task.
-
mmtracking_video_multi_object_tracking_model_import
Import PyTorch / MLflow model
-
mmtracking_video_multi_object_tracking_pipeline
Pipeline component for multi-object tracking using MMTracking models.
-
model_data_collector_preprocessor
Filters the data based on the window provided.
-
Pipeline component for model evaluation for supported tasks. \ Generates predictions on a given model, followed by computing model performance metrics to score the model quality for supported tasks.
-
Generate and output actions to the default datastore.
-
Generate and output actions
-
model_monitor_azmon_metric_publisher
Azure Monitor Publisher for the computed model monitor metrics.
-
model_monitor_compute_histogram
Compute a histogram given an input data and associated histogram buckets.
-
model_monitor_compute_histogram_buckets
Compute histogram buckets given up to two datasets.
-
Creates the model monitor metric manifest.
-
Joins two data assets on the given columns for model monitor.
-
model_monitor_evaluate_metrics_threshold
Evaluate signal metrics against the threshold provided in the monitoring signal.
-
model_monitor_feature_selector
Selects features to compute signal metrics on.
-
model_monitor_metric_outputter
Output the computed model monitor metrics.
-
Output the computed model monitor metrics to the default datastore.
-
model_performance_compute_metrics
Compute model performance metrics leveraged by the model performance monitor.
-
model_performance_signal_monitor
Computes the model performance
-
Generate predictions on a given mlflow model for supported tasks.
-
model_prediction_with_container
Optimized Distributed inference component for LLMs.
-
multimodal_classification_datapreprocessing
Component to preprocess data for multimodal classification task
-
multimodal_classification_finetune
Component to finetune multimodal models for classification using MMEFT
-
multimodal_classification_model_import
Import PyTorch / MLflow model
-
multimodal_classification_pipeline
Pipeline component for multimodal classification models.
-
-
nlp_multiclass_datapreprocessing
Component to preprocess data for automl nlp multiclass classification task
-
nlp_multilabel_datapreprocessing
Component to preprocess data for automl nlp multilabel classification task
-
Component to preprocess data for automl nlp ner task
-
nlp_textclassification_multiclass
Pipeline component for AutoML NLP Multiclass Text classification
-
nlp_textclassification_multilabel
Pipeline component for AutoML NLP Multilabel Text classification
-
Pipeline component for AutoML NLP NER
-
Finetune your own OAI model. Visit https://learn.microsoft.com/en-us/azure/cognitive-services/openai/ for more info.
-
openai_completions_finetune_pipeline
Finetune your own OAI model. Visit https://learn.microsoft.com/en-us/azure/cognitive-services/openai/ for more info.
-
FTaaS component to finetune model for Chat Completion task
-
FTaaS Pipeline component for chat completion
-
oss_distillation_batchscoring_datagen_pipeline
Component to generate data from teacher model endpoint by invoking it in batch.
-
oss_distillation_data_generation_batch_scoring_selector
Component to select the Batch Scoring Selector based on the task type
-
oss_distillation_data_generation_file_selector
Component to select the Batch Scoring Selector based on the task type
-
oss_distillation_data_generation_validation_file_checker
Component to Check if the validation file is present or not
-
oss_distillation_generate_data
Component to generate data from teacher model enpoint
-
oss_distillation_generate_data_batch_postprocess
Component to prepare data returned from teacher model enpoint in batch
-
oss_distillation_generate_data_batch_preprocess
Component to prepare data to invoke teacher model enpoint in batch
-
Component to generate data from teacher model enpoint and finetune student model on generated dataset
-
oss_distillation_seq_scoring_pipeline
Component to generate data from teacher model enpoint(sequentially) and finetune student model on generated dataset
-
oss_distillation_validate_pipeline
Component to validate inputs to the distillation pipeline
-
oss_text_generation_data_import
FTaaS component to copy user training data to output
-
FTaaS component to finetune model for Text Generation task
-
FTaaS Pipeline component for text generation
-
prediction_drift_signal_monitor
Computes the prediction drift between a baseline and a target data assets.
-
This component is used to create prompts from a given dataset. From a given jinja prompt template, it will generate prompts. It can also create few-shot prompts given a few-shot dataset and the number of shots.
-
question_answering_datapreprocess
Component to preprocess data for question answering task. See docs to learn more.
-
Component to finetune Hugging Face pretrained models for extractive question answering task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
question_answering_model_import
Component to import PyTorch / MLFlow model. See docs to learn more.
-
Pipeline Component to finetune Hugging Face pretrained models for extractive question answering task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
-
-
Register a model to a workspace or a registry. The component works on compute with MSI attached.
-
Component to preprocess data for summarization task. See docs to learn more.
-
Component to finetune Hugging Face pretrained models for summarization task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
Component to import PyTorch / MLFlow model. See docs to learn more.
-
Pipeline Component to finetune Hugging Face pretrained models for summarization task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
text_classification_datapreprocess
Component to preprocess data for single label classification task. See docs to learn more.
-
Component to finetune Hugging Face pretrained models for text classification task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
text_classification_model_import
Component to import PyTorch / MLFlow model. See docs to learn more.
-
Pipeline component to finetune Hugging Face pretrained models for text classification task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
text_generation_datapreprocess
Component to preprocess data for text generation task
-
Component to finetune model for Text Generation task
-
Import PyTorch / MLFlow model
-
Pipeline component for text generation
-
text_generation_pipeline_singularity_basic_high
Pipeline component for text generation
-
text_generation_pipeline_singularity_basic_low
Pipeline component for text generation
-
text_generation_pipeline_singularity_basic_medium
Pipeline component for text generation
-
text_generation_pipeline_singularity_premium_high
Pipeline component for text generation
-
text_generation_pipeline_singularity_premium_low
Pipeline component for text generation
-
text_generation_pipeline_singularity_premium_medium
Pipeline component for text generation
-
text_generation_pipeline_singularity_standard_high
Pipeline component for text generation
-
text_generation_pipeline_singularity_standard_low
Pipeline component for text generation
-
text_generation_pipeline_singularity_standard_medium
Pipeline component for text generation
-
token_classification_datapreprocess
Component to preprocess data for token classification task. See docs to learn more.
-
Component to finetune Hugging Face pretrained models for token classification task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
token_classification_model_import
Component to import PyTorch / MLFlow model. See docs to learn more.
-
Pipeline component to finetune Hugging Face pretrained models for token classification task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
token_statistics_compute_metrics
Compute token statistics metrics.
-
train_image_classification_model
Component to finetune AutoML legacy models for image classification.
-
train_instance_segmentation_model
Component to finetune AutoML legacy models for instance segmentation.
-
Component to finetune AutoML legacy models for object detection.
-
transformers_image_classification_finetune
Component to finetune HuggingFace transformers models for image classification.
-
transformers_image_classification_model_import
Import PyTorch / MLflow model
-
transformers_image_classification_pipeline
Pipeline component for image classification using HuggingFace transformers models.
-
Component to preprocess data for translation task. See docs to learn more.
-
Component to finetune Hugging Face pretrained models for translation task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
Component to import PyTorch / MLFlow model. See docs to learn more.
-
Pipeline component to finetune Hugging Face pretrained models for translation task. The component supports optimizations such as LoRA, Deepspeed and ONNXRuntime for performance enhancement. See docs to learn more.
-
Component for enabling validation of import pipeline.
-
validation_trigger_model_evaluation
Component for enabling validation of model evaluation pipeline.