This README provides instructions on how to set up and execute the pipeline for E&I's Vertex AI, along with configuring and deploying REST API Cloud Functions.
![vertex-ai-image](https://private-user-images.githubusercontent.com/62001944/307623517-2417c1a9-b413-4018-93c8-cfc651106f7e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk2NzEyOTUsIm5iZiI6MTczOTY3MDk5NSwicGF0aCI6Ii82MjAwMTk0NC8zMDc2MjM1MTctMjQxN2MxYTktYjQxMy00MDE4LTkzYzgtY2ZjNjUxMTA2ZjdlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE2VDAxNTYzNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWE0MzZhYWM4OWZkZjExZTcyMDg2ODQ3ZDUwM2RiNTFlMWI1NTEyOGM1ZjgxZTFjMDk4ZThiMzExZjQ3YTU4NmEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.If_FnIknbBrRo2se_u0z-DryyMXBWNn563EHZeA0LHs)
This figure schematizes the process of training and deploying Google's Vertex AI and PaLM 2 large language models (LLMs). It shows the process of starting with training data in JSONL format, training a large language model with Vertex AI, and finally integrating the trained model into a real-world software application.
- JSONL Training Data: This represents the training data. The JSONL format is a text file where each line is a JSON object. Each object contains an input_text and output_text field, providing data for the model to train on.
- Vertex AI: Part of the Google Cloud Platform, this is a service that builds, deploys, and manages large-scale machine learning models. This service is used to train and optimize models using training data you provide.
- PaLM 2 LLM: This refers to Google's large language model, which is used as the base model for "adapter-based fine-tuning". Adapter-based fine-tuning is a technique for fine-tuning a model for a specific task by inserting small additional networks into an existing model.
- Software Component: This represents the software component where the AI model built using Vertex AI and PaLM 2 LLM will actually be implemented. This component can be an AI application, service, client, or subsystem of a system.
![vertex-ai-text](https://private-user-images.githubusercontent.com/62001944/307623516-c86b427a-a569-49aa-baf5-91f3b9f5188e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk2NzEyOTUsIm5iZiI6MTczOTY3MDk5NSwicGF0aCI6Ii82MjAwMTk0NC8zMDc2MjM1MTYtYzg2YjQyN2EtYTU2OS00OWFhLWJhZjUtOTFmM2I5ZjUxODhlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE2VDAxNTYzNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY4ZWRkNmVlMWNjY2U3YzQ4NTZmYjgzN2VhNmMxOGY1N2MxZjc4ZTI2MGU1NDdjMTAyYzVjNzcwNWVlODFhNjgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.mylwJ8tyCI1Ij88rpKwgTNb_u-xLfktY1PhSSbMb-PU)
The figure shows a pipeline for fine-tuning Google's Pathways Language Model2 (PaLM2) language model using Vertex AI on Google Cloud.
- validate_pipeline: This step validates the pipeline to ensure that all required components are set up correctly.
- tuning_graph: This represents the process of tuning the model to a specific task or dataset. This process typically involves tuning hyperparameters to optimize the performance of the model.
- export_managed_dataset: This is the step to export the tuned dataset as a managed dataset. This dataset is used for training or evaluation.
- dataset_encoder: Encodes the dataset and converts it into a format that the model can understand.
- evaluation-dataset-encoder: This is the process of encoding an evaluation dataset, which is used to evaluate how well the model performs.
- vertex-pipelines-prompt: Sets the pipeline prompts for Vertex AI. This can be a step to specify components or parameters for pipeline execution.
- compose-params-for-model: This is the step to configure parameters for the model, determining what settings or hyperparameters the model needs before it can start training.
- large_language_model_tuning: This is the actual tuning of the large language model. In this step, the model is trained for a specific task.
- tensorboard-uploader: Upload the data generated during the training process to TensorBoard to visualize and monitor the training process.
- deployment_graph: A pipeline for deploying the model. In this step, models are deployed to endpoints that users can access.
- Upload-LLM-Model: The process of uploading a large, aligned language model to Vertex AI.
- create-endpoint-and-deploy: Creates an endpoint where the model is available and deploys the model.
Before running this function, make sure you have the following:
- Node.js 18 installed
- Firebase CLI installed
- Firebase project set up
- Firestore Set Up
- Create Document
users
,follows
,notification_logs
- Change Policy
- Create Document
service cloud.firestore {
match /databases/{database}/documents {
match /users/{userId} {
allow read: if true;
allow write: if request.auth != null && request.auth.uid == userId;
}
match /follows/{userId} {
allow read: if true;
allow write: if request.auth != null && request.auth.uid == userId;
}
match /notification_logs/** {
allow read, write: if false;
}
}
}
- Clone this repository
git clone https://github.com/GDSC-DGU/2024-SolutionChallenge-EarthAndI.git
- npm dependences install
# Move to the 'server' directory.
cd ./2024-SolutionChallenge-EarthAndI/server/trigger_and_notification_api
# Move to the `functions` subdirectory.
cd functions
# Install all of the dependencies of the cloud functions.
npm install
# Move to the `Parent Folder` subdirectory.
cd ../
- Select the Firebase project you have created.
firebase use -add
- Execute Emulators
firebase emulators:start --only functions
- Deploy Project
firebase deploy