Added the Application:codegen based on the LLM-on-Ray Service (#191)

* Added the Application:codegen based on the LLM-on-Ray Service * Refine import and remove some unnecessary code * Merge main and add application path in CI config. * Remove duplicate file * [Habana] update habana docker image (#196) * update habana docker image * update * [Inference] integrate deepseek-coder-33b-instruct. (#190) * Support new mode: deepseek-coder-33b-instruct. * Update Action config for deepseek-coder-33b-instruct. * Small fix * Config device use lowercase letters * [TEST] Add better query test (#109) * first commit query_http * fix format * fix proxy * add martix * add openai test case * change to githubci * add * add * add * add * add * add * add * add * add * add * add * add * add * add * add * github ci * only gpt2 * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * fix openai * f## * change to github ci * change to github ci * change to github ci * change to github ci * change to github ci * change to dare&docker * change to dare&docker * change to dare&docker * change to dare&docker * change to dare&docker * change to docker * change to docker * change to docker * ls * ls * test ls * test ls * test ls * test ls * test ls * Organize code * add easy error case * update openai * merge main & add temp=0 * more test * test llama * test llama * test llama * More appropriate error detection * delete redundant * delete redundant * fix ipex error * fix muti serve * fix openai * test build error * fix build error * re * fix output message * use list instead dict * rebase main * Add a readme for application Gluten UDF converter AI chatbot. * refine the text2sql * Add license header. * refine the text2sql embedding mode * Separate front-end and back-end code for application Gluten_Coder_Chatbot_V2 * Renamed folder for application Gluten_Coder_Chatbot_V2 * refine the text2sql with fastapi * Format files * Add license header * Fix proxy for ci --------- Signed-off-by: Yao Qing <[email protected]> Co-authored-by: harborn <[email protected]> Co-authored-by: yutianchen <[email protected]> Co-authored-by: tianyil1 <[email protected]>
intel · May 13, 2024 · db68b65 · db68b65
1 parent a61d89f
commit db68b65
Show file tree

Hide file tree

Showing 16 changed files with 1,091 additions and 76 deletions.
diff --git a/.github/workflows/workflow_orders_on_pr.yml b/.github/workflows/workflow_orders_on_pr.yml
@@ -7,7 +7,6 @@ on:
     paths:
       - '**'
       - '!*.md'
-
 jobs:
 
   Lint:

diff --git a/application/pages/1_LANGCHAIN_2_text_to_sql.py b/application/pages/1_LANGCHAIN_2_text_to_sql.py
@@ -14,89 +14,33 @@
 # limitations under the License.
 #
 
-import os
+import json
+import time
 import utils
+import requests
 import streamlit as st
 from streaming import StreamHandler
 
-from langchain.chat_models import ChatOpenAI
-from langchain.chains import ConversationChain
-from langchain_community.vectorstores import FAISS
-from langchain_community.embeddings import HuggingFaceEmbeddings
-
 st.set_page_config(page_title="SQL_Chatbot", page_icon="💬")
 st.header("SQL Chatbot")
 st.write("Allows users to interact with the LLM")
 
 
-def generate_prompt(question, schema):
-    prompt = """### Instructions:
-Your task is convert a question into a SQL query, given a MySQL database schema.
-Adhere to these rules:
-- **Deliberately go through the question and database schema word by word** to appropriately answer the question
-- **Use Table Aliases** to prevent ambiguity. For example, `SELECT table1.col1, table2.col1 FROM table1 JOIN table2 ON table1.id = table2.id`.
-- When creating a ratio, always cast the numerator as float
-- Use LIKE instead of ilike
-- Only generate the SQL query, no additional text is required
-- Generate SQL queries for MySQL database
-
-### Input:
-Generate a SQL query that answers the question `{question}`.
-This query will run on a database whose schema is represented in this string:
-{schema}
-
-### Response:
-Based on your instructions, here is the SQL query I have generated to answer the question `{question}`:
-```sql
-""".format(
-        question=question, schema=schema
-    )
-
-    return prompt
-
-
-def rag_retrival(retriever, query):
-    matched_tables = []
-    matched_documents = retriever.get_relevant_documents(query=query)
-
-    for document in matched_documents:
-        page_content = document.page_content
-        matched_tables.append(page_content)
-    return matched_tables
-
-
 class Basic:
     def __init__(self):
-        self.openai_model = "sqlcoder-7b-2"
+        self.server_url = "http://127.0.0.1:8080"
         self.history_messages = utils.enable_chat_history("basic_chat")
 
-    def setup_chain(self):
-        llm = ChatOpenAI(
-            openai_api_base="http://localhost:8000/v1",
-            model_name=self.openai_model,
-            openai_api_key="not_needed",
-            streaming=True,
-            max_tokens=512,
-        )
-        chain = ConversationChain(llm=llm, verbose=True)
-        return chain
-
-    def setup_db_retriever(
-        self,
-        db=os.path.join(os.path.abspath(os.path.dirname(__file__)), "retriever.db"),
-        emb_model_name="defog/sqlcoder-7b-2",
-        top_k_table=1,
-    ):
-        embeddings = HuggingFaceEmbeddings(model_name=emb_model_name)
-        db = FAISS.load_local(db, embeddings, allow_dangerous_deserialization=True)
-        retriever = db.as_retriever(
-            search_type="mmr", search_kwargs={"k": top_k_table, "lambda_mult": 1}
-        )
-        return retriever
+    def _post_parse_response(self, response):
+        if response.status_code == 200:
+            text = response.text
+            json_data = json.loads(text)
+            return json_data
+        else:
+            print("Error Code: ", response.status_code)
+            return None
 
     def main(self):
-        chain = self.setup_chain()
-        db_retriever = self.setup_db_retriever()
         for message in self.history_messages:  # Display the prior chat messages
             with st.chat_message(message["role"]):
                 st.write(message["content"])
@@ -109,10 +53,24 @@ def main(self):
             with st.chat_message("assistant"):
                 with st.spinner("Thinking..."):
                     st_cb = StreamHandler(st.empty())
-                    schema = rag_retrival(db_retriever, user_query)
-                    user_query = generate_prompt(user_query, schema)
-                    response = chain.run(user_query, callbacks=[st_cb])
-                    self.history_messages.append({"role": "assistant", "content": response})
+                    response_rag = requests.post(
+                        f"{self.server_url}/v1/retrieve_tables", json={"query": user_query}
+                    )
+                    json_data_rag = self._post_parse_response(response_rag)
+                    matched_tables = json_data_rag["matched_tables"]
+
+                    response_sql = requests.post(
+                        f"{self.server_url}/v1/generate_sql_code",
+                        json={"query": user_query, "schema": matched_tables},
+                    )
+                    json_data_sql = self._post_parse_response(response_sql)
+                    sql_answer = json_data_sql["sql_code"]["content"]
+                    self.history_messages.append({"role": "assistant", "content": sql_answer})
+
+                    print(sql_answer)
+                    for token in sql_answer.split():
+                        time.sleep(1)
+                        st_cb.on_llm_new_token(token + " ")
 
 
 if __name__ == "__main__":

diff --git a/application/pages/1_LANGCHAIN_3_gluten_udf_codegen.py b/application/pages/1_LANGCHAIN_3_gluten_udf_codegen.py
@@ -0,0 +1,156 @@
+#
+# Copyright 2023 The LLM-on-Ray Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+import requests
+import streamlit as st
+from code_editor import code_editor
+import json
+
+st.set_page_config(page_title="Gluten_Coder_Chatbot_V2", page_icon="💬")
+st.header("Gluten Coder Chatbot")
+st.write("Convert code to Gluten/Velox UDF with the LLM")
+
+code_editor_btns_config = [
+    {
+        "name": "Copy",
+        "feather": "Copy",
+        "hasText": True,
+        "alwaysOn": True,
+        "commands": [
+            "copyAll",
+            [
+                "infoMessage",
+                {"text": "Copied to clipboard!", "timeout": 2500, "classToggle": "show"},
+            ],
+        ],
+        "style": {"top": "0rem", "right": "0.4rem"},
+    },
+    {
+        "name": "Run",
+        "feather": "Play",
+        "primary": True,
+        "hasText": True,
+        "showWithIcon": True,
+        "commands": ["submit"],
+        "style": {"bottom": "0.44rem", "right": "0.4rem"},
+    },
+]
+
+info_bar = {
+    "name": "input code",
+    "css": "\nbackground-color: #bee1e5;\n\nbody > #root .ace-streamlit-dark~& {\n   background-color: #444455;\n}\n\n.ace-streamlit-dark~& span {\n   color: #fff;\n    opacity: 0.6;\n}\n\nspan {\n   color: #000;\n    opacity: 0.5;\n}\n\n.code_editor-info.message {\n    width: inherit;\n    margin-right: 75px;\n    order: 2;\n    text-align: center;\n    opacity: 0;\n    transition: opacity 0.7s ease-out;\n}\n\n.code_editor-info.message.show {\n    opacity: 0.6;\n}\n\n.ace-streamlit-dark~& .code_editor-info.message.show {\n    opacity: 0.5;\n}\n",
+    "style": {
+        "order": "1",
+        "display": "flex",
+        "flexDirection": "row",
+        "alignItems": "center",
+        "width": "100%",
+        "height": "2.5rem",
+        "padding": "0rem 0.6rem",
+        "padding-bottom": "0.2rem",
+        "margin-bottom": "-1px",
+        "borderRadius": "8px 8px 0px 0px",
+        "zIndex": "9993",
+    },
+    "info": [{"name": "Your code", "style": {"width": "800px"}}],
+}
+
+
+class Basic:
+    def __init__(self):
+        self.server_url = "http://127.0.0.1:8000"
+
+    def _post_parse_response(self, response):
+        if response.status_code == 200:
+            text = response.text
+            json_data = json.loads(text)
+            return json_data
+        else:
+            print("Error Code: ", response.status_code)
+            return None
+
+    def main(self):
+        step = 1
+
+        response_dict = code_editor(
+            "",
+            height=(8, 20),
+            lang="scala",
+            theme="dark",
+            shortcuts="vscode",
+            focus=False,
+            buttons=code_editor_btns_config,
+            info=info_bar,
+            props={"style": {"borderRadius": "0px 0px 8px 8px"}},
+            options={"wrap": True},
+        )
+        code_to_convert = response_dict["text"]
+
+        if bool(code_to_convert):
+            print(code_to_convert)
+
+            with st.chat_message(name="assistant", avatar="🧑‍💻"):
+                st.write(f"Step {step}:  convert the code into C++")
+                step += 1
+            with st.spinner("Converting your code to C++..."):
+                data = {"code": code_to_convert}
+                response = requests.post(self.server_url + "/v1/convert_to_cpp", json=data)
+                json_data = self._post_parse_response(response)
+                cpp_code_res = json_data["answer"]
+                cpp_code = json_data["cpp_code"]
+                with st.chat_message("ai"):
+                    st.markdown(cpp_code_res)
+
+            with st.chat_message(name="assistant", avatar="🧑‍💻"):
+                st.write(f"Step {step}: Analyze the keywords that may need to be queried")
+                step += 1
+            with st.spinner("Analyze the  code..."):
+                data = {"cpp_code": cpp_code}
+                response = requests.post(self.server_url + "/v1/generate_keywords", json=data)
+                json_data = self._post_parse_response(response)
+                keywords = json_data["velox_keywords"]
+                with st.chat_message("ai"):
+                    st.markdown("\n".join(keywords))
+
+            with st.chat_message(name="assistant", avatar="🧑‍💻"):
+                st.write(f"Step {step}: Retrieve related knowledge from velox documentations")
+                step += 1
+            with st.spinner("Retrieve reference from velox document and code..."):
+                data = {"velox_keywords": keywords}
+                response = requests.post(self.server_url + "/v1/retrieve_doc", json=data)
+                json_data = self._post_parse_response(response)
+                related_docs = json_data["related_docs"]
+                with st.chat_message("ai"):
+                    st.write(related_docs)
+
+            with st.chat_message(name="assistant", avatar="🧑‍💻"):
+                st.write(f"Step {step}: Based on the previous analysis, rewrite velox based UDF")
+                step += 1
+            with st.spinner("Converting the C++ code to velox based udf..."):
+                data = {
+                    "velox_keywords": keywords,
+                    "code": code_to_convert,
+                    "related_docs": related_docs,
+                }
+                response = requests.post(self.server_url + "/v1/get_gluten_udf", json=data)
+                json_data = self._post_parse_response(response)
+                udf_answer = json_data["udf_answer"]
+                with st.chat_message("ai"):
+                    st.markdown(udf_answer)
+
+
+if __name__ == "__main__":
+    obj = Basic()
+    obj.main()
diff --git a/application/pages/gluten_coder/README.md b/application/pages/gluten_coder/README.md
@@ -0,0 +1,49 @@
+# Gluten UDF converter AI chatbot
+
+[![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://langchain-chatbot.streamlit.app/)
+## Introduction
+### Gluten
+[Gluten](https://github.com/apache/incubator-gluten) is a new middle layer to offload Spark SQL queries to native engines. Gluten can benefit from high scalability of Spark SQL framework and high performance of native libraries.
+
+The basic rule of Gluten's design is that we would reuse spark's whole control flow and as many JVM code as possible but offload the compute-intensive data processing part to native code. Here is what Gluten does:
+
+- Transform Spark's whole stage physical plan to Substrait plan and send to native
+- Offload performance-critical data processing to native library
+- Define clear JNI interfaces for native libraries
+- Switch available native backends easily
+- Reuse Spark�s distributed control flow
+- Manage data sharing between JVM and native
+- Extensible to support more native accelerators
+
+
+### Ability of this chatbot
+The objective of this chatbot application is to assist users by seamlessly transforming their user-defined functions (UDFs), originally designed for Vanilla Spark, into C++ code that adheres to the code standards of Gluten and Velox. This is achieved through the utilization of a Language Learning Model (LLM), which automates the conversion process, ensuring compatibility and enhancing performance within these advanced execution frameworks.
+
+The conversion process is streamlined into the following steps:
+
+1. The chatbot identifies and comprehends the logic of the original Spark UDF code, then translates it into an initial C++ code draft.
+2. Utilizing the preliminary C++ code, the Language Learning Model (LLM) identifies key terms to construct queries. These queries are related to Velox's existing function implementations and data types. The LLM then outputs the query results in JSON format.
+3. With the keywords from the LLM's output, the chatbot retrieve the Velox documentation stored in vector database(Faiss) to find relevant information.
+4. Drawing from the information in the Velox documentation, the chatbot generates the final C++ code that is tailored to meet the specifications of Velox UDFs.
+
+
+### Configuration
+
+Currently, we are using LLM Model [deepseek-coder-33b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct).
+Deployment can be done using LLM-on-Ray with the following command:
+```
+llm_on_ray-serve --config_file llm_on_ray/inference/models/deepseek-coder-33b-instruct.yaml
+```
+
+Before launching the Streamlit application, you need to update the config.py file located at application/pages/codegen/config.py with the necessary configuration details:
+
+```
+# Specify the directory where the model 'deepseek-coder-33b-instruct' is stored.
+model_base_path = ""
+# Provide the path to the FAISS index for Velox documentation.
+vs_path = ""
+```
+
+
+
+
diff --git a/application/pages/gluten_coder/__init__.py b/application/pages/gluten_coder/__init__.py
@@ -0,0 +1,15 @@
+#
+# Copyright 2023 The LLM-on-Ray Authors.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
-Original file line number
+Diff line change
@@ Expand Up / @@ -7,7 +7,6 @@ on: @@
         paths:
           - '**'
           - '!*.md'
     jobs:
       Lint:
@@ Expand Down @@