From d0404cdeec460919fd6b4641dacaa601a992c1bf Mon Sep 17 00:00:00 2001 From: Kyo Lee Date: Fri, 15 Mar 2024 21:02:21 +0000 Subject: [PATCH 01/12] [docs-agent] Clean up Docs Agent's README.md files. --- examples/gemini/python/docs-agent/README.md | 85 +++---------------- .../docs_agent/interfaces/README.md | 12 ++- 2 files changed, 18 insertions(+), 79 deletions(-) diff --git a/examples/gemini/python/docs-agent/README.md b/examples/gemini/python/docs-agent/README.md index 4b30e9644..8b2feee0f 100644 --- a/examples/gemini/python/docs-agent/README.md +++ b/examples/gemini/python/docs-agent/README.md @@ -67,8 +67,8 @@ The following list summarizes the tasks and features supported by Docs Agent: to create, populate, update and delete online corpora using the Semantic Retrieval AI. For the list of all available Docs Agent command lines, see the [Docs Agent CLI reference][cli-reference] page. -- **Run the Docs Agent CLI from anywhere on a terminal**: You can set up the - Docs Agent CLI to ask questions to the Gemini model from anywhere on a terminal. +- **Run the Docs Agent CLI from anywhere in a terminal**: You can set up the + Docs Agent CLI to ask questions to the Gemini model from anywhere in a terminal. For more information, see the [Set up Docs Agent CLI][cli-readme] page. For more information on Docs Agent's architecture and features, @@ -174,41 +174,32 @@ Authorize Google Cloud credentials on your host machine: (`application_default_credentials.json`) in the `$HOME/.config/gcloud/` directory of your host machine. -### 4. Clone the Docs Agent project repository +### 4. Clone the Docs Agent project **Note**: This guide assumes that you're creating a new project directory from your `$HOME` directory. -Clone the Docs Agent repository and install dependencies: +Clone the Docs Agent project and install dependencies: -1. Clone the following internal repo: +1. Clone the following repo: ```posix-terminal - git clone sso://doc-llm-internal/docs-agent + git clone https://github.com/google/generative-ai-docs.git ``` -2. Go to the project directory: +2. Go to the Docs Agent project directory: ```posix-terminal - cd docs-agent + cd generative-ai-docs/examples/gemini/python/docs-agent ``` -3. (**Optional**) If you plan on contributing to the `docs-agent` repo, - run the following command to set up your commit hook: - - ``` - curl -Lo `git rev-parse --git-dir`/hooks/commit-msg https://gerrit-review.googlesource.com/tools/hooks/commit-msg ; chmod +x `git rev-parse --git-dir`/hooks/commit-msg - ``` - -4. Install dependencies using `poetry`: +3. Install dependencies using `poetry`: ```posix-terminal poetry install ``` - This may take some time to complete. - -5. Enter the `poetry` shell environment: +4. Enter the `poetry` shell environment: ```posix-terminal poetry shell @@ -234,7 +225,7 @@ Update settings in the Docs Agent project to use your custom dataset: 1. Go to the Docs Agent project home directory, for example: ``` - cd $HOME/docs-agent + cd $HOME/generative-ai-docs/examples/gemini/python/docs-agent ``` 2. Open the [`config.yaml`][config-yaml] file using a text editor, for example: @@ -315,7 +306,7 @@ To populate a new vector database: 1. Go to the Docs Agent project home directory, for example: ``` - cd $HOME/docs-agent + cd $HOME/generative-ai-docs/examples/gemini/python/docs-agent ``` 2. Process Markdown files into small text chunks: @@ -349,7 +340,7 @@ To start the Docs Agent chat app: 1. Go to the Docs Agent project home directory, for example: ``` - cd $HOME/docs-agent + cd $HOME/generative-ai-docs/examples/gemini/python/docs-agent ``` 2. Launch the Docs Agent chat app: @@ -391,56 +382,6 @@ To start the Docs Agent chat app: **The Docs Agent chat app is all set!** -## Contribute to Docs Agent - -The section provides instructions on how to set up your account with the Docs Agent -repository so that you can start contributing to the Docs Agent project. - -To set up your account for the Docs Agent repository, do the following: - -1. To create an account on Gerrit Code Review, open the following page - on a browser: - - ``` - https://doc-llm-internal-review.git.corp.google.com/ - ``` - -1. Click **Create account**. - -1. Clone the `docs-agent` repository on your host machine: - - ``` - git clone sso://doc-llm-internal/docs-agent - ``` - -1. Go to the project directory: - - ``` - cd docs-agent - ``` - -1. To set up your commit hook, run the following command: - - ``` - curl -Lo `git rev-parse --git-dir`/hooks/commit-msg https://gerrit-review.googlesource.com/tools/hooks/commit-msg ; chmod +x `git rev-parse --git-dir`/hooks/commit-msg - ``` - -1. Create a new Gerrit change, for example: - - ``` - git add - ``` - - ``` - git commit [--amend] - ``` - -1. Upload the change for review: - - ``` - git push origin HEAD:refs/for/main - ``` - ## Contributors Nick Van der Auwermeulen (`@nickvander`), Rundong Du (`@rundong08`), diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/README.md b/examples/gemini/python/docs-agent/docs_agent/interfaces/README.md index bd383e5ae..27e2ae9ae 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/README.md +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/README.md @@ -116,21 +116,21 @@ credentials (via `gcloud`) stored on your host machine. (`application_default_credentials.json`) in the `$HOME/.config/gcloud/` directory of your host machine. -## 4. Clone the Docs Agent project repository +## 4. Clone the Docs Agent project **Note**: This guide assumes that you're creating a new project directory from your `$HOME` directory. -1. Clone the following internal repo: +1. Clone the following repo: ```posix-terminal - git clone sso://doc-llm-internal/docs-agent + git clone https://github.com/google/generative-ai-docs.git ``` -2. Go to the project directory: +2. Go to the Docs Agent project directory: ```posix-terminal - cd docs-agent + cd generative-ai-docs/examples/gemini/python/docs-agent ``` 3. Install dependencies using `poetry`: @@ -139,8 +139,6 @@ from your `$HOME` directory. poetry install ``` - This may take some time to complete. - ## 5. Set up an alias to the gemini command **Note**: If your Docs Agent project is not cloned in the `$HOME` directory, From 61758684020f0d2cae9f2959d062d7fa70f3dcff Mon Sep 17 00:00:00 2001 From: Kyo Lee Date: Fri, 29 Mar 2024 20:00:49 +0000 Subject: [PATCH 02/12] [Release] Docs Agent version 0.3.1 What's changed: - Bug fixes in the Docs Agent web app UI. - Added more templates for the Docs Agent web app: widget and experimental - A new custom splitter added: FIDL (.fidl) file specific splitter. --- .../python/docs-agent/apps_script/exportmd.gs | 1309 ++++++++++++++++- .../docs_agent/interfaces/README.md | 12 +- .../docs_agent/interfaces/chatbot/__init__.py | 4 +- .../docs_agent/interfaces/chatbot/chatui.py | 56 +- .../chatbot/static/css/style-chatui.css | 30 +- .../css/{style-old.css => style-widget.css} | 198 ++- .../chatbot/static/javascript/app.js | 41 + .../chatbot/templates/chat-widget/base.html | 25 + .../chatbot/templates/chat-widget/index.html | 16 + .../chatbot/templates/chat-widget/result.html | 168 +++ .../templates/chatui-experimental/base.html | 27 + .../index.html} | 2 +- .../result.html} | 0 .../chatbot/templates/chatui/index.html | 6 +- .../chatbot/templates/chatui/result.html | 16 +- .../docs-agent/docs_agent/interfaces/cli.py | 14 +- .../preprocess/files_to_plain_text.py | 58 +- .../preprocess/splitters/fidl_splitter.py | 64 + .../preprocess/splitters/markdown_splitter.py | 4 +- .../docs-agent/docs_agent/utilities/config.py | 15 +- .../docs_agent/utilities/helpers.py | 6 +- .../gemini/python/docs-agent/pyproject.toml | 2 +- 22 files changed, 1983 insertions(+), 90 deletions(-) mode change 120000 => 100644 examples/gemini/python/docs-agent/apps_script/exportmd.gs rename examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/{style-old.css => style-widget.css} (68%) create mode 100644 examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/base.html create mode 100644 examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/index.html create mode 100644 examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/result.html create mode 100644 examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/base.html rename examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/{chatui/index_experimental.html => chatui-experimental/index.html} (90%) rename examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/{chatui/result_experimental.html => chatui-experimental/result.html} (100%) create mode 100644 examples/gemini/python/docs-agent/docs_agent/preprocess/splitters/fidl_splitter.py diff --git a/examples/gemini/python/docs-agent/apps_script/exportmd.gs b/examples/gemini/python/docs-agent/apps_script/exportmd.gs deleted file mode 120000 index 203251435..000000000 --- a/examples/gemini/python/docs-agent/apps_script/exportmd.gs +++ /dev/null @@ -1 +0,0 @@ -../third_party/g2docsmd-html/exportmd.gs \ No newline at end of file diff --git a/examples/gemini/python/docs-agent/apps_script/exportmd.gs b/examples/gemini/python/docs-agent/apps_script/exportmd.gs new file mode 100644 index 000000000..283872c79 --- /dev/null +++ b/examples/gemini/python/docs-agent/apps_script/exportmd.gs @@ -0,0 +1,1308 @@ +/** + * Copyright 2023 Google LLC + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* Original script is from: +https://github.com/lmmx/gdocs2md-html/blob/master/exportmd.gs +and commit: 0d86cfa +Parsing from mangini/gdocs2md. +Modified by clearf to add files to the google directory structure. +Modified by lmmx to write Markdown, going back to HTML-incorporation. + +Usage: + NB: don't use on top-level doc (in root Drive folder) See comment in setupScript function. + Adding this script to your doc: + - Tools > Script Manager > New + - Select "Blank Project", then paste this code in and save. + Running the script: + - Tools > Script Manager + - Select "convertDocumentToMarkdown" function. + - Click Run button. + - Converted doc will be added to a "Markdown" folder in the source document's directories. + - Images will be added to a subfolder of the "Markdown" folder. +*/ + +function onInstall(e) { + onOpen(e); +} + +function onOpen() { + // Add a menu with some items, some separators, and a sub-menu. + setupScript(); +// In future: +// DocumentApp.getUi().createAddonMenu(); + DocumentApp.getUi().createMenu('Markdown') + .addItem('View as markdown', 'markdownPopup') + .addSubMenu(DocumentApp.getUi().createMenu('Export \u2192 markdown') + .addItem('Export to local file', 'convertSingleDoc') + .addItem('Export entire folder to local file', 'convertFolder') + .addItem('Customise markdown conversion', 'changeDefaults')) + .addSeparator() + .addSubMenu(DocumentApp.getUi().createMenu('Toggle comment visibility') + .addItem('Image source URLs', 'toggleImageSourceStatus') + .addItem('All comments', 'toggleCommentStatus')) + .addItem("Add comment", 'addCommentDummy') + .addToUi(); +} + +function changeDefaults() { + var ui = DocumentApp.getUi(); + var default_settings = '{ use your imagination... }'; + var greeting = ui.alert('This should be set up to display defaults from variables passed to getDocComments etc., e.g. something like:\n\nDefault settings are:' + + '\ncomments - not checking deleted comments.\nDocument - this document (alternatively specify a document ID).' + + '\n\nClick OK to edit these, or cancel.', + ui.ButtonSet.OK_CANCEL); + ui.alert("There's not really need for this yet, so this won't proceed, regardless of what you just pressed."); + return; + + // Future: + if (greeting == ui.Button.CANCEL) { + ui.alert("Alright, never mind!"); + return; + } + // otherwise user clicked OK + // user clicked OK, to proceed with editing these defaults. Ask case by case whether to edit + + var response = ui.prompt('What is x (default y)?', ui.ButtonSet.YES_NO_CANCEL); + + // Example code from docs at https://developers.google.com/apps-script/reference/base/button-set + // Process the user's response. + if (response.getSelectedButton() == ui.Button.YES) { + Logger.log('The user\'s name is %s.', response.getResponseText()); + } else if (response.getSelectedButton() == ui.Button.NO) { + Logger.log('The user didn\'t want to provide a name.'); + } else { + Logger.log('The user clicked the close button in the dialog\'s title bar.'); + } +} + +function setupScript() { + var script_properties = PropertiesService.getScriptProperties(); + script_properties.setProperty("user_email", Drive.About.get().user.emailAddress); + + // manual way to do the following: + // script_properties.setProperty("folder_id", "INSERT_FOLDER_ID_HERE"); + // script_properties.setProperty("document_id", "INSERT_FILE_ID_HERE"); + + var doc_id = DocumentApp.getActiveDocument().getId(); + script_properties.setProperty("document_id", doc_id); + var doc_parents = DriveApp.getFileById(doc_id).getParents(); + var folders = doc_parents; + while (folders.hasNext()) { + var folder = folders.next(); + var folder_id = folder.getId(); + } + script_properties.setProperty("folder_id", folder_id); + script_properties.setProperty("image_folder_prefix", ""); // add if modifying image location +} + +function addCommentDummy() { + // Dummy function to be switched during development for addComment + DocumentApp.getUi() + .alert('Cancelling comment entry', + "There's not currently a readable anchor for Google Docs - you need to write your own!" + + + "\n\nThe infrastructure for using such an anchoring schema is sketched out in" + + " the exportmd.gs script's addComment function, for an anchor defined in anchor_props" + + + "\n\nSee github.com/lmmx/devnotes/wiki/Custom-Google-Docs-comment-anchoring-schema", + DocumentApp.getUi().ButtonSet.OK + ); + return; +} + +function addComment() { + + var doc_id = PropertiesService.getScriptProperties().getProperty('document_id'); + var user_email = PropertiesService.getScriptProperties().getProperty('email'); +/* Drive.Comments.insert({content: "hello world", + context: { + type: 'text/html', + value: 'hinges' + } + }, document_id); */ + var revision_list = Drive.Revisions.list(doc_id).items; + var recent_revision_id = revision_list[revision_list.length - 1].id; + var anchor_props = { + revision_id: recent_revision_id, + starting_offset: '', + offset_length: '', + total_chars: '' + } + insertComment(doc_id, 'hinges', 'Hello world!', my_email, anchor_props); +} + +function insertComment(fileId, selected_text, content, user_email, anchor_props) { + + // NB Deal with handling missing args + + /* + anchor_props is an object with 4 properties: + - revision_id, + - starting_offset, + - offset_length, + - total_chars + */ + + var context = Drive.newCommentContext(); + context.value = selected_text; + context.type = 'text/html'; + var comment = Drive.newComment(); + comment.kind = 'drive#comment'; + var author = Drive.newUser(); + author.kind = 'drive#user'; + author.displayName = user_email; + author.isAuthenticatedUser = true; + comment.author = author; + comment.content = type; + comment.context = context; + comment.status = 'open'; + comment.anchor = "{'r':" + + anchor_props.revision_id + + ",'a':[{'txt':{'o':" + + anchor_props.starting_offset + + ",'l':" + + anchor_props.offset_length + + ",'ml':" + + anchor_props.total_chars + + "}}]}"; + comment.fileId = fileId; + Drive.Comments.insert(comment, fileId); +} + +function decodeScriptSwitches(optional_storage_name) { + var property_name = (typeof(optional_storage_name) == 'string') ? optional_storage_name : 'switch_settings'; + var script_properties = PropertiesService.getScriptProperties(); + return script_properties + .getProperty(property_name) + .replace(/{|}/g,'') // Get the statements out of brackets... + .replace(',', ';'); // ...swap the separator for a semi-colon... + // ...evaluate the stored object string as statements upon string return and voila, switches interpreted +} + + +function getDocComments(comment_list_settings) { + var possible_settings = ['images', 'include_deleted']; + + // switches are processed and set on a script-wide property called "comment_switches" + var property_name = 'comment_switches'; + switchHandler(comment_list_settings, possible_settings, property_name); + + var script_properties = PropertiesService.getScriptProperties(); + var comment_switches = decodeScriptSwitches(property_name); + eval(comment_switches); + + var document_id = script_properties.getProperty("document_id"); + var comments_list = Drive.Comments.list(document_id, + {includeDeleted: include_deleted, + maxResults: 100 }); // 0 to 100, default 20 + // See https://developers.google.com/drive/v2/reference/comments/list for all options + var comment_array = []; + var image_sources = []; + // To collect all comments' image URLs to match against inlineImage class elements LINK_URL attribute + + for (var i = 0; i < comments_list.items.length; i++) { + var comment = comments_list.items[i]; + var comment_text = comment.content; + var comment_status = comment.status; + /* + images is a generic parameter passed in as a switch to + return image URL-containing comments only. + + If the parameter is provided, it's no longer undefined. + */ + var img_url_regex = /(https?:\/\/.+?\.(png|gif|jpe?g))/; + var has_img_url = img_url_regex.test(comment_text); + + if (images && !has_img_url) continue; // no image URL, don't store comment + if (has_img_url) image_sources.push(RegExp.$1); + comment_array.push(comment); + } + script_properties.setProperty('image_source_URLs', image_sources) + return comment_array; +} + +function isValidAttrib(attribute) { // Sanity check function, called per element in array + + // Possible list of attributes to check against (leaving out unchanging ones like kind) + possible_attrs = [ + 'selfLink', + 'commentId', + 'createdDate', + 'modifiedDate', + 'author', + 'htmlContent', + 'content', + 'deleted', + 'status', + 'context', + 'anchor', + 'fileId', + 'fileTitle', + 'replies', + 'author' + ]; + + // Check if attribute(s) provided can be used to match/filter comments: + + if (typeof(attribute) == 'string' || typeof(attribute) == 'object') { + // Either a string/object (1-tuple) + + // Generated with Javascript, gist: https://gist.github.com/lmmx/451b301e1d78ed2c10b4 + + // Return false from the function if any of the attributes specified are not in the above list + + // If an object, the name is the key, otherwise it's just the string + if (attribute.constructor === Object) { + var att_keys = []; + for (var att_key in attribute) { + if (attribute.hasOwnProperty(att_key)) { + att_keys.push(att_key); + } + } + for (var n=0; n < att_keys.length; n++) { + var attribute_name = att_keys[n]; + var is_valid_attrib = (possible_attrs.indexOf(attribute_name) > -1); + + // The attribute needs to be one of the possible attributes listed above, match its given value(s), + // else returning false will throw an error from onAttribError when within getCommentAttributes + return is_valid_attrib; + } + } else if (typeof(attribute) == 'string') { + var attribute_name = attribute; + var is_valid_attrib = (possible_attrs.indexOf(attribute_name) > -1); + return is_valid_attrib; + // Otherwise is a valid (string) attribute + } else if (attribute.constructor === Array) { + return false; // Again, if within getCommentAttributes this will cause an error - shouldn't pass an array + } else { + // Wouldn't expect this to happen, so give a custom error message + Logger.log('Unknown type (assumed impossible) passed to isValidAttrib: ', attribute, attribute.constructor); + throw new TypeError('Unknown passed to isValidAttrib - this should be receiving 1-tuples only, see logs for details.'); + } + } else return false; // Neither string/object / array of strings &/or objects - not a valid attribute +} + +function getCommentAttributes(attributes, comment_list_settings) { + + // A filter function built on Comments.list, for a given list of attributes + // Objects' values are ignored here, only their property titles are used to filter comments. + + + /* + - attributes: array of attributes to filter/match on + - comment_list_settings: (optional) object with properties corresponding to switches in getDocComments + + This function outputs an array of the same length as the comment list, containing + values for all fields matched/filtered on. + */ + + + /* + * All possible comment attributes are listed at: + * https://developers.google.com/drive/v2/reference/comments#properties + */ + + // Firstly, describe the type in a message to be thrown in case of TypeError: + + var attrib_def_message = "'attributes' should be a string (the attribute to get for each comment), " + + "an object (a key-value pair for attribute and desired value), " + + "or an array of objects (each with key-value pairs)"; + + function onAttribError(message) { + Logger.log(message); + throw new TypeError(message); + } + + // If (optional) comment_list_settings isn't set, make a getDocComments call with switches left blank. + if (typeof(comment_list_settings) == 'undefined') var comment_list_settings = {}; + if (typeof(attributes) == 'undefined') onAttribError(attrib_def_message); // no variables specified + + if (isValidAttrib(attributes)) { // This will be true if there's only one attribute, not provided in an array + + /* + Make a 1-tuple (array of 1) from either an object or a string, + i.e. a single attribute, with or without a defined value respectively. + */ + + var attributes = Array(attributes); + + } else if (attributes.constructor === Array) { + + // Check each item in the array is a valid attribute specification + for (var l = 0; l < attributes.length; l++) { + if (! isValidAttrib(attributes[l]) ) { + onAttribError('Error in attribute ' + + (l+1) + ' of ' + attributes.length + + '\n\n' + + attrib_def_message); + } + } + + } else { // Neither attribute nor array of attributes + throw new TypeError(attrib_def_message); + } + + // Attributes now holds an array of string and/or objects specifying a comment match and/or filter query + + var comment_list = getDocComments(comment_list_settings); + var comment_attrib_lists = []; + for (var i in comment_list) { + var comment = comment_list[i]; + var comment_attrib_list = []; + for (var j in attributes) { + var comment_attribute = comment_list[i][attributes[j]]; + comment_attrib_list.push(comment_attribute); + } + comment_attrib_lists.push(comment_attrib_list); + } + // The array comment_attrib_lists is now full of the requested attributes, + // of length equal to that of attributes + return comment_attrib_lists; +} + +// Example function to use getCommentAttributes: + +function filterComments(attributes, comment_list_settings) { + var comment_attributes = getCommentAttributes(attributes, comment_list_settings); + var m = attribs.indexOf('commentId') // no need to keep track of commentID array position + comm_attribs.map(function(attrib_pair) { + if (attrib_pair[1]); + }) +} + +function toggleCommentStatus(comment_switches){ + // Technically just image URL-containing comments, not sources just yet + var attribs = ['commentId', 'status']; + var comm_attribs = getCommentAttributes(attribs, comment_switches); + var rearrangement = []; + comm_attribs.map( + function(attrib_pair) { // for every comment return with the images_only / images: true comments.list setting, + switch (attrib_pair[1]){ // check the status of each + case 'open': + rearrangement.push([attrib_pair[0],'resolved']); + break; + case 'resolved': + rearrangement.push([attrib_pair[0],'open']); + break; + } + } + ); + var script_properties = PropertiesService.getScriptProperties(); + var doc_id = script_properties.getProperty("document_id"); + rearrangement.map( + function(new_attrib_pair) { // for every comment ID with flipped status + Drive.Comments.patch('{"status": "' + + new_attrib_pair[1] + + '"}', doc_id, new_attrib_pair[0]) + } + ); + return; +} + +function toggleImageSourceStatus(){ + toggleCommentStatus({images: true}); +} + +function flipResolved() { + // Flip the status of resolved comments to open, and open comments to resolved (respectful = true) + // I.e. make resolved URL-containing comments visible, without losing track of normal comments' status + + // To force all comments' statuses to switch between resolved and open en masse set respectful to false + + var switch_settings = {}; + switch_settings.respectful = true; + switch_settings.images_only = false; // If true, only switch status of comments with an image URL + switch_settings.switch_deleted_comments = false; // If true, also switch status of deleted comments + + var comments_list = getDocComments( + { images: switch_settings.images_only, + include_deleted: switch_settings.switch_deleted_comments }); + + // Note: these parameters are unnecessary if both false (in their absence assumed false) + // but included for ease of later reuse + + if (switch_settings.respectful) { + // flip between + } else { + // flip all based on status of first in list + } +} + +function markdownPopup() { + var css_style = ''; + + // The above was written with js since doesn't work: + // https://gist.github.com/lmmx/ec084fc351528395f2bb + + var mdstring = stringMiddleMan(); + + var htmlstring = + '' + + css_style + + '
'; + + var html5 = HtmlService.createHtmlOutput(htmlstring) + .setSandboxMode(HtmlService.SandboxMode.IFRAME) + .setWidth(800) + .setHeight(500); + + DocumentApp.getUi() + .showModalDialog(html5, 'Markdown output'); +} + +function stringMiddleMan() { + var returned_string; + convertSingleDoc({"return_string": true}); // for some reason needs the scope to be already set... + // could probably rework to use mdstring rather than returned_string, cut out middle man function + return this.returned_string; +} + +function convertSingleDoc(optional_switches) { + var script_properties = PropertiesService.getScriptProperties(); + // renew comments list on every export + var doc_comments = getDocComments(); + var image_urls = getDocComments({images: true}); // NB assumed false - any value will do + script_properties.setProperty("comments", doc_comments); + script_properties.setProperty("image_srcs", image_urls); + var folder_id = script_properties.getProperty("folder_id"); + var document_id = script_properties.getProperty("document_id"); + var source_folder = DriveApp.getFolderById(folder_id); + var markdown_folders = source_folder.getFoldersByName("Markdown"); + + var markdown_folder; + if (markdown_folders.hasNext()) { + markdown_folder = markdown_folders.next(); + } else { + // Create a Markdown folder if it doesn't exist. + markdown_folder = source_folder.createFolder("Markdown") + } + + convertDocumentToMarkdown(DocumentApp.openById(document_id), markdown_folder, optional_switches); +} + +function convertFolder() { + var script_properties = PropertiesService.getScriptProperties(); + var folder_id = script_properties.getProperty("folder_id"); + var source_folder = DriveApp.getFolderById(folder_id); + var markdown_folders = source_folder.getFoldersByName("Markdown"); + + + var markdown_folder; + if (markdown_folders.hasNext()) { + markdown_folder = markdown_folders.next(); + } else { + // Create a Markdown folder if it doesn't exist. + markdown_folder = source_folder.createFolder("Markdown"); + } + + // Only try to convert google docs files. + var gdoc_files = source_folder.getFilesByType("application/vnd.google-apps.document"); + + // For every file in this directory + while(gdoc_files.hasNext()) { + var gdoc_file = gdoc_files.next() + + var filename = gdoc_file.getName(); + var md_files = markdown_folder.getFilesByName(filename + ".md"); + var update_file = false; + + if (md_files.hasNext()) { + var md_file = md_files.next(); + + if (md_files.hasNext()){ // There are multiple markdown files; delete and rerun + update_file = true; + } else if (md_file.getLastUpdated() < gdoc_file.getLastUpdated()) { + update_file = true; + } + } else { + // There is no folder and the conversion needs to be rerun + update_file = true; + } + + if (update_file) { + convertDocumentToMarkdown(DocumentApp.openById(gdoc_file.getId()), markdown_folder); + } + } +} + +function switchHandler(input_switches, potential_switches, optional_storage_name) { + + // Firstly, if no input switches were set, make an empty input object + if (typeof(input_switches) == 'undefined') input_switches = {}; + + // Use optional storage name if it's defined (must be a string), else use default variable name "switch_settings" + var property_name = (typeof(optional_storage_name) == 'string') ? optional_storage_name : 'switch_settings'; + + // Make a blank object to be populated and stored as the script-wide property named after property_name + var switch_settings = {}; + + for (var i in potential_switches) { + var potential_switch = potential_switches[i]; + + // If each switch has been set (in input_switches), evaluate it, else assume it's switched off (false): + + if (input_switches.propertyIsEnumerable(potential_switch)) { + + // Evaluates a string representing a statement which sets switch_settings properties from input_switches + // e.g. "switch_settings.images = true" when input_switches = {images: true} + + eval('switch_settings.' + potential_switch + " = " + input_switches[potential_switch]); + + } else { + + // Alternatively, the evaluated statement sets anything absent from the input_switches object as false + // e.g. "switch_settings.images = false" when input_switches = {} and potential_switches = ['images'] + + eval('switch_settings.' + potential_switch + " = false"); + } + } + + PropertiesService.getScriptProperties().setProperty(property_name, switch_settings); + + /* + Looks bad but more sensible than repeatedly checking if arg undefined. + + Sets every variable named in the potential_switches array to false if + it wasn't passed into the input_switches object, otherwise evaluates. + + Any arguments not passed in are false, but so are any explicitly passed in as false: + all parameters are therefore Boolean until otherwise specified. + */ + +} + +function convertDocumentToMarkdown(document, destination_folder, frontmatter_input, optional_switches) { + // if returning a string, force_save_images will make the script continue - experimental + var possible_switches = ['return_string', 'force_save_images']; + var property_name = 'conversion_switches'; + switchHandler(optional_switches, possible_switches, property_name); + + // TODO switch off image storage if force_save_images is true - not necessary for normal behaviour + var script_properties = PropertiesService.getScriptProperties(); + var comment_switches = decodeScriptSwitches(property_name); + eval(comment_switches); + + var image_prefix = script_properties.getProperty("image_folder_prefix"); + var numChildren = document.getActiveSection().getNumChildren(); + if (frontmatter_input != "") { + var text = frontmatter_input; + } + else { + var text = "" + } + var md_filename = sanitizeFileName(document.getName()) + ".md"; + var image_foldername = document.getName()+"_images"; + var inSrc = false; + var inClass = false; + var globalImageCounter = 0; + var globalListCounters = {}; + // edbacher: added a variable for indent in src
 block. Let style sheet do margin.
+  var srcIndent = "";
+
+  var postHasImages = false;
+
+  var files = [];
+
+  // Walk through all the child elements of the doc.
+  for (var i = 0; i < numChildren; i++) {
+    var child = document.getActiveSection().getChild(i);
+    var result = processParagraph(i, child, inSrc, globalImageCounter, globalListCounters, image_prefix + image_foldername);
+    globalImageCounter += (result && result.images) ? result.images.length : 0;
+    if (result!==null) {
+      if (result.sourceGlossary==="start" && !inSrc) {
+        inSrc=true;
+        text+="
\n";
+      } else if (result.sourceGlossary==="end" && inSrc) {
+        inSrc=false;
+        text+="
\n\n"; + } else if (result.sourceFigCap==="start" && !inSrc) { + inSrc=true; + text+="
\n";
+      } else if (result.sourceFigCap==="end" && inSrc) {
+        inSrc=false;
+        text+="
\n\n"; + } else if (result.source==="start" && !inSrc) { + inSrc=true; + text+="
\n";
+      } else if (result.source==="end" && inSrc) {
+        inSrc=false;
+        text+="
\n\n"; + } else if (result.inClass==="start" && !inClass) { + inClass=true; + text+="
\n";
+      } else if (result.inClass==="end" && inClass) {
+        inClass=false;
+        text+="
\n\n"; + } else if (inClass) { + text+=result.text+"\n\n"; + } else if (inSrc) { + text+=(srcIndent+escapeHTML(result.text)+"\n"); + } else if (result.text && result.text.length>0) { + text+=result.text+"\n\n"; + } + + if (result.images && result.images.length>0) { + for (var j=0; j/g, '>'); +} + +function standardQMarks(text) { + return text.replace(/\u2018|\u8216|\u2019|\u8217/g,"'").replace(/\u201c|\u8220|\u201d|\u8221/g, '"') +} + +// Process each child element (not just paragraphs). +function processParagraph(index, element, inSrc, imageCounter, listCounters, image_path) { + // First, check for things that require no processing. + if (element.getType() === DocumentApp.ElementType.UNSUPPORTED) { + return null; + } + if (element.getNumChildren()==0) { + return null; + } + // Skip on TOC. + if (element.getType() === DocumentApp.ElementType.TABLE_OF_CONTENTS) { + return {"text": "[[TOC]]"}; + } + + // Set up for real results. + var result = {}; + var pOut = ""; + var textElements = []; + var imagePrefix = "image_"; + + // Handle Table elements. Pretty simple-minded now, but works for simple tables. + // Note that Markdown does not process within block-level HTML, so it probably + // doesn't make sense to add markup within tables. + if (element.getType() === DocumentApp.ElementType.TABLE) { + textElements.push("\n"); + var nCols = element.getChild(0).getNumCells(); + for (var i = 0; i < element.getNumChildren(); i++) { + textElements.push(" \n"); + // process this row + for (var j = 0; j < nCols; j++) { + textElements.push(" \n"); + } + textElements.push(" \n"); + } + textElements.push("
" + element.getChild(i).getChild(j).getText() + "
\n"); + } + + // Need to handle this element type, return null for now + if (element.getType() === DocumentApp.ElementType.CODE_SNIPPET) { + return null + } + + // Process various types (ElementType). + for (var i = 0; i < element.getNumChildren(); i++) { + var t = element.getChild(i).getType(); + + if (t === DocumentApp.ElementType.TABLE_ROW) { + // do nothing: already handled TABLE_ROW + } else if (t === DocumentApp.ElementType.TEXT) { + var txt = element.getChild(i); + pOut += txt.getText(); + textElements.push(txt); + } else if (t === DocumentApp.ElementType.INLINE_IMAGE) { + var imglink = element.getChild(i).getLinkUrl(); + result.images = result.images || []; + var blob = element.getChild(i).getBlob() + var contentType = blob.getContentType(); + var extension = ""; + if (/\/png$/.test(contentType)) { + extension = ".png"; + } else if (/\/gif$/.test(contentType)) { + extension = ".gif"; + } else if (/\/jpe?g$/.test(contentType)) { + extension = ".jpg"; + } else { + throw "Unsupported image type: "+contentType; + } + + var name = imagePrefix + imageCounter + extension; + blob.setName(name); + + imageCounter++; + if (!return_string || force_save_images) { + textElements.push('![](' + image_path + '/' + name + ')'); + } else { + textElements.push('![](' + imglink + ')'); + } + //result.images.push( { + // "bytes": blob.getBytes(), + // "type": contentType, + // "name": name}); + + result.images.push({ "blob" : blob } ) + + // Need to fix this case TODO + } else if (t === DocumentApp.ElementType.INLINE_DRAWING) { + + imageCounter++; + if (!return_string || force_save_images) { + textElements.push('![](' + "drawing" + '/' + " name" + ')'); + } else { + textElements.push('![](' + "drawing" + ')'); + } + //result.images.push( { + // "bytes": blob.getBytes(), + // "type": contentType, + // "name": name}); + + // result.images.push({ "blob" : blob } ) + + } + else if (t === DocumentApp.ElementType.PAGE_BREAK) { + // ignore + } else if (t === DocumentApp.ElementType.HORIZONTAL_RULE) { + textElements.push('* * *\n'); + } else if (t === DocumentApp.ElementType.FOOTNOTE) { + textElements.push(' ('+element.getChild(i).getFootnoteContents().getText()+')'); + // Fixes for new elements + } else if (t === DocumentApp.ElementType.DATE) { + textElements.push(' ('+element.getChild(i)+')'); + } else if (t === DocumentApp.ElementType.RICH_LINK) { + textElements.push(' ('+element.getChild(i).getUrl()+')'); + } else if (t === DocumentApp.ElementType.PERSON) { + textElements.push(element.getChild(i).getName() + ', '); + } else if (t === DocumentApp.ElementType.UNSUPPORTED) { + textElements.push(' '); + } else { + throw "Paragraph "+index+" of type "+element.getType()+" has an unsupported child: " + +t+" "+(element.getChild(i)["getText"] ? element.getChild(i).getText():'')+" index="+index; + } + } + + if (textElements.length==0) { + // Isn't result empty now? + return result; + } + +// Fix for unrecognized command getIndentFirstLine + var ind_f = 0; + var ind_s = 0; + var ind_e = 0; + if (t === DocumentApp.ElementType.PARAGRAPH) { + + var ind_f = element.getIndentFirstLine(); + var ind_s = element.getIndentStart(); + var ind_e = element.getIndentEnd(); + } + var i_fse = [ind_f,ind_s,ind_e]; + var indents = {}; + for (indt=0;indt 0) indents[indname] = eval(indname); + // lazy test, null (no indent) is not greater than zero, but becomes set if indent 'undone' + } + var inIndent = (Object.keys(indents).length > 0); + + // evb: Add glossary and figure caption too. (And abbreviations: gloss and fig-cap.) + // process source code block: + if (/^\s*---\s+gloss\s*$/.test(pOut) || /^\s*---\s+source glossary\s*$/.test(pOut)) { + result.sourceGlossary = "start"; + } else if (/^\s*---\s+fig-cap\s*$/.test(pOut) || /^\s*---\s+source fig-cap\s*$/.test(pOut)) { + result.sourceFigCap = "start"; + } else if (/^\s*---\s+src\s*$/.test(pOut) || /^\s*---\s+source code\s*$/.test(pOut)) { + result.source = "start"; + } else if (/^\s*---\s+class\s+([^ ]+)\s*$/.test(pOut)) { + result.inClass = "start"; + result.className = RegExp.$1.replace(/\./g,' '); + } else if (/^\s*---\s*$/.test(pOut)) { + result.source = "end"; + result.sourceGlossary = "end"; + result.sourceFigCap = "end"; + result.inClass = "end"; + } else if (/^\s*---\s+jsperf\s*([^ ]+)\s*$/.test(pOut)) { + result.text = ''; + } else { + + prefix = findPrefix(inSrc, element, listCounters); + + var pOut = ""; + for (var i=0; i): + if (gt === DocumentApp.GlyphType.BULLET + || gt === DocumentApp.GlyphType.HOLLOW_BULLET + || gt === DocumentApp.GlyphType.SQUARE_BULLET) { + prefix += "* "; + } else { + // Ordered list (
    ): + var key = listItem.getListId() + '.' + listItem.getNestingLevel(); + var counter = listCounters[key] || 0; + counter++; + listCounters[key] = counter; + prefix += counter+". "; + } + } + } + return prefix; +} + +function processTextElement(inSrc, txt) { + if (typeof(txt) === 'string') { + return txt; + } + + var pOut = txt.getText(); + if (! txt.getTextAttributeIndices) { + return pOut; + } + +// Logger.log("Initial String: " + pOut) + + // CRC introducing reformatted_txt to let us apply rational formatting that we can actually parse + var reformatted_txt = txt.copy(); + reformatted_txt.deleteText(0,pOut.length-1); + reformatted_txt = reformatted_txt.setText(pOut); + + var attrs = txt.getTextAttributeIndices(); + var lastOff = pOut.length; + // We will run through this loop multiple times for the things we care about. + // Font + // URL + // Then for alignment + // Then for bold + // Then for italic. + + // FONTs + var lastOff = pOut.length; // loop goes backwards, so this holds + for (var i=attrs.length-1; i>=0; i--) { + var off=attrs[i]; + var font=txt.getFontFamily(off) + if (font) { + while (i>=1 && txt.getFontFamily(attrs[i-1])==font) { + // detect fonts that are in multiple pieces because of errors on formatting: + i-=1; + off=attrs[i]; + } + reformatted_txt.setFontFamily(off, lastOff-1, font); + } + lastOff=off; + } + + // URL + // XXX TODO actually convert to URL text here. + var lastOff=pOut.length; + for (var i=attrs.length-1; i>=0; i--) { + var off=attrs[i]; + var url=txt.getLinkUrl(off); + if (url) { + while (i>=1 && txt.getLinkUrl(attrs[i-1]) == url) { + // detect urls that are in multiple pieces because of errors on formatting: + i-=1; + off=attrs[i]; + } + reformatted_txt.setLinkUrl(off, lastOff-1, url); + } + lastOff=off; + } + + // alignment + var lastOff=pOut.length; + for (var i=attrs.length-1; i>=0; i--) { + var off=attrs[i]; + var alignment=txt.getTextAlignment(off); + if (alignment) { // + while (i>=1 && txt.getTextAlignment(attrs[i-1]) == alignment) { + i-=1; + off=attrs[i]; + } + reformatted_txt.setTextAlignment(off, lastOff-1, alignment); + } + lastOff=off; + } + + // strike + var lastOff=pOut.length; + for (var i=attrs.length-1; i>=0; i--) { + var off=attrs[i]; + var strike=txt.isStrikethrough(off); + if (strike) { + while (i>=1 && txt.isStrikethrough(attrs[i-1])) { + i-=1; + off=attrs[i]; + } + reformatted_txt.setStrikethrough(off, lastOff-1, strike); + } + lastOff=off; + } + + // bold + var lastOff=pOut.length; + for (var i=attrs.length-1; i>=0; i--) { + var off=attrs[i]; + var bold=txt.isBold(off); + if (bold) { + while (i>=1 && txt.isBold(attrs[i-1])) { + i-=1; + off=attrs[i]; + } + reformatted_txt.setBold(off, lastOff-1, bold); + } + lastOff=off; + } + + // italics + var lastOff=pOut.length; + for (var i=attrs.length-1; i>=0; i--) { + var off=attrs[i]; + var italic=txt.isItalic(off); + if (italic) { + while (i>=1 && txt.isItalic(attrs[i-1])) { + i-=1; + off=attrs[i]; + } + reformatted_txt.setItalic(off, lastOff-1, italic); + } + lastOff=off; + } + + + var mOut=""; // Modified out string + var harmonized_attrs = reformatted_txt.getTextAttributeIndices(); + reformatted_txt.getTextAttributeIndices(); // @lmmx: is this a typo...? + pOut = reformatted_txt.getText(); + + + // Markdown is farily picky about how it will let you intersperse spaces around words and strong/italics chars. This regex (hopefully) clears this up + // Match any number of \*, followed by spaces/word boundaries against anything that is not the \*, followed by boundaries, spaces and * again. + // Test case at http://jsfiddle.net/ovqLv0s9/2/ + + var reAlignStars = /(\*+)(\s*\b)([^\*]+)(\b\s*)(\*+)/g; + + var lastOff=pOut.length; + for (var i=harmonized_attrs.length-1; i>=0; i--) { + var off=harmonized_attrs[i]; + + var raw_text = pOut.substring(off, lastOff) + + var d1 = ""; // @lmmx: build up a modifier prefix + var d2 = ""; // @lmmx: ...and suffix + + var end_font; + + var mark_bold = false; + var mark_italic = false; + var mark_code = false; + var mark_sup = false; + var mark_sub = false; + var mark_strike = false; + + // The end of the text block is a special case. + if (lastOff == pOut.length) { + end_font = reformatted_txt.getFontFamily(lastOff - 1) + if (end_font) { + if (!inSrc && end_font===end_font.COURIER_NEW) { + mark_code = true; + } + } + if (reformatted_txt.isBold(lastOff -1)) { + mark_bold = true; + } + if (reformatted_txt.isItalic(lastOff - 1)) { + // edbacher: changed this to handle bold italic properly. + mark_italic = true; + } + if (reformatted_txt.isStrikethrough(lastOff - 1)) { + mark_strike = true; + } + if (reformatted_txt.getTextAlignment(lastOff - 1)===DocumentApp.TextAlignment.SUPERSCRIPT) { + mark_sup = true; + } + if (reformatted_txt.getTextAlignment(lastOff - 1)===DocumentApp.TextAlignment.SUBSCRIPT) { + mark_sub = true; + } + } else { + end_font = reformatted_txt.getFontFamily(lastOff -1 ) + if (end_font) { + if (!inSrc && end_font===end_font.COURIER_NEW && reformatted_txt.getFontFamily(lastOff) != end_font) { + mark_code=true; + } + } + if (reformatted_txt.isBold(lastOff - 1) && !reformatted_txt.isBold(lastOff) ) { + mark_bold=true; + } + if (reformatted_txt.isStrikethrough(lastOff - 1) && !reformatted_txt.isStrikethrough(lastOff)) { + mark_strike=true; + } + if (reformatted_txt.isItalic(lastOff - 1) && !reformatted_txt.isItalic(lastOff)) { + mark_italic=true; + } + if (reformatted_txt.getTextAlignment(lastOff - 1)===DocumentApp.TextAlignment.SUPERSCRIPT) { + if (reformatted_txt.getTextAlignment(lastOff)!==DocumentApp.TextAlignment.SUPERSCRIPT) { + mark_sup = true; + } + } + if (reformatted_txt.getTextAlignment(lastOff - 1)===DocumentApp.TextAlignment.SUBSCRIPT) { + if (reformatted_txt.getTextAlignment(lastOff)!==DocumentApp.TextAlignment.SUBSCRIPT) { + mark_sub = true; + } + } + } + + if (mark_code) { + d2 = '`'; // shouldn't these go last? or will it interfere w/ reAlignStars? + } + if (mark_bold) { + d2 = "**" + d2; + } + if (mark_italic) { + d2 = "*" + d2; + } + if (mark_strike) { + d2 = "" + d2; + } + if (mark_sup) { + d2 = '' + d2; + } + if (mark_sub) { + d2 = '' + d2; + } + + mark_bold = mark_italic = mark_code = mark_sup = mark_sub = mark_strike = false; + + var font=reformatted_txt.getFontFamily(off); + if (off == 0) { + if (font) { + if (!inSrc && font===font.COURIER_NEW) { + mark_code = true; + } + } + if (reformatted_txt.isBold(off)) { + mark_bold = true; + } + if (reformatted_txt.isItalic(off)) { + mark_italic = true; + } + if (reformatted_txt.isStrikethrough(off)) { + mark_strike = true; + } + if (reformatted_txt.getTextAlignment(off)===DocumentApp.TextAlignment.SUPERSCRIPT) { + mark_sup = true; + } + if (reformatted_txt.getTextAlignment(off)===DocumentApp.TextAlignment.SUBSCRIPT) { + mark_sub = true; + } + } else { + if (font) { + if (!inSrc && font===font.COURIER_NEW && reformatted_txt.getFontFamily(off - 1) != font) { + mark_code=true; + } + } + if (reformatted_txt.isBold(off) && !reformatted_txt.isBold(off -1) ) { + mark_bold=true; + } + if (reformatted_txt.isItalic(off) && !reformatted_txt.isItalic(off - 1)) { + mark_italic=true; + } + if (reformatted_txt.isStrikethrough(off) && !reformatted_txt.isStrikethrough(off - 1)) { + mark_strike=true; + } + if (reformatted_txt.getTextAlignment(off)===DocumentApp.TextAlignment.SUPERSCRIPT) { + if (reformatted_txt.getTextAlignment(off - 1)!==DocumentApp.TextAlignment.SUPERSCRIPT) { + mark_sup = true; + } + } + if (reformatted_txt.getTextAlignment(off)===DocumentApp.TextAlignment.SUBSCRIPT) { + if (reformatted_txt.getTextAlignment(off - 1)!==DocumentApp.TextAlignment.SUBSCRIPT) { + mark_sub = true; + } + } + } + + + if (mark_code) { + d1 = '`'; + } + + if (mark_bold) { + d1 = d1 + "**"; + } + + if (mark_italic) { + d1 = d1 + "*"; + } + + if (mark_sup) { + d1 = d1 + ''; + } + + if (mark_sub) { + d1 = d1 + ''; + } + + if (mark_strike) { + d1 = d1 + ''; + } + + var url=reformatted_txt.getLinkUrl(off); + if (url) { + mOut = d1 + '['+ raw_text +']('+url+')' + d2 + mOut; + } else { + var new_text = d1 + raw_text + d2; + new_text = new_text.replace(reAlignStars, "$2$1$3$5$4"); + mOut = new_text + mOut; + } + + lastOff=off; +// Logger.log("Modified String: " + mOut) + } + + mOut = pOut.substring(0, off) + mOut; + return mOut; +} \ No newline at end of file diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/README.md b/examples/gemini/python/docs-agent/docs_agent/interfaces/README.md index 27e2ae9ae..bd383e5ae 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/README.md +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/README.md @@ -116,21 +116,21 @@ credentials (via `gcloud`) stored on your host machine. (`application_default_credentials.json`) in the `$HOME/.config/gcloud/` directory of your host machine. -## 4. Clone the Docs Agent project +## 4. Clone the Docs Agent project repository **Note**: This guide assumes that you're creating a new project directory from your `$HOME` directory. -1. Clone the following repo: +1. Clone the following internal repo: ```posix-terminal - git clone https://github.com/google/generative-ai-docs.git + git clone sso://doc-llm-internal/docs-agent ``` -2. Go to the Docs Agent project directory: +2. Go to the project directory: ```posix-terminal - cd generative-ai-docs/examples/gemini/python/docs-agent + cd docs-agent ``` 3. Install dependencies using `poetry`: @@ -139,6 +139,8 @@ from your `$HOME` directory. poetry install ``` + This may take some time to complete. + ## 5. Set up an alias to the gemini command **Note**: If your Docs Agent project is not cloned in the `$HOME` directory, diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/__init__.py b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/__init__.py index 629ffe29f..58c4c46d9 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/__init__.py +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/__init__.py @@ -19,7 +19,7 @@ from docs_agent.utilities import config -def create_app(product: config.ProductConfig): +def create_app(product: config.ProductConfig, app_mode: str = "web"): app = Flask(__name__) - app.register_blueprint(chatui.construct_blueprint(product_config=product)) + app.register_blueprint(chatui.construct_blueprint(product_config=product, app_mode=app_mode)) return app diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/chatui.py b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/chatui.py index 084dfa884..3e0b351ed 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/chatui.py +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/chatui.py @@ -45,19 +45,32 @@ # This is used to define the app blueprint using a productConfig -def construct_blueprint(product_config: config.ProductConfig): +def construct_blueprint(product_config: config.ProductConfig, app_mode: str = None): bp = Blueprint("chatui", __name__) if product_config.db_type == "google_semantic_retriever": docs_agent = DocsAgent(config=product_config, init_chroma=False) else: docs_agent = DocsAgent(config=product_config) - logging.info(f"Launching the flask app for product: {product_config.product_name}") + logging.info(f"Launching the Flask app for product: {product_config.product_name} with app_mode: {app_mode}") + # Assign templates and redirects + if app_mode == "web": + app_template = "chatui/index.html" + redirect_index = "chatui.index" + elif app_mode == "experimental": + app_template = "chatui-experimental/index.html" + redirect_index = "chatui-experimental.index" + elif app_mode == "widget": + app_template = "chat-widget/index.html" + redirect_index = "chat-widget.index" + else: + app_template = "chatui/index.html" + redirect_index = "chatui.index" @bp.route("/", methods=["GET", "POST"]) def index(): server_url = request.url_root.replace("http", "https") return render_template( - "chatui/index.html", + app_template, product=product_config.product_name, server_url=server_url, ) @@ -73,7 +86,7 @@ def api(): context, sources_ref, plain_token, - ) = ask_model_2_with_sources(input["question"], agent=docs_agent) + ) = ask_model_with_sources(input["question"], agent=docs_agent) source_array = [] for source in sources_ref: source_array.append(source.returnDictionary()) @@ -100,7 +113,7 @@ def like(): log_like(is_like, str(uuid_found).strip()) return "OK" else: - return redirect(url_for("chatui.index")) + return redirect(url_for(redirect_index)) @bp.route("/rewrite", methods=["GET", "POST"]) def rewrite(): @@ -149,10 +162,7 @@ def rewrite(): file.close() return "OK" else: - if product_config.docs_agent_config == "experimental": - return redirect(url_for("chatui.index_experimental")) - elif product_config.docs_agent_config == "normal": - return redirect(url_for("chatui.index")) + return redirect(url_for(redirect_index)) # Render a response page when the user asks a question # using input text box. @@ -160,17 +170,9 @@ def rewrite(): def result(): if request.method == "POST": question = request.form["question"] - if product_config.docs_agent_config == "experimental": - return ask_model2(question, agent=docs_agent, template="chatui/index_experimental.html") - elif product_config.docs_agent_config == "normal": - return ask_model2( - question, agent=docs_agent, template="chatui/index.html" - ) + return ask_model(question, agent=docs_agent, template=app_template) else: - if product_config.docs_agent_config == "experimental": - return redirect(url_for("chatui.index_experimental")) - elif product_config.docs_agent_config == "normal": - return redirect(url_for("chatui.index")) + return redirect(url_for(redirect_index)) # Render a response page when the user clicks a question # from the related questions list. @@ -178,17 +180,9 @@ def result(): def question(ask): if request.method == "GET": question = urllib.parse.unquote_plus(ask) - if product_config.docs_agent_config == "experimental": - return ask_model2(question, agent=docs_agent, template="chatui/index_experimental.html") - elif product_config.docs_agent_config == "normal": - return ask_model2( - question, agent=docs_agent, template="chatui/index.html" - ) + return ask_model(question, agent=docs_agent, template=app_template) else: - if product_config.docs_agent_config == "experimental": - return redirect(url_for("chatui.index_expiremental")) - elif product_config.docs_agent_config == "normal": - return redirect(url_for("chatui.index")) + return redirect(url_for(redirect_index)) return bp @@ -196,7 +190,7 @@ def question(ask): # Construct a set of prompts using the user question, send the prompts to # the lanaguage model, receive responses, and present them into a page. # Use template to specify a custom template for the classic web UI -def ask_model2(question, agent, template: str = "chatui/index.html"): +def ask_model(question, agent, template: str = "chatui/index.html"): # Returns a built context, a total token count of the context and an array # of sourceOBJ full_prompt = "" @@ -307,7 +301,7 @@ def ask_model2(question, agent, template: str = "chatui/index.html"): # Not fully implemented # This method is used for the API endpoint, so it returns values that can be # packaged as JSON -def ask_model_2_with_sources(question, agent): +def ask_model_with_sources(question, agent): docs_agent = agent full_prompt = "" context, plain_token, sources_ref = docs_agent.query_vector_store_to_build( diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-chatui.css b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-chatui.css index fb67d4b4b..7155e750d 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-chatui.css +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-chatui.css @@ -93,18 +93,20 @@ body { font-size: 0.9em; font-family: system-ui; line-height: 150%; + word-break: break-word; padding: 4px; } #response-box { font-size: 1.0em; font-family: sans-serif; - line-height: 100%; + line-height: 140%; margin-top: 10px; } #suggested-questions { font-family: sans-serif; + word-break: break-word; } #context-content{ @@ -176,6 +178,13 @@ body { margin-top: 12px; } + #answerable-span { + font-size: small; + font-family: system-ui; + float: right; + padding: 10px; + } + /* ======= Style by class ======= */ .hidden { @@ -240,6 +249,7 @@ body { .related-questions { margin-bottom: 20px; font-size: 0.9em; + line-height: 140%; } /* ======= Style buttons by ID ======= */ @@ -291,7 +301,7 @@ body { #edit-text-area { font: 13px/1.5em Overpass, "Open Sans", Helvetica, sans-serif; max-height: 500px; - max-width: 650px; + max-width: -webkit-fill-available; height: 300px; width: 650px; padding: 8px; @@ -340,7 +350,7 @@ body { .search input[type="text"] { border: 0; - width: 91%; + width: calc(100% - 65px); padding: 10px; } @@ -505,3 +515,17 @@ body { margin-bottom: 0; } +/* Loader animation */ +/* Source: https://css-loaders.com/classic/ */ +.loader { + width: fit-content; + font-family: monospace; + font-size: 14px; + margin-left: 13px; + clip-path: inset(0 3ch 0 0); + animation: animation 1s steps(4) infinite; +} +.loader:before { + content:"Generating a response..." +} +@keyframes animation {to{clip-path: inset(0 -1ch 0 0)}} diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-old.css b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-widget.css similarity index 68% rename from examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-old.css rename to examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-widget.css index b1c0c1b3c..a0732ba9f 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-old.css +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/css/style-widget.css @@ -22,13 +22,13 @@ body { font-weight: 300; max-width: 960px; margin: auto; - background-color: #d9d9d9; + background-color: white; padding-top: 15px; padding-bottom: 15px; } a { - color: #22578c; + color: #0a619a; } p { @@ -39,15 +39,18 @@ body { h1 { margin: 0 0 0.5em; font-weight: 500; - font-size: 2.0em; - margin-left: 0.8em; - margin-top: 0.3em; + font-size: 1.3em; + margin-top: 0.1em; + margin-left: 0.1em; + margin-bottom: 0.9em; } h2 { margin: 0; - margin-top: 17px; - margin-bottom: 15px; + margin-top: 15px; + margin-bottom: 10px; + font-weight: normal; + font-size: 1.4em; } h3 { @@ -58,22 +61,32 @@ body { h4 { color: #505050; - margin: 0; margin-top: 3px; - margin-bottom: 10px; + margin-left: 5px; + margin-bottom: 8px; } li { margin: 0 0 0.3em; } - /* code { - font-family: math; - color: darkgreen; - } */ + code { + font-family: math; + color: darkgreen; + text-wrap: pretty; + } /* ======= Style layout by ID ======= */ + #iframe-box { + margin: 0px; + max-width: 760px; + font: 15px arial, sans-serif; + background-color: white; + padding-bottom: 0px; + padding-left: 0px; + } + #callout-box { margin: auto; max-width: 800px; @@ -86,6 +99,108 @@ body { border-radius: 15px; } + #important-box { + font-size: 0.9em; + font-family: system-ui; + word-break: break-word; + line-height: 150%; + word-break: break-word; + padding: 4px; + } + + #response-box { + font-size: 1.0em; + font-family: sans-serif; + line-height: 140%; + margin-top: 10px; + } + + #suggested-questions { + font-family: sans-serif; + word-break: break-word; + } + + #source-pages { + font-family: sans-serif; + word-break: break-all; + } + + #context-content{ + background: #d7dbd7; + font-family: sans-serif; + word-break: break-all; + } + + #context-pre { + font-size: small; + font-family: monospace; + text-wrap: pretty; + margin-top: 0.3px; + } + + #probability-box { + font-size: small; + padding: 4px; + margin-bottom: 10px; + } + + #grounding-box { + font-size: small; + padding: 4px; + word-break: break-all; + } + + #grounding-pre { + font-size: small; + font-family: monospace; + text-wrap: pretty; + } + + #reference-box { + font-size: 0.9em; + font-family: system-ui; + text-wrap: pretty; + word-break: break-all; + margin-bottom: 12px; + line-height: 1.5em; + } + + #reference-box-no-aqa { + font-size: 0.9em; + font-family: system-ui; + text-wrap: pretty; + word-break: break-all; + line-height: 1.5em; + } + + #aqa-content{ + background: #9fc7db; + font-family: math; + } + + #aqa-label{ + background: #49a5d2; + } + + #aqa-json { + font-family: system-ui; + font-size: small; + text-wrap: pretty; + word-break: break-all; + margin: 0; + } + + #rewrite-buttons-box { + margin-top: 12px; + } + + #answerable-span { + font-size: small; + font-family: system-ui; + float: right; + padding: 10px; + } + /* ======= Style by class ======= */ .hidden { @@ -137,7 +252,7 @@ body { border-radius: 15px; } - .question, .response, .response-text, .fact-checked-text, .related-questions { + .question, .response, .response-text, .fact-checked-text { max-width: 700px; margin-left: 3px; } @@ -147,6 +262,18 @@ body { margin-left: 10px; } + .related-questions { + margin-bottom: 20px; + font-size: 0.9em; + line-height: 140%; + } + + .relevant-sources { + margin-bottom: 20px; + font-size: 0.9em; + line-height: 140%; + } + /* ======= Style buttons by ID ======= */ #rewrite-button { @@ -156,6 +283,7 @@ body { padding: 7px; border-radius: 5px; cursor:pointer; + margin-top: 0.3em; } #rewrite-button:hover { @@ -189,14 +317,15 @@ body { #submit-result { color: #027f02d6; + font-family: fantasy; } #edit-text-area { font: 13px/1.5em Overpass, "Open Sans", Helvetica, sans-serif; max-height: 500px; - max-width: 650px; + max-width: -webkit-fill-available; height: 300px; - width: 650px; + width: 580px; padding: 8px; } @@ -234,7 +363,7 @@ body { .search { border: 2px solid #CF5C3F; overflow: auto; - max-width: 700px; + max-width: 600px; margin-top: 15px; margin-left: 10px; margin-bottom: 10px; @@ -243,7 +372,7 @@ body { .search input[type="text"] { border: 0; - width: 91%; + width: calc(100% - 65px); padding: 10px; } @@ -269,7 +398,7 @@ body { .accordion { max-width: 65em; - margin-bottom: 1em; + #margin-bottom: 1em; } .accordion > input[type="checkbox"] { @@ -284,7 +413,8 @@ body { } .accordion .reference-content { - font-size: 13px; + font-size: 15px; + font-family: serif; } .accordion > input[type="checkbox"]:checked ~ .content { @@ -298,17 +428,19 @@ body { .accordion .handle { margin: 0; - font-size: 1.125em; - line-height: 1.2em; + font-size: 1.0em; } .accordion label { display: block; font-weight: normal; border: 2px solid #000; - padding: 12px; + #padding: 12px; background: #4490b8ab; - border-radius: 15px; + #border-radius: 15px; + padding: 5px; + #background: #027f023b; + border-radius: 10px; } .accordion label:hover, @@ -403,4 +535,20 @@ body { .accordion-source p:last-child { margin-bottom: 0; - } \ No newline at end of file + } + +/* Loader animation */ +/* Source: https://css-loaders.com/classic/ */ +.loader { + width: fit-content; + font-family: monospace; + font-size: 14px; + margin-left: 13px; + clip-path: inset(0 3ch 0 0); + animation: animation 1s steps(4) infinite; +} +.loader:before { + content:"Generating a response..." +} +@keyframes animation {to{clip-path: inset(0 -1ch 0 0)}} + diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/javascript/app.js b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/javascript/app.js index cf5065e9e..32908b566 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/javascript/app.js +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/static/javascript/app.js @@ -26,6 +26,20 @@ if (askButton != null){ }); } +// Display the "loading" message when a related question is clicked. +let relatedQuestions = document.getElementById('suggested-questions'); + +if (relatedQuestions != null){ + questions = relatedQuestions.getElementsByTagName('a'); + for(i=0; i { + for (const entry of entries) { + const width = entry.contentRect.width; + console.log('Element width changed to:', width); + if (width < 520) { + answerableSpan.classList.add("hidden"); + }else{ + if (answerableSpan.classList.contains("hidden")) { + answerableSpan.classList.remove("hidden"); + } + } + if (width < 300) { + rewriteButton.classList.add("hidden"); + }else{ + if (rewriteButton.classList.contains("hidden")) { + rewriteButton.classList.remove("hidden"); + } + } + } +}); + +resizeObserver.observe(element); diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/base.html b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/base.html new file mode 100644 index 000000000..42987dee4 --- /dev/null +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/base.html @@ -0,0 +1,25 @@ + + + + + + + + {{ product }} ChatBot + + +
    +

    {{ product }} Docs Agent

    +
    +
    + {% block content %} {% endblock %} +
    +
    +
    + + + + diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/index.html b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/index.html new file mode 100644 index 000000000..437b9bfd2 --- /dev/null +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/index.html @@ -0,0 +1,16 @@ +{% extends 'chat-widget/base.html' %} + +{% block content %} + + + {% if response %} +
    + {% include 'chat-widget/result.html' %} +
    + {% endif %} +{% endblock %} diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/result.html b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/result.html new file mode 100644 index 000000000..e3c65c759 --- /dev/null +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chat-widget/result.html @@ -0,0 +1,168 @@ + +{% if search_result[0] %} + {% set check_url = search_result[0].section.url %} + {% set probability = search_result[0].probability %} +{% else %} + {% set check_url = "None" %} + {% set probability = "0.0" %} +{% endif %} + +{% if aqa_response_in_html != "" %} + +
    +

    {{ question | replace("+", " ") | replace("%3F", "?")}}

    +
    + {% if probability|float > 0.7 %} +
    + Important: The answer below is generated by Gemini. Please + verify this answer by visiting the source pages. +
    +
    + + {{ md_to_html(response) | safe }} + +
    +
    +

    Source pages

    + +
      + {% for source in search_result %} + {% set section_url = named_link_html(source.section.url, source.section.url) %} +
    1. {{ section_url | safe}}
    2. + {% endfor %} +
        + +
    + + {% else %} +
    + + I'm sorry, but I cannot answer this question at the moment. + The answer to your question does not seem to be available in documentation.
    +
    + However, based on the documentation available in the corpus, you could ask questions like: +
    +
    + + {% endif %} + +{% else %} + +
    +

    {{ question | replace("+", " ") | replace("%3F", "?")}}

    +
    +
    + + Important: The answer below is generated by the Gemini Pro model. + To verify this answer, please visit: {{named_link_html(check_url, check_url) | safe}} +
    +
    + + {{ md_to_html(response) | safe }} + +
    + +
    + +

    + +

    +
    + + {% for source in search_result %} +
    {{md_to_html(source.section.content) | safe}}Reference [{{loop.index}}]
    + {% endfor %} +
    + {% for source in search_result %} + {% set section_url = named_link_html(source.section.url, source.section.url) %} + [{{loop.index}}]{{ section_url | safe}} ({{ source.distance }}) +
    + {% endfor %} +
    +
    +
    +{% endif %} + + +
    + + + {% if aqa_response_in_html != "" %} + + (Answerable probability: {{"%.3f"|format(probability|float)}}) + + {% endif %} +
    + + diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/base.html b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/base.html new file mode 100644 index 000000000..944984182 --- /dev/null +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/base.html @@ -0,0 +1,27 @@ + + + + + + + + {{ product }} ChatBot + + +
    +

    {{ product }} Docs Agent

    +
    +
    + {% block content %} {% endblock %} +
    +
    +
    +
    +
    + + + + diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/index_experimental.html b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/index.html similarity index 90% rename from examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/index_experimental.html rename to examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/index.html index 088fb364b..97376ac8a 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/index_experimental.html +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/index.html @@ -12,7 +12,7 @@ {% if response %}
    - {% include 'chatui/result_experimental.html' %} + {% include 'chatui-experimental/result.html' %}
    {% endif %} {% endblock %} \ No newline at end of file diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/result_experimental.html b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/result.html similarity index 100% rename from examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/result_experimental.html rename to examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui-experimental/result.html diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/index.html b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/index.html index 5d00e10b5..042f18d02 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/index.html +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/index.html @@ -7,12 +7,10 @@ - + {% if response %}
    {% include 'chatui/result.html' %}
    {% endif %} -{% endblock %} \ No newline at end of file +{% endblock %} diff --git a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/result.html b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/result.html index 37e695141..4bc14b304 100644 --- a/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/result.html +++ b/examples/gemini/python/docs-agent/docs_agent/interfaces/chatbot/templates/chatui/result.html @@ -1,10 +1,18 @@ + +{% if search_result[0] %} + {% set check_url = search_result[0].section.url %} + {% set probability = search_result[0].probability %} +{% else %} + {% set check_url = "None" %} + {% set probability = "0.0" %} +{% endif %} + {% if aqa_response_in_html != "" %}

    {{ question | replace("+", " ") | replace("%3F", "?")}}

    - {% set check_url = search_result[0].section.url %} Important: The answer below is generated by the Gemini AQA model. To verify this answer, please visit: {{named_link_html(check_url, check_url) | safe}}
    @@ -26,7 +34,6 @@

    - {% set probability = search_result[0].probability %} Answerable probability: {{"%.6f"|format(probability|float)}}
    @@ -60,7 +67,6 @@

    {{ question | replace("+", " ") | replace("%3F", "?")}}

    - {% set check_url = search_result[0].section.url %} Important: The answer below is generated by the Gemini Pro model. To verify this answer, please visit: {{named_link_html(check_url, check_url) | safe}}
    @@ -100,6 +106,9 @@

    + + (Answerable probability: {{"%.3f"|format(probability|float)}}) +