Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vocab Mapper: release plan #144

Open
josephjclark opened this issue Jan 24, 2025 · 0 comments
Open

Vocab Mapper: release plan #144

josephjclark opened this issue Jan 24, 2025 · 0 comments

Comments

@josephjclark
Copy link
Collaborator

josephjclark commented Jan 24, 2025

We currently have a working prototype of the vocab mapper working with an Openfn worfkflow and googlesheets. This issue is a high level design for how we want to enable users to use the vocab mapper themselves

The short version is:

  • Create a sheets extension (or script) that's shareable and with the settings we need
  • Create a public "free" version for testing by anyone
  • Enable users to create and own their own mapper workflow

Credentials

A word on credentials: there are currently two credentials which users have to provide to the vocab mapper

  • Googlesheets Oauth, needed to read and write from the target spreadsheet. Presumably if using a public sheet, this isn't technically needed (I'll bet the adaptor blows up thought)
  • Anthropic - used by the vocab mapper

We have options how to manage this:

  • We can safely store the credentials in the OpenFn workflow (this is what the prototype does today)
  • We can demand credentials from the Sheets extension. Easy for anthropic but maybe harder for sheets (can we take the user's oauth token from within the extension and post it out to openfn? Surely that's not secure?)

There are also two hidden credentials which Apollo is currently owning:

  • Pinecone: used to store embedded terms (ie, loinc and snomed)
  • OpenAI: used to convert the user's inputs into embeddings on-the-fly

The Plan

Users use Googlesheets via a script or extension. We have the ability to prompt users for options (like API keys) and save them to the sheet for later use.

Users wanting to user the vocab mapper should setup their own workflow on OpenFn.

This comes with several advantages:

  • Users can manage whatever credentials they want in OpenFn. It is secure and safe.
  • If users want to use a private sheet for their mappings (and why not?!) they can just use their sheets outh token and that'll handle their permissions
  • Users can customise the workflow however they like. They can send different options to the mapper (when we provide them anyway). They can process the data before sending it back into the sheet. They can add a step to generate a json lookup object. Whatever!
  • It brings new users into the platform

But this only works if we have a really slick process to let them "import" this workflow (or select from a template)

We should also leave our current prototype workflow running as a public demo. This means that users can use the Sheets extension to run the mapper with default settings. It just magically works and gives instant gratification.

We need to decide if we want to pay for the AI usage or ask users to submit an API key. We should be fine for fair usage but if we go totally public we may be open to abuse here. See below.

Public Demo

We should enable a totally free public demo, which requires no settings or credentials from users. They can just set up their spreadsheet, add the script or extension, and call our public workflow endpoint. And boom! Magic.

But the demo should be limited to maybe 10 input values at a time - a limitation imposed in the workflow . Maybe we also need consider some kind of rate limit per sheet as well (tracked via collections I guess)

Work needed

Sheets extension

The sheets extension MAY need to capture an Anthropic API key

The sheets extension MAY need to capture additional options - like what datasets to map to. But we can probably

The sheets extension probably needs to take as an option the webhook URL to call.

The sheets extension needs to be distributable. Ie, how do we share it?

Workflow Repo

We will need to save the workflow as a repo on github, nicely packaged, with a full suite of docs explaining how to get into Lightning.

I think the process would be:

  • clone or fork the repo
  • If you fork the repo, you should be able to github sync it into your project (right??)
  • If you clone it, you can CLI push it to your project (right??)

We do need to ensure that users cannot accidentally sync back to the core/template repo and override it for others. Maybe that means the instruction is always to fork the repo.

@josephjclark josephjclark changed the title WIP Vocab Mapper: release plan Vocab Mapper: release plan Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant