Polish on books #916

epugh · 2024-01-03T16:24:49Z

Description

Moving from a few hundred queries at most to supporting thousands of queries requires rethinking a number of the interaction models. The amount of processing you can complete in a couple of seconds web request/response cycle is limited. This large PR does a number of things:

Combine the use of ActiveJob for background processing and ActiveStorage for managing temporary large files lets us move processing work to a background thread, freeing up the web request/response cycle. This has been done for populating a book from a case and importing a book from JSON.
In testing populating a book with 5000 queries each with 10 docs, so 50,000 objects, the web frontend easily posts a 87 MB JSON payload to the backend. It's the processing of all that data that takes a long time. With compression, we can take 87 mb json down to 21 mb zipped file, and store that. Once the background job finishes, it deletes that file. So the fact that we are storing a big binary in our database for a period of time is probably okay.
In the vein of being more cognizant of what lots of queries means, now when you close out the Judgements modal we only re run the queries if you have imported new ratings from the book. Just opening it up and then closing it won't prompt rerunning the queries.
We've also made the integration from Case to a create a book a bit better by passing along the teams that a Case is already shared with to pre-populate the Book creation screen.
Starting to see more demand to interact with Quepid via it's APIs, so we documented them more. Specifically the ones around interacting with a book, query doc pairs, and judgements! SO check out /apipie for more info!
You can now round trip all the attributes of a query to a book and back to a new query in a new case. We preserve the information_need, notes, and any options defined on a query through that lifecycle.
Exposing a proper screen to import an exported book's JSON file. Plus better handling of the exported JSON file.

Motivation and Context

How Has This Been Tested?

Screenshots or GIFs (if appropriate):

Types of changes

[] Bug fix (non-breaking change which fixes an issue)
[] Improvement (non-breaking change which improves existing functionality)
[] New feature (non-breaking change which adds new functionality)
[] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

[] My code follows the code style of this project.
[] My change requires a change to the documentation.
[] I have updated the documentation accordingly.
[] I have read the CONTRIBUTING document.
[] I have added tests to cover my changes.
[] All new and existing tests passed.

…kay.

dacox · 2024-01-17T19:17:00Z

app/controllers/api/v1/books/populate_controller.rb

+          puts "[PopulateController] the size of the serialized data is #{number_to_human_size(serialized_data.bytesize)}"
+          compressed_data = Zlib::Deflate.deflate(serialized_data)
+          puts "[PopulateController] the size of the compressed data is #{number_to_human_size(compressed_data.bytesize)}"
+          @book.populate_file.attach(io: StringIO.new(compressed_data), filename: "book_populate_#{@book.id}.bin.zip",


@epugh I am not versed in ActiveJob, or rails generally.

Storing compressed blobs in the database is probably fine considering what I imagine are relatively small deployments of Quepid.

For the hosted Quepid service, it is a bit more worrying.

My web experience is a lot of Django and Celery. I would typically save a large file to a blob store like S3, and then download in the async tasks.

Or, if i could guarantee a file was only required by one async task, or that all related tasks would be on the same machine, I would save to local disk.

Btw extra documentation of the API is awesome

thanks for the feedback. Yeah, generally with ActiveStorage you back it with s3.. I went with DB for now just to avoid adding "yet anohter quepid setup task" and that the blobs shouldn't last a long time. the app.quepid.com on heroku means that the local disk doesn't work cause each web worker has it's own local disk :-(.. WE'll see!

epugh added 25 commits January 3, 2024 11:23

unused variable...

53f2d75

use the book path!

c05345a

we need to remember these attributes!

0d11352

and this one...

e004435

having under api didn't make sense

4ed15aa

better formatting

807686b

bulk import book data

da0eb70

typo

ea574ab

forgot to establish an owner!

8641a3b

Merge branch 'main' into polish_on_books

d0ee4a8

make creating a book more intuitive

6107fd4

lint

97fdc56

lint

0fe9d07

better message! we work with engines first and apis second ;-)

dcbc596

actually track the owenr!

90c1850

clean up missing owners!

c86dcd0

Handle ? in the queries!

47db8fb

add a test file

4c0a77c

ignore judgements that don't have an associated email, anonymous is o…

63ba8e9

…kay.

lint

afdc6b6

lint

5457f12

unused method

bded237

slap the rest in

85149d8

fix test now that we don't have the same ordering...

ffc1962

Document book api to facilate code loading jdugemnts!

3aa2282

epugh temporarily deployed to quepid-pr-916 January 10, 2024 17:24 Inactive

let our database work FOR us

cfb99b7

epugh temporarily deployed to quepid-pr-916 January 12, 2024 22:22 Inactive

epugh added 2 commits January 15, 2024 09:32

debated adding, but it is useful for a return some day...

417a666

lots of scaling fixes

893e841

epugh temporarily deployed to quepid-pr-916 January 15, 2024 22:01 Inactive

regen

9c29207

epugh temporarily deployed to quepid-pr-916 January 15, 2024 22:49 Inactive

epugh temporarily deployed to quepid-pr-916 January 16, 2024 11:51 Inactive

use the db too!

84877b3

epugh temporarily deployed to quepid-pr-916 January 16, 2024 12:12 Inactive

make sure we delete attachements when we delete a book...

b971242

epugh temporarily deployed to quepid-pr-916 January 16, 2024 13:31 Inactive

extra debugger not needed.

f2ea019

epugh temporarily deployed to quepid-pr-916 January 16, 2024 17:50 Inactive

lint

396cc0a

epugh temporarily deployed to quepid-pr-916 January 16, 2024 17:51 Inactive

epugh added 2 commits January 16, 2024 13:42

thank people

8c13057

Make david happy by not reloading all the queries unless needed!

06ad5cc

epugh temporarily deployed to quepid-pr-916 January 16, 2024 22:47 Inactive

reorder gems.

983b0eb

epugh temporarily deployed to quepid-pr-916 January 17, 2024 14:25 Inactive

epugh merged commit 56c1c48 into main Jan 17, 2024
4 checks passed

epugh mentioned this pull request Jan 17, 2024

only rerun the queries after looking at the judgement popup if needed! #919

Closed

dacox reviewed Jan 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polish on books #916

Polish on books #916

epugh commented Jan 3, 2024 •

edited

Loading

dacox Jan 17, 2024 •

edited

Loading

epugh Jan 17, 2024

Polish on books #916

Polish on books #916

Conversation

epugh commented Jan 3, 2024 • edited Loading

Description

Motivation and Context

How Has This Been Tested?

Screenshots or GIFs (if appropriate):

Types of changes

Checklist:

dacox Jan 17, 2024 • edited Loading

Choose a reason for hiding this comment

epugh Jan 17, 2024

Choose a reason for hiding this comment

epugh commented Jan 3, 2024 •

edited

Loading

dacox Jan 17, 2024 •

edited

Loading