-
Notifications
You must be signed in to change notification settings - Fork 37
Incantations
Here are some collected command lines and SQL snippets that have proven useful during development of Overview
Afrer running ./dev
to start the development server, run auto/psql.sh overview-dev
to get a PSQL prompt connected to the development database
./sbt "testOnly *DocumentSetFileControllerSpec"
This will work for the main server (aka overview-web). To do this for another project, e.g. worker, common, do
./sbt "; project common; testOnly *ContentDispositionSpec"
More.
Run the auto/clean-fully.sh
script. Cleans up "everything SBT ever did."
For an overview-local installation, make sure the server is running, then:
To check postgres:
docker exec -it overview-database du -h -d 1 /var/lib/postgresql/data
To check blob storage (raw uploads, and PDFs for display):
docker exec -it overview-worker du -h -d 1 /var/lib/overview/blob-storage/
To check elastic search:
docker exec -it overview-searchindex du -h -d 1 /usr/share/elasticsearch/data
Postgres should use substantially less disk than blob storage. If this is not the case, Postgres may need its large object table vacuumed. This is where uploads are stored, temporarily, before processing, and large uploads may fragment the table and make it unnecessarily large. To fix this,
docker exec -it overview-database psql -h localhost -U overview -c "vacuum full verbose pg_largeobject"
First, run ./sbt worker/stage
once.
Then something like: DATABASE_PORT=9010 DATABASE_NAME=overview-dev worker/target/universal/stage/bin/worker -main com.overviewdocs.clustering.DumpDocuments 1 > dump
integration-test/run spec/file_upload_spec.rb
Notes:
-
ls integration-test/spec
to see the full list of test files - It's always
spec/
. The path is relative torun
, not relative to the current working directory. - Use
run-browser
to watch the test occur. (You must havevncviewer
in yourPATH
.) - See overview-integration-tester docs for a more complete explanation
For overview-local, run dump-logs or tail-logs
For Overview on AWS, do overview-manage show-logs
then browse to http://localhost:9292
. To view document set conversion times click the "Load" icon in the upper-right, then choose "Conversion". You should see how many conversions we've had recently, and whether they've been quick, slow, very slow or ridiculously slow.
SELECT node.parent_id AS id, node.id AS child_id, node.description AS child_description
FROM node
WHERE parent_id IN (2,91,171,799)
ORDER BY node.id;
Notes:
- Replace
1,91,171,799
with the node IDs in the above query. - The
ORDER BY
keeps the tree ordering consistent, but it's not strictly necessary either, if the caller is going to sort the lists by other criteria.
SELECT
node_id,
document_count,
document_id
FROM (
SELECT
nd.node_id,
COUNT(nd.document_id) OVER (PARTITION BY nd.node_id) AS document_count,
nd.document_id,
RANK() OVER (PARTITION BY nd.node_id ORDER BY nd.document_id) AS pos
FROM node_document nd
WHERE nd.node_id IN (1,2,4,5,6,10,12,17,18,19,24,30,...)
ORDER BY nd.document_id
) ss
WHERE ss.pos < 11
node_id | document_count | document_ids
---------+----------------+-----------------------------------------
1 | 2200 | 1
1 | 2200 | 2
1 | 2200 | 3
1 | 2200 | 4
1 | 2200 | 5
1 | 2200 | 6
1 | 2200 | 7
1 | 2200 | 8
1 | 2200 | 9
1 | 2200 | 10
2 | 172 | 16
2 | 172 | 17
2 | 172 | 42
2 | 172 | 53
2 | 172 | 85
2 | 172 | 117
...
Notes:
- Replace
1,2,...
with the node IDs. - This iterates over all documents in all nodes. If the root node is included in the query (which is common), that means it iterates over all documents in the document set. It also sorts all document IDs from all nodes; we may want to change this to sort by document title, which would mean another table join and a more complex sort. We can avoid all these problems by pre-computing values and storing them in the
node
table.
SELECT
tag_id,
tag_name,
document_count,
document_id
FROM (
SELECT
t.id AS tag_id,
t.name AS tag_name,
COUNT(dt.document_id) OVER (PARTITION BY dt.tag_id) AS document_count,
dt.document_id,
RANK() OVER (PARTITION BY dt.tag_id ORDER BY dt.document_id) AS pos
FROM tag t
INNER JOIN document_tag dt ON t.id = dt.tag_id
WHERE t.document_set_id = {document_set_id}
ORDER BY t.name, dt.document_id
) ss
WHERE ss.pos < 11
Notes:
- When a Tag has no Documents, this query will return one row for it.
tag_id
andtag_name
will be what they should be;document_count
will be0
anddocument_id
will beNULL
.
SELECT id, title, text_url, view_url FROM document
WHERE document.id IN (SELECT document_id FROM node_document WHERE node_id IN ({nodeIdList}) AND
WHERE document.id IN (SELECT document_id FROM document_tag WHERE tag_id IN ({tagList}) AND
WHERE document.id IN ({documentIdList})
ORDER BY id
LIMIT {end - start} OFFSET {start}
Notes:
- If nodes, tags, or documents are not specified in selection, associated WHERE clause is removed.
- Removed sorting by title
- 0 <= start <= end
- start == 0 means first row is included
SELECT COUNT(*) FROM document
WHERE document.id IN (SELECT document_id FROM node_document WHERE node_id IN ({nodeIdList}) AND
WHERE document.id IN (SELECT document_id FROM document_tag WHERE tag_id IN ({tagList}) AND
WHERE document.id IN ({documentIdList})
SELECT nd.node_id, dt.tag_id, COUNT(*)
FROM node_document nd
INNER JOIN document_tag dt ON nd.document_id = dt.document_id
WHERE ...
Notes:
- This query will be slow with large document sets. Perhaps we should cache the results in a new node_tag table?
- If no documents in a node match a given tag, that tag will not show up in the results.
SELECT nd.node_id, COUNT(dt.tag_id)
FROM node_document nd
LEFT JOIN document_tag dt ON nd.document_id = dt.document_id AND dt.tag_id = {tagid}
WHERE nd.node_id IN (...)
GROUP BY nd.node_id
Notes:
- Returns 0 for all nodes which do not include the given tag
- Omits nodes which do not contain documents (which should never happen)
SELECT dt.document_id, dt.tag_id
FROM document_tag dt
INNER JOIN tag t ON dt.tag_id = t.id
WHERE dt.document_id IN ([subselect])
ORDER BY dt.document_id, t.name
Notes:
- the
tag
table is only included to give the properORDER
.
UPDATE "user" SET tree_tooltips_enabled = TRUE WHERE email = '[email protected]';