Skip to content

Commit

Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Start moving taggings to jsonb
Browse files Browse the repository at this point in the history
peter-hank committed Jun 2, 2022
1 parent adde8f3 commit f85b11e
Showing 5 changed files with 72 additions and 1 deletion.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). It uses [CalVer](https://calver.org/) as of May 2019.

## [22.06](https://github.com/berkmancenter/lumendatabase/releases/tag/2022.06) - 2022-06-02
### Changed
* Started moving exisiting taggings to new `jsonb` fields in the `notice` table.

## [22.05c](https://github.com/berkmancenter/lumendatabase/releases/tag/2022.05c) - 2022-05-31
### Changed
* Added a db constraint to avoid inserting wrong works json data.
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
class AddJsonFieldsForTaggingsToNotices < ActiveRecord::Migration[6.1]
def change
add_column :notices, :tags_json, :jsonb
add_column :notices, :jurisdictions_json, :jsonb
add_column :notices, :regulations_json, :jsonb
end
end
5 changes: 4 additions & 1 deletion db/schema.rb
Original file line number Diff line number Diff line change
@@ -10,7 +10,7 @@
#
# It's strongly recommended that you check this file into your version control system.

ActiveRecord::Schema.define(version: 2022_05_23_184945) do
ActiveRecord::Schema.define(version: 2022_06_01_165646) do

# These are extensions that must be enabled in order to support this database
enable_extension "plpgsql"
@@ -405,6 +405,9 @@
t.text "local_jurisdiction_laws"
t.jsonb "works_json", null: false
t.integer "case_id_number"
t.jsonb "tags_json"
t.jsonb "jurisdictions_json"
t.jsonb "regulations_json"
t.index ["created_at"], name: "index_notices_on_created_at"
t.index ["original_notice_id"], name: "index_notices_on_original_notice_id"
t.index ["published"], name: "index_notices_on_published"
57 changes: 57 additions & 0 deletions script/copy_taggings_to_notices_as_jsonb.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#!/bin/sh

set -e

beg=0
inc=1000
max=20000000

while getopts b:i:m: flag
do
case "${flag}" in
b) beg=${OPTARG};;
i) inc=${OPTARG};;
m) max=${OPTARG};;
esac
done

for i in $(seq $beg $inc $max); do
j=$(expr $i + $inc)

cat << EOM
PROCESSING $i UNTIL $j ($(expr 100 '*' $i / $max)%)
EOM

psql -v ON_ERROR_STOP=1 << EOM
SET enable_hashjoin = false;
SET enable_mergejoin = false;
WITH
notice_tags AS (
SELECT
tg.name AS tag_name,
tgg.context AS context,
tgg.taggable_id
FROM taggings tgg
JOIN tags tg ON tg.id = tgg.tag_id
WHERE tgg.taggable_id >= $i AND tgg.taggable_id < $j
),
f AS (
SELECT
notice_tags.taggable_id AS notice_id,
jsonb_agg(tag_name) FILTER (WHERE context = 'tags' AND tag_name IS NOT NULL) AS tags,
jsonb_agg(tag_name) FILTER (WHERE context = 'jurisdictions' AND tag_name IS NOT NULL) AS jurisdictions,
jsonb_agg(tag_name) FILTER (WHERE context = 'regulations' AND tag_name IS NOT NULL) AS regulations
FROM notice_tags
GROUP BY notice_tags.taggable_id
)
UPDATE notices n
SET tags_json = COALESCE((SELECT f.tags FROM f WHERE f.notice_id = n.id), '[]'),
jurisdictions_json = COALESCE((SELECT f.jurisdictions FROM f WHERE f.notice_id = n.id), '[]'),
regulations_json = COALESCE((SELECT f.regulations FROM f WHERE f.notice_id = n.id), '[]')
WHERE n.id >= $i AND n.id < $j;
VACUUM notices;
EOM

done
File renamed without changes.

0 comments on commit f85b11e

Please sign in to comment.