-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add jurisdiction layer to helper #1651
Conversation
select 1 | ||
from geo.atd_jurisdictions | ||
where JURISDICTION_LABEL = 'AUSTIN FULL PURPOSE' AND ST_Intersects(geometry, new.position) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here's the updated code. we're filtering on the jurisdiction label rather than the arbitrary ID sequence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 goodbye magic number - thank you!
jurisdictions: { | ||
service_name: "BOUNDARIES_jurisdictions_planning", | ||
layer_id: 0, | ||
query_params: { ...DEFAULT_ESRI_QUERY_PARAMS, where: "JURISDICTION_LABEL = 'AUSTIN FULL PURPOSE'" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are filtering out all features that are not AFP. there are thousands of features in this layer, and a single jurisdiction label can have multiple rows. E.g. there are currently two AUSTIN FULL PURPOSE
features which, when combined, represent the entire AFP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for pointing this out because I was curious about the 2 rows in the DB. Makes sense! 🙏
toolbox/load_agol_layer/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before this change, our geospatial docs were a lie.
The atd_jurisdiction
layer was not an "ArcGIS Online authoritative layer owned by CTM GIS". It was a custom layer. Now it is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to point out that as the author of those docs, you have all the authority one could have to call out that they were in error, but for all other readers, I want to push back on calling these a lie. That feels a little too strong and doesn't capture the nuance here.
While this work brings them fully into compliance as written in the README, they most certainly were previously derived from the CTM data and used in good faith. This data is much better in its provenance, but they were far from fabricated or intentionally misleading previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, Frank—"lie" was a bit harsh but i was a bit surprised to find that I hadn't correctly documented the provenance of that layer. glad it is square now 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm -- you do want this into main
and not the collector branch? That sounds right to me, but I wanted to double check that you're not maybe wanting to rebase this.
The testing worked out great, and I think this is ship-shape for the big moving parts. I do have a pair of observations that I'd like to put forward enough to request changes, but I am happy to switch this to a ✅ after some consideration of them.
Of my observations, I'm most interested in not using the deprecated atd_
acronym, and also short-circuiting out that st_intersects()
by the query engine where possible via the usage of &&
.
When testing the SQL backfill commands, I get these number of updated rows, in order of the queries presented:
Updated Rows 29
Updated Rows 86
Updated Rows 9
Updated Rows 18
Updated Rows 29
Updated Rows 36
@@ -0,0 +1,143 @@ | |||
alter table geo.atd_jurisdictions add column jurisdictions_id integer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way at all we can drop the atd_
out of this table name? coa_
is appropriate here, or my true preference: just jurisdictions
.
atd
is no longer, and while tpw
is marginally better, these jurisdictions belong to the larger city. This similar thing happened with the new VZ S3 bucket name, and it'd be great to avoid this usage here because they are harder to change the longer they remain in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for nudging me on this! i was a bit worried about blowing up the scope of this, but you're totally right: might as well get this cleaned up once and for all now. i checked with Milad and Xavier and they are ok with this change. i'll push it up!
select 1 | ||
from geo.atd_jurisdictions | ||
where JURISDICTION_LABEL = 'AUSTIN FULL PURPOSE' AND ST_Intersects(geometry, new.position) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 goodbye magic number - thank you!
database/migrations/default/1736197283308_jurisdiction_refactor/up.sql
Outdated
Show resolved
Hide resolved
database/migrations/default/1736197283308_jurisdiction_refactor/down.sql
Outdated
Show resolved
Hide resolved
new.in_austin_full_purpose = EXISTS ( | ||
select 1 | ||
from geo.atd_jurisdictions | ||
where JURISDICTION_LABEL = 'AUSTIN FULL PURPOSE' AND ST_Intersects(geometry, new.position) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can get some meaningful speed-up on this where clause by doing:
where
JURISDICTION_LABEL = 'AUSTIN FULL PURPOSE'
AND geometry && new.position
AND ST_Intersects(geometry, new.position)
&&
is the "has intersecting bounding boxes" operator, and it's a zillion times faster to see if a point and a box overlap than a point and an arbitrary polygon. By doing this blazing fast operation first, you can skip over checking the st_intersect()
on all the pairs of geometries that don't have overlapping bounding boxes, knowing that they couldn't intersect no matter what.
Also PG's GIST indexing on geometry is aimed at exactly this -- it stores the bounding boxes of geometries as its index data, so &&
can get really lightning fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @frankhereford for suggesting this. I had it in my mind that the GIST index would ensure that a bbox is used to speed up the spatial operations automatically, and that does seem to track from the docs.
Most of the commonly used functions in PostGIS (ST_Contains, ST_Intersects, ST_DWithin, etc) include an index filter automatically. But some functions (e.g., ST_Relate) do not include an index filter.
but i've also been reading that it's a best practice to be explicit about the use of bboxes—so there should be no harm in adding the &&
for good measure. that seems reasonable! weirdly, though, when i test my update commands with and without the bbox directly, the operation with &&
is consistently slower.
here's one example. curious if you're seeing this.
With bbox
-- Execution Time: 1474.878 ms
explain analyze UPDATE crashes
SET
in_austin_full_purpose = FALSE,
updated_by = 'dts_automation'
WHERE
crashes.in_austin_full_purpose IS TRUE
AND crashes.position IS NOT NULL
AND NOT EXISTS (
SELECT 1
FROM
geo.atd_jurisdictions
WHERE
atd_jurisdictions.geometry && crashes.position
AND atd_jurisdictions.jurisdiction_label = 'AUSTIN FULL PURPOSE'
AND ST_INTERSECTS(atd_jurisdictions.geometry, crashes.position)
);
Without bbox
-- Execution Time: 1069.798 ms
explain analyze UPDATE crashes
SET
in_austin_full_purpose = FALSE,
updated_by = 'dts_automation'
WHERE
crashes.in_austin_full_purpose IS TRUE
AND crashes.position IS NOT NULL
AND NOT EXISTS (
SELECT 1
FROM
geo.atd_jurisdictions
WHERE
atd_jurisdictions.jurisdiction_label = 'AUSTIN FULL PURPOSE'
AND ST_INTERSECTS(atd_jurisdictions.geometry, crashes.position)
);
weirdly, the analytics for both of these statements doesn't mention the index scan at all. i also wonder if my local machine resources is influencing the query planner and if this may look completely different on staging. i'll test that here momentarily...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i ran those two statements on staging and the results are consistent: removing the &&
check results in better performance by nearly 50% 😵
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
John - What a twist! Running those two queries on my machine come up with a less dramatic difference in timing 765 ms vs 515 ms, but the extra clause I'm suggesting really does seem to be the difference that is slowing it down. I'm fully confused about why it's doing it this way, but the numbers don't lie -- thank you for testing this out and providing such a clear cut set of data. 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really truly a twist! i'd love to spend some more time understanding what's going on here. i'd love to re-test this scenario against a table with more rows, like atd_txdot_locations
....
toolbox/load_agol_layer/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to point out that as the author of those docs, you have all the authority one could have to call out that they were in error, but for all other readers, I want to push back on calling these a lie. That feels a little too strong and doesn't capture the nuance here.
While this work brings them fully into compliance as written in the README, they most certainly were previously derived from the CTM data and used in good faith. This data is much better in its provenance, but they were far from fabricated or intentionally misleading previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the same updated rows counts as Frank did when running the backfill commands, ✅
I really appreciate the comments in the migration, it helps me understand what is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HALP! I'm stuck on Step 3 in the testing steps. What directory should that script be run from? And why are we using a local.env
file when all or other .env
files follow the pattern of .env.name_of_environment
instead of flipped around as your copy/paste implies.
Perhaps, we need to add to the testing instructions that a .env.local
(or if you insist local.env
) should be created from /toolbox/load_agol_layer/env_template
instead of assuming its there. But this is all just a guess on my part. I've tried a few variations after creating an env file from the template and this is the type of error I see:
(venv) (base) ➜ load_agol_layer git:(john/19441-jurisdiction-layer) ✗ node --env-file=local.env src/load_agol_layer.js --layer jurisdictions
node: bad option: --env-file=local.env
I could also use some guidance on what AGOL credentials should be added to that .env file. For now, I'm just guessing AGOL VZ Publisher
is the correct one to use.
UPDATE:
With @chiaberry's help, I was able to figure out how to get the load_agol_layer.js
script running with the node script in the testing instructions. Here are the extra things I had to do that were hard for me to know without more explicit instructions.
- Make a
local.env
file from theenv_template
file. (Still confused why its named this way). - Use the "AGOL Scripts Publisher" credentials in 1pw.
- Change to the directory two levels up from the script:
cd toolbox/load_agol_layer
- Run
nvm use
in order to be able to use the--env-file
option - Run
npm install
- Run
node --env-file=local.env src/load_agol_layer.js --layer jurisdictions
After the fact, I realized there is a README.md that could've helped me understand what was going wrong. myb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worked as described in the testing steps (once I asked for help/learned how to read 🫠)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran through the test steps and see the in and out of AFP calculating as expected when using the editor. I also think nixing the atd_
in the table would be nice like Frank mentioned (probably because everything else is so nice and tidy in here ✨).
Cool to see yet another AGOL layer covered with this tooling! 😎 🙌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you John for picking off that atd_
prefix and also for digging into my misbelief that the bounding box operator would speed things up! Can't wait to see this go live!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this all looks great john! i reviewed these update statements and want to make sure you also update afd_incidents
which are now AFP as well if thats necessary cuz i didnt see that command there. i agree on dropping atd_
, i dont see that update pushed up yet but i think thats a great change to clean things up! 🧹 ✨
folks, i requested review because the table name change impacted quite a lot of code, including dropping a dormant function from the pre-LDM days 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
John - this looks good! I have run the DB up through the migrations and metadata presented in this PR as well as gone over the code. This is on top of my previous review, so I am feeling really great about this.
Thank you
Thanks for taking on the entirely out-of-scope request to rename that field and drop the depreciated department name. That was no small thing, and I am really grateful to you. 🙏
@@ -12,4 +12,3 @@ | |||
retries: 1 | |||
use_prepared_statements: true | |||
tables: "!include default/tables/tables.yaml" | |||
functions: "!include default/functions/functions.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting to see here that this is the removal of the last function, at all. I'll be curious how long we stay in that function-less state, and if it's a long time, that's pretty great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retested, it looks great!! 🚢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this with the latest updates, and I remembered to run the SQL statements this time around too. Everything looks and works great! 🙌
headed to |
Associated issues
This PR brings the jurisdiction layer into our normal geo layer maintenance process. It replaces the old
atd_jurisdictions
layer that was created through an ad-hoc process that merged all features by jurisdiction label. Going forward, the table will resemble the authoritative CTM layer, where there may be multiple rows (aka features) for a single jurisdiction type.Also, whereas the old DB table had rows for the various ETJs and surrounding jurisdictions, I elected to exclude these since we don't currently make any use of them.
Before
After
Testing
hasura migrate apply
andhasura metadata apply
Start the ole VZE. Move some crashes into and out of the AFP jurisdiction. Moving a crash out of AFP will cause the alert banner to appear above the crash map.
The VZ team pointed out that
20271128
should not be assigned to AFP. You can test that by moving it 1 pixel and observing thatin_austin_full_purpose
changes tofalse
I have prepared these sql commands to backfill the various AFP flags that we have for crashes, EMS, and AFD records. If you would please review and run each of those and confirm that they look good 🙏
Ship list
main
branch