Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"load from url" fails with Granicus Website #1813

Open
vevetron opened this issue Sep 9, 2024 · 14 comments
Open

"load from url" fails with Granicus Website #1813

vevetron opened this issue Sep 9, 2024 · 14 comments
Assignees
Labels
bug Something isn't working (crash, a rule has a problem)

Comments

@vevetron
Copy link

vevetron commented Sep 9, 2024

Describe the bug

Attempts to validated: Calabasas but get "Error Processing Report".

I don't see anything in the inspection that suggests an error.

But I think what could be happening, Granicus probably blocks requests from cloud servers. So we put in the url, mobility-data server puts in a request for the file, it gets blocked, and we get an error.

It's okay if you download the file and upload it directly.

Steps/Code to Reproduce

Go here:
https://gtfs-validator.mobilitydata.org/

Put in this url: Calabasas to "Load from a URL"

Expected Results

Should process the gtfs

Actual Results

"Error Processing Result"

Screenshots

No response

Files used

No response

Validator version

Can't tell - 9/9/2024 version

Operating system

Windows - Chrome

Java version

No response

Additional notes

No response

@vevetron vevetron added bug Something isn't working (crash, a rule has a problem) status: Needs triage Applied to all new issues labels Sep 9, 2024
@github-project-automation github-project-automation bot moved this to Requires investigation in Bug triage Sep 9, 2024
Copy link

welcome bot commented Sep 9, 2024

Thanks for opening your first issue in this project! If you haven't already, you can join our slack and join the #gtfs-validators channel to meet our awesome community. Come say hi 👋!

Welcome to the community and thank you for your engagement in open source! 🎉

@emmambd
Copy link
Contributor

emmambd commented Sep 9, 2024

Hi @vevetron - thanks for flagging this! We tested this and saw this notice from the Granicus website in our logs:

Access Denied
You don't have permission to access "http://www.cityofcalabasas.com/home/showpublisheddocument/31620/638611519891730000" on this server.
Reference #18.9369dc17.1725907184.2f7ad1eb
https://errors.edgesuite.net/18.9369dc17.1725907184.2f7ad1eb

We suspect this may because our user agent is blocked by the website. The user agent we provide is shared here.

We'd suggest troubleshooting this on the Granicus website to verify if this is the issue.

Let us know if there's anything else we can do to support with this problem.

@vevetron
Copy link
Author

We saw something similar since we made requests from Google Cloud. Options:

  • Ask Granicus to whitelist the MobilityData ips if they are stable.

  • Ask Grancius for a custom user agent code that gets through their firewall

  • Granicus might make you get the transit agency to say "It's okay not to block MobilityData" which can be difficult.

@qcdyx qcdyx self-assigned this Oct 21, 2024
@qcdyx qcdyx removed the status: Needs triage Applied to all new issues label Oct 21, 2024
@qcdyx
Copy link
Contributor

qcdyx commented Oct 22, 2024

I tested the URL, and it works in a browser, but the curl command fails to download the ZIP file because the required headers, including User-Agent and sec-ch-ua, are missing. @emmambd We can do better error handling as part of the solution to this bug and ask the engagement team to contact Granicus.

curl 'https://www.cityofcalabasas.com/home/showpublisheddocument/31620/638611519891730000'
-H 'accept-language: en-US,en;q=0.9'
-H 'priority: u=0, i'
-H 'sec-ch-ua: "Chromium";v="130", "Google Chrome";v="130", "Not?A_Brand";v="99"'
-H 'sec-ch-ua-mobile: ?0'
-H 'sec-ch-ua-platform: "macOS"'
-H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36'
-O

@qcdyx
Copy link
Contributor

qcdyx commented Oct 23, 2024

Hi @vevetron, could you please reach out to Granicus and request them to whitelist MobilityData's header for GTFS validation? The header to whitelist is: user-agent: MobilityData GTFS-Validator/6.0.0 (Java 17.0.6). Please make sure it matches the correct GTFS validation version (You can check the validation version in the report).

@vevetron
Copy link
Author

I sent a first email to Granicus and cc'd @qcdyx on it. We can try. Y'all would probably need to set up a custom header that we keep secret. Also since each time java or the validator's version changed the header would change.

I had emailed Calabasus website team previously and they never got back to me, so we might want to find an easier target. I'm guessing all the granicus websites will be blocked from MobilityData.

@qcdyx
Copy link
Contributor

qcdyx commented Oct 23, 2024

Hey @vevetron Thanks for the update! Please also note that the current version is 5.0.1, which can be found in the validation report. https://gtfs-validator.mobilitydata.org/ Image
Please also cc @emmambd and @davidgamez on your emails for visibility. I agree that Granicus websites could be blocked from MobilityData, I'll explore this further with the team and get back to you.

@vevetron
Copy link
Author

I wonder what the 6.0.0 means.

Here are some of the CAS agencies we had trouble with:
City of Tracy | Granicus
City of West Hollywood | Granicus
City of Torrance | Likely Granicus
City of Glendale | Granicus
City of Lompoc | Granicus
City of Glendora | Granicus

City of Inglewood | Civic Plus

I tested Glendora through MobilityData validator and it also failed, so I'm guessing the rest will as well.

@qcdyx
Copy link
Contributor

qcdyx commented Oct 24, 2024

Hey @vevetron Thanks for pointing that out! The "6.0.0" is actually a placeholder for the version we're currently working on for the GTFS Validator's next release. The current public version is 5.0.1, as I mentioned. I included 6.0.0 in the whitelist request to future-proof it for when the new release goes live. For now, we can proceed with the request using 5.0.1, and once 6.0.0 is released, we can update it if needed.

I tested the City of Tracy's URL (http://data.trilliumtransit.com/gtfs/tracy-ca-us/tracy-ca-us.zip) from the MobilityDatabase (https://mobilitydatabase.org/feeds/mdb-877), and it is working.Image

The URL of City of Glendora (https://raw.githubusercontent.com/LACMTA/los-angeles-regional-gtfs/main/glendora-ca-us/glendora-ca-us.zip) found on the MobilityDatabase https://mobilitydatabase.org/feeds/mdb-609 gives me a 404 when I tried it in browser. Image

Please continue testing using the URLs on MobilityDatase https://mobilitydatabase.org/ for the other cities.

@vevetron
Copy link
Author

Looks like Tracy hosts their gtfs in two places- this is the one new one we have been using that fails without a firewall exception.

@vevetron
Copy link
Author

Question - does MobilityData use a stable IP address when downloading gtfs?

  1. A user goes to https://gtfs-validator.mobilitydata.org/ and adds a url to "load from a url"
  2. MobilityData servers seek out that gtfs from say, https://www.cityoftracy.org/home/showpublisheddocument/16626/638342536313270000
    --- Is MobilityData's server ip static? Or does it change with each request?

If the ip address is static, it would be easier to get a firewall passthrough approved for Granicus rather than getting the user-agent whitelisted. (For CAL-ITP, our ips in this case are ephemeral).

@vevetron
Copy link
Author

David says the servers run in the cloud and don't have stable ip addresses.

@davidgamez
Copy link
Member

Hi @vevetron, yes unfortunately we don't have a static IP that producers can rely on. As a follow up, we will work in the different branches of this issue.

We are looking at having this implemented by the next release.

@vevetron
Copy link
Author

Hi!

Can someone from your team test the new granicus auth keys from your cloud server? Refer to the emails for the code.

@qcdyx qcdyx assigned qcdyx and unassigned qcdyx Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working (crash, a rule has a problem)
Projects
Status: Requires investigation
Development

When branches are created from issues, their pull requests are automatically linked.

4 participants