Spike: Research alternatives to "allowed list" of users for production #1281

exalate-issue-sync · 2025-01-14T19:39:21Z

As a developer, I want to implement a means so that a user account that is negatively impacting site performance (such as one who creates an account and loads maliciously-crafted data that causes expensive queries to run) can be manually throttled or blocked.

As a developer, I want to replace the allow list with a more effective and scalable solution so that security extends beyond account creation and developers are not on the hook to keep an allow list updated.

Considerations:

The ability to block IP addresses thorough NGINX for the API and web app is already in place, but IP origin can be spoofed.
Allowed list functionality is already in place but is only checked on account creation.
Manually maintaining a list of people allowed to create accounts is not scalable.

The throttle/deny list should be checked on authentication and for each API call.

Automated blocking is not in scope for this story. The failure condition of automated blocking is blocking users who should not be blocked until they are unblocked; the failure condition of manual blocking is allowing a malicious user to degrade the site performance until such time as they are manually blocked. The latter is deemed more acceptable than the former.

This story covers researching possible solutions/means of implementation, not selecting and implementing one.

Task:

Enter a follow-up implementation ticket, tag it sprint 56

QA Notes

null

DEV Notes

null

Design

null

See full ticket and images here: FECFILE-1948

exalate-issue-sync · 2025-01-22T17:12:35Z

Dan Fowlkes commented: [NOTE: This comment was written prior to the update/clarification of the ticket description.]

It appears that cloud.gov already offers [certain protections|https://cloud.gov/docs/technology/platform-protections/], including rate limiting. The [AWS WAF|https://docs.aws.amazon.com/waf/latest/developerguide/waf-captcha-and-challenge-actions.html] imposes a rate limit and when a requester exceeds it the WAF responds with a [CHALLENGE action|https://docs.aws.amazon.com/waf/latest/APIReference/API_ChallengeAction.html] that must be answered before that and subsequent requests are allowed to proceed to the application. Additionally, AWS WAF mitigates the threat of malicious actors spoofing their IP origin to get around rate limiting by analyzing traffic and identifying proxies and suspicious IP addresses based on [their global threat intelligence|https://www.securityweek.com/inside-awss-crusade-against-ip-spoofing-and-ddos-attacks/]. That’s a great baseline from which to start.

The Django REST Framework also handles [throttling|https://www.django-rest-framework.org/api-guide/throttling/]. This, on its own, cannot provide reliable DDoS protection but could allow us to add an additional layer of protection against over-zealous users or automated security scanners run amok.

We could also use something like [fail2ban|https://www.fail2ban.org/], which is a tool that monitors your log files looking for malicious traffic and then updates your firewall to block requesters for a set period (or even permanently). [Here’s an example write-up|https://rogs.me/2020/04/secure-your-django-api-from-ddos-attacks-with-nginx-and-fail2ban/] from someone using it with nginx and django. In looking through its configuration documentation, it appears that we could configure it to not just block a malicious IP but to relate that malicious IP to a user, by parsing the authentication log lines, and so [block future authentication attempts from that user|https://github.com/fail2ban/fail2ban/wiki/How-to-ban-something-other-as-host-(IP-address),-like-user-or-mail,-etc.] at the firewall level. Caveat: fail2ban is used by a lot of people running AWS EC2 instances who aren’t paying for AWS WAF; given that [cloud.gov|http://cloud.gov] is using the AWS WAF this might be unnecessary/overkill.

In a similar vein, [django-banish|https://github.com/yourabi/django-banish] is Django middleware that can be configured to automatically ban users that exceed a set rate limit. We could use or fork this if we want to be able to disable users who appear malicious at the Django-level rather than at the server firewall.

exalate-issue-sync · 2025-01-23T22:36:22Z

Dan Fowlkes commented: A blacklist (block/throttle list) is more scalable than a whitelist (allowed list) given a large number of users of which a vast majority are bad actors.

One route is to leverage as much existing django functionality as possible.

Users can be blocked from authenticating by setting the [is_active|https://docs.djangoproject.com/en/5.1/ref/contrib/auth/#django.contrib.auth.models.User.is_active] flag (and ensuring that our authentication checks it).
Users can be selectively throttled by leveraging a [DRF custom throttle|https://www.django-rest-framework.org/api-guide/throttling/#custom-throttles], overriding allow_request with a function that checks the blacklist and returns a throttle_failure if the user is found there.
** This will return an HTTP 429 (“Too Many Requests”) and a Retry-After header.
** Our use case is not preventing DoS-like traffic patterns but rather cases where a single query execution from a user is debilitating. While [DRF throttling|https://www.django-rest-framework.org/api-guide/throttling/] is normally used for the former and applied broadly to users by type, a more targeted and more restrictive use could address the latter.

Alternatively, if we want greater control of the response code and headers we could roll our own API shim to validate all requests against a blacklist and respond in a custom way.

My assumption is that calling the API requires authenticating with a user, in which case I would expect that the is_active flag allows us to prevent authentication and therefore prevent returning responses to API calls. If our API does not use authentication, or uses custom authentication, then entirely custom blocking/throttling would have to be implemented.

Pivoting from the current whitelist to a blacklist will require the following actions (in approximate order of most to least important):

Implement a blacklist check on API calls.
Implement a blacklist check on app authentication.
Update the web app to handle throttled/blocked responses from API and communicate them clearly to the user.
Implement a means of maintaining the blacklist.
Remove the existing whitelist check on account creation.
Update API documentation (if anything beyond swagger?) to include new response options.

exalate-issue-sync · 2025-01-28T23:33:40Z

Dan Fowlkes commented: I have tested the first bullet in the previous comment: using the is_active flag. When the flag is set to false for a user:

API requests from an authenticated user return an HTTP 403 Forbidden
an authenticated user is logged out of the UI
the user is unable to authenticate

In the latter two cases, the user is bounced to the api/v1/oidc/authenticate endpoint which remains blank.

h2. The Testing.

I {{docker exec}}’d into the fecfile-api container and used the django shell to retrieve the users, select one, confirm that I’d retrieved it, and then flip the is_active flag back and forth as follows:

{noformat}$ python manage.py shell

from django.contrib.auth.models import User
users = User.objects.all()
for user in users:
... print(user.username)

>>> test_user = User.objects.get(username='') >>> print(test_user.username) >>> user.is_active = not user.is_active >>> user.save() [repeat the last two lines to flip it back and forth as needed]{noformat}

Results:

I logged into the web app with an enabled user, then disabled that user. On next click, I was bounced to the authentication API and got no further.
I made an API call with an enabled user (successfully), then disabled that user. On next API call it returned an HTTP 403 Forbidden.
I attempted to log into the web app with a disable user. I reached the authentication API and got no further.
I made an API call with a disabled user and it returned an HTTP 403 Forbidden.

exalate-issue-sync · 2025-01-28T23:44:20Z

Dan Fowlkes commented: h3. Implementation Notes.

We can break this into two pieces:

blocking: Flipping the is_active flag is simple and effective. We don’t have any sort of user management dashboard/console currently. Until such time as one is necessary/implemented, the ability to flip the flag could be encapsulated in a custom django command.
throttling: Throttling is achievable as described in an earlier comment. Authentication utilizing an API so throttling API calls could prevent a user from authenticating too often – not just calling APIs too often. In order to establish the priority of implementing the throttling piece we should weigh the level of effort against the expected frequency of the use case outlined in that comment. Unless we expect to have a regular need to throttle a user’s API usage (as opposed to throttling excessive traffic at the WAF), this piece is likely low priority.

exalate-issue-sync · 2025-01-29T13:54:56Z

Todd Lees commented: Follow up created. Ready for QA

exalate-issue-sync · 2025-01-29T14:01:16Z

Shelly Wise commented: No QA review needed on this ticket.

Moved to Stage Ready.

exalate-issue-sync · 2025-02-03T18:09:50Z

Laura Beaufort commented: Follow up above in “linked issues” but here’s a link in the comments: [https://fecgov.atlassian.net/browse/FECFILE-1992|https://fecgov.atlassian.net/browse/FECFILE-1992|smart-link]

exalate-issue-sync bot assigned lbeaufort and danguyf Jan 14, 2025

exalate-issue-sync bot unassigned lbeaufort Jan 22, 2025

exalate-issue-sync bot assigned toddlees Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike: Research alternatives to "allowed list" of users for production #1281

Spike: Research alternatives to "allowed list" of users for production #1281

exalate-issue-sync bot commented Jan 14, 2025 •

edited

Loading

exalate-issue-sync bot commented Jan 22, 2025 •

edited

Loading

exalate-issue-sync bot commented Jan 23, 2025 •

edited

Loading

exalate-issue-sync bot commented Jan 28, 2025

exalate-issue-sync bot commented Jan 28, 2025

exalate-issue-sync bot commented Jan 29, 2025

exalate-issue-sync bot commented Jan 29, 2025

exalate-issue-sync bot commented Feb 3, 2025

Spike: Research alternatives to "allowed list" of users for production #1281

Spike: Research alternatives to "allowed list" of users for production #1281

Comments

exalate-issue-sync bot commented Jan 14, 2025 • edited Loading

QA Notes

DEV Notes

Design

exalate-issue-sync bot commented Jan 22, 2025 • edited Loading

exalate-issue-sync bot commented Jan 23, 2025 • edited Loading

exalate-issue-sync bot commented Jan 28, 2025

exalate-issue-sync bot commented Jan 28, 2025

exalate-issue-sync bot commented Jan 29, 2025

exalate-issue-sync bot commented Jan 29, 2025

exalate-issue-sync bot commented Feb 3, 2025

exalate-issue-sync bot commented Jan 14, 2025 •

edited

Loading

exalate-issue-sync bot commented Jan 22, 2025 •

edited

Loading

exalate-issue-sync bot commented Jan 23, 2025 •

edited

Loading