Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Feeds API. Closes #97, #98 and #429. #419

Open
wants to merge 24 commits into
base: develop
Choose a base branch
from

Conversation

regulartim
Copy link
Collaborator

@regulartim regulartim commented Jan 3, 2025

Description

This PR introduces a new API endpoint for feeds and adds some data to the existing ones. Changes in detail:

  1. There is a new API endpoint for feeds (/api/feeds/v2) that provides highly configurable and more detailed feeds. This endpoint is only accessible for authenticated users.
  2. The old API endpoints for feeds now provide some more information about IOCs:
    1. the interaction_count (see Add interaction count to the IOC model #405)
    2. the reputation of an IP according to listbot (see Filter IP addresses from known scanners #97)
    3. the ASN of an IP
    4. the number of attempted logins from an IP
    5. the number of destination ports that were attacked by an IP
  3. The retrieval of valid feed types is now a separate function.

Related issues

Type of change

  • Bug fix (non-breaking change which fixes an issue).
  • New feature (non-breaking change which adds functionality).

Checklist

  • I have read and understood the rules about how to Contribute to this project.
  • The pull request is for the branch develop.
  • I have added documentation of the new features.
  • Linters (Black, Flake, Isort) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.
  • I have added tests for the feature/bug I solved. All the tests (new and old ones) gave 0 errors.
  • If changes were made to an existing model/serializer/view, the docs were updated and regenerated (check CONTRIBUTE.md).
  • If the GUI has been modified:
    • I have a provided a screenshot of the result in the PR.
    • I have created new frontend tests for the new component or updated existing ones.

Important Rules

  • If you miss to compile the Checklist properly, your PR won't be reviewed by the maintainers.
  • If your changes decrease the overall tests coverage (you will know after the Codecov CI job is done), you should add the required tests to fix the problem
  • Everytime you make changes to the PR and you think the work is done, you should explicitly ask for a review. After being reviewed and received a "change request", you should explicitly ask for a review again once you have made the requested changes.

@regulartim
Copy link
Collaborator Author

@mlodic : One thing I forgot: As the new API is only accessible by authenticated users, should we add the option to exclude mass scanners to the old APIs?

@mlodic
Copy link
Member

mlodic commented Jan 7, 2025

As the new API is only accessible by authenticated users, should we add the option to exclude mass scanners to the old APIs?

for "mass scanners" you mean legit scanners like Censys, right? if so, yes

@regulartim regulartim marked this pull request as ready for review January 7, 2025 12:35
@regulartim regulartim requested a review from mlodic January 7, 2025 12:35
Comment on lines 59 to 69
@cache
def ordering_validation(ordering: str) -> str:
if not ordering:
raise serializers.ValidationError("Invalid ordering: <empty string>")
if ordering[0] == "-":
ordering = ordering[1:]
try:
IOC._meta.get_field(ordering)
except FieldDoesNotExist as exc:
raise serializers.ValidationError(f"Invalid ordering: {ordering}") from exc
return ordering
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not completely sure to understand what happens here. You remove the - if exists and then return the field value without it. But isn't it needed to properly order the query afterwards?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this is indeed confusing. I will make the function clearer.

@@ -54,25 +56,54 @@ def feed_type_validation(feed_type: str, valid_feed_types: frozenset) -> str:
return feed_type


class FeedsSerializer(serializers.Serializer):
@cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this useful? I think I am missing the point of this function. Could you clarify it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @cache decorator? I assume that the function will very often be called with the same ordering argument, e.g. -last_seen which is the default ordering. If the @cache decorator is used and the function has been called at least once with a given argument, the result of the function will be obtained by a dictionary lookup instead of executing the function. This should be much quicker.

Comment on lines 36 to 93
class FeedRequestParams:
"""A class to handle and validate feed request parameters.
It processes and stores query parameters for feed requests,
providing default values.

Attributes:
feed_type (str): Type of feed to retrieve (default: "all")
attack_type (str): Type of attack to filter (default: "all")
max_age (str): Maximum number of days since last occurrence (default: "3")
min_days_seen (str): Minimum number of days on which an IOC must have been seen (default: "1")
include_reputation (list): List of reputation values to include (default: [])
exclude_reputation (list): List of reputation values to exclude (default: [])
feed_size (int): Number of items to return in feed (default: "5000")
ordering (str): Field to order results by (default: "-last_seen")
verbose (str): Whether to include IOC properties that contain a lot of data (default: "false")
paginate (str): Whether to paginate results (default: "false")
format (str): Response format type (default: "json")
"""

def __init__(self, query_params: dict):
"""Initialize a new FeedRequestParams instance.

Parameters:
query_params (dict): Dictionary containing query parameters for feed configuration.
"""
self.feed_type = query_params.get("feed_type", "all").lower()
self.attack_type = query_params.get("attack_type", "all").lower()
self.max_age = query_params.get("max_age", "3")
self.min_days_seen = query_params.get("min_days_seen", "1")
self.include_reputation = query_params["include_reputation"].split(";") if "include_reputation" in query_params else []
self.exclude_reputation = query_params["exclude_reputation"].split(";") if "exclude_reputation" in query_params else []
self.feed_size = query_params.get("feed_size", "5000")
self.ordering = query_params.get("ordering", "-last_seen").lower().replace("value", "name")
self.verbose = query_params.get("verbose", "false").lower()
self.paginate = query_params.get("paginate", "false").lower()
self.format = query_params.get("format_", "json").lower()

def exclude_mass_scanners(self):
self.exclude_reputation.append("mass scanner")

def set_legacy_age(self, age: str):
"""Translates legacy age specification into max_age and min_days_seen attributes
and sets ordering accordingly.

Parameters:
age (str): Age of the data to filter (recent or persistent).
"""
match age:
case "recent":
self.max_age = "3"
self.min_days_seen = "1"
if "feed_type" in self.ordering:
self.ordering = "-last_seen"
case "persistent":
self.max_age = "14"
self.min_days_seen: "10"
if "feed_type" in self.ordering:
self.ordering = "-attack_count"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at this closely, this could be all converted inside the Django Rest Framework but don't worry, I won't ask you to change it 🤣 . The important thing is that you don't find any particular perfomance issues, as I said today. I'll check that in the Honeynet production servers too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok :D If there's the option to do this with a built-in class from Django Rest Framework I would like to use it. Would you show me where to find it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I would like to change that in a separate PR.

@regulartim regulartim requested a review from mlodic January 7, 2025 21:50
- remove logging the number of returned IOCs in get_queryset() as this forces early QuerySet evaluation
- only pass filter for number_of_days_seen if it is > 1
- build list of general honeypots, that have seen an IOC, directly on the database
- rewrite feeds_response() function
@regulartim regulartim changed the title New Feeds API. Closes #97 and #98. New Feeds API. Closes #97, #98 and #429. Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants