Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib eventdb: new to_json script #2552

Merged
merged 1 commit into from
Feb 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions contrib/eventdb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ EventDB Utilities
- Apply Malware Name Mapping: Applies the malware name mapping to the eventdb. Source and destination columns can be given, also a local file. If no local file is present, the mapping can be downloaded on demand.
It queries the database for all distinct malware names with the taxonomy "malicious-code" and sets another column to the malware family name.
- Apply Domain Suffix: Writes the public domain suffix to the `source.domain_suffix` / `destination.domain_suffix` columns, extracted from `source.fqdn` / `destination.fqdn`.
- PostgreSQL trigger keeping track of the oldest inserted/updated "time.source" data. This can be useful to (re-)generate statistics or aggregation data.
- SQL queries to set up a separate `raws` table, described in https://docs.intelmq.org/latest/admin/database/postgresql/#separating-raw-values-in-postgresql-using-view-and-trigger
- `trigger_oldest_time.source.sql`: PostgreSQL trigger keeping track of the oldest inserted/updated "time.source" data. This can be useful to (re-)generate statistics or aggregation data.
- `to_json.py`: Export EventDB data to JSON, to use it in IntelMQ again.

Usage
-----
Expand All @@ -22,6 +22,16 @@ See `--help` for more information:
```
apply_mapping_eventdb.py -h
apply_domain_suffix.py -h
to_json.py -h
```

The SQL script can be executed in the database directly.

### `to_json.py`


- Get an event by ID: `~intevation/to_json.py --id $id`
- You can give multiple IDs
- Pretty printed: Add `--pretty`
- Inject the data into an IntelMQ bot (dry run):
- `intelmqctl run $botid process --dry-run --show-sent --msg '$jsondata'`
52 changes: 52 additions & 0 deletions contrib/eventdb/to_json.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/usr/bin/python3

# SPDX-FileCopyrightText: 2025 Institute for Common Good Technology
#
# SPDX-License-Identifier: AGPL-3.0-or-later

from argparse import ArgumentParser
from datetime import datetime
import json
from sys import exit, stderr
from pprint import pprint

from psycopg2 import connect
from psycopg2.extras import RealDictCursor

parser = ArgumentParser(
prog='EventDB to JSON',
description='Extract data from the IntelMQ EventDB')
parser.add_argument('-v', '--verbose', action='store_true')
parser.add_argument('-i', '--id', help='Get events by ID')
parser.add_argument('-p', '--pretty', action='store_true', help='Pretty print JSON output')
parser.add_argument('--dsn', help='A complete libpg conninfo string. If not given, it will be loaded from /etc/intelmq/eventdb-serve.conf')
args = parser.parse_args()

if args.dsn:
conninfo = args.dsn
else:
try:
with open('/etc/intelmq/eventdb-serve.conf') as fody_config:
conninfo = json.load(fody_config)['libpg conninfo']
except FileNotFoundError as exc:
print(f'Could not load database configuration. {exc}', file=stderr)
exit(2)

if args.verbose:
print(f'Using DSN {conninfo!r}.')
db = connect(dsn=conninfo)
cur = db.cursor(cursor_factory=RealDictCursor)
cur.execute ('SELECT * FROM events WHERE id = %s', (args.id, ))

for row in cur.fetchall():
del row['id']
for key in list(row.keys()):
if isinstance(row[key], datetime):
# data from the database has TZ information already included
row[key] = row[key].isoformat()
elif row[key] is None:
del row[key]
if args.pretty:
print(json.dumps(row, indent=2))
else:
print(json.dumps(row))
Loading