Skip to content

Commit

Permalink
Fix auth documentation, and a frontend regex (#423)
Browse files Browse the repository at this point in the history
  • Loading branch information
msm-cert authored Oct 16, 2024
1 parent 33cd906 commit 0861024
Show file tree
Hide file tree
Showing 7 changed files with 83 additions and 75 deletions.
3 changes: 2 additions & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ Relevant for people who want to run mquery in production or on a a bigger scale.
- [On-disk format](./ondiskformat.md): Read if you want to understand ursadb's on
disk format (spoiler: many files are just JSON and can be inspected with vim).
- [Plugin system](./plugins.md): For filtering, processing and tagging files.
- [Database format](./redis.md): Information about the data stored in redis.
- [Database format](./database.md): Information about the data stored in the database.
- [Redis applications](./redis.md): Of historical interest, redis is used only for [rq](https://python-rq.org/) now.
- [User management](./users.md): Control and manage access to your mquery instance.
- [API](./api.md): Mquery exposes a simple API that you may use for your automation.
54 changes: 54 additions & 0 deletions docs/database.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# How the data is stored in the database

Currently, Postgres database is used to keep entities used by mquery.

With the default docker configuration, you can connect to the database
using the following oneliner:

```
sudo docker compose exec postgres psql -U postgres --dbname mquery
```

The followiung tables are defined:

### Job table (`job`)

Jobs are stored in the `job` table.

Every job has ID, which is a random 12 character string like 2OV8UP4DUOWK (the
same string that is visible in urls like http://mquery.net/query/2OV8UP4DUOWK).

Possible job statuses are:

* "new" - Completely new job.
* "inprogress" - Job that is in progress.
* "done" - Job that was finished
* "cancelled" - Job was cancelled by the user or failed
* "removed" - Job is hidden in the UI (TODO: remove this status in the future)

### Job agent table (`jobagent`)

It is a simple mapping between job_id and agent_id. Additionaly, it keeps track
of how many tasks are still in progress for a given agent assigned to this job.

### Match table (`match`)

Matches represent files matched to a job.

Every match represents a single yara rule match (along with optional attributes
from plugins).

### AgentGroup table (`agentgroup`)

When scheduling jobs, mquery needs to know how many agent groups are
waiting for tasks. In most cases there is only one, but in distributed environment
there may be more.

### Configuration table (`configentry`)

Represented by models.configentry.ConfigEntry class.

For example, `plugin:TestPlugin` will store configuration for `TestPlugin` as a
dictionary. All plugins can expose their own arbitrary config options.

As a special case `plugin:Mquery` keeps configuration of the mquery itself.
71 changes: 5 additions & 66 deletions docs/redis.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,16 @@
# How the data is stored in redis

In the older mquery versions, data used to be stored in Redis. In mquery
version 1.4.0 the data was migrated to a postgresql - see [database](./database.md).

Please note that all this is 100% internal, and shouldn't be relied on.
Data format in redis can and does change between mquery releases.

Right now mquery is in the process of migrating internal storage to Postgres.

### Why redis?

Because very early daemon was a trivial piece of code, and Redis as a job
queue was the easiest solution. Since then mquery got extended with (in
no particular order) batching, users, jobs, commands, task cancellations,
distributed agents, configuration, and more.

I have thus learned the hard way that Redis is not a good database.

Nevertheless, that ship has sailed. There are no plans of migrating mquery
to another database. What we can do is to document the current data format.

### Redis quickstart

To connect to redis use `redis-cli`. For docker compose use
`docker compose exec redis redis-cli`.
You can use `redis-cli` to connect to redis. With the default docker compose configuration,
use `docker compose exec redis redis-cli`.

Redis command documentation is pretty good and available at https://redis.io/commands/.

### Job table (`job`)

Jobs are stored in the `job` table.

Every job has ID, which is a random 12 character string like 2OV8UP4DUOWK (the
same string that is visible in urls like http://mquery.net/query/2OV8UP4DUOWK).

Possible job statuses are:

* "new" - Completely new job.
* "inprogress" - Job that is in progress.
* "done" - Job that was finished
* "cancelled" - Job was cancelled by the user or failed
* "removed" - Job is hidden in the UI (TODO: remove this status in the future)

### Match table (`match`)

Matches represent files matched to a job.

Every match represents a single yara rule match (along with optional attributes
from plugins).

### Agentjob objects (`agentjob:*`)

Agentjob is a simple String (but only used as an integer).

In distributed environment it's sometimes hard to say when exactly agent's job
is finished. To work around this, each agent keeps a number of pending tasks
using agentjob key. For example, for job `123456123456` and agent `default`, redis key
`agentjob:default:123456123456` will contain the number of pending tasks.

This only matters during the task execution and can be discarded after task is done.

### AgentGroup table (`agentgroup`)

When scheduling jobs, mquery needs to know how many agent groups are
waiting for tasks. In most cases there is only one, but in distributed environment
there may be more.

### Configuration table (`configentry`)

Represented by models.configentry.ConfigEntry class.

For example, `plugin:TestPlugin` will store configuration for `TestPlugin` as a
dictionary. All plugins can expose their own arbitrary config options.

As a special case `plugin:Mquery` keeps configuration of the mquery itself.

### Rq objects (`rq:*`)

Objects used internally by https://python-rq.org/, task scheduler used by mquery.
Expand Down
23 changes: 18 additions & 5 deletions docs/users.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Optional user management in mquery is role-based, and handled by OIDC.

## Role-based permissions

There are two predefined permission sets that can be assigned to users:
There are three predefined permission sets that can be assigned to users:

- `admin`: has access to everything, including management features.
Can change the service configuration, manage datasets, etc.
Expand All @@ -18,6 +18,8 @@ There are two predefined permission sets that can be assigned to users:
create new search jobs, see and cancel every job, and download
matched files. In current version, users can see and browse
all jobs in the system.
- `nobody`: empty role that gives no access to anything. Useful
for anonymous users.

Role names are considered stable, and will continue to work in the future.

Expand All @@ -35,6 +37,17 @@ may change in some future new version.
(**Note**: in the current version there is no isolation between users, and
users can view/stop/delete each other queries. This may change in the future)

## OIDC quickstart

In the `/config` section set:
* `auth_default_roles` to "nobody" or "user" (this is a role for anonymous users)
* `openid_client_id`, `openid_url`, `openid_secret` as required for your OIDC server
(secret should be a RS256 key)
* `auth_enabled` to "true" (to this last, to avoid locking yourself out).

If something goes wrong, you need to manually fix the config in the database
(to disable auth: `delete from configentry where key='auth_enabled'`).

## OIDC integration

Mquery doesn't implement user management. Instead, this is delegated
Expand Down Expand Up @@ -97,8 +110,8 @@ as necessary for your deployment.
**Warning** the proces is tricky, and right now it's missing a proper validation.
It's possible to lock yourself out (by enabling auth before configuring it
correctly). If you do this, you have to disable auth manually, by running
`redis-cli` (`sudo docker compose exec redis redis-cli` for docker) and
executing `HMSET plugin:Mquery auth_enabled ""`.
`redis-cli` (`sudo docker compose exec postgres psql -U postgres --dbname mquery` for docker) and
executing `delete from configentry where key='auth_enabled';`.

**Step 0 (optional): enable auth in non-enforcing mode**

Expand Down Expand Up @@ -146,8 +159,8 @@ Get it from `http://localhost:8080/auth/admin/master/console/#/realms/myrealm/ke

**Step 3: enable auth in enforcing mode**

- Go to the `config` page in mquery. Ensure `auth_default_roles` is
an empty string.
- Go to the `config` page in mquery. Change `auth_default_roles` to "user" or "nobody", depending on your needs.
- **Don't leave `auth_default_roles` empty**, for compatibility reasons this gives admin permissions for every user.
- Set `auth_enabled` to `true`

Final result:
Expand Down
3 changes: 2 additions & 1 deletion src/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,8 @@ def expand_role(role: str) -> List[str]:
"""Some roles imply other roles or permissions. For example, admin role
also gives permissions for all user permissions.
"""
role_implications = {
role_implications: Dict = {
"nobody": [],
"admin": [
"user",
"can_list_all_queries",
Expand Down
2 changes: 1 addition & 1 deletion src/db.py
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,7 @@ def get_core_config(self) -> Dict[str, str]:
return {
# Autentication-related config
"auth_enabled": "Enable and force authentication for all users ('true' or 'false')",
"auth_default_roles": "Comma separated list of roles available to everyone (available roles: admin, user)",
"auth_default_roles": "Roles assigned to everyone - including anonymous users (available roles: admin, user, nobody)",
# OpenID Authentication config
"openid_url": "OpenID Connect base url",
"openid_client_id": "OpenID client ID",
Expand Down
2 changes: 1 addition & 1 deletion src/mqueryfront/src/config/ConfigEntries.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import api from "../api";

const R_BOOL = /^(|true|false)$/;
const R_URL = /^(https?:\/\/.*)$/;
const R_ROLES = /^((admin|user)(,(admin|user))*)?$/;
const R_ROLES = /^((admin|user|nobody)(,(admin|user|nobody))*)?$/;

const KNOWN_RULES = {
openid_url: R_URL,
Expand Down

0 comments on commit 0861024

Please sign in to comment.