Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W0 - 232 websites main dev branch #283

Closed
wants to merge 14 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 148 additions & 11 deletions docs/docs/concepts/findings.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,16 @@ To produce a finding, the job must create an object containing the necessary inf

The finding object must contain the `type` field. Here is a list of available types.

| Type | Description |
| ----------------------------------------- | -------------------------------------------------- |
| [HostnameFinding](#hostnamefinding) | Creates a new domain. |
| [IpFinding](#ipfinding) | Creates a new host. |
| [IpRangeFinding](#iprangefinding) | Creates a new IP range. |
| [HostnameIpFinding](#hostnameipfinding) | Creates a new host, attaches it to a given domain. |
| [PortFinding](#portfinding) | Creates a new port, attaches it to the given host. |
| [CustomFinding](#customfinding) | Attaches custom finding data to a given entity. |
| [PortServiceFinding](#portservicefinding) | Fills the `service` field of a port. |
| Type | Description |
| ----------------------------------------- | ------------------------------------------------------------ |
| [HostnameFinding](#hostnamefinding) | Creates a new domain. |
| [IpFinding](#ipfinding) | Creates a new host. |
| [IpRangeFinding](#iprangefinding) | Creates a new IP range. |
| [HostnameIpFinding](#hostnameipfinding) | Creates a new host, attaches it to a given domain. |
| [PortFinding](#portfinding) | Creates a new port, attaches it to the given host. |
| [WebsiteFinding](#websitefinding) | Creates a new website, with the proper host, domain and port |
| [CustomFinding](#customfinding) | Attaches custom finding data to a given entity. |
| [PortServiceFinding](#portservicefinding) | Fills the `service` field of a port. |

## HostnameFinding

Expand Down Expand Up @@ -186,7 +187,7 @@ Using the python SDK, you can emit this finding with the following code:
```python
from stalker_job_sdk import PortFinding, log_finding
port = 80
ip = "0.0.0.0"
ip = "1.2.3.4"
log_finding(
PortFinding(
"PortFinding",
Expand All @@ -200,6 +201,64 @@ log_finding(
)
```

## WebsiteFinding

The `WebsiteFinding` will create a website resource. Websites are made from 4 characteristics: an IP address, a domain name, a port number and a path. Only the IP address and the port are mandatory. The domain can be empty and the path will default to `/`.

To create a website, it must reference an existing port of a project. To reference a domain as well, it must also be a domain already known to Stalker.

It signals that an open port running an http(s) service, either `tcp` or `udp`, has been found on the host specified through the `ip` value. The `ip` must already be known to Stalker as a valid host. A port finding creates or updates a port
and attaches it to the given host.

> Emitting a `PortServiceFinding` with a `serviceName` of `http` and `https` will result in creating a `WebsiteFinding` per domain linked to the host, and one with an empty domain. [Learn more about PortServiceFinding and websites](#portservicefinding-and-websites)

| Field | Description |
| -------- | ------------------------------------------------------- |
| `ip` | The ip |
| `port` | The port number |
| `domain` | The domain on which the website is hosted, can be empty |
| `path` | The folder path, defaults to `/` |
| `ssl` | True if the website is protected by encryption |

Example:

```json
{
"type": "WebsiteFinding",
"key": "WebsiteFinding",
"ip": "1.2.3.4",
"port": 80,
"domainName": "example.com",
"path": "/",
"ssl": false
}
```

Using the python SDK, you can emit this finding with the following code:

```python
from stalker_job_sdk import WebsiteFinding, log_finding
port = 80
ip = "1.2.3.4"
domain = "example.com"
path = "/"
ssl = False

log_finding(
WebsiteFinding(
"WebsiteFinding",
ip,
port,
domain,
path,
ssl,
"New website",
[],
"WebsiteFinding",
)
)
```

## CustomFinding

Dynamic findings allow jobs to attach custom data to core entities.
Expand Down Expand Up @@ -250,7 +309,7 @@ port = 80
ip = "0.0.0.0"
log_finding(
PortFinding(
"HttpServerCheck", ip, port, "tcp", "This port runs an HTTP server"
"PortFunFact", ip, port, "tcp", "This is a fun fact about a port"
)
)
```
Expand Down Expand Up @@ -343,3 +402,81 @@ log_finding(
```

Upon receiving this finding, the backend will set the service database field of the TCP port 22 for the `0.0.0.0` IP to `ssh`.

### PortServiceFinding and websites

When publishing a `PortServiceFinding` with the service name of `http` or `https`, the `Jobs Manager` will understand that a website is located on that port.

The `Jobs Manager` will therefore create and publish several `WebsiteFinding`s, one for each of the host's linked domain name, and one for the IP address alone.

These website findings will allow further investigation of the http(s) port with the different domain names, in case the port supporting multiple virutal hosts.

For instance, imagine a host with the IP address `1.2.3.4`. This host has the linked domains `example.com` and `dev.example.com`.

Then, with the following code publishing the results for an https port:

```python
from stalker_job_sdk import PortFinding, log_finding, TextField

ip = '1.2.3.4'
port = 443
protocol = 'tcp'
service_name = 'https'

fields = [
TextField("serviceName", "Service name", service_name)
]

log_finding(
PortFinding(
"PortServiceFinding", ip, port, protocol, f"Found service {service_name}", fields
)
)
```

We would create the following three websites:

| domain | host | port | path |
| --------------- | ------- | ---- | ---- |
| N/A | 1.2.3.4 | 443 | `/` |
| example.com | 1.2.3.4 | 443 | `/` |
| dev.example.com | 1.2.3.4 | 443 | `/` |

That way, a website at `dev.example.com`, which may be different than the one at `example.com`, will be found. The same goes for the website through direct IP access.

## WebsitePathFinding

A `WebsitePathFinding` is type of `CustomFinding` that fills a website's `paths` database field with the `endpoint` text field label. It will then be shown in the interface as the website's site map.

| Field | Description |
| -------- | ------------------------------------------------------------- |
| `domain` | The website's domain |
| `ip` | The website's ip |
| `port` | The website's port number |
| `path` | The website's path |
| `fields` | A list of [fields](#dynamic-fields). Must include `endpoint`. |

Using the python SDK, you can emit this finding with the following code.

```python
from stalker_job_sdk import PortFinding, log_finding, TextField

ip = '1.2.3.4'
domain = 'example.com'
port = 443
path = '/'
ssl = True
endpoint = '/example/endpoint.html'

fields = [
TextField("endpoint", "Enspoint", endpoint)
]

log_finding(
WebsiteFinding(
"WebsitePathFinding", ip, port, domain, path, ssl, f"Website path", fields
)
)
```

Upon receiving this finding, the backend will populate the proper website's path with the `endpoint` data.
29 changes: 27 additions & 2 deletions docs/docs/concepts/jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ subscription would need to be adapted to the new name.
| [DomainNameResolvingJob](#domainnameresolvingjob) | Resolves a domain name to an IP address |
| [TcpPortScanningJob](#tcpportscanningjob) | Scans the tcp ports of an IP address |
| [TcpIpRangeScanningJob](#tcpiprangescanningjob) | Scan an IP range for open ports |
| [BannerGrabbingJob](#httpservercheckjob) | Identifies the service running on a port |
| [BannerGrabbingJob](#bannergrabbingjob) | Identifies the service running on a port |
| [WebsiteCrawlingJob](#websitecrawlingjob) | Crawls a website for its valid endpoints |

### DomainNameResolvingJob

Expand Down Expand Up @@ -90,10 +91,34 @@ Identifies the service running on a port and grabs the banner. It may occasional
| ------------- | -------- | -------------------------------------------------- |
| targetIp | string | The IP address to check |
| ports | number[] | The ports to check for a service |
| nmapOptions | | A long string containing the options given to nmap |
| nmapOptions | string | A long string containing the options given to nmap |

**Possible findings generated :**

- PortServiceFinding
- HostnameIpFinding
- OperatingSystemFinding

### WebsiteCrawlingJob

Crawls a website for its differents valid endpoints. It can also find website technology information.

**Input variables:**

| Variable Name | Type | Value description |
| -------------------- | ------ | --------------------------------------------- |
| targetIp | string | The website's IP address |
| port | number | The website's port |
| domainName | string | The website's domain name |
| path | string | The website's base path |
| ssl | bool | If the website is https |
| maxDepth | number | The depth to crawl |
| crawlDurationSeconds | number | The max amount of time to crawl in seconds |
| fetcherConcurrency | number | The number of concurrent fetchers to get data |
| inputParallelism | number | The number of concurrent inputs pprocessor |
| extraOptions | string | Katana extra options to adapt execution |

**Possible findings generated :**

- WebsitePathFinding
- WebsiteTechnologyFinding
31 changes: 31 additions & 0 deletions docs/docs/concepts/resources.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,33 @@ A port can be created with a `PortFinding`. A port finding is a combination of a

The combination of a port's number and host identifier is unique in the database.

### Websites

A website represents a `tcp` port running an http(s) server.

A website is the combination between a port, a host, a domain and a path. The path, when not specified, defaults to `/`. The domain can also be empty, as not all websites have domains that resolve to them.

A website is usually created for each http(s) port for each domain linked to a host.

Therefore, if a host runs two http(s) ports with two domains, a total of 6 websites on the `/` path are possible. Let's take the domains `dev.example.com` and `example.com`, the IP `1.2.3.4` and the ports `80` and `443` for the `/` path.

The following 6 values are possible:

| domain | host | port | path |
| --------------- | ------- | ---- | ---- |
| | 1.2.3.4 | 80 | / |
| example.com | 1.2.3.4 | 80 | / |
| dev.example.com | 1.2.3.4 | 80 | / |
| | 1.2.3.4 | 443 | / |
| example.com | 1.2.3.4 | 443 | / |
| dev.example.com | 1.2.3.4 | 443 | / |

Ports are used, combined with a *host*'s IP address, to represent a network service. Every port is linked to a *host*. They can be seen in the user interface under the `Ports` page.

A website can be created with a `WebsiteFinding`. A website finding is a combination of an IP, a port, a domain and a path. Therefore, when a website is created, it is automatically linked to the given port, host and domain.

The combination of a websites's port identifier, domain identifier and path is unique in the database.

## Interacting with resources

### Tagging a resource
Expand All @@ -81,3 +108,7 @@ While deleted resources are removed from the database, blocked resources will st
Blocked resources can be seen in the user interface by removing the default filter `-is: blocked`. Every resource will be shown that way, blocked or not. If you wish to only see the blocked resources, use the `is: blocked` filter.

Blocking a resource is useful if, through automation, Stalker found a resource that does not belong in the project. Deleting it would likely result in it reappearing later and jobs being run on it. Blocking it will ensure that jobs are not automatically run on the resource by remembering its existence.

### Merging websites

> This feature is not yet available, but is coming soon.
8 changes: 2 additions & 6 deletions docs/docs/concepts/subscriptions.md
Original file line number Diff line number Diff line change
Expand Up @@ -215,16 +215,12 @@ An event subscription can contain these main elements :
- `operator` : The operator to compare the two operands.
- `rhs` : The right-hand side operand.





#### Event Subscription Dynamic Input

You can add dynamic input to an event subscription either by referencing a finding's fields, or by injecting a secret.

You can reference a Finding's output variable by name in a Job parameter's value or in a condition's operand using the following syntax:
`${parameterName}`. The variable name is case insensitive.
You can reference a Finding's output variable by name in a Job parameter's value or in a condition's operand using the following syntax:
`${parameterName}`. The variable name is case insensitive.

In a finding, you can find [dynamic fields](/docs/concepts/findings#dynamic-fields) in the `fields` array. The text based dynamic fields' values can be injected in the same way as a regular field, with the `${parameterName}` syntax. Simply reference the `key` part of a dynamic field as the variable name, and its `data` will be injected.

Expand Down
43 changes: 31 additions & 12 deletions jobs/job-base-images/python/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,20 +1,39 @@
FROM python:3.11.0-slim-bullseye
FROM python:3.12.4-slim-bullseye AS base

RUN python -m pip install httpx[http2]
RUN python -m pip install "poetry==1.3.2"

COPY stalker_job_sdk /usr/src/stalker_job_sdk

RUN python -m pip install -e /usr/src/stalker_job_sdk \
&& apt-get update \
&& apt-get install -y nmap git make gcc libpcap0.8 \
&& mkdir -p /tools/masscan/ \
&& git clone https://github.com/robertdavidgraham/masscan /tools/masscan/ \
&& cd /tools/masscan/ \
&& make -j \
&& make install \
&& apt-get remove -y git make gcc \
&& apt-get autoremove -y \
&& rm -rf /tools/
RUN python -m pip install -e /usr/src/stalker_job_sdk
RUN apt-get update && apt-get install -y nmap libpcap0.8 wget gnupg libc6

FROM base AS build
RUN apt-get install -y git make gcc zip curl

# Masscan
RUN mkdir -p /tools/masscan/
RUN git clone https://github.com/robertdavidgraham/masscan /tools/masscan/
WORKDIR /tools/masscan
RUN make -j && make install

# Katana
FROM golang:1.22.4-bullseye AS katana

RUN apt-get update && apt-get install -y git gcc musl-dev
RUN mkdir -p /tools/katana/
RUN git clone https://github.com/projectdiscovery/katana.git /tools/katana
WORKDIR /tools/katana
RUN go mod download
RUN go build ./cmd/katana

FROM base AS final

COPY --from=build /usr/bin/masscan /usr/bin/masscan
COPY --from=katana /tools/katana/katana /usr/bin/katana

RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& sh -c 'echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt update && apt install -y google-chrome-stable

WORKDIR /usr/src/stalker-job
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,26 @@ def __init__(
self.port = port
self.protocol = protocol

class WebsiteFinding(Finding):
def __init__(
self,
key: str,
ip: str,
port: int,
domainName: str,
path: str,
ssl: bool = None,
name: str = None,
fields: list[Field] = [],
type: str = "CustomFinding",
):
super().__init__(key, type, name, fields)
self.ip = ip
self.port = port
self.domainName = domainName
self.protocol = 'tcp'
self.path = path
self.ssl = ssl

class DomainFinding(Finding):
def __init__(
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
import os
import random

import httpx
from stalker_job_sdk import (JobStatus, PortFinding, TextField, is_valid_ip,
Expand All @@ -9,10 +8,10 @@

def get_args():
"""Gets the arguments from environment variables"""
target_ip: str = os.environ["targetIp"]
port = int(os.environ["port"])
domain = os.environ["domainName"] # DOMAIN should resolve to TARGET_IP
path = os.environ["path"] # Http server file path to GET
target_ip: str = os.environ.get("targetIp")
port = int(os.environ.get("port"))
domain = os.environ.get("domainName") # DOMAIN should resolve to TARGET_IP
path = os.environ.get("path") # Http server file path to GET

if not path or len(path) == 0:
path = '/'
Expand Down
Loading