Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mhmdiaa committed Jul 28, 2024
1 parent a2955f9 commit f16342f
Show file tree
Hide file tree
Showing 2 changed files with 86 additions and 70 deletions.
156 changes: 86 additions & 70 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,98 +1,114 @@
![chronos](./images/chronos.png)
# Chronos

Chronos (previously known as WaybackUnifier) extracts pieces of data from a web page's history. It can be used to create custom wordlists, search for secrets, find old endpoints, etc.

---
Wayback Machine OSINT Framework

- [Installation](#installation)
- [Example Usage](#example-usage)
- [Extract endpoints and URLs from archived JavaScript code](#extract-endpoints-and-urls-from-archived-javascript-code)
- [Calculate archived favicon hashes](#calculate-archived-favicon-hashes)
- [Extract archived page titles](#extract-archived-page-titles)
- [Extract paths from archived robots.txt files](#extract-paths-from-archived-robotstxt-files)
- [Extract URLs from archived sitemap.xml files](#extract-urls-from-archived-sitemapxml-files)
- [Extract endpoints from archived API documentation](#enumerate-endpoints-from-api-documentation)
- [Find S3 buckets in archived pages](#find-s3-buckets-in-archived-pages)
- [Modules](#modules)
- [Command-line Options](#command-line-options)

## Installation
### From binary
Download a prebuilt binary from the [releases page](https://github.com/mhmdiaa/chronos/releases/latest) and unzip it.
Download a prebuilt binary from the [releases page](https://github.com/mhmdiaa/chronos/releases/latest).

### From source
Use `go get` to download and install the latest version
```
go get -u github.com/mhmdiaa/chronos
go install github.com/mhmdiaa/chronos@latest
```

---

## Presets
Presets are predefined options (URL path, match regex, and extract regex) that can be used to organize and simplify different use cases. The preset definitions are stored in `~/.chronos` as JSON files

## Example Usage
### Extract endpoints and URLs from archived JavaScript code
```
$ cat ~/.chronos/robots.json
{
"path": "/robots.txt",
"match": "Disallow",
"extract": "(?:\\s)(/.*)"
}
$ chronos -pr robots -t example.com
$ # equivalent to...
$ chronos -p /robots.txt -m Disallow -e "(?:\\s)(/.*)"
chronos -target "denali-static.grammarly.com/*" -module jsluice -output js_endpoints.json
```
[![asciicast](https://asciinema.org/a/lm8hSxIMWYk8f3wolSeNYJewO.svg)](https://asciinema.org/a/lm8hSxIMWYk8f3wolSeNYJewO)

---

## Example usage

### Extract paths from robots.txt files and build a wordlist
### Calculate archived favicon hashes
```
$ chronos -t example.com -p /robots.txt -m Disallow -e "(?:\\s)(/.*)" -o robots_wordlist.txt
chronos -target "netflix.com/favicon.ico" -module favicon -output favicon_hashes.json
```
[![asciicast](https://asciinema.org/a/sNKQA7XXnAFmOSKYUQyoph1vJ.svg)](https://asciinema.org/a/sNKQA7XXnAFmOSKYUQyoph1vJ)

### Save all versions of a web page locally and filter out a specifc status code
### Extract archived page titles
```
$ chronos -t http://example.com/this_is_403_now_but_has_it_always_been_like_this_question_mark -fs 403 -od output
chronos -target "github.com" -module html -module-config "html.title=//title" -snapshot-interval y -output titles.json
```
[![asciicast](https://asciinema.org/a/avNyaQoPN8WQ2vZkyrql7aYf0.svg)](https://asciinema.org/a/avNyaQoPN8WQ2vZkyrql7aYf0)

### Save URLs of all subdomains of example.com that were last seen in 2015
### Extract paths from archived robots.txt files
```
$ chronos -t *.example.com -u -to 2015
chronos -target "tripadvisor.com/robots.txt" -module regex -module-config 'regex.paths=/[^\s]+' -output robots_paths.json
```
[![asciicast](https://asciinema.org/a/zXw1XvhyNOIKPd41HWjWQvoIx.svg)](https://asciinema.org/a/zXw1XvhyNOIKPd41HWjWQvoIx)

### Run the S3 preset that extract AWS S3 URLs
### Extract URLs from archived sitemap.xml files
```
$ chronos -pr s3 -t example.com
chronos -target "apple.com/sitemap.xml" -module xml -module-config "xml.urls=//urlset/url/loc" -limit -5 -output sitemap_urls.json
```
[![asciicast](https://asciinema.org/a/tJAWMuDx6z8G0pQqRCZR3WJiU.svg)](https://asciinema.org/a/tJAWMuDx6z8G0pQqRCZR3WJiU)

### Extract endpoints from archived API documentation
```
chronos -target "https://docs.gitlab.com/ee/api/api_resources.html" -module html -module-config 'html.endpoint=//code' -output api_docs_endpoints.json
```
[![asciicast](https://asciinema.org/a/5yrrAnt46CHJqlhja4T48ym8u.svg)](https://asciinema.org/a/5yrrAnt46CHJqlhja4T48ym8u)

---

## Options
```
Usage: chronos <preset (optional)> <params>
-c int
Number of concurrent threads (default 10)
-e string
Extract regex
-fm string
Filter Mime codes
### Find S3 buckets in archived pages
```
chronos -target "github.com" -module regex -module-config 'regex.s3=[a-zA-Z0-9-\.\_]+\.s3(?:-[-a-zA-Z0-9]+)?\.amazonaws\.com' -limit -snapshot-interval y -output s3_buckets.json
```
[![asciicast](https://asciinema.org/a/HKma8ycDMgHO6RPThjBXrOlrp.svg)](https://asciinema.org/a/HKma8ycDMgHO6RPThjBXrOlrp)

## Modules
| Module | Description |
|-------------|---------------------------------------------------------------|
| regex | Extract regex matches |
| jsluice | Extract URLs and endpoints from JavaScript code using jsluice |
| html | Query HTML documents using XPath expressions |
| xml | Query XML documents using XPath expressions |
| favicon | Calculate favicon hashes |
| full | Get the full content of snapshots |

## Command-line Options
```
Usage of chronos:
-target string
Specify the target URL or domain (supports wildcards)
-list-modules
List available modules
-module string
Comma-separated list of modules to run
-module-config value
Module configuration in the format: module.key=value
-module-config-file string
Path to the module configuration file
-match-mime string
Comma-separated list of MIME types to match
-filter-mime string
Comma-separated list of MIME types to filter out
-match-status string
Comma-separated list of status codes to match (default "200")
-filter-status string
Comma-separated list of status codes to filter out
-from string
Match results after a specific date (Format: yyyyMMddhhmmss)
-fs string
Filter status codes
-m string
Match regex
-mm string
Match Mime codes
-ms string
Match status codes (default "200")
-o string
Output file path (default "output.txt")
-od string
Directory path to store matched results' entire pages
-p string
Path to add to the URL
-preset string
Preset name
-t string
Target URL/domain (supports wildcards)
Filter snapshots from a specific date (Format: yyyyMMddhhmmss)
-to string
Match results before a specific date (Format: yyyyMMddhhmmss)
-u URLs only
Filter snapshots to a specific date (Format: yyyyMMddhhmmss)
-limit string
Limit the number of snapshots to process (use negative numbers for the newest N snapshots, positive numbers for the oldest N results) (default "-50")
-snapshot-interval string
The interval for getting at most one snapshot (possible values: h, d, m, y)
-one-per-url
Fetch one snapshot only per URL
-threads int
Number of concurrent threads to use (default 10)
-output string
Path to the output file
```

---

## Contributing
Find a bug? Got a feature request? Have an interesting preset in mind? Issues and pull requests are always welcome :)
Binary file added images/chronos.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit f16342f

Please sign in to comment.