A TypeScript application that parses HTTP access logs and provides analytics about IP addresses and URL access patterns.
- Counts unique IP addresses in the log file
- Identifies top 3 most visited URLs
- Identifies top 3 most active IP addresses
- Handles standard HTTP access log format
- Written in TypeScript with full type safety
- Includes unit tests
- Node.js (v14 or higher)
- npm (v6 or higher)
- Clone the repository:
git clone https://github.com/cryptoleek-eth/http-log-analyzer.git
cd http-log-analyzer
- Install dependencies:
npm install
- Build the project:
npm run build
- Run the analyzer:
npm start
The analyzer will process the included sample log file (programming-task-example-data.log
) and output:
- Number of unique IP addresses
- Top 3 most visited URLs with visit counts
- Top 3 most active IP addresses with request counts
├── src/
│ ├── __tests__/ # Test files
│ ├── types/ # TypeScript interfaces
│ ├── utils/ # Utility classes
│ ├── services/ # Business logic
│ └── index.ts # Application entry point
├── dist/ # Compiled JavaScript files
└── programming-task-example-data.log # Sample log file
npm run build
- Compiles TypeScript to JavaScriptnpm start
- Runs the compiled applicationnpm test
- Runs the test suitenpm run dev
- Runs the application in development mode using ts-nodenpx ts-node src/index.ts
- Runs the application in development mode without building
npm test
The application expects log files in the following format:
Example:
177.71.128.21 - - [10/Jul/2018:22:21:28 +0200] "GET /url1 HTTP/1.1" 200 3574
Defines the structure for parsed log entries:
ipAddress
: Client IP addresstimestamp
: Request timestampmethod
: HTTP methodurl
: Requested URLprotocol
: HTTP protocol versionstatusCode
: Response status coderesponseSize
: Response size in bytesuserAgent
: Client user agent string
Handles parsing of individual log lines using regex pattern matching.
Provides analytics functionality:
getUniqueIPCount()
: Returns count of unique IP addressesgetTopUrls()
: Returns most frequently accessed URLsgetTopIPs()
: Returns most active IP addresses
- Invalid log entries are skipped during parsing
- File read errors are caught and reported
- Type safety is enforced throughout the application
- Log file format follows the standard pattern
- Log file is UTF-8 encoded
- Memory usage is not a constraint (file is read entirely into memory)
- URLs are case-sensitive
- IP addresses are well-formed
This project is licensed under the MIT License - see the LICENSE file for details.