Skip to content

🕸️ Web Scraping Tool in TypeScript with Puppeteer: A versatile web scraping and data extraction tool built with TypeScript and Puppeteer for efficient web data harvesting.

Notifications You must be signed in to change notification settings

ojmarte/web-data-extractor

Repository files navigation

🕸️ Web Scraping Project

Project Description

This project is a web scraping application that utilizes Puppeteer and Crawler to extract data from various websites. It provides a modular and scalable structure for web scraping, allowing customization for different websites.

Table of Contents

🌐 Environment Setup

Before getting started with the project, ensure you have the following prerequisites installed:

🚀 Installation

To install project dependencies, run the following command in your project directory:

npm install

📦 Cloning the Repository

To clone the project repository and set up Node.js using nvm, follow these steps:

  1. Install nvm (Node Version Manager) if you haven't already. You can use the following command:

    curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.38.0/install.sh | bash
  2. Close and reopen your terminal or run:

    source ~/.nvm/nvm.sh
  3. Navigate to the project directory where you cloned the repository.

  4. Run the following command to set up the required Node.js version using nvm:

    nvm install
  5. Set the installed Node.js version as the default for this project:

    nvm use

Now, you have Node.js installed and configured for this project using nvm.

🏃 Running the Application

To run the web scraping application, use the following command:

npm start

🧪 Running Tests

To run tests, use the following command:

npm test

🧹 Linting

To run the linter and check your code for style issues, use the following command:

npm run lint

About

🕸️ Web Scraping Tool in TypeScript with Puppeteer: A versatile web scraping and data extraction tool built with TypeScript and Puppeteer for efficient web data harvesting.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published