Skip to content

Architecture diagram

Sohom Datta edited this page May 22, 2024 · 1 revision

The architecture of the VisibleV8 crawler is depicted below.

  • PostgreSQL is used as the primary database backend, and MongoDB is used as the primary archive storage medium.
  • Task statuses are stored in Celery
  • The crawling instance exposes localhost:5901 and localhost:6901 as two VNC sessions which can be used to monitor non-headless crawls.
  • The databases can be swapped out for non-docker-managed instances by selecting the connect mode during the setup process.
Clone this wiki locally