Skip to content

PaulinaPacyna/otodom_scraper_and_information_retrieval

Repository files navigation

Otodom scraper and information retrieval

Otodom.pl is a Polish online real estate marketplace. They provide a lot of useful filters (price, size, number of rooms), but I stumbled upon situations where those filters are not enough. Sometimes relevant informations are present only in the description. That is why I created this tool.

This tool does the following:

  • scrapes all offers from a given listing link (like this). The link can and should include built-in otodom filters img
  • extracts the offer description and other stuff and saves it into a database img
  • Asks GPT (OpenAI) to answer questions from questions.json in natural language (using langchain)
Question: Is the land regulated?
Answer: The land is being regulated.

Question: Is a land and mortgage register established?
Answer: No, the description contains information that a land and mortgage register has not been established.

Question: How much is the rent?
Answer: The rent is PLN 1,000.

Question: Is the apartment two-sided?
Answer: Yes, the description includes information that the apartment is two-sided, the windows face two sides of the world: south and north
  • Parses the answer from the previous step from natural language to json composed of integers and enums (yes/no/no information)
{
"lands_regulated": "false",
"mortgage_register": "no_information", 
"rent_administration_fee": 1000, 
"two_sided": "true"
}
  • Displays the offers back to the user, allows to filter based on the answers. img

Interesting files:

  • retrieve_offer_information.py - information retrieval and parsing to json
  • otodom_parser.py

Roadmap:

  • Retrieve street name from the descitpion call google maps api to compute the distance to "metro centrum" by public transportation.
  • Display images and original description along the informations retrieved by this tool.

About

Otodom scraper and information retrieval

Topics

Resources

Stars

Watchers

Forks

Languages