This project, though not heavily tech-based, had a significant impact and offered immense learning opportunities. It involved a social sciences digitization initiative, where I collaborated with university professors and research assistants to gather information on restaurant menus, aiming to explore the culinary history of the Gulf region.
My role was to collect and digitize these menus. Intrigued, I proposed to the lead professor to web scrape a site listing over 5,000 restaurants, a task aligning with my full-stack web development experience and allowing me to quickly learn Python web scraping with BeautifulSoup. The challenge was navigating and understanding the structure of multiple web pages, as the entire database wasn't available on a single page.
A significant hurdle arose when only the first menu image would download, leading me to discover that the images were loading dynamically, rendering BeautifulSoup inadequate. After extensive debugging, I adapted my approach using Selenium to successfully scrape the needed data. My perseverance paid off, resulting in a comprehensive collection of images beneficial to the research.
Building on this success, I am now contributing to the development of a machine-learning model to analyze these images and create structured data more efficiently. This ongoing work exemplifies my commitment to continuous learning and applying my technical skills to diverse, real-world problems.
The main program is dynamic_webscraper.py and I have also uploaded a part of the results Abu Dhabi Menus.