Python Web Scraper | Automation, Backend

James Musser

Automation Engineer
Backend Engineer
Web Developer
Python
Selenium

A client that was working in the ski resort industry was looking to have a web scraper built that would allow him to pull 180 days worth of ski resort ticket pricing from a handful of websites. The goal was to automate away as much of the work as possible, ensuring that the API cookie data was always correct, and ultimately outputting a CSV file with all of the relevant data.

Some requirements with the development of this application:

- Web scraper needed to automatically connect to a VPN (from a randomly selected country) at the start of the process. Custom module was created for this, working with NordVPN

- The application would open a browser window and navigate to the desired website via Selenium

- A captcha page would need to be bypassed - custom module was created for this as well (using CapSolver API)

- The API cookie would need to be grabbed from Chrome Dev Tools - custom module was created here too

- Next, Selenium would close the browser window, and the initial VPN connection would be terminated. Then, another VPN (from another randomly selected country, different from the first) would be connected to, and the browser window would be re-opened with Selenium

- Finally, the API cookie data pulled from earlier would be used to pull the ski resort ticket price data over a customizable range of dates and age groups. The data was exported into an output CSV file



A video showing this process can be viewed here:

https://youtu.be/UJE38xQqJAU?si=ZqeqyR1JIPKi168d

And the github repo for the project can be found here:

https://github.com/musser004/Ski_Resort_Web_Scraper



Partner With James
View Services

More Projects by James