🚀 Just built my new Python automation project: Greenhouse Job Scraper
This tool automates the process of extracting job listings from multiple Greenhouse job boards, making job data collection faster, cleaner, and more scalable.
What it does:
✅ Scrapes jobs from multiple companies
✅ Validates company slugs before scraping
✅ Supports remote-only filtering
✅ Handles request retries and failures
✅ Removes duplicate entries efficiently
✅ Exports structured datasets into JSON and CSV
Extracted fields:
• Job title
• Company name
• Location
• Published date
• Updated date
• Application deadline
• Direct application URL
Tech Stack:
🐍 Python
🌐 Requests
📦 JSON / CSV
⚙️ REST API Integration
🖥️ CLI Interface
This project helped me deepen my understanding of:
API-based scraping
Data pipeline structuring
Error handling and retries
Efficient deduplication strategies
Building production-style CLI tools
A solid step forward in my journey through automation, backend engineering, and data extraction systems.
#Python #WebScraping #Automation #APIs #DataEngineering #BackendDevelopment #SoftwareEngineering #OpenToWork #Freelance #JobScraper
0
11
I recently completed a web scraping project
targeting OuedKniss, Algeria's largest
real estate marketplace.
The tool automates the extraction of structured
property listings via their GraphQL API and
delivers clean, analysis-ready data in CSV
and JSON formats.
Key technical implementations:
→ GraphQL API integration with dynamic queries
→ Multilingual data processing (Arabic & French)
→ Unicode & encoding normalization
→ Retry logic with configurable error handling
→ Automated data cleaning pipeline
Tech stack: Python • Requests • GraphQL •
Regex • Unicodedata • CSV • JSON
This project is part of my growing freelance
portfolio as a Python developer specialized
in web scraping and data automation.
Open to freelance opportunities in data
extraction, automation, and API integration.
🔗 GitHub: github.com/GuMouhssin
(https://www.linkedin.com/safety/go/?url=http%3A%2F%2Fgithub%2Ecom%2FGuMouhssin&urlhash=U4mD&mt=pXN2XKbtKWo5WK9s0zVnl0QRgJTgq9fOlsERISTSLLRl-e5X20WoDzYM2uBuy4yZILNNv6jcLvUp9LhMKdMFEdJpijU--0xFcBm8huL6mVBr15rLMB4Bomfxhz8&isSdui=true)#Python (https://www.linkedin.com/search/results/all/?keywords=%23python&origin=HASH_TAG_FROM_FEED) #WebScraping (https://www.linkedin.com/search/results/all/?keywords=%23webscraping&origin=HASH_TAG_FROM_FEED) #DataAutomation
(https://www.linkedin.com/search/results/all/?keywords=%23dataautomation&origin=HASH_TAG_FROM_FEED)#GraphQL (https://www.linkedin.com/search/results/all/?keywords=%23graphql&origin=HASH_TAG_FROM_FEED) #DataExtraction (https://www.linkedin.com/search/results/all/?keywords=%23dataextraction&origin=HASH_TAG_FROM_FEED) #Freelance (https://www.linkedin.com/search/results/all/?keywords=%23freelance&origin=HASH_TAG_FROM_FEED) #Algeria (https://www.linkedin.com/search/results/all/?keywords=%23algeria&origin=HASH_TAG_FROM_FEED)
0
24
Built a Python-based web scraping system that automates the extraction of book data from a multi-page online catalog.
The scraper efficiently navigates paginated pages, collects product links, extracts structured information, and exports clean datasets in both CSV and JSON formats.
Key Features:
• Automatic pagination handling • Category-based filtering • Keyword search functionality • Retry mechanisms for failed requests • Robust error handling • Structured data export for analysis and integration
Extracted Data:
• Book title • Price • Rating • Availability status • Product description • Category • Product URL
Tech Stack:
Python | Requests | BeautifulSoup | CSV | JSON
This project strengthened my understanding of web scraping architecture, data extraction pipelines, HTML parsing, and building resilient automation systems.
#Python #WebScraping #DataExtraction #Automation #BeautifulSoup #Requests #DataEngineering #SoftwareDevelopment #OpenToWork #FreelanceDeveloper