Re-architecting Facebook crawlers

Omar ElSafwany

Backend Engineer

Fullstack Engineer

Facebook

MongoDB

Node.js

Upon working with a company dealing with social media, there was a scalability issues when it came to crawling data from Facebook. Lots of duplicates, rate limits and missing data. Around 20% of our calls were in vain.

I wrote an algorithm that scheduled crawling jobs efficiently across 24 hour window with different crawling frequency based on the posting frequency of each account. This lead to eliminating rate limit errors, efficient data crawling without any duplicates and the most important thing, no missing data.

Like this project

Posted Jul 4, 2024

Re-architecting how we crawl data from Facebook to ensure efficient use without wasting api calls and no data loss.

Likes

Views

Re-architecting Facebook crawlers

Join 50k+ companies and 1M+ independents