Cloud Architecture for Data Lakes and Analytics Solutions

Kedir Omer

Cloud Security Engineer
Data Analyst
Data Engineer
AWS
Python

𝐏𝐫𝐨𝐣𝐞𝐜𝐭 𝐁𝐚𝐜𝐤𝐠𝐫𝐨𝐮𝐧𝐝

I’m working at a company that is planning to use Amazon Simple Storage Service (Amazon S3) as the storage layer for their data lake solution. Initially, the data that will be ingested into the data lake will come from three locations:Internet of Things (IoT) sensors that send real-time dataA database with historical recordsSupplemental data from third-party entities for enriching internally generated dataThese components are currently running in the data center on physical servers. Currently, if a power outage occurred in the data center, all systems would be brought offline. Because of this issue (in addition to other benefits of the cloud), my customer wants to migrate all components to the cloud and, when possible, use AWS services to replace on-premises components.

𝐖𝐡𝐚𝐭 𝐰𝐢𝐥𝐥 𝐈 𝐝𝐨 𝐚𝐬 𝐚 𝐂𝐥𝐨𝐮𝐝 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧𝐬 𝐀𝐫𝐜𝐡𝐢𝐭𝐞𝐜𝐭?The company has tasked me with designing solutions for ingesting this data into their data lake, and each location (IoT sensors, database, and third party) will need its own ingestion solution.From there, I will need to also design a solution for how to clean or transform the data so that it can be analyzed. The company currently uses Apache Hadoop-based software. When possible, they prefer to use similar technologies in the cloud so that they don’t need to retrain their analytics team on too many new technologies at one time.The company also has a requirement to create dashboards that show visual representations of the insights they derive from the data.

Partner With Kedir
View Services

More Projects by Kedir