Big Data Cluster Architecture

Business Analyst

Data Analyst

Data Engineer

Power BI

Python

Bikanervala

◦ Objective – To develop efficient big data solution.

◦ Challenge – To cater large data (+3TB) and multiple data sources.

◦ Approach – To deploy cluster architecture with single data source.

◦ Solution – Implemented Hadoop (HDFS) + Spark (PySpark) + Airflow (Scheduler). A Linux based cluster architecture with compressed data warehouse solution for having single data source. Full architecture was open source.

◦ Result – ETL process time reduced by 10x.

Partner With Ashish

View Services

More Projects by Ashish

GCP Data Analytics

Financial Data Handling (AWS)

How it Works

Contra For Independents Contra For Hiring Success Stories Commission-Free

Company

Mission Careers Newsroom

Resources

FAQ Tips & Guides Hire Support

Dіscover Freelancers

Design Engineering Marketing Music & Audio Social Media Video & Animation Writing

Drops

Freelance Industry Report

Social

Terms & Conditions Privacy Policy Cookie Policy