This project will test your knowledge of the various tools related to batch processing, which you have learnt throughout this course. The project mainly revolves around Apache Sqoop, Apache PySpark, Amazon S3 and Amazon RedShift, which are some of the most widely used tools in the industry.
In our project, Spar Nord Bank is trying to observe the withdrawal behavior and the corresponding dependent factors to optimally manage the refill frequency. Apart from this, other insights also have to be drawn from the data.
Coming to the analysis part, you will be tasked to carry out the calculations to perform the following analytical queries:
Top 10 ATMs where most transactions are in the ’inactive’ state
Number of ATM failures corresponding to the different weather conditions recorded at the time of the transactions
Top 10 ATMs with the most number of transactions throughout the year
Number of overall ATM transactions going inactive per month for each month
Top 10 ATMs with the highest total amount withdrawn throughout the year
Number of failed ATM transactions across various card types
Top 10 records with the number of transactions ordered by the ATM_number, ATM_manufacturer, location, weekend_flag and then total_transaction_count, on weekdays and on weekends throughout the year
Most active day in each ATMs from location "Vejgaard"
Like this project
0
Posted Oct 6, 2024
Contribute to Murtaza6547/ETL_Project development by creating an account on GitHub.