Data Modelling Analyst
Data Analyst
Data Engineer
Apache Airflow
Google Cloud Platform
Microsoft Power BI
health_etl_data
download_google_files
: Downloads files from Google Drive.export_sqltables_to_csv
: Exports data from SQLite tables to CSV files.extract_icd_data
: Extracts ICD (International Classification of Diseases) data in JSON format.transform_json_to_csv
: Converts JSON files to CSV format.process_inpatient_outpatient_files
: Processes inpatient and outpatient data files using configuration settings.hospitalinfo_clean_task
: Cleans hospital information data.gcs_to_raw
Tasks: Loads CSV files from GCS into BigQuery tables in the healthcare_raw
dataset.healthcare_transformed
dataset in BigQuery and import relevant tables.google-cloud
, astro
, and cosmos
gcp
connectiongcp
connection is properly set up./usr/local/airflow/include/
directory for scripts, datasets, and config files.dags/health_etl_data.py
/usr/local/airflow/include/scripts/config.txt
/usr/local/airflow/include/sources/
healthcare_transformed
datasethealthcare_dashboard.pbix
: Main Power BI dashboard file.report file
: Power BI report file.Posted Jan 23, 2025
Contribute to Sir-Muguna/healthcare_data_pipeline development by creating an account on GitHub.
0
0
Data Modelling Analyst
Data Analyst
Data Engineer
Apache Airflow
Google Cloud Platform
Microsoft Power BI