Predicting Earthquake Magnitude: Insights from 28 Years of Seism

Balaj Khalid

0

Data Scientist

Data Analyst

Python

scikit-learn

seaborn

Introduction

Earthquakes are among the most devastating natural disasters, causing widespread destruction and loss of life. Understanding their patterns can help mitigate damage and improve early warning systems. Using a dataset of 994 recorded earthquakes from January 1995 to January 2023, I conducted an in-depth analysis to explore trends, relationships, and predictive modeling of earthquake magnitudes.

Dataset and Preprocessing

The dataset, sourced from Kaggle, includes key attributes such as magnitude, depth, intensity levels, tsunami occurrences, and location details. Before analysis, I performed the following preprocessing steps:
Imputed missing values: Countries were derived from location data, and continents were assigned accordingly.
Data Cleaning: Removed unnecessary columns like sig (as it is derived from magnitude) and alert (due to a high percentage of missing values). Additionally, handled missing data using appropriate imputation techniques and performed consistency checks to ensure data integrity.
Feature Engineering: Created features such as categorical encoding for magType, country, and continent to transform categorical data into numerical representations, enhancing model performance.

Exploratory Data Analysis (EDA)

To uncover meaningful patterns, I visualized the data using various plots:
Global Earthquake Distribution A heatmap of earthquake occurrences by continent revealed that Asia and North America experienced the highest number of earthquakes. Countries like Japan, Indonesia, and the United States were among the most affected.
Countries affected by Earthquake between 1995–2023
Countries affected by Earthquake between 1995–2023
Earthquake Magnitude Trends Analyzing magnitude trends over the years, I observed no clear increase or decrease in frequency, but significant earthquakes (magnitude >6) were consistently recorded across different periods. Box plots showed variations in magnitude across continents, with notable differences between Europe/Asia and other continents.
Magnitude of Earthquakes by Continent from 1995–2023
Magnitude of Earthquakes by Continent from 1995–2023
Depth and Magnitude Correlation A scatter plot analysis indicated that deeper earthquakes tend to have lower magnitudes, while shallow earthquakes (depth < 70 km) were more likely to have higher magnitudes and cause significant damage.
Correlation between Earthquake Depth and Magnitude
Correlation between Earthquake Depth and Magnitude
Tsunami and Earthquake Relationship A strong correlation was found between earthquake magnitude and tsunami occurrences. Most tsunamis resulted from earthquakes with magnitudes above 6.5, emphasizing the need for oceanic region monitoring.
Correlation between Earthquake Magnitude and Tsunami
Correlation between Earthquake Magnitude and Tsunami

Machine Learning Model for Magnitude Prediction

To predict earthquake magnitudes, I trained various regression models and used Grid Search CV for hyperparameter tuning. The models tested included:
Linear Regression
Decision Trees
Random Forest
XGBoost (Extreme Gradient Boosting)
LightGBM
The best-performing model was Random Forest, achieving the lowest Mean Squared Error (MSE) of 0.1365. Its ability to capture complex nonlinear relationships, combined with the power of ensemble learning through multiple decision trees, made it the most effective choice for accurately predicting earthquake magnitudes.
You can access the full Jupyter Notebook and code in my GitHub repository.

Key Takeaways

Asia and Oceania experience the most earthquakes, with Papua New Guinea and Indonesia being the most affected countries.
Higher-magnitude earthquakes often trigger tsunamis, highlighting the importance of early warnings in oceanic regions.
Earthquake depth impacts magnitude, with shallow earthquakes being more destructive.
Machine learning models can predict earthquake magnitudes, with Random Forest outperforming other methods due to its ability to model complex patterns.

Future Directions

While predictive modeling is promising, real-time earthquake forecasting remains a challenge. Future work could incorporate real-time seismic data, deep learning models, and additional geological features for better accuracy.

Conclusion

Understanding earthquake patterns through data-driven analysis is crucial for disaster preparedness. By leveraging machine learning, we can take a step closer to enhancing early warning systems and mitigating risks.
Like this project
0

Posted Feb 8, 2025

Analyzed 994 earthquakes (1995–2023), explored trends, and built ML models. Random Forest achieved the best magnitude prediction with MSE 0.1365.

Likes

0

Views

0

Tags

Data Scientist

Data Analyst

Python

scikit-learn

seaborn

Fake News Detection with DistillBERT
Fake News Detection with DistillBERT
Google Shopping Price Scraper
Google Shopping Price Scraper