WORLD SUICIDES RATES EXPLORATION

Cynthia Undisa

Data Visualizer

Data Analyst

MySQL

###Introduction

Death by suicide is commonplace and occurs almost everywhere. However, the suicide rates and numbers differ from country to country. This project is a dataset of suicide cases from 1985-2020. The project aims to find and explore relationships between the suicide rate and age group, generation, country, HDI for the year, GDP for the year, and GDP per capita.

###Project description

The data set has 12 variables: age, country, country-year, GDP for_year, gdp_per_capita, generation, HDI for year, population, sex, suicides_no, suicides/100k pop, year. The data is between the years 1985 and 2020. It was downloaded from the Kaggle open dataset.

The following data-wrangling steps were carried out to clean the data set

HDI for year was removed since it had lots of nulls.

2. An ID column was added to uniquely identify the countries.

Summary of Findings

I observed that over time, GDP per capita has increased and suicide rates have decreased.

2. The number of suicides among males is more than twice that of females.

3. The number of suicides in the Russian Federation is high.

4. The age group 35-54 has high suicide rates.

5. The Baby Boomer generation has high suicide rates.

CREATE database portfolioproject

USE portfolioproject

SHOW tables;

-- Select data we are going to use `suicide m` table

SELECT *

FROM `suicide m`;

-- Data cleaning of suicide dataset

-- Adding id column

ALTER TABLE `suicide m`

ADD COLUMN id int NOT NULL auto_increment Primary Key;

-- checking for distinct number of users

SELECT COUNT(DISTINCT(id))

FROM `suicide m`;

-- `suicide m` has 30556 users

-- checking missing data

SELECT *

FROM `suicide m`

WHERE id IS NULL;

-- deleting column HDI for year which will not use in analysis

ALTER TABLE `suicide m`

DROP COLUMN `HDI for year`

-- Preview of the cleaned data

SELECT *

FROM `suicide m`;

-- ANALYZING OF DATA

-- suicide no in russian federation

SELECT country,SUM(suicides_no)

FROM `suicide m`

WHERE country LIKE'r%'

GROUP BY 1

ORDER BY 2 DESC;

-- age with highest suicide cases

SELECT age,SUM(suicides_no)

FROM `suicide m`

GROUP BY 1

ORDER BY 2 DESC;

-- age with highest rate of suicide case is 35-54

-- gender with highest suicide cases

SELECT sex,SUM(suicides_no)

FROM `suicide m`

GROUP BY sex

ORDER BY 2 DESC;

-- males have a high rate of suicide than female

-- country with highest suicide numbers

SELECT country,`gdp_for_year ($)`,SUM(suicides_no)

FROM `suicide m`

GROUP BY 1,2

ORDER BY 3 DESC;

-- Russian federation has highest suicide cases

-- Looking at total suicides vs population

-- shows what percentage of population has committed suicide

SELECT country,year,population,SUM(suicides_no) AS total_suicidecases,(suicides_no/population)*100 AS suicide_percent

FROM `suicide m`

GROUP BY 1,2,3,5

ORDER BY 5 DESC;

-- Looking at countries with highest suicide rate as compared to the population

-- Using CTE

WITH t1 AS(

SELECT country,population,SUM(suicides_no) AS sum_scases,(SUM(suicides_no)/population)*100 AS suicide_percent

FROM `suicide m`

GROUP BY 1,2)

SELECT country,population,MAX(sum_scases),MAX(suicide_percent)

FROM t1

GROUP BY 1,2

ORDER BY 4 DESC;

-- Republic of Korea has the highest suicide rate as compared to the population

-- Looking at population vs suicide_no

SELECT SUM(population),SUM(suicides_no),SUM(suicides_no)/SUM(population)*100 AS suicide_percent

FROM `suicide m`

ORDER BY 3;

-- world suicide percent is 0.0048

-- suicides vs GDP per capita

SELECT country,SUM(suicides_no),SUM(`gdp_per_capita ($)`)

FROM `suicide m`

GROUP BY 1

ORDER BY 3 DESC;

-- Luxembourg has the highest sum per capita

-- suicides vs generation

SELECT SUM(suicides_no),generation

FROM `suicide m`

GROUP BY 2

ORDER BY 1 DESC;

-- generation boomers have high suicide numbers

-- age group with high suicide numbers

SELECT SUM(suicides_no),age

FROM `suicide m`

GROUP BY 2

ORDER BY 1 DESC;

-- 35-54 age group have high suicide_no

-- Creating View to store data for later visualization

CREATE VIEW countrysuicides_no AS

SELECT country,SUM(suicides_no)

FROM `suicide m`

WHERE country LIKE'r%'

-- GROUP BY 1

-- ORDER BY 2 DESC;

Like this project

Posted Jun 6, 2024

I analyze the world suicide rate to find out if there is a relationship between suicide rates, HDI per year, and GDP per capita.

Likes

Views

WORLD SUICIDES RATES EXPLORATION

Join 50k+ companies and 1M+ independents