MOHAMMAD ULLAH
Introduction
Dataset Overview
Let's proceed to the analysis.
Import the necessary libraries
Loading the dataset with pandas
Check the data types
Calculate basic summary statistics for each column (mean, median, standard deviation, etc.)
Get the top 5 companies with the highest total volume:
Explore the distribution of the 'Close' prices over time
Identify and analyze any outliers (if any) in the dataset of volume column
Create a line chart to visualize the 'Close' prices over time
Calculate and plot the daily percentage change in closing prices
Investigate the presence of any trends or seasonality in the stock prices
Apply moving averages to smooth the time series data in 15/30 day intervals against the original graph
Calculate the average closing price for each stock
Identify the top 5 and bottom 5 stocks based on average closing price.
Calculate and plot the rolling standard deviation of the 'Close' prices
Create a new column for daily price change
Analyze the distribution of daily price changes
Identify days with the largest price increases and decreases
Identify stocks with unusually high trading volume on certain days
Explore the relationship between trading volume and volatility
Correlation matrix between the 'Open' & 'High', 'Low' &'Close' prices
Create heatmaps to visualize the correlations using the seaborn package.
Acknowledgments
CSV
file containing a substantial dataset for exploration. The dataset comprises 49,159
rows and 7
columns, encapsulating a wealth of information related to the Dhaka stock market. It spans the time period from January 2022 to June 2022
, encompassing a diverse array of 412 companies
. Key features in this dataset include columns such as Date, Name, Open, High, Low, Close, Volume
. This dataset offers a comprehensive perspective on the dynamics of the Dhaka stock market, allowing for in-depth analysis and insights into this sector.read_csv()
function from the pandas
library within a .py
file in VS Code
. Following this, the head()
function is employed to showcase the first 5 rows
of the dataset, providing an initial overview of the structure and contents of the Dhaka stock market data.Date
column—it appears as an object
type when it should be in a DateTime
format. Therefore, we need to convert the object type to datetime. Let’s proceed with that conversion.pd.to
_datetime()
, we perform the conversion. Additionally, we set dayfirst=True
since the date format is %d%m%y
. Let's recheck the data types to confirm the successful conversiondf
DataFrame using the describe()
function. This function delivers crucial metrics including count, mean, standard deviation, as well as minimum and maximum values for each numerical column in the dataset. These statistics offer a comprehensive overview of the data's central tendencies and variability.volume
column'groupby()'
function. The output will display the average closing price for all the stocks available in the dataset.1
signifies a strong positive
correlation, while a value near -1
indicates a strong negative
correlation. Values closer to 0
suggest a weaker or no linear relationship
.'Open'
& 'High'
, 'Low'
&'Close'
pricesseaborn
package.