MOHAMMAD ULLAH
Welcome to a journey through the Dhaka stock market of 2022!
This blog is your gateway to decoding market trends, deciphering data patterns, and discovering the specifics of Bangladesh's financial landscape. Through Python-powered analysis and visualizations, we'll unveil the underlying narratives behind stock movements. Join me on this data-driven journey as we unravel the Dhaka stock market's story, offering valuable insights for traders, investors, and enthusiasts seeking a deeper understanding of market dynamics.
Within this analysis, we harnessed a CSV
file containing a substantial dataset for exploration. The dataset comprises 49,159
rows and 7
columns, encapsulating a wealth of information related to the Dhaka stock market. It spans the time period from January 2022 to June 2022
, encompassing a diverse array of 412 companies
. Key features in this dataset include columns such as Date, Name, Open, High, Low, Close, Volume
. This dataset offers a comprehensive perspective on the dynamics of the Dhaka stock market, allowing for in-depth analysis and insights into this sector.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
These libraries are essential components for our analysis.
In this section, the stock market data is imported using the read_csv()
function from the pandas
library within a .py
file in VS Code
. Following this, the head()
function is employed to showcase the first 5 rows
of the dataset, providing an initial overview of the structure and contents of the Dhaka stock market data.
Output:
df = pd.read_csv('Stock_Market_Data.csv')
print(df.head())
Verify the data types of all columns using the following command. Ensuring correct data formatting is crucial before analysis. Analyzing data in the wrong format can lead to inaccurate insights or results.
print(df.dtypes)
Output:
We notice an issue with the data type of the Date
column—it appears as an object
type when it should be in a DateTime
format. Therefore, we need to convert the object type to datetime. Let’s proceed with that conversion.
# Convert 'Date' column to datetime format
df['Date'] = pd.to_datetime(df['Date'],dayfirst=True)
Utilizing pd.to
_datetime()
, we perform the conversion. Additionally, we set dayfirst=True
since the date format is %d%m%y
. Let's recheck the data types to confirm the successful conversion
print(df.dtypes)
Output:
The code generates essential statistical measures for the df
DataFrame using the describe()
function. This function delivers crucial metrics including count, mean, standard deviation, as well as minimum and maximum values for each numerical column in the dataset. These statistics offer a comprehensive overview of the data's central tendencies and variability.
print(df.describe())
Output:
Handpicking key companies from the dataset allows for a deeper analysis of their market impact. The code snippet identifies the top 5 companies by their total trading volume.
# Calculate total volume for each company
volume_per_company = df.groupby('Name')['Volume'].sum()
# Get the top 5 companies with the highest total volume
top_5_companies = volume_per_company.nlargest(5).index
print(top_5_companies.to_list())
Output:
Analyzing the 'Close' price distribution over time for the top 5 companies, this code generates histograms for each company. The visualizations provide insights into the variations in closing prices, facilitating a comparative assessment of their stock performance.
for name in top_5_companies:
company_data = df[df['Name'] == name]
plt.figure(figsize=(15, 5))
sns.histplot(data=company_data, x="Close", bins=30, label=name)
plt.xlabel("Closing Price Distribution")
plt.ylabel("Frequency")
plt.title("Distribution of Close Prices Over Time of {}".format(name))
plt.legend()
plt.xticks(rotation=45)
plt.show()
volume
columnOutliers beyond the maximum quartile in the 'Volume' boxplots indicate:
Generate line charts depicting the 'Close' prices over time for each selected company, unraveling the historical trends and patterns in their stock performance.
# Loop through the top 5 companies based on volume
for name in top_5_companies:
# Filter data for each specific company
company_data = df[df['Name'] == name]
# Create a separate line chart for each company's 'Close' prices over time
plt.figure(figsize=(10, 4))
plt.plot(company_data['Date'], company_data['Close'])
# Set labels and title for the plot
plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.title(f'Close Prices Over Time for {name}')
plt.xticks(rotation=45)
# Show the plot for each company
plt.show()
These observations can provide insights into the performance of these companies within the given timeframe. It's important to analyze the reasons behind these trends—whether they are influenced by company-specific factors (like earnings reports, market news, or company performance) or broader market conditions—to better understand the drivers behind the stock price movements.
Here's a more concise breakdown of insights from daily percentage change in closing prices:
plt.figure(figsize=(15, 4))
# Calculate daily percentage change for each company and plot individually
for name in top_5_companies:
plt.figure(figsize=(15, 4))
company_data = df[df['Name'] == name]
company_data['Daily_PCT_Change'] = company_data['Close'].pct_change() * 100
# Plot the daily percentage change for each company
plt.plot(company_data['Date'], company_data['Daily_PCT_Change'], label=name)
# Set labels and title for the plot
plt.xlabel('Date')
plt.ylabel('Daily Percentage Change')
plt.title(f'Daily Percentage Change in Closing Prices of {name}')
plt.legend()
plt.xticks(rotation=45)
plt.show()
This code snippet creates line charts displaying the stock prices over time for each of the top 5 companies, along with a rolling average trend line for a smoother representation of the trends. Here's what the code accomplishes:
This code provides a visual representation of both the actual closing prices and a smoothed trend line, aiding in identifying the general trend in stock price movements over time for each company. Adjustments to the rolling window size or other visualization aspects can be made as needed for better analysis and presentation.
for name in top_5_companies:
company_data = df[df['Name'] == name]
plt.plot(company_data['Date'],company_data['Close'], label=name)
# Plotting a rolling average (e.g., 30 days) for trend visualizations
rolling_avg = company_data['Close'].rolling(window=30).mean()
plt.plot(company_data['Date'],rolling_avg, label=f'{name} - Trend Line', linestyle='--')
plt.title('Stock Prices Trend Line Over Time')
plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.legend()
plt.show()
The code depicting original closing prices alongside 15-day and 30-day moving averages offers these insights:
# Loop through the top 5 companies based on volume
for name in top_5_companies:
plt.figure(figsize=(12, 6))
company_data = df[df['Name'] == name]
# Plotting original closing prices
plt.plot(company_data['Date'], company_data['Close'], label=name)
# Calculate and plot moving averages (15-day and 30-day)
moving_avg_15 = company_data['Close'].rolling(window=15).mean()
moving_avg_30 = company_data['Close'].rolling(window=30).mean()
plt.plot(company_data['Date'], moving_avg_15, label=f'{name} - 15-day MA', linestyle='--')
plt.plot(company_data['Date'], moving_avg_30, label=f'{name} - 30-day MA', linestyle='-.')
# Set labels, title, and legend
plt.title('Stock Prices with Moving Averages Over Time')
plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.legend()
plt.xticks(rotation=45)
# Show the plot
plt.show()
This code groups the DataFrame by the 'Name' column and then calculates the mean of the 'Close' prices for each group, resulting in the average closing price for each stock using the 'groupby()'
function. The output will display the average closing price for all the stocks available in the dataset.
# Calculate average closing price for each stock
average_closing_price = df.groupby('Name')['Close'].mean()
# Display the average closing prices
print(average_closing_price)
Output:
# Calculate average closing price for each stock
average_closing_price = df.groupby('Name')['Close'].mean()
# Sort stocks based on average closing price
sorted_stocks = average_closing_price.sort_values()
# Display top and bottom stocks
print("Top 5 Stocks based on Average Closing Price:")
print(sorted_stocks.tail(5))
print("\nBottom 5 Stocks based on Average Closing Price:")
print(sorted_stocks.head(5))
Output:
Analyzing the rolling standard deviation provides insights into
# Calculate and plot rolling standard deviation for each of the top 5 companies
for name in top_5_companies:
company_data = df[df['Name'] == name]
rolling_std = company_data['Close'].rolling(window=30).std()
plt.figure(figsize=(12, 6))
plt.plot(company_data['Date'], rolling_std, label=f'{name} - Rolling Std (30-day)', color='orange')
plt.title(f'Rolling Standard Deviation of Close Prices for {name} (30-day Window)')
plt.xlabel('Date')
plt.ylabel('Standard Deviation')
plt.legend()
plt.xticks(rotation=45)
plt.show()
# Create a new column for daily price change
df['Daily_Price_Change'] = df['Close'] - df['Open']
# Display the updated DataFrame with the new column
print(df.head())
Output:
# Analyze distribution of daily price changes for top 5 companies
for name in top_5_companies:
company_data = df[df['Name'] == name]
plt.figure(figsize=(8, 6))
plt.hist(company_data['Daily_Price_Change'], bins=30, edgecolor='black')
plt.title(f'Distribution of Daily Price Changes for {name}')
plt.xlabel('Daily Price Change')
plt.ylabel('Frequency')
plt.grid(axis='y', alpha=0.5)
plt.show()
# Identify days with the largest price increases
largest_increase = df.nlargest(5, 'Daily_Price_Change')
print("Days with the largest price increases:")
print(largest_increase[['Date', 'Daily_Price_Change']])
# Identify days with the largest price decreases
largest_decrease = df.nsmallest(5, 'Daily_Price_Change')
print("\nDays with the largest price decreases:")
print(largest_decrease[['Date', 'Daily_Price_Change']])
Output:
This analysis helps spot peaks in trading activity, indicating potential market events or investor interest. It shows patterns, events, or irregularities that might impact specific stocks or the market as a whole.
for name in top_5_companies:
company_data = df[df['Name'] == name]
plt.plot(company_data['Date'],company_data['Volume'],label=name)
threshold = company_data['Volume'].quantile(0.95)
high_volume_data = company_data[company_data['Volume'] > threshold]
plt.scatter(high_volume_data['Date'],high_volume_data['Volume'],color="red",marker='o',label="{} - High Volume Days".format(name))
plt.title('Trading Volume Over Time with Emphasis on Unusually High Volume Days')
plt.xlabel('Date')
plt.ylabel('Trading Volume')
plt.legend()
plt.show()
# Calculate volatility (daily price range) for each company
df['Volatility'] = df['High'] - df['Low']
# Plot individual correlation heatmaps for each company
for name in top_5_companies:
company_data = df[df['Name'] == name]
correlation_matrix = company_data[['Volume', 'Volatility']].corr()
plt.figure(figsize=(4, 4))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f')
plt.title(f'Correlation Heatmap of {name}')
plt.show()
The correlation heatmaps show how trading volume relates to price volatility:
1
signifies a strong positive
correlation, while a value near -1
indicates a strong negative
correlation. Values closer to 0
suggest a weaker or no linear relationship
.'Open'
& 'High'
, 'Low'
&'Close'
prices# Iterate over each top company
for company in top_5_companies:
# Filter data for the current company
company_data = df[df['Name'] == company]
# Select columns for correlation analysis
price_data = company_data[['Open', 'High', 'Low', 'Close']]
# Calculate correlation matrix
price_correlation = price_data.corr()
seaborn
package.I express my sincere appreciation to Bohubrihi for their invaluable contributions to this analysis, offering guidance and resources that significantly enriched the project.