Query a Digital Music Store Database

Youssef Maher

Data Analyst
Data Engineer
Data Visualizer
Microsoft Excel
PostgreSQL

Table of Contents

Project Details

This project is mainly about querying a music store an then use the output of the queries and transform it into a CSV file and take that sheet into excel to get some visually appealing results from some very basic charts yet very meaningful.

So much important concepts from SQL where used in the project.

I used PostgreSQL and used the Entity Relationship Diagram (ERD) of the database to complete the project.

Entity Relationship Diagram (ERD) of the database

Queries and Visuals

Query #1

SELECT Track.Name AS Track_Name,
       Genre.Name AS Genre_Name,
       Milliseconds/60000 AS Minutes
FROM Track
JOIN Genre ON Genre.GenreId = Track.GenreId
WHERE Milliseconds >
    (SELECT AVG(Milliseconds)
     FROM Track)
ORDER BY 3 DESC
total number of minutes of sold genres in the store

We can notice that the longest (by total minutes) genres are related to movies and TV shows like Science Fiction, Sci-Fi & Fantasy, Drama, and TV Shows.

Another thing that can be noticed is that even though a genre like Rock has a much greater number of tracks than a genre like Sci-Fi & Fantasy and Drama, the total minutes for them is higher as their tracks tend to be longer than Rock tracks.

Query #2

SELECT T1.Album_Name,
       Sum(T1.Track_price) AS Total_Money
FROM
  (SELECT T.AlbumId,
          A.Title AS ALbum_Name,
          IL.TrackId,
          SUM(IL.UnitPrice * IL.Quantity) AS Track_price
   FROM InvoiceLine IL
   JOIN Track T ON T.TrackId = IL.TrackId
   JOIN Album A ON A.AlbumId = T.AlbumId
   GROUP BY 3) T1
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10
Most sold albums from the store

Using Query 2 to get the most sold albums by the store and then limiting the results to only 10 raws to just see clearly by a bar chart the 10 most sold.

We can clearly notice that the “Battlestar Galactica (Classic), Season 1“ album is the most sold by the store with a total earning of 35.82 $.

Query #3

SELECT M.Name AS Media_Type,
       Count(T.TrackId) AS Track_Counts
FROM Track T
JOIN MediaType M ON M.MediaTypeId = T.MediaTypeId
GROUP BY 1
ORDER BY 2 DESC

Media Type of Sold Tracks

From this Pie Chart, we notice two things

  1. MPEG audio file is the most popular media type used by more than 85 % of the customers.
  2. The AAC audio file and Purchased AAC audio files are rarely ordered from the store and they can be considered outliers.

Query #4

SELECT T1.Genre_Name,
       Sum(T1.Track_price) AS Total_Money
FROM
  (SELECT T.AlbumId,
          G.Name AS Genre_Name,
          IL.TrackId,
          SUM(IL.UnitPrice * IL.Quantity) AS Track_price
   FROM InvoiceLine IL
   JOIN Track T ON T.TrackId = IL.TrackId
   JOIN Genre G ON G.GenreId = T.GenreId
   GROUP BY 3) T1
GROUP BY 1
ORDER BY 2 DESC
Best-Selling Genres

We can see that Rock is the most sold genre and hence the most popular one too. We can also notice that more than 60 % of the earnings of the store come from the top 4 genres (Rock, Latin, Metal, and Alternative & Punk).



Partner With Youssef
View Services

More Projects by Youssef