This project focused on conducting an exploratory data analysis (EDA) on red wine quality data. Through extensive analysis, several insights were gained:
Understanding Wine Quality: The project aimed to understand the factors influencing wine quality using various physicochemical attributes.
Dataset Characteristics: The dataset comprised 1599 rows and 12 columns initially, with "quality" serving as the categorical variable. Most wines were of average quality.
Exploring Relationships: Analysis revealed that alcohol, volatile acidity, sulphates, and citric acid were significantly correlated with wine quality.
Linear Modeling: Linear regression models were built, with alcohol contributing only 22% to wine quality variance. Other variables also played significant roles.
Multivariate Analysis: Considering multiple variables simultaneously, it was observed that alcohol and sulphate concentrations were crucial for better wine quality.
Conclusion and Future Scope: While the models showed reasonable predictive power, further improvements and explorations were suggested, including alternative modeling techniques, feature selection, and deployment in real-world scenarios.
References: The dataset used was provided by Cortez et al. (2009), and proper citation guidelines were emphasized.
This project provided valuable insights into red wine quality determinants, highlighting the complexity of wine quality assessment and suggesting future avenues