Chhavi Verma
This project focused on conducting an exploratory data analysis (EDA) on red wine quality data. Through extensive analysis, several insights were gained:
Understanding Wine Quality: The project aimed to understand the factors influencing wine quality using various physicochemical attributes.
Dataset Characteristics: The dataset comprised 1599 rows and 12 columns initially, with "quality" serving as the categorical variable. Most wines were of average quality.
Exploring Relationships: Analysis revealed that alcohol, volatile acidity, sulphates, and citric acid were significantly correlated with wine quality.
Linear Modeling: Linear regression models were built, with alcohol contributing only 22% to wine quality variance. Other variables also played significant roles.
Multivariate Analysis: Considering multiple variables simultaneously, it was observed that alcohol and sulphate concentrations were crucial for better wine quality.
Conclusion and Future Scope: While the models showed reasonable predictive power, further improvements and explorations were suggested, including alternative modeling techniques, feature selection, and deployment in real-world scenarios.
References: The dataset used was provided by Cortez et al. (2009), and proper citation guidelines were emphasized.