Machine Learning Algorithm - Autoencoder-Based Anomaly Detection

Javier Aquique

0

Data Scientist

Jupyter

Python

TensorFlow

Autoencoder-Based Anomaly Detection for Pills

Project Overview

This project focuses on detecting anomalies in pill images using an autoencoder-based deep learning model. The dataset used is the MVTEC Anomaly Detection Dataset, which provides training and testing samples for both good and defective pills. The objective was to explore different anomaly detection algorithms and determine the most effective one. The autoencoder model demonstrated the best performance, making it the final choice.

Dataset

The MVTEC Anomaly Detection Dataset contains images of pills categorized into:
Good (normal pills)
Anomalous (pills with defects)

Preprocessing Steps

Resizing images to (128,128) pixels
Normalization of pixel values (scaling to [0,1])
Augmentation using random brightness, contrast, flips, and rotations
Dataset Splitting:
Training data: only good pills
Testing data: both good and anomalous pills

Model Architecture

A convolutional autoencoder was built with the following structure:
Encoder:
Four convolutional layers with increasing filters (32, 64, 128, 256)
Max pooling layers for downsampling
Bottleneck:
A convolutional layer with 256 filters
Decoder:
Transposed convolutional layers mirroring the encoder
UpSampling layers for reconstruction
Skip connections to preserve spatial information
Final output with a sigmoid activation to reconstruct images

Training

Loss Function: A combination of Mean Absolute Error (MAE) and Structural Similarity Index (SSIM)
Optimizer: Adam
Epochs: 20
Validation on Test Data during training

Anomaly Detection Methodology

Reconstruction Error Calculation: The trained autoencoder reconstructs images, and errors are measured using the Structural Similarity Index (SSIM).
Threshold Selection: The ROC Curve is used to determine the optimal threshold for anomaly detection.
Classification:
If the reconstruction error exceeds the threshold, the pill is classified as anomalous.
Otherwise, it is classified as good.

Performance Evaluation

Classification Report

               precision    recall  f1-score   support

Good 0.37 0.73 0.49 26
Anomalous 0.94 0.77 0.84 141

accuracy 0.76 167
macro avg 0.65 0.75 0.67 167
weighted avg 0.85 0.76 0.79 167

Confusion Matrix

A confusion matrix was plotted to visualize the model's performance in distinguishing good and anomalous pills.

Visualization

Original vs. Reconstructed Images: Samples were displayed to analyze how well the autoencoder reconstructs images.
Confusion Matrix: Provided insights into model predictions.

Conclusion

The autoencoder effectively identified anomalies, achieving 76% accuracy.
The model exhibited a high recall for anomalies (77%), making it suitable for defect detection.
Potential improvements include refining the architecture, using more advanced loss functions, and combining with other anomaly detection techniques.

Future Work

Experiment with different autoencoder architectures (e.g., Variational Autoencoders)
Use additional datasets to improve generalization
Deploy the model as a real-time anomaly detection system
This project serves as a proof of concept, demonstrating how deep learning can be leveraged for automated quality inspection in pharmaceutical manufacturing.
Like this project
0

Posted Feb 9, 2025

Autoencoder detects pill anomalies with 76% accuracy using SSIM-based reconstruction errors. Optimized via ROC curve, future work aims for real-time deployment.

Likes

0

Views

3

Tags

Data Scientist

Jupyter

Python

TensorFlow

DealFinder AI
DealFinder AI
Ticket request system with Google Spreasheets and Looker Studio
Ticket request system with Google Spreasheets and Looker Studio