Machine Learning Algorithm - Autoencoder-Based Anomaly Detection

Javier Aquique

Data Scientist

Jupyter

Python

TensorFlow

Autoencoder-Based Anomaly Detection for Pills

Project Overview

This project focuses on detecting anomalies in pill images using an autoencoder-based deep learning model. The dataset used is the MVTEC Anomaly Detection Dataset, which provides training and testing samples for both good and defective pills. The objective was to explore different anomaly detection algorithms and determine the most effective one. The autoencoder model demonstrated the best performance, making it the final choice.

Dataset

The MVTEC Anomaly Detection Dataset contains images of pills categorized into:

Good (normal pills)

Anomalous (pills with defects)

Preprocessing Steps

Resizing images to (128,128) pixels

Normalization of pixel values (scaling to [0,1])

Augmentation using random brightness, contrast, flips, and rotations

Dataset Splitting:

Training data: only good pills

Testing data: both good and anomalous pills

Model Architecture

A convolutional autoencoder was built with the following structure:

Encoder:

Four convolutional layers with increasing filters (32, 64, 128, 256)

Max pooling layers for downsampling

Bottleneck:

A convolutional layer with 256 filters

Decoder:

Transposed convolutional layers mirroring the encoder

UpSampling layers for reconstruction

Skip connections to preserve spatial information

Final output with a sigmoid activation to reconstruct images

Training

Loss Function: A combination of Mean Absolute Error (MAE) and Structural Similarity Index (SSIM)

Optimizer: Adam

Epochs: 20

Validation on Test Data during training

Anomaly Detection Methodology

Reconstruction Error Calculation: The trained autoencoder reconstructs images, and errors are measured using the Structural Similarity Index (SSIM).

Threshold Selection: The ROC Curve is used to determine the optimal threshold for anomaly detection.

Classification:

If the reconstruction error exceeds the threshold, the pill is classified as anomalous.

Otherwise, it is classified as good.

Performance Evaluation

Classification Report

               precision    recall  f1-score   support

        Good       0.37      0.73      0.49        26
   Anomalous       0.94      0.77      0.84       141

    accuracy                           0.76       167
   macro avg       0.65      0.75      0.67       167
weighted avg       0.85      0.76      0.79       167

Confusion Matrix

A confusion matrix was plotted to visualize the model's performance in distinguishing good and anomalous pills.

Visualization

Original vs. Reconstructed Images: Samples were displayed to analyze how well the autoencoder reconstructs images.

Confusion Matrix: Provided insights into model predictions.

Conclusion

The autoencoder effectively identified anomalies, achieving 76% accuracy.

The model exhibited a high recall for anomalies (77%), making it suitable for defect detection.

Potential improvements include refining the architecture, using more advanced loss functions, and combining with other anomaly detection techniques.

Future Work

Experiment with different autoencoder architectures (e.g., Variational Autoencoders)

Use additional datasets to improve generalization

Deploy the model as a real-time anomaly detection system

This project serves as a proof of concept, demonstrating how deep learning can be leveraged for automated quality inspection in pharmaceutical manufacturing.

Like this project

Posted Feb 9, 2025

Autoencoder detects pill anomalies with 76% accuracy using SSIM-based reconstruction errors. Optimized via ROC curve, future work aims for real-time deployment.

Likes

Views