Intelligent Memory-Based Obfuscated Malware Detector

Divyansh

Divyansh Singh Parihar

Intelligent Memory-Based Obfuscated Malware Detector

This repository contains the implementation and documentation for the Intelligent Memory-Based Obfuscated Malware Detector, a project developed as part of the Bachelor of Technology degree in Computer Science Engineering at Jaypee Institute of Information Technology, Noida.

Overview

Modern malware frequently employs obfuscation techniques to evade detection by traditional systems. This project addresses these challenges by developing a Memory-Based Explainable Obfuscated Malware Detector, leveraging advanced machine learning techniques and explainable AI methodologies.
The system is lightweight, efficient, and transparent, providing both high accuracy and interpretability in its malware detection process.

Features

Memory-Based Analysis: Utilizes memory dumps to detect obfuscated malware.
Explainable AI: Explains decisions using SHAP (SHapley Additive exPlanations).
Lightweight Design: Employs Recursive Feature Elimination (RFE) to select only five key features for detection.
User-Friendly Interface: Built with Python's Streamlit for real-time user interaction.

Dataset

The system is tested on the MalMem2022 dataset, which includes 58,596 instances split evenly between benign and malware samples. Features are extracted using the Volatility Framework and cover memory-specific characteristics like:
Number of running processes.
Average threads per process.
Number of DLLs loaded.

Technology Stack

Programming Language: Python
Libraries Used:
NumPy: Numerical computations.
Pandas: Data manipulation and analysis.
Matplotlib: Visualization.
Scikit-Learn: Machine learning and evaluation tools.
XGBoost: Gradient boosting algorithms.
SHAP: Explainability of model predictions.
Framework: Streamlit (for UI development).

Algorithms Used

Recursive Feature Elimination (RFE): For feature selection.
Machine Learning Models:
Random Forest
Decision Tree
Gaussian Naive Bayes
Extreme Gradient Boost (XGBoost)
10-Fold Cross-Validation: To validate model generalization.
Contributors Divyansh Singh Parihar Ayush Kumar Suraj Prakash Nishant Singh Yash Sharma
Like this project

Posted Jun 1, 2025

Developed a Memory-Based Explainable Obfuscated Malware Detector using machine learning and explainable AI.

Likes

0

Views

0

Timeline

Dec 20, 2024 - Dec 22, 2024

Clients

Jaypee Institute of Information Technology