Intelligent Memory-Based Obfuscated Malware Detector by Divyansh Singh PariharIntelligent Memory-Based Obfuscated Malware Detector by Divyansh Singh Parihar

Intelligent Memory-Based Obfuscated Malware Detector

Divyansh Singh Parihar

Completed work

ML Engineer

Matplotlib

pandas

Python

Artificial Intelligence

Intelligent Memory-Based Obfuscated Malware Detector

This repository contains the implementation and documentation for the Intelligent Memory-Based Obfuscated Malware Detector, a project developed as part of the Bachelor of Technology degree in Computer Science Engineering at Jaypee Institute of Information Technology, Noida.

Overview

Modern malware frequently employs obfuscation techniques to evade detection by traditional systems. This project addresses these challenges by developing a Memory-Based Explainable Obfuscated Malware Detector, leveraging advanced machine learning techniques and explainable AI methodologies.

The system is lightweight, efficient, and transparent, providing both high accuracy and interpretability in its malware detection process.

Features

Memory-Based Analysis: Utilizes memory dumps to detect obfuscated malware.

Explainable AI: Explains decisions using SHAP (SHapley Additive exPlanations).

Lightweight Design: Employs Recursive Feature Elimination (RFE) to select only five key features for detection.

User-Friendly Interface: Built with Python's Streamlit for real-time user interaction.

Dataset

The system is tested on the MalMem2022 dataset, which includes 58,596 instances split evenly between benign and malware samples. Features are extracted using the Volatility Framework and cover memory-specific characteristics like:

Number of running processes.

Average threads per process.

Number of DLLs loaded.

Technology Stack

Programming Language: Python

Libraries Used:

NumPy: Numerical computations.

Pandas: Data manipulation and analysis.

Matplotlib: Visualization.

Scikit-Learn: Machine learning and evaluation tools.

XGBoost: Gradient boosting algorithms.

SHAP: Explainability of model predictions.

Framework: Streamlit (for UI development).

Algorithms Used

Recursive Feature Elimination (RFE): For feature selection.

Machine Learning Models:

Random Forest

Decision Tree

Gaussian Naive Bayes

Extreme Gradient Boost (XGBoost)

10-Fold Cross-Validation: To validate model generalization.

Contributors Divyansh Singh Parihar Ayush Kumar Suraj Prakash Nishant Singh Yash Sharma

Like this project

Completed work

Posted Jun 1, 2025

Developed a Memory-Based Explainable Obfuscated Malware Detector using machine learning and explainable AI.

Likes

Views

Timeline

Dec 20, 2024 - Dec 22, 2024

Clients

Jaypee Institute of Information Technology