Data generation System by Lukmon AbdulsalamData generation System by Lukmon Abdulsalam

Data generation System

Lukmon Abdulsalam

Backend Engineer

Fullstack Engineer

Data Engineer

Python

A synthetic data generation tool for testing software applications and also used for data intensive systems.

Stacks/Tools Used

Python, Streamlit

How I got involved

I was working on a consent management system with a team, before launching, there was a need for load testing and beta-testing the product, then I decided to create a tool to generate synthetic data to serve this purpose. Also, a friend of mine needed to train some ML model, but could not find data to use, so I showed him this tool which was a valuable toolkit for his model.

Why was the project complex

The project was averagely complex as data requirement varies by application, the complexity gave rise to creating different options, but the first version used some pre-defined data models

What i did to drive Success

The complexity and technical limitations did not deter me, instead, I focused on breaking the complexity of the required data models into generic models.

The driving factor for success

The driving factor for the project was when I got an actual user of the tool, a postgraduate student at IVY league, he needed some synthetic data for a machine learning model to detect a software intrusion detection.