Lukmon Abdulsalam
A synthetic data generation tool for testing software applications and also used for data intensive systems.
Stacks/Tools Used
Python
, Streamlit
How I got involved
I was working on a consent management system with a team, before launching, there was a need for load testing and beta-testing the product, then I decided to create a tool to generate synthetic data to serve this purpose. Also, a friend of mine needed to train some ML model, but could not find data to use, so I showed him this tool which was a valuable toolkit for his model.
Why was the project complex
The project was averagely complex as data requirement varies by application, the complexity gave rise to creating different options, but the first version used some pre-defined data models
What i did to drive Success
The complexity and technical limitations did not deter me, instead, I focused on breaking the complexity of the required data models into generic models.
The driving factor for success
The driving factor for the project was when I got an actual user of the tool, a postgraduate student at IVY league, he needed some synthetic data for a machine learning model to detect a software intrusion detection.
Conflict that arose
Some conflict arose as the project implementation and feature were going out of scope
Lessons learned
Don’t focus the discussion too much on the problem, focus more on the solution
How I have grown
I am now learning the inclusion of a no-code system in the applications.
Project Assets
URL: Project Url
Repository: Github Repo