Freelancers using XGBoost

Freelancers using XGBoostFreelancers using XGBoost

Projects People

Sanket Sabharwal, PhD

maxGenoa, Italy

Senior Software & ML Engineer | Zero to One Product Builder

$50k+: Earned

6x: Hired

5.0: Rating

45: Followers

Top

Expert

Senior Software & ML Engineer | Zero to One Product Builder

Cover image for Machine Learning for Sports Betting - NCAA College Basketball

Machine Learning for Sports Betting - NCAA College Basketball

1

10

Cover image for Web Scraping Systems - Large-Scale Data Extraction Pipelines

Web Scraping Systems - Large-Scale Data Extraction Pipelines

3

44

Cover image for BI Dashboards - Retail Analytics & Forecasting

BI Dashboards - Retail Analytics & Forecasting

1

33

Cover image for Computer Vision for Manufacturing - Defect Detection & QA

Computer Vision for Manufacturing - Defect Detection & QA

2

37

proDubai - United Arab Emirates

Automation Engineer - n8n + Supabase + Codex, OpenClaw 🚀

$10k+: Earned

6x: Hired

5.0: Rating

44: Followers

expert

Automation Engineer - n8n + Supabase + Codex, OpenClaw 🚀

Cover image for ML Evaluation Infrastructure for Fraud Detection

ML Evaluation Infrastructure for Fraud Detection

0

8

Cover image for IOS Risk Data Foundry — Domain-Specific Financial Risk AI

IOS Risk Data Foundry — Domain-Specific Financial Risk AI

1

9

Cover image for OpenClaw Setup: AI-Driven Automation System for Credit Startup

OpenClaw Setup: AI-Driven Automation System for Credit Startup

1

13

Cover image for High-Volume Call Routing & Reporting

High-Volume Call Routing & Reporting System Designed and implemented an end-to-end calling infrastructure that integrates call data sources, automates daily performance reporting, and powers real-time visualization dashboards. Built a phone-number rotation system capable of cycling through thousands of Twilio numbers to support 10,000–50,000 outbound calls per day while staying within per-number limits, ensuring scalable, compliant, and reliable high-volume calling operations.

0

215

I’m an AI & Machine Learning engineer with expertise in deve

I’m an AI & Machine Learning engineer with expertise in deve

Cover image for I’m excited to share AuditFlow AI – AI-powered continuous au...

I’m excited to share AuditFlow AI – AI-powered continuous auditing platform built specifically for Chartered Accountants and audit firms. CA's practices today are drowning in manual sampling, 40–60 hour audit cycles, talent shortages, and rising client pressure for faster delivery with lower fees. Most frauds and GST/TDS errors go undetected until the assessment stage because traditional methods check only 2–5% of transactions. AuditFlow AI changes that completely: upload any ledger/Excel/CSV and in under 10 seconds it scans 100% of transactions, flags duplicates, round-figure entries, weekend fraud, high-value anomalies, and vendor loops – with plain-English AI explanations for every red flag. Tech stack: Python, Flask, XGBoost, Isolation Forest, scikit-learn, Bootstrap 5, and trained on 5,000+ synthetic + real-world patterns

1

74

Cover image for Everyone's building AR filters and

Everyone's building AR filters and calling it "computer vision magic." Almost nobody's asking what's actually happening underneath — that most of these effects are just clever masking, not detection. Here's proof. I built an invisibility cloak that runs entirely in the browser, no green screen, no chroma key, no model training. https://github.com/AnuragNagare/Ghost-frame

4

2

191

Cover image for What your attention heatmap isn't

What your attention heatmap isn't telling you Everyone's staring at attention heatmaps and calling it "interpretability." Almost nobody's asking whether a single attention map actually tells you what the model used to make its decision. It doesn't. Not on its own. A raw attention map from layer 8 shows you what layer 8 attended to. It says nothing about how that signal got mixed, diluted, or overwritten by every layer before and after it. Attention rollout fixes this — and I built a walkthrough to show why it matters. Here's what makes it more than a "pretty heatmap" demo: Instead of visualizing one layer's attention, I traced how information actually flows through the full transformer stack. → Every layer's attention matrix is extracted, per head, per token → Multi-head attention is averaged, then combined with the residual connection (identity + attention) — this is the step most tutorials skip, and it's the one that actually matters → The combined matrices are matrix-multiplied layer by layer, rolling attention forward from input to output → The result: a single map showing genuine token-to-token influence across the entire network, not just one layer's snapshot The overlay shows you everything: → Per-layer attention vs. rolled-out attention, side by side → Token importance scores overlaid directly on the input text → A comparison view: which tokens raw attention says "matter" vs. which ones rollout says actually matter → Head-level breakdown so you can see which heads specialize vs. which are noise No black box. No "trust me, the model looked here." Just linear algebra, applied honestly across every layer instead of cherry-picking one. Built with PyTorch + HuggingFace Transformers + Matplotlib. Runs on any pretrained transformer, fully offline. ⚠️ Important: attention rollout is an approximation, not ground truth. It assumes attention is the primary information pathway, which ignores MLP layers and can still mislead for very deep models. Treat it as a debugging lens, not proof of causality.

0

59

Cover image for Everyone's racing to add biometrics

Everyone's racing to add biometrics to logins. Almost nobody's asking what happens when you can't — or shouldn't — touch the sensor. Shared kiosks, clinical settings, accessibility needs, hygiene-sensitive environments. Fingerprint readers and face unlock assume contact or a stored faceprint. Sometimes you want authentication that touches nothing and stores no biometric image of you at all. So I built GestureAuth — a contactless authentication system where your "password" is a sequence of hand gestures performed in front of a standard webcam.

1

51

Lahore, Pakistan

AI/ML & Data Solutions Engineer

New to Contra

AI/ML & Data Solutions Engineer

Cover image for Built an intelligent inventory management

Built an intelligent inventory management system that automates stock ordering using machine learning and AI agents. Leveraging an XGBoost-based forecasting model, the system predicts future inventory demand and proactively places purchase orders when shortages are detected. The backend is powered by Django, integrated with AWS-hosted datasets for scalability and real-time data access. AI agents handle autonomous procurement decisions, reducing manual oversight and streamlining supply chain operations for greater efficiency and accuracy.

0

93

Cover image for Retail Knowledge Graph
In this project,

Retail Knowledge Graph In this project, we built a semantic knowledge graph tailored to the retail industry. The pipeline involved developing AI agents to transform heterogeneous data into standardized formats. Ontologies were created to represent domain knowledge accurately. Using Gemini models and LangChain, user queries were converted into Cypher queries to retrieve insights from a Neo4j database. We utilized an MCP server for orchestration and LangSmith for secure login and audit trails. This system enhances complex data exploration for non-technical users.

2

2

116

Cover image for Student Medical Chatbot
Built a chatbot

Student Medical Chatbot Built a chatbot to assist MBBS students in navigating medical literature. Leveraged Llama Index and fine-tuned language models to ensure accuracy. Embeddings were stored in OpenSearch, hosted on AWS. The Django backend included secure authentication and session management for a robust user experience.

0

67

Cover image for Prompt Engineering Mini-Academy is a

Prompt Engineering Mini-Academy is a digital learning product built using Kajabi. It helps users learn how to write better AI prompts and use AI tools for daily tasks such as writing, research, summarization, and productivity. The problem it solves is that many people use AI tools without a proper structure, which leads to weak or generic results. This product gives users a clear learning path, practical prompt templates, and workflow examples to improve the quality of their AI outputs. I used Kajabi to create the landing page, email capture form, downloadable prompt resource, product offer, checkout page, and course structure. A sample video is attached to demonstrate the product flow and user experience.

1

81

Full-Stack Developer building AI Agents & RAG Systems

New to Contra

Full-Stack Developer building AI Agents & RAG Systems

Cover image for FinSense - AI-Powered NPA Prevention

FinSense - AI-Powered NPA Prevention Platform for Banking AI-powered, post-disbursement loan health monitoring system built for SBI-scale banking use cases. Continuously monitors borrower financial behavior after loan disbursement, scores risk on a 0–100 scale across 4 tiers using XGBoost with SHAP-based explainability, and deploys a conversational AI agent to proactively engage at-risk borrowers 60–90 days before default — instead of only scoring creditworthiness at application time like traditional systems. Built the FastAPI backend (loan, risk, agent, and dashboard routers), automated risk re-scoring via Celery + Redis, and real-time dashboard updates via WebSockets, with a React + Tailwind frontend for officer-facing risk visualization.

0

20

Cover image for Financial Decision Intelligence Platform —

Financial Decision Intelligence Platform — Multi-Agent Risk Analysis Multi-agent platform (LangGraph) that ingests SEC 10-K filings via XBRL extraction, runs Altman Z-Score and Piotroski F-Score risk models, and generates investment committee reports. Pipeline includes RAG-based risk retrieval (FAISS + Sentence Transformers), XGBoost ML inference, and Gemini API report generation — served through a FastAPI backend with /analyze, /compare, and /report endpoints for multi-company investment ranking. Stack: Python, LangChain, LangGraph, FAISS, XGBoost, FastAPI

0

24

Cover image for RepoSphere - GitHub-Inspired Version Control

RepoSphere - GitHub-Inspired Version Control Platform Full-stack, GitHub-inspired platform with JWT authentication, repository management, issue tracking, and real-time search. Built a RESTful API with user dashboards, and reduced code redundancy by 35% through modular architecture. Deployed on AWS Amplify.

0

14

Cover image for SENTINEL - Edge AI Driver

SENTINEL - Edge AI Driver Assistance System Backend Lead for an edge AI ADAS (driver assistance) system built for Indian road conditions. Multi-stage ML pipeline covering sensor fusion, scene classification, risk scoring, GradCAM/XAI explainability, and federated learning. Built the backend and deployment pipeline, integrating ONNX and PyTorch runtimes to serve model inference in production.

0

23

Senior Data Analyst/ Data Scietist

Senior Data Analyst/ Data Scietist

Cover image for Loan Charge-Off Forecasting Analytics

Loan Charge-Off Forecasting Analytics

1

0

Cover image for New Member Onboarding & Growth Analytics for Credit Unions

New Member Onboarding & Growth Analytics for Credit Unions

1

1

Cover image for Automated Underwriting & Risk Scoring Analytics

Automated Underwriting & Risk Scoring Analytics

1

1

Cover image for Fraud Analytics & Predictive Monitoring for Credit Unions

Fraud Analytics & Predictive Monitoring for Credit Unions

1

2

Christians Steven Zoe

Denpasar, Indonesia

Data Scientist | Solving Business Problems with Data & ML

Data Scientist | Solving Business Problems with Data & ML

Cover image for The reports folder contains model

The reports folder contains model evaluation outputs generated during the machine learning workflow. These reports provide insights into model performance, feature importance, and predictive capabilities, helping stakeholders understand both the effectiveness and business implications of the solution. 1. Feature Importance Report Feature importance analysis was performed to identify the variables that contributed most to customer churn predictions, providing valuable to business insights. 2. ROC Curve Report ROC-AUC analysis was used to compare multiple machine learning models and identify the model with the strongest predictive performance. 3. Confusion Matrix Report A confusion matrix was generated to evaluate classification outcomes and understand the strengths and limitations of the predictive model.

1

84

Cover image for # 📊 Dashboard for Small

# 📊 Dashboard for Small Businesses (UMKM) A simple and user-friendly Excel dashboard designed to help small business owners monitor their business performance and make better decisions through data. --- ## 🚀 Project Overview This project demonstrates how sales data can be transformed into meaningful business insights using Microsoft Excel. The dashboard provides a clear overview of: - Monthly revenue - Monthly expenses - Profit tracking - Cashflow trends - Business performance visualization --- ## ✨ Features ✅ Sales Dashboard ✅ Cashflow Monitoring ✅ Profit & Loss Summary ✅ Interactive Charts ✅ Clean and Easy-to-Understand Layout ## 📸 Dashboard Preview Dashboard screenshots are available in the `screenshots` folder. --- ## 🛠 Tools Used - Microsoft Excel - Google Sheets - Git & GitHub --- ## 💡 Business Value Small business owners often struggle to understand their financial performance because their data is scattered and difficult to interpret. This dashboard simplifies business reporting and helps users: - Track revenue growth - Monitor expenses - Identify profit trends - Make data-driven decisions --- ## 👨‍💻 Created By Christians Steven Zoe Aspiring Data Analyst & Freelance Data Specialist GitHub: https://github.com/stevendsml01-blockchain

1

28

Cover image for Data Cleaning and Sales Analysis
##

Data Cleaning and Sales Analysis ## Project Overview This project demonstrates an end-to-end data cleaning and exploratory data analysis (EDA) workflow using Python. The dataset was intentionally generated with multiple data quality issues to simulate real-world business scenarios commonly encountered by Data Analysts and Data Scientists. --- ## Objectives - Identify data quality issues. - Handle missing values. - Remove duplicate records. - Standardize mixed date formats. - Perform exploratory data analysis. - Generate business insights. - Create visualizations for decision-making. --- ## Dataset Issues The raw dataset contained several intentional problems: - Missing values in `Qty` - Missing values in `Harga` - Duplicate transactions - Mixed date formats - Inconsistent category naming --- ## Data Cleaning Process The following steps were performed: 1. Loaded and profiled the raw dataset. 2. Identified missing values and duplicate records. 3. Removed duplicate transactions. 4. Filled missing values using median imputation. 5. Investigated mixed date formats. 6. Built a custom date parser to standardize dates. 7. Saved the cleaned dataset. --- ## Results ### Before Cleaning | Metric | Value | |----------|---------| | Total Records | 1009 | | Missing Qty | 8 | | Missing Harga | 5 | | Duplicate Records | 10 | # After Cleaning | Metric | Value | |----------|---------| | Total Records | 999 | | Missing Qty | 0 | | Missing Harga | 0 | | Duplicate Records | 0 | | Failed Date Parsing | 0 | --- ## Business Insights ### Best-Selling Products Kopi Arabica was the top-selling product, followed by Teh Hijau and Mouse. ### Sales by City Bandung generated the highest sales volume, indicating strong market potential compared to Surabaya and Jakarta. ### Category Performance Electronics dominated sales performance. An inconsistency between `Makanan` and `makanan` was discovered, highlighting the importance of data standardization before analysis. ### Revenue The total revenue generated was: Rp 13,593,130,000 ## Technologies Used - Python - Pandas - NumPy - Matplotlib

1

28

Cover image for Image 1 – Project Overview

Image 1 – Project Overview & Dataset Information Customer Churn Prediction Using Random Forest This project focuses on predicting customer churn using machine learning techniques to help businesses proactively identify customers who are likely to discontinue their services. The predictive solution was developed using a structured approach involving Random Forest classification, SMOTE oversampling for handling class imbalance, GridSearchCV for hyperparameter optimization, and threshold tuning to improve recall performance. The dataset contains customer demographic and behavioral attributes, including: Age, Membership Years, Lifetime Value, Total Purchases, Days Since Last Purchase, Average Order Value, Returns Rate, Cart Abandonment Rate The target variable is customer churn status, where: 0 = Active Customer, 1 = Churned Customer Business Objective: The primary objective of this project is to identify customers at risk of churn so businesses can implement preventive retention strategies and reduce customer attrition. Image 2 – Machine Learning Pipeline End-to-End Machine Learning Workflow: A comprehensive machine learning pipeline was designed to ensure robustness, reproducibility, and business relevance throughout the modeling process. The workflow consisted of: 1. Data Cleaning Prepared and validated the dataset by handling inconsistencies and ensuring data quality. 2. Exploratory Data Analysis (EDA) Investigated customer behavior patterns and feature distributions to understand underlying trends. 3. Baseline Random Forest Modeling Established an initial benchmark using Random Forest classification. 4. SMOTE Oversampling Addressed class imbalance to improve the model's ability to detect churned customers. 5. Hyperparameter Tuning Optimized model performance using GridSearchCV. 6. Threshold Tuning Adjusted classification thresholds to maximize business-oriented objectives, particularly recall. 7. Model Evaluation Assessed predictive performance using multiple evaluation metrics. Professional Value This structured workflow demonstrates adherence to industry best practices rather than relying solely on default machine learning configurations. Image 3 – Correlation Heatmap Exploratory Correlation Analysis A correlation heatmap was generated to identify relationships between customer attributes and churn behavior. The analysis revealed several noteworthy insights: Customers with longer periods since their last purchase exhibited a stronger tendency to churn. Higher cart abandonment rates were moderately associated with increased churn risk. Demographic variables such as age showed minimal correlation with churn outcomes. Key Insight The strongest relationship with churn was observed in: Days Since Last Purchase (correlation = 0.312) suggesting that customer inactivity is a meaningful indicator of potential attrition. Business Relevance Understanding these relationships enables organizations to focus their retention initiatives on the factors most strongly associated with customer loss. Image 4 – Key Insights, Recommendations & Technologies Used Key Insights Several actionable findings emerged from the analysis: 1. Customers with extended inactivity periods are more likely to churn. 2. Elevated cart abandonment behavior may signal disengagement. 3. Improving recall is critical because accurately identifying potential churners aligns directly with the business objective. Business Recommendations: Based on the findings, the following strategies are recommended: Target High-Risk Customers: Deploy retention campaigns aimed at customers identified as likely to churn. Personalize Customer Communication: Develop personalized email and promotional initiatives to improve engagement. Strengthen Loyalty Programs: Offer incentives and rewards to reactivate inactive customers. Monitor Behavioral Indicators: Continuously track customer activity metrics to detect early warning signs of churn. Technologies Used The project was implemented using the following technologies: Python Pandas NumPy Matplotlib Seaborn Scikit-Learn Imbalanced-Learn

1

63

Kristóf Németh

Budapest, Hungary

Data Analysis & Science Services

Data Analysis & Science Services

Cover image for Customer Churn Prediction with Machine Learning

Customer Churn Prediction with Machine Learning

0

2

Cover image for Predicting NO₂ Levels Using Machine Learning

Predicting NO₂ Levels Using Machine Learning

0

2

Cover image for New York Taxi Fare Prediction Model

New York Taxi Fare Prediction Model

0

2

Placeholder project card media

View more →