Image Segmentation for Object Recognition

Shahryar A

Data Scientist
Data Analyst
ML Engineer

Technical Description:

1. Project Overview:

The project aims to perform image segmentation on maps using YOLO, a real-time object detection system, to identify and categorize different elements such as plots, roads, and other features.

2. Technologies Used:

YOLO (You Only Look Once) for object detection.
Flask for creating the web-based front end.
Python as the primary programming language.
OpenCV for image processing.

3. Workflow:

a. YOLO Model:

The YOLO model is trained on a dataset containing annotated maps. The training involves optimizing the model to recognize and classify different features within the images. The trained model is then used for inference.

b. Flask Front End:

The front end is built using Flask, a web framework in Python. The user interacts with the system through a web interface, where they can upload map images for segmentation.

c. Image Segmentation:

When a user uploads a map image, the Flask application sends the image to the YOLO model for inference. The YOLO model processes the image and returns bounding boxes around detected objects along with their class labels.

d. Result Visualization:

The Flask application displays the segmented map to the user with bounding boxes around identified plots, roads, and other elements. The total number of each category is also calculated and presented on the front end.
import cv2

import numpy as np

# Load YOLO
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
classes = []
with open("coco.names", "r") as f:
classes = [line.strip() for line in f.readlines()]

# Function for YOLO Inference
def perform_yolo_inference(image_path):
image = cv2.imread(image_path)
height, width, _ = image.shape

# YOLO blob
blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
output_layers = net.getUnconnectedOutLayersNames()
detections = net.forward(output_layers)

class_ids = []
confidences = []
boxes = []

# Process detections
for out in detections:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]

if confidence > 0.5: # Confidence threshold
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)

# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)

boxes.append([x, y, w, h])
confidences.append(float(confidence))
class_ids.append(class_id)

return boxes, confidences, class_ids


Partner With Shahryar
View Services

More Projects by Shahryar