Image Segmentation for Object Recognition

Shahryar A

Data Scientist

Data Analyst

ML Engineer

Technical Description:

1. Project Overview:

The project aims to perform image segmentation on maps using YOLO, a real-time object detection system, to identify and categorize different elements such as plots, roads, and other features.

2. Technologies Used:

YOLO (You Only Look Once) for object detection.

Flask for creating the web-based front end.

Python as the primary programming language.

OpenCV for image processing.

3. Workflow:

a. YOLO Model:

The YOLO model is trained on a dataset containing annotated maps. The training involves optimizing the model to recognize and classify different features within the images. The trained model is then used for inference.

b. Flask Front End:

The front end is built using Flask, a web framework in Python. The user interacts with the system through a web interface, where they can upload map images for segmentation.

c. Image Segmentation:

When a user uploads a map image, the Flask application sends the image to the YOLO model for inference. The YOLO model processes the image and returns bounding boxes around detected objects along with their class labels.

d. Result Visualization:

The Flask application displays the segmented map to the user with bounding boxes around identified plots, roads, and other elements. The total number of each category is also calculated and presented on the front end.

import cv2

import numpy as np



# Load YOLO

net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")

classes = []

with open("coco.names", "r") as f:

    classes = [line.strip() for line in f.readlines()]



# Function for YOLO Inference

def perform_yolo_inference(image_path):

    image = cv2.imread(image_path)

    height, width, _ = image.shape



    # YOLO blob

    blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)

    net.setInput(blob)

    output_layers = net.getUnconnectedOutLayersNames()

    detections = net.forward(output_layers)



    class_ids = []

    confidences = []

    boxes = []



    # Process detections

    for out in detections:

        for detection in out:

            scores = detection[5:]

            class_id = np.argmax(scores)

            confidence = scores[class_id]



            if confidence > 0.5:  # Confidence threshold

                center_x = int(detection[0] * width)

                center_y = int(detection[1] * height)

                w = int(detection[2] * width)

                h = int(detection[3] * height)



                # Rectangle coordinates

                x = int(center_x - w / 2)

                y = int(center_y - h / 2)



                boxes.append([x, y, w, h])

                confidences.append(float(confidence))

                class_ids.append(class_id)



    return boxes, confidences, class_ids

Like this project

Posted Dec 28, 2023

Utilized image segmentation algorithms to accurately identify objects within images, allowing for precise object recognition in computer vision applications.

Likes

Views