VinBrain Internship

Bao Dinh

Data Visualizer
ML Engineer
Data Analyst
Notion
Python
PyTorch

Preliminaries

I joined VinBrain as an AI engineer intern in the department of applied scientist.
During a 3-month internship, I was tasked with acquiring knowledge of medical image segmentation. Under supervision, I completed assignments and participated in the company's projects, which were validated by mentors. In this project, my assignment was to perform multi-class image segmentation for the problem of polyp segmentation.

Introduction

Requirements

My mentor asked me to build a full pipeline (training, inference and evaluation) for multi-class image segmentation to achieve greater than 85% accuracy on this leaderboard.

Dataset

A visualized example from the data
A visualized example from the data

Solution

In this section, it's reasonable to highlight the main contribution I've made to meet the requirements set by my mentor. For additional technical details, including data pre-processing/post-processing, training/inference/evaluation pipeline, please refer to my public GitHub repository (https://github.com/giaabaoo/vinbrain_internship). During the project, I conducted an extensive literature review and utilized state-of-the-art (SOTA) models, including approximately 15 models, to apply them to the dataset. Some of these models required modifications, which I implemented myself. For example, I adapted the SANet from binary segmentation to multi-class segmentation.
As you can see from the results, I consistently improve my performance from 0 to 80% dice score on private test (with the PraNet method). After that, I stumbled upon this paper, which I believe presents a method worth trying. I find the architecture described in the paper to be well-suited for my problem, particularly since it has been experimentally proven to be effective in detecting small polyps.
Moreover, from my observation of the dataset, I noticed some important characteristics:
Non-neoplastic lesions often have colors quite similar to the background.
Neoplastic and non-neoplastic lesions differ in size, texture, and shape.
Neoplastic and non-neoplastic lesions sometimes share similar colors.
Therefore, we require a better method to capture the shape of the lesions.
Following this idea, I initially attempted augmenting more images with different colors, as it is the easiest and most time-saving approach. The intuition behind this is to compel the model to focus more on discriminative features such as shapes rather than colors for pixel classification. For realistic images, I adapted a color transfer algorithm (converting the color channels to Lab). Eventually, I created different subsets each containing 400 samples (collected randomly with color transfer) and selected the best-performing set to add to the training set.
Color-transferred image (left) VS Original image (right)
Color-transferred image (left) VS Original image (right)
The SANet idea improved my performance to 81%, while the PraNet model with color transfer idea improved it to 84%.
After that, I noticed that the segmentation wasn't fully filled in the predicted masks.
The holes should be filled
The holes should be filled
The masks are mixed
The masks are mixed
Apparently, I came across another idea for post-processing the images. The algorithm is as follows:
Convert the image to grayscale.
Find the coordinates of the components in the converted binary masks.
From those coordinates, extract the predicted masks and then calculate the number of pixels belonging to each class. Assign the component to a single class based on the highest count.
The accuracy increased to 85.6%.

Summary

My contributions include:
Building a complete pre-processing/training/inference/evaluation pipeline for multi-class medical image segmentation.
Conducting extensive experiments with state-of-the-art (SOTA) architectures, losses, and models in medical polyp segmentation.
Enhancing performance by incorporating ideas from SANet (manually adapting the architecture from binary to multi-class segmentation) and PraNet architectures.
Implementing a color transfer technique for data augmentation.
Developing a post-processing algorithm to improve dice score.

Results

I achieved 85.6% on the leaderboard (ranked 8 at the time) with a non-assemble method.
For details in experimental results, please refer to this link.
For details in experimental graphs, please refer to this link. I used wandb to keep tracks the training loss/score.
I was approved by my mentor to officially join the company's project.
Partner With Bao
View Services

More Projects by Bao