Masker
. We feed the output of the Masker into the second UNet, which predicts the next mask eleven times until reaching the 22nd image. This process will be discussed further below. Note that this model takes only masks as input and ouputs a single mask; no color images are processed at this point. We call this second UNet the Predictor
, as it predicts the subsequent positions of the objects. Naturally, we dub the combination of these two UNets, WNet.imgs.pt
, val_imgs.pt
, masks.pt
, val_masks.pt
, and unlabeled_imgs.pt
files (no masks are provided for the unlabeled set). These all live in ./WNet/data. Note: although we have provided the code for generating unlabeled_imgs.pt
, we encourage the use of our lazy loading implementation that directly converts the raw images into a segmentation mask tensor. This is covered in the following step.imgs.pt
and masks.pt
, along with our validation tensors for verification of performance. The best Masker model weights will be saved into ./WNet/masker_models as best_masker.pth
. We recommend renaming this to a unique name to avoid overwiting in the future. We use masker.pth
.masks.pt
data alone, with val_masks.pt
for validation. This implementation is the default in the notebook. However, with some simple changes, the unlabeled data can be incorporated as well. Below, two approaches are explained (Changes must be made at the indicated location under the Load Data header in Predictor.ipynb): best_predictor.pth
. We recommend renaming this to a unique name to avoid overwiting in the future. We use predictor.pth
.