Cover Courtesy: Wikimedia

What is segmentation of image in machine learning?

In the field of computer vision in machine learning, an image can be broken into segments by learning a mask for each segment. Image segmentation is used to locate object boundaries (lines, curves etc.) and assign a label to each pixel.

For example, this is from opencv documentation (watershed algorithm) where boundaries of a set of coins are learnt:

Original Image

Coins

Semantic Segmentation Mask Learnt from Original Image

Coins Boundary

Sudo Exam Tip: Remember to recall image segmentation as a technique when there is a mention of classification of objects in an image at pixel level. Each pixel is assigned a label when segmentation mask is learnt. The only thing is to be more specific when needed is to identify whether to use semantic or instance segmentation, which you’ll learn shortly.

Semantic and Instance Segmentation Compared

Semantic Segmentation Instance Segmentation
Objects in image are grouped based on defined categories Refined version of semantic segmentation
Example: A street scene would be segmented by pedestrians, bikes, vehicles, sidewalks etc. Example: Categories like “vehicles” are split into cars, buses, trucks etc.
Instance Segmentation detects each instance of a category

Sudo Exam Tip: Think of semantic segmentation as image classification at pixel level – associates a pixel to a class label. Like image classification, semantic segmentation will not detect each distinct object in the image, rather a category of object.

Sudo Exam Tip: When given a choice among semantic and instance segmentation for autonomous vehicle use case -> choose instance segmentation.