Performing Image Segmentation using TensorFlow

5 min readFeb 21, 2023

Image segmentation is the process of dividing an image into different regions or segments based on certain criteria, such as color, texture, or intensity. It is a fundamental task in computer vision and is used in various applications, such as object detection, medical image analysis, and autonomous driving.

TensorFlow is a popular open-source machine learning framework that provides powerful tools for building and training deep learning models, including those for image segmentation.

Objective

In this project we will use the TensorFlow machine learning framework to train and evaluate an image segmentation neural network using a medical imagery dataset. We will perform semantic segmentation to classify each pixel in a cardiac MRI image whether the pixel is a part of the left ventricle (LV) or not.

Dataset

You will have to register on the Cardiac MR Left Ventricle Segmentation Challenge website and download links to the dataset will be emailed to you. The data set is a series of cardiac images (specifically MRI short-axis (SAX) scans) that have been expertly labeled.

A representative example of the data is shown above. On the left are the MRI images and the right are the expertly-segmented regions (often called contours). The portions of the images that are part of the LV are denoted in white. The data extraction from the raw images and then subsequent preparation of these images will not be showcased in this article.

1. Visualizing the Dataset

We will define a function ‘display’ to display an image and its label and then run it on our training.

# function to display an image, it's label 
def display(display_list):
    plt.figure(figsize=(10, 10))
    title = ['Input Image', 'Label']

    for i in range(len(display_list)):
        display_resized = tf.reshape(display_list[i], [256, 256])
        plt.subplot(1, len(display_list), i+1)
        plt.title(title[i])
        plt.imshow(display_resized)
        plt.axis('off')
    plt.show()

# display 3 random images and labels from the training set
for image, label in train.take(3):
    sample_image, sample_label = image, label
    display([sample_image, sample_label])

Output:

# an image and label from validation data
for image, label in val.take(1):
    sample_image, sample_label = image, label
    display([sample_image, sample_label])

Output:

2. Building the Model

The input will be the value of each pixel and since the images are black and white we’ll use 1 color channel. The layer expects a vector representation not a matrix so we will flatten the matrix representation of the images. The hidden Dense layer will have a size that you can adjust to any positive integer.

Each input pixel can either be in two classes; Left Ventricle(LV) or not therefore that will be the output. We will then reshape the vector so as to view it as an image.

tf.keras.backend.clear_session()

# set up the model architecture
model = tf.keras.models.Sequential([
    Flatten(input_shape=[256, 256, 1]),
    Dense(64, activation='relu'),
    Dense(256*256*2, activation='softmax'),
    Reshape((256, 256, 2))
])

# specify how to train the model with algorithm, the loss function and metrics
model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy'])

# plot the model including the sizes of the model
tf.keras.utils.plot_model(model, show_shapes=True)

Output:

3. Training the model

EPOCHS = 20
STEPS_PER_EPOCH = len(list(parsed_train_data))
VALIDATION_STEPS = 26

model_history = model.fit(train_dataset, epochs=EPOCHS,    
                          steps_per_epoch=STEPS_PER_EPOCH,
                          validation_steps=VALIDATION_STEPS,
                          validation_data=test_dataset,
                         callbacks=[tensorboard_callback])

Output:

The code will output each epoch and its model statistic.

4. Model Evaluation

Evaluating the performance of the model.

# output model statistics
loss = model_history.history['loss']
val_loss = model_history.history['val_loss']
accuracy = model_history.history['accuracy']
val_accuracy = model_history.history['val_accuracy']

epochs = range(EPOCHS)

plt.figure()
plt.plot(epochs, loss, 'r', label='Training loss')
plt.plot(epochs, val_loss, 'bo', label='Validation loss')
plt.title('Training and Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss Value')
plt.ylim([0, 1])
plt.legend()
plt.show()

Output:

#model evaluation on the test dataset
model.evaluate(test_dataset)

The model had a loss of 0.6931 and accuracy of 0.986. The model did not perform well since loss function is very high.

5. CNN with Dice Metric Loss

One metric we can use to more accurately determine how well our network is segmenting LV is called the Dice metric or Sorensen-Dice coefficient, among other names. This is a metric to compare the similarity of two samples. In our case we’ll use it to compare the two areas of interest, i.e., the area of the expertly-labelled contour and the area of our predicted contour.

#dice coef function
def dice_coef(y_true, y_pred, smooth=1):
    indices = K.argmax(y_pred, 3)
    indices = K.reshape(indices, [-1, 256, 256, 1])

    true_cast = y_true
    indices_cast = K.cast(indices, dtype='float32')

    axis = [1, 2, 3]
    intersection = K.sum(true_cast * indices_cast, axis=axis)
    union = K.sum(true_cast, axis=axis) + K.sum(indices_cast, axis=axis)
    dice = K.mean((2. * intersection + smooth)/(union + smooth), axis=0)

    return dice


#clear the backend session to free up memory
tf.keras.backend.clear_session()

#define the layers of the model architecture
layers = [
    Conv2D(input_shape=[256, 256, 1],
           filters=100,
           kernel_size=5,
           strides=2,
           padding="same",
           activation=tf.nn.relu,
           name="Conv1"),
    MaxPool2D(pool_size=2, strides=2, padding="same"),
    Conv2D(filters=200,
           kernel_size=5,
           strides=2,
           padding="same",
           activation=tf.nn.relu),
    MaxPool2D(pool_size=2, strides=2, padding="same"),
    Conv2D(filters=300,
           kernel_size=3,
           strides=1,
           padding="same",
           activation=tf.nn.relu),
    Conv2D(filters=300,
           kernel_size=3,
           strides=1,
           padding="same",
           activation=tf.nn.relu),
    Conv2D(filters=2,
           kernel_size=1,
           strides=1,
           padding="same",
           activation=tf.nn.relu),
    Conv2DTranspose(filters=2, kernel_size=31, strides=16, padding="same")
]

#create a sequential model with the defined layers
model = tf.keras.models.Sequential(layers)

#compiling the model
model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy'])

#setting up and running the model
EPOCHS = 20
STEPS_PER_EPOCH = len(list(parsed_train_data))
VALIDATION_STEPS = 26

model_history = model.fit(train_dataset, epochs=EPOCHS,
                          steps_per_epoch=STEPS_PER_EPOCH,
                          validation_steps=VALIDATION_STEPS,
                          validation_data=test_dataset)

Output:

The model had a loss of 0.0871 and accuracy of 0.9830.

We will run the model again this time with 30 epochs and we will measure the Dice Metric Loss.

tf.keras.backend.clear_session() 

layers=layers # the layers in our model architecture

model = tf.keras.models.Sequential(layers) # the model

model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[dice_coef,'accuracy'])

# setting up and running the model using 30 passes
EPOCHS = 30
STEPS_PER_EPOCH = len(list(parsed_train_data))

model_history = model.fit(train_dataset, epochs=EPOCHS,
                          steps_per_epoch=STEPS_PER_EPOCH,
                          validation_data=test_dataset)

#evaluating the model using the test dataset
model.evaluate(test_dataset)

Output:

The model improved giving a loss of 0.0863, dice coefficient of 0.049 and accuracy of 0.983.

Performing Image Segmentation using TensorFlow

Objective

Dataset

1. Visualizing the Dataset

2. Building the Model

3. Training the model

4. Model Evaluation

5. CNN with Dice Metric Loss

Written by Kevin Kibe