Twitter icon
Facebook icon
LinkedIn icon
Google icon
Reddit icon
StumbleUpon icon
Del.icio.us icon

Skin Cancer AI

Added to IoTplaybook or last updated on: 08/26/2019

Story

Why did we build Skin Cancer AI?

According to the Skin Cancer Foundation, half of the population in the United States are diagnosed with some form of skin cancer by age 65. The survival rate for early detection is almost 98%, but it falls to 62% when the cancer reaches the lymph node and 18% when it metastasizes to distance organs. With Skin Cancer AI, we want to use the power of Artificial Intelligence to provide early detection as widely as available.

We've originally built this project at TechCrunch Disrupt Hackathon. After the hackathon and demos, we've received thousands of emails requesting taking this project further, which motivates us to take this project further.

Things used in this project

Hardware components

Movidius Neural Compute Stick
Intel Movidius Neural Compute Stick
 
× 1

Buy

UP² Grove IoT Development Kit
UP² Grove IoT Development Kit
 
× 1

Buy

 
Thundercomm AI Kit
 
× 1

Buy

Software apps and online services

Snappy Ubuntu Core
Snappy Ubuntu Core
 
  Info

What is AI and how to use it

Deep learning has been a pretty big trend for machine learning lately, and the recent success has paved the way to build a project like this. We are going to focus specifically on computer vision and image classification in this sample. To do this, we will be building nevus, melanoma, and seborrheic keratosis image classifier using deep learning algorithm, the Convolution Neural Network (CNN) through Caffe Framework.

In this article we will focus on supervised learning, it requires training on the server as well as deploying on the edge. Our goal is to build a machine learning algorithm that can detect cancer images in real-time, this way you can build your own AI-based skin cancer classification device.

Our application will include 2 parts, the first part is training, which we will be using different sets of cancer image database to train a machine learning algorithm (model) with their corresponding labels. The second part is deploying on the edge, which uses the same model we've trained and running it on an Edge device, in this case, Movidius Neural Computing Stick.

Training on the server and deploying on the edge
Training on the server and deploying on the edge

Traditional Machine Learning (ML) vs. Deep Learning

This is probably the most asked question in AI, and it's fairly simple once you learn how to do it. In order to understand this, we first have to learn how machine learning image classification works.

Machine learning requires feature extraction and model training. We first have to use domain knowledge to extract features that can be used for our ML algorithm model, some examples include SIFT and HoG. After that, we can use a dataset that has all the image features and labels to train our machine learning model.

The major difference between traditional ML and Deep Learning is in feature engineering. The traditional ML uses manually programmed features where Deep Learning does it automatically. Feature engineering is relatively difficult since it requires domain expertise and is very time-consuming. Deep learning requires no feature engineering and can be more accurate.

Traditional Machine Learning vs. Deep Learning
Traditional Machine Learning vs. Deep Learning

Artificial Neural Networks (ANNs)

According to techopedia, "An artificial neuron network (ANN) is a computational model based on the structure and functions of biological neural networks." The artificial neuron network is technically emulating how a human biological neuron works, it has a finite number of inputs, weights associated with them, and an activate function. The activation function of a node defines the output of that node given an input or set of inputs, It is non-linear to encode complex. patterns of the data. When input comes in, the activation function applies to the weighted sum of the inputs to generate the output. The artificial neurons are connected to one another to form a network, hence it's called artificial neuron network (ANN).

Biological neuron vs. Articial neuron.  Source (Wikipedia)
Biological neuron vs. Artificial neuron. Source (Wikipedia)

feedforward neural network is an artificial neural network wherein connections between the nodes do not form a cycle, this is the simplest form of ANN. It has 3 layers, Input, Hidden and Output layer, where the data comes in through Input Layer, through the hidden layer onto the output nodes like the figure below. We can have multiple hidden layers, the complexity of the model is correlated to the size of hidden layers.

Feedforward neural network.  Source (Wikipedia)
Feedforward neural network. Source (Wikipedia)

Training data and Loss function are the 2 elements used in training a neural network. Training data is composed of images and the corresponding labels, and Loss function is a function that measures the inaccuracies during classification. Once obtained those 2 elements, we then use the backpropagation algorithm and gradient descent to train an ANN.

Convolutional Neural Networks (CNNs)

Convolutional neural network is a class of deep, feed-forward artificial neural networks, most commonly applied to analyze visual imagery because it is designed to emulate the behavior of biological behaviors on an animal visual cortex. It consists of convolutional layers and pooling layers so that the network can encode image properties.

Typical CNN network.  Source (Wikipedia)
Typical CNN network. Source (Wikipedia)

The convolutional layer's parameters consist of a set of learnable filters (or kernels) that have a small receptive field. This way the image can convolve across spatially, and computing dot products between the entries of the filter and the input and producing a 2-dimensional activation map of that filter. This way the network learns filters that can activate when it detects special features on the input image's spatial feature.

Neurons of a convolutional layer (blue), connected to their receptive field (red).  Source (Wikipedia)
Neurons of a convolutional layer (blue), connected to their receptive field (red). Source (Wikipedia)

The pooling layer is a form of non-linear down-sampling. It partitions the input image into a set of non-overlapping rectangles and, for each such sub-region, outputs the maximum. The idea is to continuously reduce the spatial size of the input representation to reduce the amount of parameters and computation in the network, so it can also control overfitting. Max pooling is the most common type non-linear pooling. According to Wikipedia, "Pooling is often applied with filters of size 2x2 applied with a stride of 2 at every depth slice. A pooling layer of size 2x2 with stride of 2 shrinks the input image to a 1/4 of its original size."

Max pooling.  Source (Wikipedia)
Max pooling. Source (Wikipedia)

Skin Cancer AI Components

Equipment needed is very simple for this project. You can either do it with your computer and a USB Movidius Neural Computing Stick or build it using embedded computing like these IoT devices.

  • Up2 Board
  • Endoscope Camera
  • Movidius PCIe Add-on (Or USB Neural Computing Stick)
  • A screen or monitor

Skin Cancer AI Components
Skin Cancer AI Components

Step 1: Gather Image Dataset for Skin Cancers

We first need skin cancer data sets, although there are many places to get it, isic-archive becomes the easiest one. For this part, we just need about 500 images of each between nevus, melanoma and seborrheic keratosis as well as 500 of random images of anything else. To get higher accuracy we'd have to use more data, this is just to get training started. The easiest way to get the data is via images below using https://isic-archive.com/#images

ISIC Archive
ISIC Archive

Afterward, we can tag the images into one folder, and naming them nevus-00.jpg, melanoma-00.jpg so we can form our lmdb easily.

Getting the data from ISIC
Getting the data from ISIC

Step 2: Setting Up Server for Training

Machine learning training uses a lot of processing power, and hence they generally cost a lot. In this article, we are going to focus on Intel® AI DevCloud which is free for Intel® AI Academy members. It uses Intel® Xeon® Scalable processors and Intel optimized Caffe as well as other ML frameworks. You can access it from https://software.intel.com/en-us/ai-academy/tools/devcloud.

To build Doctor Hazel, we are using Caffe Framework, mainly because of the Intel Movidius NCS's edge support. Caffe is a deep learning framework developed by the Berkeley Vision and Learning Center (BVLC), and there are 4 steps of using Caffe framework to train our algorithm model.

  • Data Prep, we clean the images and store them into LMDB.
  • Model Definition prototxt file, define the parameters and choose CNN architecture.
  • Solver Definition prototxt file, define the solver parameters for model optimization
  • Model training, we execute caffe command to get our.caffemodel algorithm file

On Intel Devcloud, we can check whether these are available simply by going to

cd /glob/deep-learning/py-faster-rcnn/caffe-fast-rcnn/build/tools

Step 3: Preparing LMDB for Training

Once we have the Intel DevCloud setup, we can build it up

mkdir doctorhazel
cd doctorhazel
mkdir input
cd input
mkdir train

And from there we can get all the data we've previously set up into the folder

scp ./* colfax:/home/[youruser_name]/doctorhazel/input/train/

After that, we can build our lmdb this way. And we do it through following sets

  • 5/6 of the data set will be used for training, 1/6 is being used for validation so we can calculate the accuracy of the model
  • We are resizing all images to 227x227 to follow the same standard as BVLC
  • Histogram equalization is being applied to all the training images to adjust the contrast.
  • And store them among train_lmdb and validation_lmdb
  • Use make_datum to label all the image dataset inside lmdb
import os
import glob
import random
import numpy as np
import cv2
import caffe
from caffe.proto import caffe_pb2
import lmdb
#We use 227x227 from BVLC
IMAGE_WIDTH = 227
IMAGE_HEIGHT = 227
def transform_img(img, img_width=IMAGE_WIDTH, img_height=IMAGE_HEIGHT):
   img[:, :, 0] = cv2.equalizeHist(img[:, :, 0])
   img[:, :, 1] = cv2.equalizeHist(img[:, :, 1])
   img[:, :, 2] = cv2.equalizeHist(img[:, :, 2])
   img = cv2.resize(img, (img_width, img_height), interpolation = cv2.INTER_CUBIC)
   return img
def make_datum(img, label):
   return caffe_pb2.Datum(
       channels=3,
       width=IMAGE_WIDTH,
       height=IMAGE_HEIGHT,
       label=label,
       data=np.rollaxis(img, 2).tostring())
train_lmdb = '/home/[your_username]/doctorhazel/input/train_lmdb'
validation_lmdb = '/home/[youser_username]/doctorhazel/input/validation_lmdb'
os.system('rm -rf  ' + train_lmdb)
os.system('rm -rf  ' + validation_lmdb)
train_data = [img for img in glob.glob("./input/train/*jpg")]
random.shuffle(train_data)
print 'Creating train_lmdb'
in_db = lmdb.open(train_lmdb, map_size=int(1e12))
with in_db.begin(write=True) as in_txn:
   for in_idx, img_path in enumerate(train_data):
       if in_idx %  6 == 0:
           continue
       img = cv2.imread(img_path, cv2.IMREAD_COLOR)
       img = transform_img(img, img_width=IMAGE_WIDTH, img_height=IMAGE_HEIGHT)
       if 'none' in img_path:
           label = 0
       elif 'nevus' in img_path:
           label = 1
       elif 'melanoma' in img_path:
           label = 2
       else:
           label = 3
       datum = make_datum(img, label)
       in_txn.put('{:0>5d}'.format(in_idx), datum.SerializeToString())
       print '{:0>5d}'.format(in_idx) + ':' + img_path
in_db.close()
print '\nCreating validation_lmdb'
in_db = lmdb.open(validation_lmdb, map_size=int(1e12))
with in_db.begin(write=True) as in_txn:
   for in_idx, img_path in enumerate(train_data):
       if in_idx % 6 != 0:
           continue
       img = cv2.imread(img_path, cv2.IMREAD_COLOR)
       img = transform_img(img, img_width=IMAGE_WIDTH, img_height=IMAGE_HEIGHT)
       if 'none' in img_path:
           label = 0
       elif 'nevus' in img_path:
           label = 1
       elif 'melanoma' in img_path:
           label = 2
       else:
           label = 3
       datum = make_datum(img, label)
       in_txn.put('{:0>5d}'.format(in_idx), datum.SerializeToString())
       print '{:0>5d}'.format(in_idx) + ':' + img_path
in_db.close()
print '\nFinished processing all images'

After wards we will run the script

python2 create_lmdb.py

to get all the LMDB. After that is done, we will need to get the mean image of the training data. As part of the caffe, we can do it through

cd /glob/deep-learning/py-faster-rcnn/caffe-fast-rcnn/build/tools
compute_image_mean -backend=lmdb /home/[your_user]/doctorhazel/input/train_lmdb /home/[your_user]/doctorhazel/input/mean.binaryproto

The command above will generate the mean image of training data. Each input image will subtract the mean image so that every feature pixel has zero mean. This is a commonly used in preprocessing for supervised machine learning.

Step 4: Set Up Model Definition and Solver Definition

We now need to set up model definition and solver definition, in this article we will be using bvlc_reference_net, which can be seen at https://github.com/BVLC/caffe/tree/master/models/bvlc_reference_caffenet

Below is the modified version of train.prototxt

name: "CaffeNet"
layer {
 name: "data"
 type: "Data"
 top: "data"
 top: "label"
 include {
   phase: TRAIN
 }
 transform_param {
   mirror: true
   crop_size: 227
   mean_file: "/home/[your_username]/doctorhazel/input/mean.binaryproto"
 }
 data_param {
   source: "/home/[your_username]/doctorhazel/input/train_lmdb"
   batch_size: 128
   backend: LMDB
 }
}
layer {
 name: "data"
 type: "Data"
 top: "data"
 top: "label"
 include {
   phase: TEST
 }
 transform_param {
   mirror: false
   crop_size: 227
   mean_file: "/home/[your_username]/doctorhazel/input/mean.binaryproto"
 }
# mean pixel / channel-wise mean instead of mean image
#  transform_param {
#    crop_size: 227
#    mean_value: 104
#    mean_value: 117
#    mean_value: 123
#    mirror: true
#  }
 data_param {
   source: "/home/[your_username]/doctorhazell/input/validation_lmdb"
   batch_size: 36
   backend: LMDB
 }
}
layer {
 name: "conv1"
 type: "Convolution"
 bottom: "data"
 top: "conv1"
 param {
   lr_mult: 1
   decay_mult: 1
 }
 param {
   lr_mult: 2
   decay_mult: 0
 }
 convolution_param {
   num_output: 96
   kernel_size: 11
   stride: 4
   weight_filler {
     type: "gaussian"
     std: 0.01
   }
   bias_filler {
     type: "constant"
     value: 0
   }
 }
}
layer {
 name: "relu1"
 type: "ReLU"
 bottom: "conv1"
 top: "conv1"
}
layer {
 name: "pool1"
 type: "Pooling"
 bottom: "conv1"
 top: "pool1"
 pooling_param {
   pool: MAX
   kernel_size: 3
   stride: 2
 }
}
layer {
 name: "norm1"
 type: "LRN"
 bottom: "pool1"
 top: "norm1"
 lrn_param {
   local_size: 5
   alpha: 0.0001
   beta: 0.75
 }
}
layer {
 name: "conv2"
 type: "Convolution"
 bottom: "norm1"
 top: "conv2"
 param {
   lr_mult: 1
   decay_mult: 1
 }
 param {
   lr_mult: 2
   decay_mult: 0
 }
 convolution_param {
   num_output: 256
   pad: 2
   kernel_size: 5
   group: 2
   weight_filler {
     type: "gaussian"
     std: 0.01
   }
   bias_filler {
     type: "constant"
     value: 1
   }
 }
}
layer {
 name: "relu2"
 type: "ReLU"
 bottom: "conv2"
 top: "conv2"
}
layer {
 name: "pool2"
 type: "Pooling"
 bottom: "conv2"
 top: "pool2"
 pooling_param {
   pool: MAX
   kernel_size: 3
   stride: 2
 }
}
layer {
 name: "norm2"
 type: "LRN"
 bottom: "pool2"
 top: "norm2"
 lrn_param {
   local_size: 5
   alpha: 0.0001
   beta: 0.75
 }
}
layer {
 name: "conv3"
 type: "Convolution"
 bottom: "norm2"
 top: "conv3"
 param {
   lr_mult: 1
   decay_mult: 1
 }
 param {
   lr_mult: 2
   decay_mult: 0
 }
 convolution_param {
   num_output: 384
   pad: 1
   kernel_size: 3
   weight_filler {
     type: "gaussian"
     std: 0.01
   }
   bias_filler {
     type: "constant"
     value: 0
   }
 }
}
layer {
 name: "relu3"
 type: "ReLU"
 bottom: "conv3"
 top: "conv3"
}
layer {
 name: "conv4"
 type: "Convolution"
 bottom: "conv3"
 top: "conv4"
 param {
   lr_mult: 1
   decay_mult: 1
 }
 param {
   lr_mult: 2
   decay_mult: 0
 }
 convolution_param {
   num_output: 384
   pad: 1
   kernel_size: 3
   group: 2
   weight_filler {
     type: "gaussian"
     std: 0.01
   }
   bias_filler {
     type: "constant"
     value: 1
   }
 }
}
layer {
 name: "relu4"
 type: "ReLU"
 bottom: "conv4"
 top: "conv4"
}
layer {
 name: "conv5"
 type: "Convolution"
 bottom: "conv4"
 top: "conv5"
 param {
   lr_mult: 1
   decay_mult: 1
 }
 param {
   lr_mult: 2
   decay_mult: 0
 }
 convolution_param {
   num_output: 256
   pad: 1
   kernel_size: 3
   group: 2
   weight_filler {
     type: "gaussian"
     std: 0.01
   }
   bias_filler {
     type: "constant"
     value: 1
   }
 }
}
layer {
 name: "relu5"
 type: "ReLU"
 bottom: "conv5"
 top: "conv5"
}
layer {
 name: "pool5"
 type: "Pooling"
 bottom: "conv5"
 top: "pool5"
 pooling_param {
   pool: MAX
   kernel_size: 3
   stride: 2
 }
}
layer {
 name: "fc6"
 type: "InnerProduct"
 bottom: "pool5"
 top: "fc6"
 param {
   lr_mult: 1
   decay_mult: 1
 }
 param {
   lr_mult: 2
   decay_mult: 0
 }
 inner_product_param {
   num_output: 4096
   weight_filler {
     type: "gaussian"
     std: 0.005
   }
   bias_filler {
     type: "constant"
     value: 1
   }
 }
}
layer {
 name: "relu6"
 type: "ReLU"
 bottom: "fc6"
 top: "fc6"
}
layer {
 name: "drop6"
 type: "Dropout"
 bottom: "fc6"
 top: "fc6"
 dropout_param {
   dropout_ratio: 0.5
 }
}
layer {
 name: "fc7"
 type: "InnerProduct"
 bottom: "fc6"
 top: "fc7"
 param {
   lr_mult: 1
   decay_mult: 1
 }
 param {
   lr_mult: 2
   decay_mult: 0
 }
 inner_product_param {
   num_output: 4096
   weight_filler {
     type: "gaussian"
     std: 0.005
   }
   bias_filler {
     type: "constant"
     value: 1
   }
 }
}
layer {
 name: "relu7"
 type: "ReLU"
 bottom: "fc7"
 top: "fc7"
}
layer {
 name: "drop7"
 type: "Dropout"
 bottom: "fc7"
 top: "fc7"
 dropout_param {
   dropout_ratio: 0.5
 }
}
layer {
 name: "fc8"
 type: "InnerProduct"
 bottom: "fc7"
 top: "fc8"
 param {
   lr_mult: 1
   decay_mult: 1
 }
 param {
   lr_mult: 2
   decay_mult: 0
 }
 inner_product_param {
   num_output: 4
   weight_filler {
     type: "gaussian"
     std: 0.01
   }
   bias_filler {
     type: "constant"
     value: 0
   }
 }
}
layer {
 name: "accuracy"
 type: "Accuracy"
 bottom: "fc8"
 bottom: "label"
 top: "accuracy"
 include {
   phase: TEST
 }
}
layer {
 name: "loss"
 type: "SoftmaxWithLoss"
 bottom: "fc8"
 bottom: "label"
 top: "loss"
}

visual representation of the caffe model
visual representation of the Caffe model

At the same time, we can create deploy.prototxt which is built off train.prototxt. This can be seen from the Github repo. We will also create the label.txt file the same way we created the lmdb file

classes
None
Nevus
Melanoma 
Seborrheic Keratosis

After that, we need Solver Definition in solver.prototxt, it is used to optimize the training models. Because we are relying on CPU offered by Intel, we need to make some modifications on the solver definition below.

net: "/home/[your_username]/doctorhazel/model/train.prototxt"
test_iter: 50
test_interval: 50
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 50
display: 50
max_iter: 5000
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000
snapshot_prefix: "/home/[your_username]/doctorhazel/model"
solver_mode: CPU

Because we are dealing with a small amount of data here, we can shorten the test iterations and get our model as quickly as possible. To make this short, the solver will compute the accuracy of the model every 50 iterations using the validation set. Since we don't have a lot of data, the solver optimization process will take a snapshot every 1000 iteration and run for a maximum of 5000 iterations. The current configuration of lr_policy: "step", stepsize: 2500, base_lr: 0.001 and gamma: 0.1 is pretty standard as we can try to use others as well through bvlc solver documentation.

Step 5: Training the Model

Since we are using the free Intel® AI DevCloud and have everything all set, we can use the Intel Caffe which is optimized using Intel CPU that's installed on the cluster. Since this is a cluster, we can simply start training by using the command below.

cd /glob/deep-learning/py-faster-rcnn/caffe-fast-rcnn/build/tools
echo caffe train --solver ~/doctorhazel/model/solver.prototxt | qsub -o ~/doctorhazel/model/output.txt -e ~/doctorhazel/model/train.log

The trained model will be model_iter_1000.caffemodel,model_iter_2000.caffemodel and so on. With the data given from ISIC you should obtain somewhere around 70 to 80% accuracy. You can plot your own curve by the command below,

cd ~/doctorhazel
python2 plot_learning_curve.py ./model/train.log ./model/train.png

Training Curve
Training Curve

Step 6: Deploying and Running on the Edge (Movidius NCS)

For this article, we are using Up2 Board so that we can have an offline device which we can carry around everywhere. The Up2 Board is already installed with Ubuntu 16.04, making things a bit easier. On the device, we can first create a folder copying everything we trained on the server.

mkdir -p ~/workspace
cd workspace
mkdir doctorhazel
cd doctorhazel
mkdir WaterNet
scp colfax:/doctorhazel/model/* ./CancerNet

Afterwards, we need to install Movidius NCS SDK via https://developer.movidius.com/start. This is to ensure that we can run our programs on the edge. The simple command would be

cd ~/workspace
git clone https://github.com/movidius/ncsdk.git
cd ~/workspace/ncsdk
make install

Full instruction can be seen from the video below

VIDEO

Installing Movidius NCS

Next, we need to download all the sample apps ncappzoo, which is also created by Movidius, and the specific app we need is stream inference, which can be gotten from the example file

cd ~/workspace
git clone https://github.com/movidius/ncappzoo.git
cd doctorhazel
cp ~/workspace/ncappzoo/apps/stream_infer/* ./

Change the file into CancerNet we've just created

NETWORK_IMAGE_WIDTH = 227                     # the width of images the network requires
NETWORK_IMAGE_HEIGHT = 227                    # the height of images the network requires
NETWORK_IMAGE_FORMAT = "BGR"                  # the format of the images the network requires
NETWORK_DIRECTORY = "/CancerNet/" # directory of the network 
NETWORK_STAT_TXT = "./squeezenet_stat.txt"    # stat.txt for network
NETWORK_CATEGORIES_TXT = "/CancerNet/label.txt" # categories.txt for network

Last and final part, we need to compile a graph file for Movidius NCS SDK

cd ~/workspace/doctorhazel/CancerNet/
mvNCCompile deploy.prototxt -w 
cd ..
python3 stream_infer.py

And from here you pretty much have a doctor hazel of your own.

Showing of a regular mole
Showing of a regular mole

 

Showing of melanoma
Showing of melanoma

We've used Intel DevCloud with Intel optimized Caffe on the cloud, and Movidius Neural Computing Stick on the Edge.

Step 7: Casing

 

 

 

Schematics

Base model (download)

Code

Nyceane / blue-scan

0 3

Blue Scan AI for skin cancer — Read More

Latest commit to the master branch on 1-30-2019

Download as zip

Credits

Peter Ma

Peter Ma 

26 projects • 197 followers

Prototype Hacker, Intel Software Innovator, Hackathon Goer, World Traveler, Ecological balancer, integrationist, technologist, futurist.

 

Sarah Han

Sarah Han 

6 projects • 32 followers

Software Engineer, Design, 3D

Josue

Josue 

1 project • 2 followers

Software Dev

serena xu

serena xu 

2 projects • 4 followers

hell-o

Phil Levy

Phil Levy 

Shterny Isseroff

Shterny Isseroff 

 

 

 

Verticals: