DeepMicroscopy – my portable ML laboratory

DIV

Today I’m very happy to finally release my open source project DeepMicroscopy.
In this project I have created the platform where you can capture the images from the microscope, annotate, train the Tensorflow model and finally observe real time object detection.
The project is configured on the Jetson Nano device thus it can work with compact and portable solutions.

The project code is available on my github https://github.com/qooba/deepmicroscopy

Before you will continue reading please watch quick introduction:

1. Architecture

The solution requires three devices:
* Microscope with usb camera – e.g. Velleman CAMCOLMS3 2Mpx
* Inference server – Jetson Nano
* Training server – PC equipped with GPU card e.g. NVIDIA GTX 1050 Ti

The whole solution was built using docker images thus now I will describe components installed on each device.

Jetson

The Jetson device contains three components:
* Frontend – Vue application running on Nginx
* Backend – Python application which is the core of the solution
* Storage – Minio storage where projects, images and annotations are stored

Training Server

The training server contains two components:
* Frontend – Vue application running on Nginx
* Backend – Python application which handles the training logic

2. Platform functionalities

The most of platform’s functionality is installed on the Jetson Nano. Because the Jetson Nano compute capabilities are insufficient for model training purposes I have decided to split this part into three stages which I will describe in the training paragraph.

Projects management

In the Deep Microscopy you can create multiple projects where you annotate and recognize different objects.

You can create and switch projects in the top left menu. Each project data is kept in the separate bucket in the minio storage.

Images Capture

When you open the Capture panel in the web application and click Play ▶ button the WebRTC socket between browser and backend is created (I have used the aiortc python library). To make it working in the Chrome browser we need two things:
* use TLS for web application – the self signed certificate is already configured in the nginx
* allow Camera to be used for the application – you have to set it in the browser

Now we can stream the image from camera to the browser (I have used OpenCV library to fetch the image from microscope through usb).

When we decide to capture specific frame and click Plus ✚ button the backend saves the current frame into project bucket of minio storage.

Annotation

The annotation engine is based on the Via Image Annotator. Here you can see all images you have captured for specific project. There are a lot of features eg. switching between images (left/right arrow), zoom in/out (+/-) and of course annotation tools with different shapes (currently the training algorithm expects the rectangles) and attributes (by default the class attribute is added which is also expected by the training algorithm).

This is rather painstaking and manual task thus when you will finish remember to save the annotations by clicking save button (currently there is no auto save). When you save the project the project file (with the via schema) is saved in the project bucket.

Training

When we finish image annotation we can start model training. As mentioned before it is split into three stages.

Data package

At the beginning we have to prepare data package (which contains captured images and our annotations) by clicking the DATA button.

Training server

Then we drag and drop the data package to the application placed on machine with higher compute capabilities.

After upload the training server automatically extracts the data package, splits into train/test data and starts training.
Currently I have used the MobileNet V2 model architecture and I base on the pretrained tensorflow model.

When the training is finished the model is exported using TensorRT which optimizes the model inference performance especially on NVIDIA devices like Jetson Nano.

During and after training you can inspect all models using builtin tensorboard.

The web application periodically check training state and when the training is finished we can download the model.

Uploading model

Finally we upload the TensorRT model back to the Jetson Nano device. The model is saved into selected project bucket thus you can use multiple models for each project.

Object detection

On the Execute panel we can choose model from the drop down list (where we have list of models uploaded for selected project) and load the model clicking RUN (typically it take same time to load the model). When we click Play ▶ button the application shows real time object detection. If we want to change the model we can click CLEAR and then choose and RUN another model.

Additionally we can fetch additional detection statistics which are sent using Web Socket. Currently the number of detected items and average width, height, score are returned.

3. Setup

To start working with the Jetson Nano we have to install Jetson Nano Developer Kit.

The whole platform is working with Docker and all Dockerfiles are included in the GitHub repository

Because Jetson Nano has aarch64 / arm64 architecture thus we need separate images for Jetson components.

Jetson dockers:
* front – frontend web app
* app – backend web app
* minio – minio storage for aarch64 / arm64 architecture

Training Server dockers:
* serverfront – frontend app
* server – backend app

If you want you can build the images by yourself or you can use built images from DockerHub.

The simplest option is to run run.app.sh on Jetson Nano and run.server.sh on Training Server which will setup the whole platform.

Thanks for reading 🙂

FastAI with TensorRT on Jetson Nano

DIV

IoT and AI are the hottest topics nowadays which can meet on Jetson Nano device.
In this article I’d like to show how to use FastAI library, which is built on the top of the PyTorch on Jetson Nano. Additionally I will show how to optimize the FastAI model for the usage with TensorRT.

You can find the code on https://github.com/qooba/fastai-tensorrt-jetson.git.

1. Training

Although the Jetson Nano is equipped with the GPU it should be used as a inference device rather than for training purposes. Thus I will use another PC with the GTX 1050 Ti for the training.

Docker gives flexibility when you want to try different libraries thus I will use the image which contains the complete environment.

Training environment Dockerfile:

FROM nvcr.io/nvidia/tensorrt:20.01-py3
WORKDIR /
RUN apt-get update && apt-get -yq install python3-pil
RUN pip3 install jupyterlab torch torchvision
RUN pip3 install fastai
RUN DEBIAN_FRONTEND=noninteractive && apt update && apt install curl git cmake ack g++ tmux -yq
RUN pip3 install ipywidgets && jupyter nbextension enable --py widgetsnbextension
CMD ["sh","-c", "jupyter lab --notebook-dir=/opt/notebooks --ip='0.0.0.0' --port=8888 --no-browser --allow-root --NotebookApp.password='' --NotebookApp.token=''"]

To use GPU additional nvidia drivers (included in the NVIDIA CUDA Toolkit) are needed.

If you don’t want to build your image simply run:

docker run --gpus all  --name jupyter -d --rm -p 8888:8888 -v $(pwd)/docker/gpu/notebooks:/opt/notebooks qooba/fastai:1.0.60-gpu

Now you can use pets.ipynb notebook (the code is taken from lesson 1 FastAI course) to train and export pets classification model.

from fastai.vision import *
from fastai.metrics import error_rate

# download dataset
path = untar_data(URLs.PETS)
path_anno = path/'annotations'
path_img = path/'images'
fnames = get_image_files(path_img)

# prepare data 
np.random.seed(2)
pat = r'/([^/]+)_\d+.jpg$'
bs = 16
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs).normalize(imagenet_stats)

# prepare model learner
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

# train 
learn.fit_one_cycle(4)

# export
learn.export('/opt/notebooks/export.pkl')

Finally you get pickled pets model (export.pkl).

2. Inference (Jetson Nano)

The Jetson Nano device with Jetson Nano Developer Kit already comes with the docker thus I will use it to setup the inference environment.

I have used the base image nvcr.io/nvidia/l4t-base:r32.2.1 and installed the pytorch and torchvision.
If you have JetPack 4.4 Developer Preview you can skip this steps and start with the base image nvcr.io/nvidia/l4t-pytorch:r32.4.2-pth1.5-py3.

The FastAI installation on Jetson is more problematic because of the blis package. Finally I have found the solution here.

Additionally I have installed torch2trt package which converts PyTorch model to TensorRT.

Finally I have used the tensorrt from the JetPack which can be found in
/usr/lib/python3.6/dist-packages/tensorrt .

The final Dockerfile is:

FROM nvcr.io/nvidia/l4t-base:r32.2.1
WORKDIR /
# install pytorch 
RUN apt update && apt install -y --fix-missing make g++ python3-pip libopenblas-base
RUN wget https://nvidia.box.com/shared/static/ncgzus5o23uck9i5oth2n8n06k340l6k.whl -O torch-1.4.0-cp36-cp36m-linux_aarch64.whl
RUN pip3 install Cython
RUN pip3 install numpy torch-1.4.0-cp36-cp36m-linux_aarch64.whl
# install torchvision
RUN apt update && apt install libjpeg-dev zlib1g-dev git libopenmpi-dev openmpi-bin -yq
RUN git clone --branch v0.5.0 https://github.com/pytorch/vision torchvision
RUN cd torchvision && python3 setup.py install
# install fastai
RUN pip3 install jupyterlab
ENV TZ=Europe/Warsaw
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone && apt update && apt -yq install npm nodejs python3-pil python3-opencv
RUN apt update && apt -yq install python3-matplotlib
RUN git clone https://github.com/NVIDIA-AI-IOT/torch2trt.git /torch2trt && mv /torch2trt/torch2trt /usr/local/lib/python3.6/dist-packages && rm -r /torch2trt
COPY tensorrt /usr/lib/python3.6/dist-packages/tensorrt
RUN pip3 install --no-deps fastai
RUN git clone https://github.com/fastai/fastai /fastai
RUN apt update && apt install libblas3 liblapack3 liblapack-dev libblas-dev gfortran -yq
RUN curl -LO https://github.com/explosion/cython-blis/files/3566013/blis-0.4.0-cp36-cp36m-linux_aarch64.whl.zip && unzip blis-0.4.0-cp36-cp36m-linux_aarch64.whl.zip && rm blis-0.4.0-cp36-cp36m-linux_aarch64.whl.zip
COPY blis-0.4.0-cp36-cp36m-linux_aarch64.whl .
RUN pip3 install scipy pandas blis-0.4.0-cp36-cp36m-linux_aarch64.whl spacy fastai scikit-learn
CMD ["sh","-c", "jupyter lab --notebook-dir=/opt/notebooks --ip='0.0.0.0' --port=8888 --no-browser --allow-root --NotebookApp.password='' --NotebookApp.token=''"]

As before you can skip the docker image build and use ready image:

docker run --runtime nvidia --network app_default --name jupyter -d --rm -p 8888:8888 -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix -v $(pwd)/docker/jetson/notebooks:/opt/notebooks qooba/fastai:1.0.60-jetson

Now we can open jupyter notebook on jetson and move pickled model file export.pkl from PC.
The notebook jetson_pets.ipynb show how to load the model.

import torch
from torch2trt import torch2trt
from fastai.vision import *
from fastai.metrics import error_rate

learn = load_learner('/opt/notebooks/')
learn.model.eval()
model=learn.model

if torch.cuda.is_available():
    input_batch = input_batch.to('cuda')
    model.to('cuda')

Additionally we can optimize the model using torch2trt package:

x = torch.ones((1, 3, 224, 224)).cuda()
model_trt = torch2trt(learn.model, [x])

Let’s prepare example input data:

import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/dog.jpg", "dog.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0)

Finally we can run prediction for PyTorch and TensorRT model:

x=input_batch
y = model(x)
y_trt = model_trt(x)

and compare PyTorch and TensorRT performance:

def prediction_time(model, x):
    import time
    times = []
    for i in range(20):
        start_time = time.time()
        y_trt = model(x)

        delta = (time.time() - start_time)
        times.append(delta)
    mean_delta = np.array(times).mean()
    fps = 1/mean_delta
    print('average(sec):{},fps:{}'.format(mean_delta,fps))

prediction_time(model,x)
prediction_time(model_trt,x)

where for:
* PyTorch – average(sec):0.0446, fps:22.401
* TensorRT – average(sec):0.0094, fps:106.780

The TensorRT model is almost 5 times faster thus it is worth to use torch2trt.

References

[1] Top image DrZoltan from Pixabay

Boosting Elasticsearch with machine learning – Elasticsearch, RankLib, Docker

Telescope

Elastic search is powerful search engine. Its distributed architecture give ability to build scalable full-text search solution. Additionally it provides comprehensive query language.

Despite this sometimes the engine and search results is not enough to meet the expectations of users. In such situations it is possible to boost search quality using machine learning algorithms.

Before you will continue reading please watch short introduction:

In this article I will show how to do this using RankLib library and LambdaMart algorithm . Moreover I have created ready to use platform which:

  1. Index the data
  2. Helps to label the search results in the user friendly way
  3. Trains the model
  4. Deploys the model to elastic search
  5. Helps to test the model

The whole project is setup on the docker using docker compose thus you can setup it very easy.
The platform is based on the elasticsearch learning to rank plugin. I have also used the python example described in this project.

Before you will start you will need docker and docker-compose installed on your machine (https://docs.docker.com/get-started/)

To run the project you have to clone it:

git clone https://github.com/qooba/elasticsearch-learning-to-rank.git

Then to make elasticsearch working you need to create data folder with appropriate access:

cd elasticsearch-learning-to-rank/
mkdir docker/elasticsearch/esdata1
chmod g+rwx docker/elasticsearch/esdata1
chgrp 1000 docker/elasticsearch/esdata1

Finally you can run the project:

docker-compose -f app/docker-compose.yml up

Now you can open the http://localhost:8020/.

1. Architecture

There are three main components:

A. The ngnix reverse proxy with angular app
B. The flask python app which orchestrates the whole ML solution
C. The elastic search with rank lib plugin installed

A. Ngnix

I have used the Ngnix reverse proxy to expose the flask api and the angular gui which helps with going through the whole proces.

ngnix.config

server {
    listen 80;
    server_name localhost;
    root /www/data;

    location / {
        autoindex on;
    }

    location /images/ {
        autoindex on;
    }

    location /js/ {
        autoindex on;
    }

    location /css/ {
        autoindex on;
    }

    location /training/ {
        proxy_set_header   Host                 $host;
        proxy_set_header   X-Real-IP            $remote_addr;
        proxy_set_header   X-Forwarded-For      $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto    $scheme;
        proxy_set_header Host $http_host;

        proxy_pass http://training-app:5090;
    }
}

B. Flask python app

This is the core of the project. It exposes api for:

  • Indexing
  • Labeling
  • Training
  • Testing

It calls directly the elastic search to get the data and do the modifications.
Because training with RankLib require the java thus Docker file for this part contains default-jre installation. Additionally it downloads the RankLib-2.8.jar and tmdb.json (which is used as a default data source) from: http://es-learn-to-rank.labs.o19s.com/.

Dockerfile

FROM python:3

RUN \
    apt update && \
    apt-get -yq install default-jre
RUN mkdir -p /opt/services/flaskapp/src
COPY . /opt/services/flaskapp/src
WORKDIR /opt/services/flaskapp/src
RUN pip install -r requirements.txt
RUN python /opt/services/flaskapp/src/prepare.py
EXPOSE 5090
CMD ["python", "-u", "app.py"]

C. Elastic search

As mentioned before it is the instance of elastic search with the rank lib plugin installed

Dockerfile

FROM docker.elastic.co/elasticsearch/elasticsearch:6.2.4
RUN /usr/share/elasticsearch/bin/elasticsearch-plugin install \ 
-b http://es-learn-to-rank.labs.o19s.com/ltr-1.1.0-es6.2.4.zip

All layers are composed with docker-compose.yml:

version: '2.2'
services:
  elasticsearch:
    build: ../docker/elasticsearch
    container_name: elasticsearch
    environment:
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - xpack.security.enabled=false
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - ../docker/elasticsearch/esdata1:/usr/share/elasticsearch/data
    networks:
      - esnet

  training-app:
    build: ../docker/training-app
    networks:
      - esnet
    depends_on:
      - elasticsearch
    environment:
      - ES_HOST=http://elasticsearch:9200
      - ES_INDEX=tmdb
      - ES_TYPE=movie
    volumes:
      - ../docker/training-app:/opt/services/flaskapp/src

  nginx:
    image: "nginx:1.13.5"
    ports:
      - "8020:80"
    volumes:
      - ../docker/frontend-reverse-proxy/conf:/etc/nginx/conf.d
      - ../docker/frontend-reverse-proxy/www/data:/www/data
    depends_on:
      - elasticsearch
      - training-app
    networks:
      - esnet


volumes:
  esdata1:
    driver: local

networks:
  esnet:

2. Platform

The platform helps to run and understand the whole process thought four steps:

A. Indexing the data
B. Labeling the search results
C. Training the model
D. Testing trained model

A. Indexing

The first step is obvious thus I will summarize it shortly. As mentioned before the default data source is taken from tmdb.json file but it can be simply changed using ES_DATA environment variable in the docker-compose.yml :

training-app:
    environment:
      - ES_HOST=http://elasticsearch:9200
      - ES_DATA=/opt/services/flaskapp/tmdb.json
      - ES_INDEX=tmdb
      - ES_TYPE=movie
      - ES_FEATURE_SET_NAME=movie_features
      - ES_MODEL_NAME=test_6
      - ES_MODEL_TYPE=6
      - ES_METRIC_TYPE=ERR@10

Clicking Prepare Index the data is taken from ES_DATA file and indexed in the elastic search.

prepare index

Additionally you can define:
ES_HOST – the elastic search url
ES_USER/ES_PASSWORD – elastic search credentials, by default authentication is turned off
ES_INDEX/ES_TYPE – index/type name for data from ES_DATA file
ES_FEATURE_SET_NAME – name of container for defined features (described later)
ES_MODEL_NAME – name of trained model kept in elastic search (described later)
ES_MODEL_TYPE – algorithm used to train the model (described later).
ES_METRIC_TYPE – metric type (described later)

We can train and keep multiple models in elastic search which can be used for A/B testing.

B. Labeling

The supervised learning algorithms like learn to rank needs labeled data thus in this step I will focus on this area.
First of all I have to prepare the file label_list.json which contains the list of queries to label e.g.:

[
    "rambo",
    "terminator",
    "babe",
    "die hard",
    "goonies"
]

When the file is ready I can go to the second tab (Step 2 Label).

label

For each query item the platform prepare the result candidates which have to be ranked from 0 to 4.

You have to go through the whole list and at the last step
the labeled movies are saved in the file :

# grade (0-4)   queryid docId   title
# 
# Add your keyword strings below, the feature script will 
# Use them to populate your query templates 
# 
# qid:1: rambo
# qid:2: terminator
# qid:3: babe
# qid:4: die hard
# 
# https://sourceforge.net/p/lemur/wiki/RankLib%20File%20Format/
# 
# 
4 qid:1 # 7555 Rambo
4 qid:1 # 1370 Rambo III
4 qid:1 # 1368 First Blood
4 qid:1 # 1369 Rambo: First Blood Part II
0 qid:1 # 31362 In the Line of Duty: The F.B.I. Murders
0 qid:1 # 13258 Son of Rambow
0 qid:1 # 61410 Spud
4 qid:2 # 218 The Terminator
4 qid:2 # 534 Terminator Salvation
4 qid:2 # 87101 Terminator Genisys
4 qid:2 # 61904 Lady Terminator
...

Each labeling cycle is saved to the separate file: timestamp_judgments.txt

C. Training

Now it is time to use labeled data to make elastic search much more smarter. To do this we have to indicate the candidates features.
The features list is defined in the files: 1-4.json in the training-app directory.
Each feature file is elastic search query eg. the {{keyword}}
(which is searched text) match the title property:

{
    "query": {
        "match": {
            "title": "{{keywords}}"
        }
    }
}

In this example I have used 4 features:
– title match keyword
– overview match keyword
– keyword is prefix of title
– keyword is prefix of overview

I can add more features without code modification, the list of features is defined and read using naming pattern (1-n.json).

Now I can go to the Step 3 Train tab and simply click the train button.

train

At the first stage the training app takes all feature files and build the features set which is save in the elastic search (the ES_FEATURE_SET_NAME environment variable defines the name of this set).

In the next step the latest labeling file (ordered by the timestamp) is processed (for each labeled item the feature values are loaded) eg.

4 qid:1 # 7555 Rambo

The app takes the document with id=7555 and gets the elastic search score for fetch defined feature.
The Rambo example is translated into:

4   qid:1   1:12.318446 2:10.573845 3:1.0   4:1.0 # 7555    rambo

Which means that score of feature one is 12.318446 (and respectively 10.573845, 1.0, 1.0 for features 2,3,4 ).
This format is readable for the RankLib library. And the training can be perfomed.
The full list of parameters is available on: [https://sourceforge.net/p/lemur/wiki/RankLib/][https://sourceforge.net/p/lemur/wiki/RankLib/].

The ranker type is chosen using ES_MODEL_TYPE parameter:
– 0: MART (gradient boosted regression tree)
– 1: RankNet
– 2: RankBoost
– 3: AdaRank
– 4: Coordinate Ascent
– 6: LambdaMART
– 7: ListNet
– 8: Random Forests

The default used value is LambdaMART.

Additionally setting ES_METRIC_TYPE we can use the optimization metric.
Possible values:
– MAP
– NDCG@k
– DCG@k
– P@k
– RR@k
– ERR@k

The default value is ERR@10

train

Finally we obtain the trained model which is deployed to the elastic search.
The project can deploy multiple trained models and the deployed model name is defined by ES_MODEL_NAME.

D. Testing

In the last step we can test trained and deployed model.

test

We can choose the model using the ES_MODEL_NAME parameter.

It is used in the search query and can be different in each request which is useful when we need to perform A/B testing.

Happy searching 🙂

Tensorflow meets C# Azure function

Meet

Tensorflow meets C# Azure function and … . In this post I would like to show how to deploy tensorflow model with C# Azure function. I will use the TensorflowSharp the .NET bindings to the tensorflow library. The InterceptionInterface will be involved to create http endpoint which will recognize the images.

Code

I will start with creating .net core class library and adding TensorFlowSharp package:

dotnet new classlib
dotnet add package TensorFlowSharp -v 1.9.0

Then create file TensorflowImageClassification.cs:

Here I have defined the http entrypoint for the AzureFunction (Run method). The q query parameter is taken from the url and used as a url of the image which will be recognized.

The solution will analyze the image using the convolutional neural network arranged with the Interception architecture.

The function will automatically download the trained interception model thus the function first run will take little bit longer. The model will be saved to the D:\home\site\wwwroot\.

The convolutional neural network graph will be kept in the memory (graphCache) thus the function don’t have to read the model every request. On the other hand the input image tensor has to be prepared and preprocessed every time (ConstructGraphToNormalizeImage).

Finally I can run command:

dotnet publish

which will create the package for the function deployment.

Azure function

To deploy the code I will create the Azure Function (Consumption) with the http trigger. Additionally I will set the function entry point, the function.json will be defined as:

The kudu will be used to deploy the already prepared package. Additionally I have to deploy the libtensorflow.dll from /runtimes/win7-x64/native (otherwise the Azure Functions won’t load it). The bin directory should look like:

Finally I can test the azure function:

The function recognize the image and returns the label with the highest probability.