In this article I will show how to process streams with Apache Flink and MLflow model
Before you will continue reading please watch short introduction:
Apache Flink allows for an efficient and scalable way of processing streams. It is a distributed processing engine which supports multiple sources like: Kafka, NiFi and many others
(if we need custom, we can create them ourselves).
Apache Flink also provides the framework for defining streams operations in languages like:
Java, Scala, Python and SQL.
To simplify the such definitions we can use Jupyter Notebook as a interface. Of course we can write in Python using PyFlink library but we can make it even easier using writing jupyter notebook extension (“magic words”).
Using Flink extension (magic.ipynb) we can simply use Flink SQL sql syntax directly in Jupyter Notebook.
To use the extesnions we need to load it:
%reload_ext flinkmagic
Then we need to initialize the Flink StreamEnvironment:
%flink_init_stream_env
Now we can use the SQL code for example:
FileSystem connector:
%%flink_execute_sql
CREATE TABLE MySinkTable (
word varchar,
cnt bigint) WITH (
'connector.type' = 'filesystem',
'format.type' = 'csv',
'connector.path' = '/opt/flink/notebooks/data/word_count_output1')
The magic keyword will automatically execute SQL in existing StreamingEnvironment.
Now we can apply the Machine Learning model. In plain Flink we can use UDF function defined
in python but we will use MLflow model which wraps the ML frameworks (like PyTorch, Tensorflow, Scikit-learn etc.). Because MLflow expose homogeneous interface we can
create another “jupyter magic” which will automatically load MLflow model as a Flink function.
%%flink_sql_query
SELECT word as smstext, SPAM_CLASSIFIER(word) as smstype FROM MySourceKafkaTable
which in our case will fetch kafka events and classify it using MLflow spam classifier. The
results will be displayed in the realtime in the Jupyter Notebook as a events DataFrame.
If we want we can simply use other python libraries (like matplotlib and others) to create
graphical representation of the results eg. pie chart.
You can also use docker image: qooba/flink:dev to test and run notebooks inside.
Please check the run.sh
where you have all components (Kafka, MySQL, Jupyter with Flink, MLflow repository).
In this article I will show how to use artificial intelligence to add motion to the images and photos.
Before you will continue reading please watch short introduction:
Face reenactment
To bring photos to life we can use the face reenactment algorithm designed to transfer the facial movements in the video to another image.
In this project I have used github implementation: https://github.com/AliaksandrSiarohin/first-order-model. Where the extensive description of the neural network architecture can be found in this paper. The solution contains of two parts: motion module and generation module.
The motion module at the first stage extracts the key points from the source and target image. In fact in the solution we assume that reference image which we can to the source and target image exists and at the first stage the transformations from reference image to source (T_{S \leftarrow R} (p_k)) and target (T_{T \leftarrow R} (p_k)) image is calculated respectively. Then the first order Taylor expansions \frac{d}{dp}T_{S \leftarrow R} (p)| {p=p_k} and \frac{d}{dp}T_{T \leftarrow R} (p)| {p=p_k} is used to calculate dense motion field.
The generation module use calculated dense motion field and source image to generate new image that will resemble target image.
The whole solution is packed into docker image thus we can simply reproduce the results using command:
NOTE: additional volumes (torch_models and checkpoints) are mount because during first run the trained neural networks are downloaded.
To reproduce the results we need to provide two files motion video and source image. In above example I put them into test directory and mount it into docker container (-v $(pwd)/test:/ai/test) to use them into it.
Below you have all command line options:
usage: prepare.py [-h] [--config CONFIG] [--checkpoint CHECKPOINT]
[--source_image SOURCE_IMAGE]
[--driving_video DRIVING_VIDEO] [--crop_image]
[--crop_image_padding CROP_IMAGE_PADDING [CROP_IMAGE_PADDING ...]]
[--crop_video] [--output OUTPUT] [--relative]
[--no-relative] [--adapt_scale] [--no-adapt_scale]
[--find_best_frame] [--best_frame BEST_FRAME] [--cpu]
first-order-model
optional arguments:
-h, --help show this help message and exit
--config CONFIG path to config
--checkpoint CHECKPOINT
path to checkpoint to restore
--source_image SOURCE_IMAGE
source image
--driving_video DRIVING_VIDEO
driving video
--crop_image, -ci autocrop image
--crop_image_padding CROP_IMAGE_PADDING [CROP_IMAGE_PADDING ...], -cip CROP_IMAGE_PADDING [CROP_IMAGE_PADDING ...]
autocrop image paddings left, upper, right, lower
--crop_video, -cv autocrop video
--output OUTPUT output video
--relative use relative or absolute keypoint coordinates
--no-relative don't use relative or absolute keypoint coordinates
--adapt_scale adapt movement scale based on convex hull of keypoints
--no-adapt_scale no adapt movement scale based on convex hull of
keypoints
--find_best_frame Generate from the frame that is the most alligned with
source. (Only for faces, requires face_aligment lib)
--best_frame BEST_FRAME
Set frame to start from.
--cpu cpu mode.
The GaN network consists of two parts of the Generator whose task is to generate the image from random input and a discriminator that checks if the generated image is realistic.
During training, the networks compete with each other, the generator tries to generate better and better images
and thereby deceive the Discriminator. On the other hand, the Discriminator learns to distinguish between real and generated photos.
To train the discriminator, we use both real photos and those generated by the generator.
Finally, we can achieve the following results using DCGAN network.
As you can see some faces look realistic while some are distorted, additionally the network can only generate low resolution images.
We can achieve much better results using the StyleGaN (arxiv article) network, which, among other things, differs in that the next layers of the network are progressively added during training.
I generated the images using pretrained networks and the effect is really amazing.
In this article I will show how to improve the quality of blurred face images using
artificial intelligence. For this purpose I will use neural networks and FastAI library (ver. 1)
Before you will continue reading please watch short introduction:
I have based o lot on the fastai course thus I definitely recommend to go through it.
Data
To train neural network how to rebuild the face images we need to provide the
faces dataset which will show how low quality and blurred images should be reconstructed.
Thus we need pairs of low and high quality images.
We will treat the original images as a high resolution data and rescale them
to prepare low resolution input:
import fastai
from fastai.vision import *
from fastai.callbacks import *
from fastai.utils.mem import *
from torchvision.models import vgg16_bn
from pathlib import Path
path = Path('/opt/notebooks/faces')
path_hr = path/'high_resolution'
path_lr = path/'small-96'
il = ImageList.from_folder(path_hr)
def resize_one(fn, i, path, size):
dest = path/fn.relative_to(path_hr)
dest.parent.mkdir(parents=True, exist_ok=True)
img = PIL.Image.open(fn)
targ_sz = resize_to(img, size, use_min=True)
img = img.resize(targ_sz, resample=PIL.Image.BILINEAR).convert('RGB')
img.save(dest, quality=60)
sets = [(path_lr, 96)]
for p,size in sets:
if not p.exists():
print(f"resizing to {size} into {p}")
parallel(partial(resize_one, path=p, size=size), il.items)
Now we can create data bunch for training:
bs,size=32,128
arch = models.resnet34
src = ImageImageList.from_folder(path_lr).split_by_rand_pct(0.1, seed=42)
def get_data(bs,size):
data = (src.label_from_func(lambda x: path_hr/x.name)
.transform(get_transforms(max_zoom=2.), size=size, tfm_y=True)
.databunch(bs=bs,num_workers=0).normalize(imagenet_stats, do_y=True))
data.c = 3
return data
data = get_data(bs,size)
Training
In this solution we will use a neural network with UNET architecture.
The UNET neural network contains two parts Encoder and Decoder which are used to reconstruct the face image.
During the first stage Encoder fetch the input, extracts and aggregates the image features. At each stage the features maps are donwsampled.
Then Decoder uses extracted features and tries to rebuild the image upsampling it at each decoding stage. Finally we get regenerated images.
Additionally we need to define the Loss Function which will tell the model if the image was rebuilt correctly and allow to train the model.
To do this we will use additional neural network VGG-16. We will put Generated image and Original image (which is our target) to the network input. Then will compare the features extracted for both images at selected layers and according to this calculated the loss.
Finally we will use Adam optmizer to minimize the loss and achieve better result.
def gram_matrix(x):
n,c,h,w = x.size()
x = x.view(n, c, -1)
return (x @ x.transpose(1,2))/(c*h*w)
base_loss = F.l1_loss
vgg_m = vgg16_bn(True).features.cuda().eval()
requires_grad(vgg_m, False)
blocks = [i-1 for i,o in enumerate(children(vgg_m)) if isinstance(o,nn.MaxPool2d)]
class FeatureLoss(nn.Module):
def __init__(self, m_feat, layer_ids, layer_wgts):
super().__init__()
self.m_feat = m_feat
self.loss_features = [self.m_feat[i] for i in layer_ids]
self.hooks = hook_outputs(self.loss_features, detach=False)
self.wgts = layer_wgts
self.metric_names = ['pixel',] + [f'feat_{i}' for i in range(len(layer_ids))
] + [f'gram_{i}' for i in range(len(layer_ids))]
def make_features(self, x, clone=False):
self.m_feat(x)
return [(o.clone() if clone else o) for o in self.hooks.stored]
def forward(self, input, target):
out_feat = self.make_features(target, clone=True)
in_feat = self.make_features(input)
self.feat_losses = [base_loss(input,target)]
self.feat_losses += [base_loss(f_in, f_out)*w
for f_in, f_out, w in zip(in_feat, out_feat, self.wgts)]
self.feat_losses += [base_loss(gram_matrix(f_in), gram_matrix(f_out))*w**2 * 5e3
for f_in, f_out, w in zip(in_feat, out_feat, self.wgts)]
self.metrics = dict(zip(self.metric_names, self.feat_losses))
return sum(self.feat_losses)
def __del__(self): self.hooks.remove()
feat_loss = FeatureLoss(vgg_m, blocks[2:5], [5,15,2])
learn = unet_learner(data, arch, wd=wd, loss_func=feat_loss, callback_fns=LossMetrics,
blur=True, norm_type=NormType.Weight)
Results
After training we can use the model to regenerate the images:
Application
Finally we can export the model and create the drag and drop application which fix the face images in web application.
The whole solution is packed into docker images thus you can simply start it using commands:
# with GPU
docker run -d --gpus all --rm -p 8000:8000 --name aiunblur qooba/aiunblur
# without GPU
docker run -d --rm -p 8000:8000 --name aiunblur qooba/aiunblur
To use GPU additional nvidia drivers (included in the NVIDIA CUDA Toolkit) are needed.
The popularity of drones and the area of their application is becoming greater each year.
In this article I will show how to programmatically control Tello Ryze drone, capture camera video and detect objects using Tensorflow. I have packed the whole solution into docker images (the backend and Web App UI are in separate images) thus you can simply run it.
Before you will continue reading please watch short introduction:
Architecture
The application will use two network interfaces.
The first will be used by the python backend to connect the the Tello wifi to send the commands and capture video stream. In the backend layer I have used the DJITelloPy library which covers all required tello move commands and video stream capture.
To efficiently show the video stream in the browser I have used the WebRTC protocol and aiortc library. Finally I have used the Tensorflow 2.0 object detection with pretrained SSD ResNet50 model.
The second network interface will be used to expose the Web Vue application.
I have used nginx to serve the frontend application
Application
Using Web interface you can control the Tello movement where you can:
* start video stream
* stop video stream
* takeoff – which starts Tello flight
* land
* up
* down
* rotate left
* rotate right
* forward
* backward
* left
* right
In addition using draw detection switch you can turn on/off the detection boxes on the captured video stream (however this introduces a delay in the video thus it is turned off by default). Additionally I send the list of detected classes through web sockets which are also displayed.
As mentioned before I have used the pretrained model thus It is good idea to train your own model to get better results for narrower and more specific class of objects.
Finally the whole solution is packed into docker images thus you can simply start it using commands:
Machine Learning is one of the hottest area nowadays. New algorithms and models are widely used in commercial solutions thus the whole ML process as a software development and deployment process needs to be optimized.
On the other hand MLFlow is a platform which can be run as standalone application. It doesn’t require Kubernetes thus the setup much more simpler then Kubeflow but it doesn’t support multi-user/multi-team separation.
In this article we will use Kubeflow and MLflow to build the isolated workspace and MLOps pipelines for analytical teams.
Currently we use Kubeflow platform in @BankMillennium to build AI solutions and conduct MLOPS process and this article is inspired by the experience gained while launching and using the platform.
Before you will continue reading please watch short introduction:
AI Platform
The core of the platform will be setup using Kubeflow (version 1.0.1) on Kubernetes (v1.17.0). The Kuberenetes was setup using Rancher RKE which simplifies the installation.
By default Kubeflow is equipped with metadata and artifact store shared between namespaces which makes it difficult to secure and organize spaces for teams. To fix this we will setup separate MLflow Tracking Server and Model Registry for each team namespace.
MLflow docker image qooba/mlflow:
FROM continuumio/miniconda3
RUN apt update && apt install python3-mysqldb default-libmysqlclient-dev -yq
RUN pip install mlflow sklearn jupyterlab watchdog[watchmedo] boto3
RUN conda install pymysql
ENV NB_PREFIX /
CMD ["sh","-c", "jupyter notebook --notebook-dir=/home/jovyan --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]
import os
import warnings
import sys
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearn
import logging
remote_server_uri='http://mlflow:5000'
mlflow.set_tracking_uri(remote_server_uri)
mlflow.set_experiment("/my-experiment2")
logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)
def eval_metrics(actual, pred):
rmse = np.sqrt(mean_squared_error(actual, pred))
mae = mean_absolute_error(actual, pred)
r2 = r2_score(actual, pred)
return rmse, mae, r2
warnings.filterwarnings("ignore")
np.random.seed(40)
# Read the wine-quality csv file from the URL
csv_url = (
"./winequality-red.csv"
)
try:
data = pd.read_csv(csv_url, sep=";")
except Exception as e:
logger.exception(
"Unable to download training & test CSV, check your internet connection. Error: %s", e
)
train, test = train_test_split(data)
train_x = train.drop(["quality"], axis=1)
test_x = test.drop(["quality"], axis=1)
train_y = train[["quality"]]
test_y = test[["quality"]]
alpha = 0.5
l1_ratio = 0.5
with mlflow.start_run():
lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
lr.fit(train_x, train_y)
predicted_qualities = lr.predict(test_x)
(rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)
print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
print(" RMSE: %s" % rmse)
print(" MAE: %s" % mae)
print(" R2: %s" % r2)
mlflow.log_param("alpha", alpha)
mlflow.log_param("l1_ratio", l1_ratio)
mlflow.log_metric("rmse", rmse)
mlflow.log_metric("r2", r2)
mlflow.log_metric("mae", mae)
tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
if tracking_url_type_store != "file":
mlflow.sklearn.log_model(lr, "model", registered_model_name="ElasticnetWineModel2")
else:
mlflow.sklearn.log_model(lr, "model")
I definitely recommend to use git versioned MLflow projects instead of running code directly from jupyter because
MLflow model registry will keep the git commit hash used for the run which will help to reproduce the results.
MLOps
Now I’d like to propose the process of building and deploying ML models.
Training
As described before the model is prepared and trained by the analyst which works in the Jupyter workspace and logs metrics and model to the MLflow tracking and model registry.
MLflow UI
Senior Analyst (currently the MLflow doesn’t support roles assignment) checks the model metrics and decides to promote it to Staging/Production stage in MLflow UI.
Model promotion
We will create additional application which will track the changes in the MLflow registry and initialize the deployment process.
The on each MLflow registry change the python application will check the database, prepare and commit k8s deployments and upload models artifacts to minio.
Because the applications commits the deployments to git repository we need to generate ssh keys:
No it is time to setup ArgoCD which will sync the Git deployments changes with Kubernetes configuration and automatically deploy newly promoted models.
Each time new model is promoted the ArgoCD applies new deployment with the new model s3 path:
- name: MODEL
value: s3://qooba/mlflow/1/e0167f65abf4429b8c58f56b547fe514/artifacts/model
Inference services
Finally we can access model externally and generate predictions. Please note that in article the model is deployed in the same k8s namespace (in real solution model will be deployed on the separate k8s cluster) thus to access the model I have to send authservice_session otherwise request will redirected to the dex login page.
Cutting photos background is one of the most tedious graphical task. In this article will show how to simplify it using neural networks.
I will use U^2-Net networks which are described in detail in the arxiv article and python library rembg to create ready to use drag and drop web application which you can use running docker image.
Before you will continue reading please watch quick introduction:
Neural network
To correctly remove the image background we need to select the most visually attractive objects in an image which is covered by Salient Object Detection (SOD). To connect a low memory and computation cost with competitive results against state of art methods the novel U^2-Net architecture will be used.
U-Net convolutional networks have characteristic U shape with symmetric encoder-decoder structure. At each encoding stage the feature maps are downsampled (torch.nn.MaxPool2d) and then upsampled at each decoding
stage (torch.nn.functional.upsample). Downsample features are transferred and concatenated with upsample features using residual connections.
U^2-Net network uses two-level nested U-structure where the main architecture is a U-Net like encoder-decoder and each stage contains residual U-block. Each residual U-block repeats donwsampling/upsampling procedures which are also connected using residual connections.
Nested U-structure extracts and aggregates the features at each level and enables to capture local and global information from shallow and deep layers.
The U^2-Net architecture is precisely described in arxiv article. Moreover we can go through the pytorch model definition of U2NET and U2NETP.
The lighter U2NETP version is only 4.7 MB thus it can be used in mobile applications.
Web application
The neural network is wrapped with rembg library which automatically download pretrained networks and gives simple python api. To simplify the usage I have decided to create drag and drop web application (https://github.com/qooba/aiscissors)
In the application you can drag and the drop the image and then compare image with and without background side by side.
You can simply run the application using docker image:
docker run --gpus all --name aiscissors -d -p 8000:8000 --rm -v $(pwd)/u2net_models:/root/.u2net qooba/aiscissors
To use GPU additional nvidia drivers (included in the NVIDIA CUDA Toolkit) are needed.
When you run the container the pretrained models are downloaded thus I have mount local directory u2net_models to /root/.u2net to avoid download each time I run the container.
U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection, Qin, Xuebin and Zhang, Zichen and Huang, Chenyang and Dehghan, Masood and Zaiane, Osmar and Jagersand, Martin Pattern Recognition 106 107404 (2020)
In this article I will show how to build the complete CI/CD solution for building, training and deploying multilingual chatbots.
I will use Rasa core framework, Gitlab pipelines, Minio and Redis to build simple two language google assistant.
Before you will continue reading please watch quick introduction:
Architecture
The solution contains several components thus I will describe each of them.
Google actions
To build google assistant we need to create and configure the google action project.
We will build our own nlu engine thus we will start with the blank project.
Then we need to install gactions CLI to manage project from command line.
To access your projects you need to authenticate using command:
gactions login
if you want you can create the project using templates:
As mentioned before for development purposes I have used the ngrok to proxy the traffic from public endpoint (used for webhook destination) to localhost:8081:
ngrok http 8081
NGINX with LuaJIT
Currently in google action project is not possible to set different webhook addresses for different languages thus I have used NGINX and LuaJIT to route the traffic to proper language container.
The information about language context is included in the request body which can be handled using Lua script:
server {
listen 80;
resolver 127.0.0.11 ipv6=off;
location / {
set $target '';
access_by_lua '
local cjson = require("cjson")
ngx.req.read_body()
local text = ngx.var.request_body
local value = cjson.new().decode(text)
local lang = string.sub(value["user"]["locale"],1,2)
ngx.var.target = "http://heygoogle-" .. lang
';
proxy_pass $target;
}
}
Rasa application
The rasa core is one of the famous framework for building chatbots. I have decided to create separate docker container for each language which gives flexibility in terms of scalability and deployment.
Dockerfile (development version with watchdog) for rasa application (qooba/rasa:1.10.10_app):
FROM rasa/rasa:1.10.10
USER root
RUN pip3 install python-jose watchdog[watchmedo]
ENTRYPOINT watchmedo auto-restart -d . -p '*.py' --recursive -- python3 app.py
Using default rasa engine you have to restart the container when you want to deploy new retrained model thus I have decided to wrap it with simple python application which additionally listen the redis PubSub topic and waits for event which automatically reloads the model without restarting the whole application. Additionally there are separate topics for different languages thus we can simply deploy and reload model for specific language.
Redis
In this solution the redis has two responsibilities:
* EventBus – as mentioned above chatbot app listen events sent from GitLab pipeline worker.
* Session Store – which keeps the conversations state thus we can simply scale the chatbots
We can simply run Redis using command:
docker run --name redis -d --rm --network gitlab redis
Minio
Minio is used as a Rasa Model Store (Rasa supports the S3 protocol). The GitLab pipeline worker after model training uploads the model package to Minio. Each language has separate bucket:
To run minio we will use command (for whole solution setup use run.sh where environment variables are set) :
Notice that I have used gitlab hostname (without this pipelines does not work correctly on localhost) thus you will need to add appropriate entry to /etc/hosts:
127.0.1.1 gitlab
Now you can create new project (in my case I called it heygoogle).
Likely you already use 22 port thus for ssh I used 8022.
You can clone the project using command (remember to setup ssh keys):
#!/bin/bash
lang=$1
echo "Processing $lang"
if (($(git diff-tree --no-commit-id --name-only -r $CI_COMMIT_SHA | grep ^$lang/ | wc -l) > 0)); then
echo "Training $lang"
cd $lang
rasa train
rasa test
cd ..
python3 pipeline.py --language $lang
else
echo
checks if something have changed in chosen language directory, trains and tests the model
and finally uploads trained model to Minio and publish event to Redis using pipeline.py:
Now after each change in the repository the gitlab starts the pipeline run:
Summary
We have built complete solution for creating, training, testing and deploying the chatbots.
Additionally the solution supports multi language chatbots keeping scalability and deployment flexibility.
Moreover trained models can be continuously deployed without chatbot downtime (for Kubernetes environments
the Canary Deployment could be another solution).
Finally we have integrated solution with the google actions and created simple chatbot.
Today I’m very happy to finally release my open source project DeepMicroscopy.
In this project I have created the platform where you can capture the images from the microscope, annotate, train the Tensorflow model and finally observe real time object detection.
The project is configured on the Jetson Nano device thus it can work with compact and portable solutions.
The whole solution was built using docker images thus now I will describe components installed on each device.
Jetson
The Jetson device contains three components:
* Frontend – Vue application running on Nginx
* Backend – Python application which is the core of the solution
* Storage – Minio storage where projects, images and annotations are stored
Training Server
The training server contains two components:
* Frontend – Vue application running on Nginx
* Backend – Python application which handles the training logic
2. Platform functionalities
The most of platform’s functionality is installed on the Jetson Nano. Because the Jetson Nano compute capabilities are insufficient for model training purposes I have decided to split this part into three stages which I will describe in the training paragraph.
Projects management
In the Deep Microscopy you can create multiple projects where you annotate and recognize different objects.
You can create and switch projects in the top left menu. Each project data is kept in the separate bucket in the minio storage.
Images Capture
When you open the Capture panel in the web application and click Play ▶ button the WebRTC socket between browser and backend is created (I have used the aiortc python library). To make it working in the Chrome browser we need two things:
* use TLS for web application – the self signed certificate is already configured in the nginx
* allow Camera to be used for the application – you have to set it in the browser
Now we can stream the image from camera to the browser (I have used OpenCV library to fetch the image from microscope through usb).
When we decide to capture specific frame and click Plus ✚ button the backend saves the current frame into project bucket of minio storage.
Annotation
The annotation engine is based on the Via Image Annotator. Here you can see all images you have captured for specific project. There are a lot of features eg. switching between images (left/right arrow), zoom in/out (+/-) and of course annotation tools with different shapes (currently the training algorithm expects the rectangles) and attributes (by default the class attribute is added which is also expected by the training algorithm).
This is rather painstaking and manual task thus when you will finish remember to save the annotations by clicking save button (currently there is no auto save). When you save the project the project file (with the via schema) is saved in the project bucket.
Training
When we finish image annotation we can start model training. As mentioned before it is split into three stages.
Data package
At the beginning we have to prepare data package (which contains captured images and our annotations) by clicking the DATA button.
Training server
Then we drag and drop the data package to the application placed on machine with higher compute capabilities.
After upload the training server automatically extracts the data package, splits into train/test data and starts training.
Currently I have used the MobileNet V2 model architecture and I base on the pretrained tensorflow model.
When the training is finished the model is exported using TensorRT which optimizes the model inference performance especially on NVIDIA devices like Jetson Nano.
During and after training you can inspect all models using builtin tensorboard.
The web application periodically check training state and when the training is finished we can download the model.
Uploading model
Finally we upload the TensorRT model back to the Jetson Nano device. The model is saved into selected project bucket thus you can use multiple models for each project.
Object detection
On the Execute panel we can choose model from the drop down list (where we have list of models uploaded for selected project) and load the model clicking RUN (typically it take same time to load the model). When we click Play ▶ button the application shows real time object detection. If we want to change the model we can click CLEAR and then choose and RUN another model.
Additionally we can fetch additional detection statistics which are sent using Web Socket. Currently the number of detected items and average width, height, score are returned.
Jupyter Notebook is one of the most useful tool for data exploration, machine learning and fast prototyping. There are many plugins and projects which make it even more powerful:
* jupyterlab-git
* nbdev
* jupyter debugger
But sometimes you simply need IDE …
One of my favorite text editor is vim. It is lightweight, fast and with appropriate plugins it can be used as a IDE.
Using Dockerfile you can build jupyter environment with fully equipped vim:
FROM continuumio/miniconda3
RUN apt update && apt install curl git cmake ack g++ python3-dev vim-youcompleteme tmux -yq
RUN sh -c "$(curl -fsSL https://raw.githubusercontent.com/qooba/vim-python-ide/master/setup.sh)"
RUN conda install xeus-python jupyterlab jupyterlab-git -c conda-forge
RUN jupyter labextension install @jupyterlab/debugger @jupyterlab/git
RUN pip install nbdev
RUN echo "alias ls='ls --color=auto'" >> /root/.bashrc
CMD bin/bash