The popularity of drones and the area of their application is becoming greater each year.
In this article I will show how to programmatically control Tello Ryze drone, capture camera video and detect objects using Tensorflow. I have packed the whole solution into docker images (the backend and Web App UI are in separate images) thus you can simply run it.
Before you will continue reading please watch short introduction:
The application will use two network interfaces.
The first will be used by the python backend to connect the the Tello wifi to send the commands and capture video stream. In the backend layer I have used the DJITelloPy library which covers all required tello move commands and video stream capture.
To efficiently show the video stream in the browser I have used the WebRTC protocol and aiortc library. Finally I have used the Tensorflow 2.0 object detection with pretrained SSD ResNet50 model.
The second network interface will be used to expose the Web Vue application.
I have used nginx to serve the frontend application
Using Web interface you can control the Tello movement where you can:
* start video stream
* stop video stream
* takeoff – which starts Tello flight
* rotate left
* rotate right
In addition using draw detection switch you can turn on/off the detection boxes on the captured video stream (however this introduces a delay in the video thus it is turned off by default). Additionally I send the list of detected classes through web sockets which are also displayed.
As mentioned before I have used the pretrained model thus It is good idea to train your own model to get better results for narrower and more specific class of objects.
Finally the whole solution is packed into docker images thus you can simply start it using commands:
Today I’m very happy to finally release my open source project DeepMicroscopy.
In this project I have created the platform where you can capture the images from the microscope, annotate, train the Tensorflow model and finally observe real time object detection.
The project is configured on the Jetson Nano device thus it can work with compact and portable solutions.
The whole solution was built using docker images thus now I will describe components installed on each device.
The Jetson device contains three components:
* Frontend – Vue application running on Nginx
* Backend – Python application which is the core of the solution
* Storage – Minio storage where projects, images and annotations are stored
The training server contains two components:
* Frontend – Vue application running on Nginx
* Backend – Python application which handles the training logic
2. Platform functionalities
The most of platform’s functionality is installed on the Jetson Nano. Because the Jetson Nano compute capabilities are insufficient for model training purposes I have decided to split this part into three stages which I will describe in the training paragraph.
In the Deep Microscopy you can create multiple projects where you annotate and recognize different objects.
You can create and switch projects in the top left menu. Each project data is kept in the separate bucket in the minio storage.
When you open the Capture panel in the web application and click Play ▶ button the WebRTC socket between browser and backend is created (I have used the aiortc python library). To make it working in the Chrome browser we need two things:
* use TLS for web application – the self signed certificate is already configured in the nginx
* allow Camera to be used for the application – you have to set it in the browser
Now we can stream the image from camera to the browser (I have used OpenCV library to fetch the image from microscope through usb).
When we decide to capture specific frame and click Plus ✚ button the backend saves the current frame into project bucket of minio storage.
The annotation engine is based on the Via Image Annotator. Here you can see all images you have captured for specific project. There are a lot of features eg. switching between images (left/right arrow), zoom in/out (+/-) and of course annotation tools with different shapes (currently the training algorithm expects the rectangles) and attributes (by default the class attribute is added which is also expected by the training algorithm).
This is rather painstaking and manual task thus when you will finish remember to save the annotations by clicking save button (currently there is no auto save). When you save the project the project file (with the via schema) is saved in the project bucket.
When we finish image annotation we can start model training. As mentioned before it is split into three stages.
At the beginning we have to prepare data package (which contains captured images and our annotations) by clicking the DATA button.
Then we drag and drop the data package to the application placed on machine with higher compute capabilities.
After upload the training server automatically extracts the data package, splits into train/test data and starts training.
Currently I have used the MobileNet V2 model architecture and I base on the pretrained tensorflow model.
When the training is finished the model is exported using TensorRT which optimizes the model inference performance especially on NVIDIA devices like Jetson Nano.
During and after training you can inspect all models using builtin tensorboard.
The web application periodically check training state and when the training is finished we can download the model.
Finally we upload the TensorRT model back to the Jetson Nano device. The model is saved into selected project bucket thus you can use multiple models for each project.
On the Execute panel we can choose model from the drop down list (where we have list of models uploaded for selected project) and load the model clicking RUN (typically it take same time to load the model). When we click Play ▶ button the application shows real time object detection. If we want to change the model we can click CLEAR and then choose and RUN another model.
Additionally we can fetch additional detection statistics which are sent using Web Socket. Currently the number of detected items and average width, height, score are returned.
Tensorflow meets C# Azure function and … . In this post I would like to show how to deploy tensorflow model with C# Azure function. I will use the TensorflowSharp the .NET bindings to the tensorflow library. The InterceptionInterface will be involved to create http endpoint which will recognize the images.
I will start with creating .net core class library and adding TensorFlowSharp package:
dotnet new classlib
dotnet add package TensorFlowSharp -v 1.9.0
Then create file TensorflowImageClassification.cs:
Here I have defined the http entrypoint for the AzureFunction (Run method). The q query parameter is taken from the url and used as a url of the image which will be recognized.
The function will automatically download the trained interception model thus the function first run will take little bit longer. The model will be saved to the D:\home\site\wwwroot\.
The convolutional neural network graph will be kept in the memory (graphCache) thus the function don’t have to read the model every request. On the other hand the input image tensor has to be prepared and preprocessed every time (ConstructGraphToNormalizeImage).
Finally I can run command:
which will create the package for the function deployment.
To deploy the code I will create the Azure Function (Consumption) with the http trigger. Additionally I will set the function entry point, the function.json will be defined as:
The kudu will be used to deploy the already prepared package. Additionally I have to deploy the libtensorflow.dll from /runtimes/win7-x64/native (otherwise the Azure Functions won’t load it). The bin directory should look like:
Finally I can test the azure function:
The function recognize the image and returns the label with the highest probability.