陆宇振
[转载]Integration of Qt, OpenCV and Tensorflow
2019-5-5 21:42
阅读:2303

How to Create an Image Classifier Using Qt, OpenCV and TensorFlow

[Updated this post on April 04, 2019, to make sure this tutorial is compatible with OpenCV 4.x and TensorFlow 2.0]

In this post we’re going to learn how to create an image classifier application with a proper GUI that allows the users to choose a camera or a video file as the input and classify the incoming images (video or camera frames) in real time. We’ll be using the power of Qt for cross-platform GUI creation and anything related to visualizing the output and allowing a seamless experience for the user. We’ll also be using TensorFlow and OpenCV to handle the actual classification task, along with accessing cameras and video files.


With the more recent versions of OpenCV, a new module was introduced for handling Deep Learning problems, which is getting better and better by each release. This module is called “dnn” and I recommend getting the latest version of OpenCV (3.4.1 at the time of writing this article) to make sure you face no issues at all (or less issues if any). Using this new module, we can load and use deep learning models from popular 3rd party libraries such as TensorFlowCaffeDarkNet and so on. In our example project we’ll use pre-trained TensorFlow models, namely ssd_mobilenet_v1_coco, however, you can easily use other models too if you get a firm grasp on all of the information provided here.

Prerequisites

We’ll be performing everything described here in Windows operating system, so we’ll cover the prerequisites for Windows. On other operating systems such as macOS or Linux though, the only major change is the compiler which is either provided as part of Xcode or GCC. So, here is what you need on Windows:

• Microsoft Visual Studio 2017 (https://www.visualstudio.com)
• Qt5 (https://www.qt.io)
• OpenCV 3 (https://opencv.org)
• CMake (https://cmake.org)
• Python 64-bit (https://www.python.org)
• TensorFlow (https://www.tensorflow.org)

Make sure to install the latest version of all dependencies mentioned here. At the time of writing this article, it would be Qt5.10.1, OpenCV 3.4.1, CMake 3.10.2, Python 3.6.4 and TensorFlow 1.6.

[Update April 04, 2019: Qt5.12.2, OpenCV 4.0.1 and TensorFlow 2.0 will also work, as long as you take into account the updated notes such as this one, throughout this tutorial.]

If your OpenCV installation does not contain 64-bit MSVC 15 libraries, then you need to build them from source by yourself. You search my website for detailed guides on how to do just that! You can continue reading if you have all dependencies in place.

Creating a Qt GUI Project That Uses CMake

We are going to create a Qt GUI application project that uses CMake. This is a good point to learn about this if you have never used CMake with Qt Creator before. It will also allow a lot easier integration of OpenCV libraries into our Qt project. So, start by creating a new project as seen in the next screenshot (Choose Plain C++ Application):

qt new file or project

Press “Choose” and on the next screen make sure to name your project “Image Classifier” (or anything else you prefer). Also choose CMake and the “Build System” in the next screen:

qt choose cmake

After the project is created, replace all of the contents of CMakeLists.txt file with the following (the comments in the following code are meant as a description to why each line exists at all):

# Specify the minimum version of CMake(3.1 is currently recommended by Qt)
cmake_minimum_required(VERSION 3.1)

# Specify project title
project(ImageClassifier)

# To automatically run MOC when building(Meta Object Compiler)
set(CMAKE_AUTOMOC ON)

# To automatically run UIC when building(User Interface Compiler)
set(CMAKE_AUTOUIC ON)

# To automatically run RCC when building(Resource Compiler)
set(CMAKE_AUTORCC ON)

# Specify OpenCV folder, and take care of dependenciesand includes
set(OpenCV_DIR "path_to_opencv")
find_package(OpenCV REQUIRED)
include_directories(${ OpenCV_INCLUDE_DIRS })

# Take care of Qt dependencies
find_package(Qt5 COMPONENTS Core Gui Widgets REQUIRED)

# add required source, header, uiand resource files
add_executable(${ PROJECT_NAME } "main.cpp" "mainwindow.h" "mainwindow.cpp" "mainwindow.ui")

# link required libs
target_link_libraries(${PROJECT_NAME} Qt5::Core Qt5::Gui Qt5::Widgets ${OpenCV_LIBS})

You can also download the final CMakeLists.txt file from here:

http://amin-ahmadi.com/downloadfiles/qt-opencv-tensorflow/CMakeLists.txt

Just make sure to replace “path_to_opencv” with the actual path to your OpenCV installation. That would be the folder where “OpenCVConfig.cmake” and “OpenCVConfig-version.cmake” files exist. Don’t worry about the mainwindow entries as they will be added later on.

Next, replace all of the contents of “main.cpp” with the following:

#include "mainwindow.h"
#include <QApplication>

int main(int argc, char* argv[])
{
	QApplication a(argc, argv);
	MainWindow w;
	w.show();

	return a.exec();
}

Now, to add a Main Window to our application, select “New File or Project” from the “File” menu and then choose Qt Designer Form Class as seen below:

qt new class

Make sure to choose “MainWindow” in the next screen, as seen here:

qt designer class

We’ll design a GUI similar to the following:

qt creator

You can also get “mainwindow.ui” file from here:

http://amin-ahmadi.com/downloadfiles/qt-opencv-tensorflow/mainwindow.ui

In the “mainwindow.h” file, you need to start by adding the required includes, as seen here:

#include <QMainWindow>
#include <QMessageBox>
#include <QDebug>
#include <QFile>
#include <QElapsedTimer>
#include <QGraphicsScene>
#include <QGraphicsPixmapItem>
#include <QCloseEvent>
#include <QFileDialog>
#include <opencv2/opencv.hpp>

We will also need the following private members:

cv::dnn::Net tfNetwork;
QGraphicsScene scene;
QGraphicsPixmapItem pixmap;
bool videoStopped;

Where tfNetwork is the Deep Learning Network class in OpenCV, scene and pixmaps are used for displaying the output properly, and finally videoStopped is used as a flag to stop the video. We are going to keep things as simple as possible here.

You can download “mainwindow.h” file from here:

http://amin-ahmadi.com/downloadfiles/qt-opencv-tensorflow/mainwindow.h

“mainwindow.cpp” contains a number of methods to deal with the user interactions but the most important piece of code in it, is the part responsible for loading the TensorFlow model and configurations, and then performing detections. Here is how it is done. First of all, the pretrained model is loaded into the network (you’ll see how and where to get the models later on):

tfNetwork = readNetFromTensorflow(ui->pbFileEdit->text().toStdString(), ui->pbtxtFileEdit->text().toStdString());

pbFileEdit and pbtxtFileEdit in this code are two Qt Line Edit widgets that hold the paths to the required files. The next step is loading a video or an existing camera on the computer. Using two radio buttons we can allow the users to switch between camera and video mode and then open the chosen one, here’s how:

VideoCapture video;
if (ui->cameraRadio->isChecked())
	video.open(ui->cameraSpin->value());
else
	video.open(ui->videoEdit->text().toStdString());

Where cameraRadio is a Radio Button, cameraSpin is a Spin Box and videoEdit is a Line Edit widget. Next thing we need to do is loop while reading video frames and processing them, until the video is ended or it is stopped. Here’s a simple solution for that:

Mat image;

while (!videoStopped && video.isOpened())
{
	video >> image;

	// Detect objects ...

	qApp->processEvents();
}

There are many different ways to achieve responsiveness in GUI when performing such tasks, and this is one of them. The more recommended way would be to move this part of the code into a QThread, but as it was mentioned before, we’ll keep things simple. The actual detection part is done as seen below.

First create a BLOB compatible with TensorFlow models:

Mat inputBlob = blobFromImage(image,
	inScaleFactor,
	Size(inWidth, inHeight),
	Scalar(meanVal, meanVal, meanVal),
	true,
	false);

The values provided to blobFromImage are defined as constants and must be provided by the network provider (see the references section at the bottom to peek into where they come from). Here’s what they are:

const int inWidth = 300;
const int inHeight = 300;
const float meanVal = 127.5; // 255 divided by 2
const float inScaleFactor = 1.0f / meanVal;

[Update April 04, 2019: You will only need inScaleFactor if you are using OpenCV 4.x and TensorFlow 2.0, pass it a value of 0.95]

To actually provide the blob to the network and get the detection results, do the following:

tfNetwork.setInput(inputBlob);
Mat result = tfNetwork.forward();
Mat detections(result.size[2], result.size[3], CV_32F, result.ptr<float>());

In the preceding code we simply set the input of the network to the prepared blob, then calculate the result of the network using the forward() method, and finally create a detections Mat class which has rows and columns equal to the third (height) and fourth (width) element of the Mat::size. See the following link, or the documentation for imagesFromBlob function if you feel lost:

https://docs.opencv.org/3.4.1/d6/d0f/group__dnn.html#ga4051b5fa2ed5f54b76c059a8625df9f5

The next part is extracting detections (based on an acceptable threshold), then getting the bounding boxes for objects, printing the name of the detected object class over it (from a previously loaded string of labels), and finally displaying it:

for (int i = 0; i < detections.rows; i++)
{
	float confidence = detections.at<float>(i, 2);

	if (confidence > confidenceThreshold)
	{
		using namespace cv;

		int objectClass = (int)(detections.at<float>(i, 1));

		int left = static_cast<int>(
			detections.at<float>(i, 3) * image.cols);
		int top = static_cast<int>(
			detections.at<float>(i, 4) * image.rows);
		int right = static_cast<int>(
			detections.at<float>(i, 5) * image.cols);
		int bottom = static_cast<int>(
			detections.at<float>(i, 6) * image.rows);

		rectangle(image, Point(left, top),
			Point(right, bottom), Scalar(0, 255, 0));
		String label = classNames[objectClass].toStdString();
		int baseLine = 0;
		Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX,
			0.5, 2, &baseLine);
		top = max(top, labelSize.height);
		rectangle(image, Point(left, top - labelSize.height),
			Point(left + labelSize.width, top + baseLine),
			Scalar(255, 255, 255), FILLED);
		putText(image, label, Point(left, top),
			FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));
	}
}

pixmap.setPixmap(
	QPixmap::fromImage(QImage(image.data,
		image.cols,
		image.rows,
		image.step,
		QImage::Format_RGB888).rgbSwapped()));
ui->videoView->fitInView(&pixmap, Qt::KeepAspectRatio);

You can download mainwindow.cpp file from here:

http://amin-ahmadi.com/downloadfiles/qt-opencv-tensorflow/mainwindow.cpp

Our application is more than ready! But we still need to get and prepare a TensorFlow network, so, let’s move on to the next section.


Getting and Preparing TensorFlow Model

First of all, start by downloading a pre-trained model from TensorFlow model zoo:

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

We will be using ssd_mobilenet_v1_coco which you can directly download from here:

http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz

[Update April 04, 2019: Or you can use a more recent version such as this one.]

Extract it to get ssd_mobilenet_v1_coco_2017_11_17 folder with the pre-trained files.

You need to get the text graph file for the model, one that is compatible with OpenCV. To do this you must use the following script:

opencv-source-files\samples\dnn\tf_text_graph_ssd.py

If you don’t have OpenCV source files (which would be quite strange at this point), you can get the script from here:

https://github.com/opencv/opencv/blob/master/samples/dnn/tf_text_graph_ssd.py

Just copy it to ssd_mobilenet_v1_coco_2017_11_17 folder and execute the following:

tf_text_graph_ssd.py --input frozen_inference_graph.pb --output frozen_inference_graph.pbtxt

Update April 04, 2019: In more recent versions of OpenCV, you need to also copy tf_text_graph_common.py file, in addition to tf_text_graph_ssd.py, into the folder mentioned above and execute the following command:

tf_text_graph_ssd.py --input frozen_inference_graph.pb --output frozen_inference_graph.pbtxt --config pipeline.config

Class names for this model can be found here:

https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_label_map.pbtxt

This file won’t be of use for us the way it is, so here is a here’s a simpler (CSV) format that I’ve prepared to use to display class names when detecting them:

http://amin-ahmadi.com/downloadfiles/qt-opencv-tensorflow/class-names.txt

Now we have everything we need to run and test our classification app in action.


Running the Image Classifier Application

Run the application in Qt Creator and switch to the Settings tab at the bottom of the screen. You can choose the input files in this page, as seen in the screenshot:

settings image classifier

Now switch back to the Controls tab and hit the Start button. Try objects that exist in the class names list and see how the results are. Here’s what I got which is pretty good if we ignore the fact that my iPad is not exactly a TV:

image classifier

Following pages, one way or another, helped me a lot while writing this guide:

https://github.com/opencv/opencv/tree/master/samples/dnn

https://www.tensorflow.org/tutorials/image_retraining

References:

http://amin-ahmadi.com/2018/03/15/how-to-create-an-image-classifier-using-qt-opencv-and-tensorflow/

https://felgo.com/cross-platform-development/machine-learning-add-image-classification-for-ios-and-android-with-qt-and-tensorflow


转载本文请联系原作者获取授权,同时请注明本文来自陆宇振科学网博客。

链接地址:https://wap.sciencenet.cn/blog-578676-1177281.html?mobile=1

收藏

分享到:

当前推荐数:0
推荐到博客首页
网友评论0 条评论
确定删除指定的回复吗?
确定删除本博文吗?