Literature Review - CankayaUniversity/ceng-407-408-2021-2022-Autonomous-Drone-Control GitHub Wiki

CONTENTS

ÖZET

Son yıllarda drone teknolojisi birçok alanda kendini göstermektedir. Örneğin gazetecilik ve televizyon dünyasında, kargo dağıtımında, tarımda, acil yardım durumlarında havada giden droneları görmek artık imkansız değil. Dronelar insansız hava araçlarıdır. Drone; pervane, motor, gövde ve uçuş kontrol kartı gibi parçaların birleştirilmiş halidir. Biz de yapay zeka ve görüntü işleme teknolojilerini kullanarak dronun otomasyonunu sağlayacağız. Böylece daha önceden tanımlanan engelleri algılayıp hedefe ulaşabilecek. Aynı zamanda, bir kırılma olduğunda en az maaliyetle hemen durumun çözülmesini, güvenlik sorunlarının minimumda tutulmasını ve engellere hızlı tepki verebilmesini sağlayacağız. Bu literatür araştırmamızda bu konu hakkında daha önceden yapılmış olan projelerden, kendi projemizi nasıl yapacağımızdan, karşımıza çıkabilecek problemlerden ve onların çözümlerinden bahsediyoruz. Anahtar Kelimeler: Drone, Yapay Zeka ve Görüntü İşleme

ABSTRACT

In recent years, drone technology shows itself in many areas. For example, it is no longer impossible to see drones flying in the air in the world of journalism and show, cargo distribution, agriculture, and emergency situations. Drones are unmanned aerial vehicles. It consists of components such as propeller, engine, body and flight control board. We will also automate the drone by using artificial intelligence and image processing technologies. Thus, it will be able to detect previously defined obstacles and reach the target. At the same time, we will ensure that when a break occurs, the situation is resolved promptly with the least cost, security issues are kept to a minimum, and the ability to react quickly to obstacles. In this literature review, we mention about the projects that have been done before on this subject, how we will do our own project, the problems that we may encounter and their solutions. Keywords: Drone, Artificial Intelligence and Image Processing

INTRODUCTION

Drones are known as Unmanned Aerial Vehicles(UAV) with the name used in the technological field [1]. Drones are robots that can be controlled by remote control or fly automatically under the control of various software added to their embedded systems. Drones, which were used only to provide security in the military field in the past, are now used in many various and useful areas such as transportation, camera shots and fire extinguishing. Drones can perform tasks assigned to them, such as taking off, landing, and flying from one place to another, through a remote controller. These controllers can communicate with the drone using radio waves [2].

The aim of this study is to develop obstacle recognition skills for unmanned aerial vehicles by using Image Processing and Artificial Intelligence methods and in this direction, enabling drones to find their position by detecting the objects around them on autopilot.

In our Literature Review Report, we have Work Part containing Drone Preparation, Image Processing with Deep Neural Network Model and Ensuring that the software created with Image Processing and Deep Artificial Neural Network Model communicates with Pixhawk to automatically fly from desired locations. It also includes some of the problems that we may encounter during the project and their solutions. Finally, there is the conclusion part, where we evaluate the results of all our studies.

WORK PART

2.1. Drone Preparation

It is the preparation phase of the test drone. Based on our research, we chose a drone that will be multicopter style [3]. In order to carry out the tests easily, to repair the drone at an affordable cost in the massacres, to make it ready for flight again, and to reduce the security problems, it is considered to build a small, 20-30 cm diameter, quadcopter-style drone with 5-7 inch propellers. We are planning to build a quadcopter-style drone [4], which is the system that gives commands to the motor that provide the balance of the flying device by processing the information coming from the gyro and these sensors through the software running on its processor. It is planned that the engines will be selected strong and the payload of the drone up to 500 grams. It is thought to have a mini camera on the drone, a mini computer for image processing and artificial intelligence. The minicomputer is planned to be Jetson Xavier [5], which will benefit us in artificial intelligence coding and drone automation. According to the results of our research, it is planned to use open source code pixhawk autopilot as a 10 3A regulator autopilot on the drone to feed this computer and camera. This 10 3A regulator is used for all kinds of open source vehicles (multicopter, helicopter airplane, etc.), all kinds of multicopter types (tricopter, quadcopter, hexacopter, etc.) [6]. When a problem is encountered in the next tests, it is thought to have the option of turning off the engines with a remote command for safety reasons. The communication between Jetson and Pixhawk autopilot is planned to be done with UART communication ports, which is a communication protocol that provides communication between the peripheral units of the computers and microcontrollers on the Jetson and Pixhawk. It is considered to use 115,200 baud rate as the UART communication baud rate [7]. It is planned to create the structure of the UART communication system by having 8 bits of data as the UART communication protocol, selecting the parity bit, selecting 1 stop bit and selecting the timeout values and applying these values to both Jetson Xavier and Pixhawk autopilot. It is considered to use Linux Ubuntu operating system on Jetson Xavier as the operating system. The Pixhawk autopilot can communicate with standard 50 hertz motor drivers, but it is planned to increase motor driver communication to 400 hertz to make the drone respond faster. The images taken from the camera will be processed by Jetson using image processing and artificial intelligence and the guidance information obtained from this processing will be quickly transferred to the Pixhawk autopilot via the UART, and the Pixhawk autopilot will be able to fly with automatic responses at 400 hertz. The image processing model with the artificial neural network on our Jetson board is explained in the next phase.

2.2. Image Processing with Deep Neural Network Model

According to our research, first, a data set will be created with the images of the objects to be detected by the drone. This dataset creation process will be done with photographs of objects taken from different angles. The created dataset will not be a complete training set to be tested later, 80% of this dataset will be reserved for training and 20% for testing [8]. According to the results of our research, we planned to use CNN (Convolutional Neural Networks) artificial neural network on datasets for training. As an algorithm, we planned to use the YOLO (You Only Look Once) [9] algorithm. New training datasets will be processed with the YOLO algorithm using CNN. YOLO (You Only Look Once): It is an image processing algorithm using CNN (Convolutional Neural Networks). There are several reasons why we chose the YOLO algorithm. These are speed, accuracy and learning capacity for object detection and object recognition. YOLO [10] is generally superior to other algorithms in these aspects, so we plan to use this algorithm. The importance of these issues is outlined below.

Speed: In systems that will work in real time, objects must be detected very quickly. The drone is an agile platform that can move very fast, so speed is one of the most important issues for us. YOLO, which we will use as an algorithm, is 4-5 times faster than other object detection algorithms (such as RetinaNet-101, RetinaNet-50).

Accuracy: Accuracy is important as we are moving quickly with the drone live and YOLO can detect objects with very low error rate. After all, it is an algorithm that can detect objects both quickly and with high accuracy.

Learning capacity: The YOLO algorithm [11] is compatible with datasets expansion. Some other algorithms may suffer from poor prediction performance (especially in terms of accuracy) when there are many datasets. In the YOLO algorithm, a large number of objects can be defined quickly and accurately with extensible data sets.

YOLO's high object detection speed and prediction performance in the fast movements of the drone and its expandable learning capacity made us prefer this algorithm [12]. Our model, which will be formed as a result of our algorithm that will work with CNN on YOLO, will be run with our test data set, and our accuracy rate will be revealed according to the results obtained[13].

We will also include images of different objects in the test data, and in this way, the error rate of the trained model against incorrect images will show how the model will behave against different objects. Our accuracy rate will be determined as a result of the tests we perform with the test data.

                                        Figure 1: YOLO Architecture

                                        Figure 2: CNN Architecture

CNN (Convolutional Neural Networks)

First used in its modern form in 1990, CNN is a class of artificial neural networks most commonly used for image analysis in deep learning. Because CNN uses ReLU as activation, there can be a large number of data in the nerve and learning can take place without burden. CNN consists of basic foundation. These are Convolutional Layer - Pooling Layer - Fully Connected (FC) Layer.

Convolutional Layer

Our images in the training data have dimensions W x H x 3. W: width H: height 3: RGB (Red Green Blue) (Red Green Blue) value. W x H in the images represents a matrix. Each cell of this matrix represents a pixel. This pixel has an RGB value. This value is between 0-255. An N x N filter is determined according to the matrices of the images. The feature map of the image is created by sliding this filter over the image according to the Stride value (slip variable). This process is repeated many times with different filters. In this way, according to the attribute map, the object's color, corner, protrusion, etc. information is extracted.

ReLU

ReLU (Rectified Linear Unit) is a nonlinear function that operates as f(x) = max(0,x) [15]. So, if the relu function takes – it will output 0, if it takes + it will output that value itself. ReLU, whose main purpose is to get rid of negative values, has a very important position in CNNs. Nonlinear functions such as ReLU, tanh and sigmoid are used to prevent our model from learning negative values or not being able to grasp some features due to these negative values [14].

Max Pooling

Max Pooling is done to reduce the size of images with high dimensions without losing their properties [15]. Max Pooling is done by sliding over our W x H sized image with an N x N matrix. During this process, the highest value within a region of the size of our N x N max pooling matrix is determined as the new pixel value and other values are not used. Using a value directly belonging to the image without deriving a new value also prevents the image from being corrupted. Feature extraction takes place thanks to these processes.

Fully Connected Layer

Our matrix, which reaches up to the Fully Connected layer, is turned into rows by the Flattening process [2]. Values in rows are considered as Input Layer. Our neural network model has 3 stages as Input, Hidden and Output Layers. (Input Layer – Hidden Layer – Output Layer) Input Layer takes the incoming data in rows as input and transfers it to the hidden layers. Input layer node value * hidden layer node value gives the number of connections. The values coming to the Hidden Layer are connected to the Output Layer according to the output state we want. We will have outputs according to the number of objects we have determined in our project, plus the absence of any of these objects. We will have as many connections as hidden layer node value * output layer node value. The values we will end up with will be the object name or the absence of the object. As soon as the object is detected, our drone will react accordingly.

2.3. Ensuring that the software created with Image Processing and Deep Artificial Neural Network Model communicates with Pixhawk to automatically fly from desired locations

We are planning to develop the image processing and artificial neural network application, which we will develop based on the research we have done, in a structure that can process the image coming from the camera at 20 fps and make decisions at 20 hertz accordingly. In this way, it is aimed to process the frames as soon as they come from the camera and transfer the information accordingly. If the processor were on the ground, there would be several hundred milliseconds of delay due to RF (radio frequency) transmit and receive delays. But since the processor is on the drone, communication will take place via Jetson Xavier to UART, and from UART to autopilot at a baud rate of 115,200 via cables without RF (radio frequency) delay, so the drone will be able to react as quickly as possible. Considering that the drone can reach speeds of over 60 km/h, this minimum delay is very important in order to avoid problems. The system will make decisions with an accuracy of over 90 percent in each frame, and when the second frame arrives, it will work with 90 percent plus the remaining ninety percent of the ten percent, that is, 99 percent in total and with 3rd and 4th frames, this accuracy rate will increase much more [16]. We will test the system with many flight trials and we will be able to observe the accuracy rate. We will continue to work by updating the image processing and artificial neural network algorithms in case of potential problems and accelerating them if necessary [17].

CONCLUSION

As a result, the aim of this study is to develop possible obstacle models for drones by using Image Processing and Artificial Intelligence techniques and to enable the drones to pass these obstacles with autopilot and reach the target location. According to the results of our research, we will use CNN, Image Processing computer programming sub-branches, PyCharm IDE, LabelIMG, Cloud GPU programs and YOLO reference algorithm method. We will use TensorFlow, Keras, OpenCV, NumPy libraries for this project. A low confidence score in object detection is seen as a potential problem. A possible solution for the related problem is to expand the dataset and add different angles to the visuals.

REFERENCES

[1] About UAVs [Online]. Available:

https://en.wikipedia.org/wiki/Unmanned_aerial_vehicle

[Accessed 02/11/2021]

[2] Study about Drone [Online]. Available:

https://internetofthingsagenda.techtarget.com/definition/drone

[Accessed 02/11/2021]

[3] Study about “Attitude Control of Multicopter” [Online]. Available:

https://core.ac.uk/download/pdf/295548558.pdf

[Accessed 04/11/2021]

[4] Study about “Building a Quadcopter with Arduino” [Online]. Available:

https://abis.bozok.edu.tr/indir.php?file_id=710

[Accessed 06/11/2021]

[5] Jetson Xavier [Online]. Available:

https://developer.nvidia.com/embedded/jetson-agx-xavier-developer-kit#:~:text=The%20NVIDIA%20Jetson%20AGX%20Xavier%20Developer%20Kit%20is%20the%20latest,drones%20and%20other%20autonomous%20machines

[Accessed 06/11/2021]

[6] About multicopters [Online]. Available:

https://www.dronedoktoru.com/multikopter-nedir.html

[Accessed 06/11/2021]

[7] How UART works [Online]. Available:

https://herenkeskin.com/uart-nedir-ve-nasil-calisir/

[Accessed 07/11/2021]

[8] Shobhit Bhatnagar, “I Classification of Fashion Article Images using Convolutional Neural Networks,” 2017, Fourth International Conference on Image Information Processing (ICIIP) [Online]. Available:

https://ieeexplore.ieee.org/document/8313740

[Accessed 08/11/2021]

[9] Xia Zhao, “A novel three-dimensional object detection with the modified You Only Look Once method,” 2018, International Journal of Advanced Robotic Systems [Online]. Available:

https://journals.sagepub.com/doi/full/10.1177/1729881418765507

[Accessed 08/11/2021]

[10] Daniel Pestana, “A Full Featured Configurable Accelerator for Object Detection with YOLO, “ 2021, IEEE [Online]. Available:

https://ieeexplore.ieee.org/document/9435338

[Accessed 09/11/2021]

[11] About YOLO algorithm [Online]. Available:

https://medium.com/deep-learning-turkiye/yolo-algoritmas%C4%B1n%C4%B1-anlamak-290f2152808f

[Accessed 09/11/2021]

[12] About YOLO algorithm [Online]. Available:

https://www.youtube.com/watch?v=vRqSO6RsptU

[Accessed 10/11/2021]

[13] You Only Look Once: Unified, Real-Time Object Detection [Online]. Available:

https://arxiv.org/pdf/1506.02640.pdf

[Accessed 10/11/2021]

[14] About Residual blocks — Building blocks of ResNet [Online]. Available:

https://towardsdatascience.com/residual-blocks-building-blocks-of-resnet-fd90ca15d6ec

[Accessed 11/11/2021]

[15] Convolutional Neural Network (CNN) [Online]. Available:

https://medium.com/@tuncerergin/convolutional-neural-network-convnet-yada-cnn-nedir-nasil-calisir-97a0f5d34cad

[Accessed 11/11/2021]

[16] Smart Autopilot Drone System for Surface Surveillance and Anomaly Detection via Customizable Deep Neural Network [Online]. Available:

https://www.researchgate.net/publication/338529296_Smart_Autopilot_Drone_System_for_Surface_Surveillance_and_Anomaly_Detection_via_Customizable_Deep_Neural_Network

[Accessed 12/11/2021]

[17] Detection of a Moving UAV Based on Deep Learning-Based Distance Estimation [Online]. Available:

https://www.mdpi.com/2072-4292/12/18/3035/htm

[Accessed 13/11/2021]