Deep Learning Based Real-time Object Recognition System

recognition. In other words, deep learning needs a lot of images. Theas images can be collected in the Internet. However, categorizing data is the mos...

2 downloads 410 Views 2MB Size
Advanced Science and Technology Letters Vol.142 (SIT 2016), pp.103-110 http://dx.doi.org/10.14257/astl.2016.142.19

Deep Learning Based Real-time Object Recognition System with Image Web Crawler Myung-jae Lee1, Hyeok-june Jeong1, Young-guk Ha2 Department of Computer Science & Engineerinig Konkuk University Neungdong-ro, Gwangjin-gu, Seoul 143-701, Korea {dualespresso,amitajung}@naver.com 2 Department of Computer Science & Engineerinig Konkuk University Neungdong-ro, Gwangjin-gu, Seoul 143-701, Korea [email protected] 1

Abstract. Recently, deep learning algorithm becomes a great solution for various field. Convolutional Neural Network (CNN), a kind of neural network, is known as suitable method for image processing. This paper proposes a realtime object recognition system with CNN. Since many images are needed for deep learning, this system contains image web crawler that collects images automatically. This paper will show high accuracy in object recognition. Keywords: deep learning, object recognition, CNN, crawler

1

Introduction

There are a lot of data in the Internet. These data can be used with suitable process such as big data analysis which is the process of collecting, organizing and analyzing a lot of data to discover data patterns and useful information. There are many types of data, and one of them is image. Images have a lot of information and is used in various system such as speed camera system on the road, license plate recognition system and Google image searching system. Object recognition in image is one of the interested study because object recognition means that the system can understand like human think. In other words, object recognition is related with field of Artificial Intelligence. Furthermore, growth of deep learning algorithm accelerates object recognition system. Deep learning which is part of a machine learning is used in the many research and industry to help solve big data problems. It has various architectures such as Convolutional Deep Neural Networks (CNN). CNN, which is inspired by the organization of the visual cortex of animal, is the feed forward networks between its neurons. It can be used vision computing system, such as object recognition system. Deep learning algorithm with CNN can help analyze image data. It trains a lot of categorized images and helps

2

Corresponding author

ISSN: 2287-1233 ASTL Copyright © 2016 SERSC

Advanced Science and Technology Letters Vol.142 (SIT 2016)

recognition. In other words, deep learning needs a lot of images. Theas images can be collected in the Internet. However, categorizing data is the most important work before using data because usable and unusable data for purpose are mixed in the Internet. For this reason, there are increase of researches which is related with collecting data. Web crawling is one of the collecting method. Web crawler collects data with established category and helps manage data. This paper proposes a real-time object recognition system with web crawler. This system collects images automatically and trains collected images. With trained model, the system recognize objects in real-time with camera. This system focuses on the object which appear in the road such as car, traffic sign and police.

2

Related Work

There have been significant researches that have tired for archiving object recognition in images. Several papers have proposed way of using Scale Invariant Feature Transform (SIFT) algorithm[1,2]. Paper [1] suggested image recognition system with pyramidal descriptor adapted SIFT algorithm and paper [2] proposed image recognition system for colorectal polyp histology with SIFT. Both are well designed system; however these systems are simply processed, so that its result can be incorrect. Research [3] suggested persimmon growing monitoring system with analyzing image and paper [4] proposed image recognition system with three dimensional Speed Up Robust Feature (SURF) algorithm. However both are have a possibility of erroneous result. For this reason, there have been many studies that have aimed to archive object recognition with deep learning. Paper [5] proposed deep learning based visual tracking system and research [6] suggested multiple instance analysis system with deep learning. Both focused on the use of deep learning in object recognition, not performances. To improve these performances, various designs are suggested. Related work [7] suggested very deep CNN for recognition large-scale images. Paper [8] showed hierarchical feature extraction to improve image recognition performances. Research [9] proposed simple network of learning for fast image recognition. These studies showed great performance but they have not managing system for images which are trained and have not real-time system. This paper suggests real-time image recognition system with image crawler. This system tries for recognizing objects which are detected on the road such as Car, Ambulance and Pedestrians. And this paper proposes a way to manage and collect images and to recognize object in real-time.

104

Copyright © 2016 SERSC

Advanced Science and Technology Letters Vol.142 (SIT 2016)

3

System Design

Fig. 1. Overall architecture of this system.

Figure 1 shows overall architecture of this system. Web Image Crawler designs ontology and collects images with designed ontology. This system crawls images automatically and saves images in Hadoop Distributed File System (HDFS). Image Trainer brings images from a HDFS and learns these images. Image Trainer makes DNN model profile that fundamental source of Image Recognizer after learning. Image Recognizer makes DNN model from downloaded DNN model profile and detects objects from images which is captured from the camera in real time. 3.1

Image Web Crawler

Fig. 2. System Flow of Image Web Crawler

Figure 2 shows automatic image web crawling system. Ontology Manager generate ontology and instance file. Ontology changes experiences in the real world into modeled concept for computer. Ontology Manager consist of various objects on the road. Web Page Searcher searches keyword with instances of ontology and takes web source. This study used Selenium Google Chrome Driver for page searching. Image Crawler crawls URLs of images from parsed web source. File Handler saves images to HDFS. Before saving, it checks duplication of URL and changes URLs to images. This study constructed HDFS on cluster server with 60 virtual nodes. HDFS is suitable system to save big data.

Copyright © 2016 SERSC

105

Advanced Science and Technology Letters Vol.142 (SIT 2016)

3.2

Image Trainer

Fig. 3. System Flow of Image Trainer

As shown Figure 3 above, Image Trainer has three layered process. Image Downloader brings big data images from a HDFS. Images which saved in HDFS are classified with their category. Image Downloader brings these images as it is. Image Learner trains images with deep learning. For deep learning network, this study used Convolutional Neural Network (CNN) to recognize object from image.

Fig. 4. Graph and Example of Overfitting

CNN uses multiple filter to focus on a small area and get one number. By focusing on a small area repeatedly, feature of image is found. However, this useful network was not used until a few years ago. Since it focuses on small area repeatedly, its result becomes detailed. As a result, trained system recognize only trained images but not 106

Copyright © 2016 SERSC

Advanced Science and Technology Letters Vol.142 (SIT 2016)

testing images which is in same category but not used in train. It is named Overfitting. For example, the system trained Police Car but cannot recognize not trained Police Car as shown Figure 4 above. However, as the dropout concept was proposed, Overfitting problem was solved. Dropout eliminates overfitting and increase its accuracy. Deep learning with CNN can be implemented with various libraries. This paper implemented with Caffe framework which is considered to be rapid and is modularized with C++, python and Matlab. DNN Model Profile Manager makes DNN model profile from the result of image training and sends DNN model profile to image recognizer. 3.3

Image Recognizer

Fig. 5. System Flow of Image Recognizer

As shown Figure 5 above, Image Recognizer has two inputs, captured image and DNN Model Profile. DNN Model Regenerator generates DNN model with received DNN Model Profile. As Image Recognizer regenerates DNN Model, multiple Image Recognizer can be used in this system. Image Receiver captures images from the camera. Object Recognizer detects object from a received images with DNN model which is regenerated on DNN Model Regenerator. Recognition Result Logger saves result of object recognition. This log can be used in feedback of this system.

4

Implementation

The proposed Image Trainer is implemented on Ubuntu 14.04. To increase learning performance, we used four GPGPUs and high-performance CPU. Caffe library was used for deep learning and CUDA was used for using GPGPU. Image Recognizer is implemented in Ubuntu 14.04 and used GPGPU for image recognition. Table 1.

Implementation Environment

CPU RAM

Copyright © 2016 SERSC

Image Trainer Intel Xeon E5 2.40GHz 128GB

Image Recognizer Intel i7 3.60 GHz 16GB

107

Advanced Science and Technology Letters Vol.142 (SIT 2016)

HDD GPGPU OS Libraries

1TB SSD Geforce GTX 1080 * 4 Ubuntu 14.04 LTS CUDA, OpenCV, Caffe

256GB SSD Geforce GTX 1080 Ubuntu 14.04 LTS CUDA, OpenCV, Caffe

This implementation trained 65,000 images which is collected in Image Web Crawler and has general resolution; general resolution is in range from 640x480 to 1920x1080. To increase accuracy, it was learned with 100,000 iteration and 25 network layers. The recognizer experiment uses captured image from the camera in real-time.

Fig. 6. A Part of the Designed Ontology

Ontology was designed as shown Figure 6. It is comprised of various objects which is detected on the road. There are various instances at the bottom of the ontology tree and sub instances at the child of instance. For example, two sub instances, Kia K5 and Hyundai Sonata, are located for the child of Mid-size Car. These sub instances are used for keyword to search images.

Fig. 7. Designed Convolutional Neural Network

As shown Figure 7 above, convolutional neural network is designed with 24 layers. All of focused small images are use this network. To solve Gradient Vanishing problem, ReLU layer is used for activate function in every Convolutional Layer. If the

108

Copyright © 2016 SERSC

Advanced Science and Technology Letters Vol.142 (SIT 2016)

system uses sigmoid function as a activate function, a gradient becomes zero value. Using ReLU function can solve this problem with low calculating time. However, this function makes input size too big and it can be a critical problem in learning algorithm with increment of calculating time and lack of memory space. Pooling layer is solution of this problem. With pooling layer, input size can be reduced. Local Response Normalization (LRN) and Dropout layer prevent overfitting. Fully Connected (FC) layer, which is implemented after Conv layer, classified images. In Output layer, Softmax layer transforms result value to possibility.

Fig. 8. Result of implementation

The experiment trained with 65,000 images and 100,000 iteration. Calculating time on training was 48 hours. Recognition system classified with 17 classes. An accuracy of object recognition resulted 99% and calculating time was below 50ms in average.

5

Conclusion

This paper proposed a real-time object recognition system with image web crawler. The proposed crawling system was designed for flexibility that can be modified easily with ontology. This paper designed CNN to achieve high performance. The deep learning system performed great accuracy. The recognizer was designed to implement in itself, if DNN model profile is provided. In other words, this recognizer need not exchange image or recognition data with deep learning system, but only need download DNN model profile one time. The proposed system will be modified in the near future. Recognition system will apply to crawling process. Recognizer which is implemented in crawling process checks images whether it is proper or not. This system will make accuracy of images for training higher.

Copyright © 2016 SERSC

109

Advanced Science and Technology Letters Vol.142 (SIT 2016)

Acknowledgments. This work was supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(MSIP) (R7118-16-1002, Development of Driving Computing System Supporting Real-time Sensor Fusion Processing for Self-Driving Car)

References 1.

Seidenari, L.: Local pyramidal descriptors for image recognition. IEEE transactions on pattern analysis and machine intelligence 36.5 (2014): 1033-1040. 2. Kominami, Y.: Computer-aided diagnosis of colorectal polyp histology by using a realtime image recognition system and narrow-band imaging magnifying colonoscopy. Gastrointestinal endoscopy 83.3 (2016): 643-649. 3. Chang, K.-C.: Design of persimmon growing stage monitoring system using image recognition technique. Consumer Electronics-Taiwan (ICCE-TW), 2016 IEEE International Conference on. IEEE, 2016. 4. Redondo-Cabrera, C.: Surfing the point clouds: Selective 3d spatial pyramids for categorylevel object recognition. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012. 5. Wang, N., Yeung, D.-Y.: Learning a deep compact image representation for visual tracking. Advances in neural information processing systems. 2013. 6. Xu, Y.: Deep learning of feature representation with multiple instance learning for medical image analysis. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014. 7. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014). 8. Li, H.: Hierarchical feature extraction with local neural response for image recognition. IEEE transactions on cybernetics 43.2 (2013): 412-424. 9. Chan, T.-H.: PCANet: A simple deep learning baseline for image classification? IEEE Transactions on Image Processing 24.12 (2015): 5017-5032. 10. Srivastava, N.: Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15.1 (2014): 1929-1958.

110

Copyright © 2016 SERSC