Image Recognition Using Deep Learning

Student: Bruno Barbosa
Supervisor: António Ramires e Manuel João Ferreira (Neadvance)

Abstract

Computer vision is a vast knowledge subject responsible for traducing digital images and videos into a higher level of understandable information. Image recognition is one of the several tasks that are inserted in this subject and it can be subdivided in object recognition (also called as object classification), segmentation, identification and detection.

Some of the available alternatives for image recognition are based on machine learning approaches. Deep learning is a branch of machine learning that became very popular in the last years due to its success in previously considered hard tasks. The lack of large amounts of data and efficient computational resources a few years ago, were a barrier for the expansion of Deep learning. However, thanks to the current easy data access and due to development of more powerful computational resources, including CPU and GPU too, the attention turned back on, and it became easier and faster to train a model than can distinguish different types of classes with a very low error rate. One interesting fact about Deep learning is its ability to automatically learn from data and understand the most differentiable features of it.

From the point of view of the industry, many artificial vision inspection lines still do their jobs relying on traditional computer vision methods/algorithms. Yet, with more complex domains, for example like texture patterns, things can get more difficult. This is where deep learning comes in.

This document begins with an introduction of deep learning for artificial vision. It starts by addressing the theoretical fundamentals of deep learning for image recognition and then focuses on the general aspects of Convolutional Neural Networks (CNN). Next, are reviewed the state of the art network configurations that stood out in recently.

A high-level toolkit for image recognition was created to simplify the whole process of building deep learning models, from the data pre-processing to the trained model testing phase. It allowed to easily prepare a set of experiences that address some of the common practices used on CNNs and highlight the power of deep learning on image recognition related tasks.

This dissertation was developed under a business environment on a artificial vision company called Neadvance, Machine Vision, SA. The Neadvance, Machine Vision, SA is also interested in researching the new trends related to deep learning for image recognition in order to know how to apply them on their projects since it opens a new range of challenging opportunities.

Thesis Download (PDF)