Abstract
This is the web page of Panos Kalodimas master thesis at the Technical University of Crete, Department of Electronics and Computer Engineering, Chania, Greece, 2016.
In this thesis a custom implementation of the “Face Detection, Pose Estimation, and Landmark Localization in the Wild” system by Xiangxin Zhu and Deva Ramanan is represented. This implementation was firstly designed for being used by embedded systems but finally it can also be used by large multiprocessors systems. This is because the modern embedded systems tend to be similar to what we used to call multiprocessor systems years ago. The latest embedded system are in the category of small multiprocessor systems using from 2 to 4 and even more cores in their central processing unit.
Our Tree Structural Model (TSM) Face Detection system was implemented both in C (Basic) and C++ (Object Oriented) and there is no usage of any external C++ library in the core of the algorithm. This gives the algorithm the ability to be used in both Windows and UNIX systems with no further changes. It also allows further improvements and alteration as it is easily readable for those who would like to use it for custom application. Our implementation gives the ability of customizing the functionality of the algorithm through a set of settings and parameters that can easily be modified.
As this implementation is designed for usage in embedded systems the need of reducing memory consumption and processing speedup was encounter. For that reason a number of customizations were made in contrast to the original implementation of its creators. Compared to the implementation of Hang Su, this implementation is two times faster and consumes about ten times less memory as described in the documentation. There were also presented a set of techniques that some may pull down the algorithm’s performance but in contrast they offer extra speedup and memory saving. These techniques may be very useful for custom application.
Despite any further speedup the main problem of making the face detection task a great time consumer is the fact that the image size in the one that makes it a long time processing. Large images compel the system to create large image pyramids in order to search them for face detection. In addition the larger the top image is the more time is needed to be processed. The main solution on this problem is proposed is the scaling of the original image to a smaller size in order to reduce the number of data needed to be processed. This solution makes the systems faster but they lose part of their performance as scaling an image to a smaller size makes small size faces to be unable for detection. Our implementation presents a method that scans the image pyramid faster for face detections in order to avoid detection processing in pyramid levels that seems to be empty of faces. This can be a very effective method for video application where empty faces frames can be faster processed and rejected.
Documentation
The available documentation is the master thesis documentation, a short presentation and the TSM system user manual. The master thesis documentation contains the following chapters,
- Chapters 4 and 5 contain the TSM Algorithm short and detailed description.
- Chapter 6 describes code optimization steps. The results are presented summarily in the subchapter 6.21.
- Chapter 7 describes two algorithm optimization patches that are included in the implementation.
- Chapter 8 describes the TSM system speedup using multi-threading technology. The summarized results are presented in subchapter 8.9.4.
- Chapter 9 describes methods that can be applied in the TSM system in order in some cases speed it up.
- Chapter 10 presents the comparison between this implementation and the one the creators presented.
Implementation
The Implementation of the TSM Face Detection system is implemented in both basic C and Object Oriented C++.