A benchmark for evaluating Deep Learning based Image Analytics

dc.contributor.author	Ikram, Chaudhry Rehan
dc.date.accessioned	2019-08-26T23:46:34Z
dc.date.available	2019-08-26T23:46:34Z
dc.date.issued	2019
dc.identifier.citation	Ikram, Chaudhry Rehan. A benchmark for evaluating Deep Learning based Image Analytics. Master thesis, University of Oslo, 2019
dc.identifier.uri	http://hdl.handle.net/10852/69588
dc.description.abstract	Deep learning based systems are on the rise as they have shown tremendous potential to extract concealed patterns through the data. Today Deep learning systems are surpassing human-level vision capabilities, which leads to the widespread adoption of deep learning on image classification and object detection. There is a wide variety of hardware, deep learning frameworks, model architectures and algorithms to chose from if one wants to implement deep learning based system, but a fair apple to apple comparison to aid selection remains a considerable challenge. The study aims to serve as a guide for the selection of deep learning framework and object detection algorithm with appropriate backbone feature extractor, which provides the optimal speed and accuracy trade-off. The proposed solution ImageMark has two parts. First part is a classification benchmark which provides a comparative study of seven state-of-the-art GPU-accelerated deep learning software tools ( MXNet, Tensorflow, PyTorch, CNTK, MXNet(Gluon), Keras, Theano) by executing a Convolutional Neural Network workload over all of them. All of them provide almost similar performance in terms of accuracy while some differences in training and inference speed are observed.MXNet is the fastest in terms of training speed while Theano is the leader in terms of inference speed while all of the frameworks utilize the GPU very efficiently. The second part is an object detection benchmark component which is focused on the evaluation of state of the art object detection algorithms which we view as “meta-architectures” i.e. Faster-RCNN, RFCN and SSD using seven different base feature extractor CNN architectures for high-level feature extraction from the input image. We use a small, more practical dataset on road damage detection for the workloads. Faster-RCNN based object detectors generally provide better accuracy then RFCN based models, while RFCN based models, in turn, perform better than SSD based models.SSD based models provide high inference and training speed compared to RFCN and Faster-RCNN based models. We study speed-accuracy trade-off curve by keeping the hyper-parameters same across all models and apply multi-objective optimization to optimize speed and accuracy and present range of object detectors on Pareto front. Faster-RCNN based model with PNasNet base feature extractor achieves a highest mAP and F1-score but takes much more time to train and is impractical for scenarios which require high frame rate during inference. On the other extreme, we present SSD based model with Inception base feature extractor, which takes the least amount of time for training and inference and still provides decent accuracy.	eng
dc.language.iso	eng
dc.subject
dc.title	A benchmark for evaluating Deep Learning based Image Analytics	eng
dc.type	Master thesis
dc.date.updated	2019-08-26T23:46:34Z
dc.creator.author	Ikram, Chaudhry Rehan
dc.identifier.urn	URN:NBN:no-72729
dc.type.document	Masteroppgave
dc.identifier.fulltext	Fulltext https://www.duo.uio.no/bitstream/handle/10852/69588/5/imagemark_final.pdf

Files in this item

Name:: imagemark_final.pdf
Size:: 21.16Mb
Format:: application/

View/Open

Appears in the following Collection

Institutt for informatikk [4956]

Hide metadata

A benchmark for evaluating Deep Learning based Image Analytics

Files in this item

Appears in the following Collection

Browse

For library staff

RSS Feeds