Abstract
Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. While the endoscopy video contains a wealth of information, tools to capture this information for the purpose of clinical reporting are rather poor. In date, endoscopists do not have any access to tools that enable them to browse the video data in an efficient and user friendly manner. Fast and reliable video retrieval methods could for example, allow them to review data from previous exams and therefore improve their ability to monitor disease progression. Deep learning provides new avenues of compressing and indexing video in an extremely efficient manner. In this study, we propose to use an autoencoder for efficient video compression and fast retrieval of video images. To boost the accuracy of video image retrieval and to address data variability like multi-modality and view-point changes, we propose the integration of a Siamese network. We demonstrate that our approach is competitive in retrieving images from 3 large scale videos of 3 different patients obtained against the query samples of their previous diagnosis. Quantitative validation shows that the combined approach yield an overall improvement of 5% and 8% over classical and variational autoencoders, respectively.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1905.04384