Welcome to AMDS123 Blog!

Recent Papers about CV, CL and SD

A Homotopy Method for Motion Planning

2019-01-29

Shenyu Liu, Mohamed Ali Belabbas

arXiv_RO

arXiv_RO
Abstract

We propose a novel method for motion planning and illustrate its implementation on several canonical examples. The core novel idea underlying the method is to define a metric for which a path of minimal length is an admissible path, that is path that respects the various constraints imposed by the environment and the physics of the system on its dynamics. To be more precise, our method takes as input a control system with holonomic and non-holonomic constraints, an initial and final point in configuration space, a description of obstacles to avoid, and an initial trajectory for the system, called a sketch. This initial trajectory does not need to meet the constraints, except for the obstacle avoidance constraints. The constraints are then encoded in an inner product, which is used to deform (via a homotopy) the initial sketch into an admissible trajectory from which controls realizing the transfer can be obtained. We illustrate the method on various examples, including vehicle motion with obstacles and a two-link manipulator problem.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10094

PDF

http://arxiv.org/pdf/1901.10094
Read All
Multi-Scale Recursive and Perception-Distortion Controllable Image Super-Resolution

2019-01-29

Pablo Navarrete Michelini, Dan Zhu, Hanwen Liu

arXiv_CV

arXiv_CV Adversarial Super_Resolution GAN
Abstract

We describe our solution for the PIRM Super-Resolution Challenge 2018 where we achieved the 2nd best perceptual quality for average RMSE<=16, 5th best for RMSE<=12.5, and 7th best for RMSE<=11.5. We modify a recently proposed Multi-Grid Back-Projection (MGBP) architecture to work as a generative system with an input parameter that can control the amount of artificial details in the output. We propose a discriminator for adversarial training with the following novel properties: it is multi-scale that resembles a progressive-GAN; it is recursive that balances the architecture of the generator; and it includes a new layer to capture significant statistics of natural images. Finally, we propose a training strategy that avoids conflicts between reconstruction and perceptual losses. Our configuration uses only 281k parameters and upscales each image of the competition in 0.2s in average.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1809.10711

PDF

http://arxiv.org/pdf/1809.10711
Read All
Cloud-Net: An end-to-end Cloud Detection Algorithm for Landsat 8 Imagery

2019-01-29

Sorour Mohajerani, Parvaneh Saeedi

arXiv_CV

arXiv_CV CNN Deep_Learning Detection
Abstract

Cloud detection in satellite images is an important first-step in many remote sensing applications. This problem is more challenging when only a limited number of spectral bands are available. To address this problem, a deep learning-based algorithm is proposed in this paper. This algorithm consists of a Fully Convolutional Network (FCN) that is trained by multiple patches of Landsat 8 images. This network, which is called Cloud-Net, is capable of capturing global and local cloud features in an image using its convolutional blocks. Since the proposed method is an end-to-end solution, no complicated pre-processing step is required. Our experimental results prove that the proposed method outperforms the state-of-the-art method over a benchmark dataset by 8.7\% in Jaccard Index.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10077

PDF

http://arxiv.org/pdf/1901.10077
Read All
On the negation of a Dempster-Shafer belief structure based on maximum uncertainty allocation

2019-01-29

Xinyang Deng, Wen Jiang

arXiv_AI

arXiv_AI
Abstract

Probability theory and Dempster-Shafer theory are two germane theories to represent and handle uncertain information. Recent study suggested a transformation to obtain the negation of a probability distribution based on the maximum entropy. Correspondingly, determining the negation of a belief structure, however, is still an open issue in Dempster-Shafer theory, which is very important in theoretical research and practical applications. In this paper, a negation transformation for belief structures is proposed based on maximum uncertainty allocation, and several important properties satisfied by the transformation have been studied. The proposed negation transformation is more general and could totally compatible with existing transformation for probability distributions.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10072

PDF

http://arxiv.org/pdf/1901.10072
Read All
Multigrid Backprojection Super-Resolution and Deep Filter Visualization

2019-01-29

Pablo Navarrete Michelini, Hanwen Liu, Dan Zhu

arXiv_CV

arXiv_CV Super_Resolution CNN
Abstract

We introduce a novel deep-learning architecture for image upscaling by large factors (e.g. 4x, 8x) based on examples of pristine high-resolution images. Our target is to reconstruct high-resolution images from their downscale versions. The proposed system performs a multi-level progressive upscaling, starting from small factors (2x) and updating for higher factors (4x and 8x). The system is recursive as it repeats the same procedure at each level. It is also residual since we use the network to update the outputs of a classic upscaler. The network residuals are improved by Iterative Back-Projections (IBP) computed in the features of a convolutional network. To work in multiple levels we extend the standard back-projection algorithm using a recursion analogous to Multi-Grid algorithms commonly used as solvers of large systems of linear equations. We finally show how the network can be interpreted as a standard upsampling-and-filter upscaler with a space-variant filter that adapts to the geometry. This approach allows us to visualize how the network learns to upscale. Finally, our system reaches state of the art quality for models with relatively few number of parameters.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1809.09326

PDF

http://arxiv.org/pdf/1809.09326
Read All
Committee Selection with Attribute Level Preferences

2019-01-29

Venkateswara Rao Kagita, Arun K Pujari, Vineet Padmanabhan, Vikas Kumar

arXiv_AI

arXiv_AI
Abstract

Approval ballot based committee formation is concerned with aggregating individual approvals of voters. Voters submit their approvals of candidates and these approvals are aggregated to arrive at the optimal committee of specified size. There are several aggregation techniques proposed in the literature and these techniques differ among themselves on the criterion function they optimize. Voters preferences for a candidate is based on his/her opinion on candidate suitability. We note that candidates have attributes that make him/her suitable or otherwise. Hence, it is relevant to approve attributes and select candidates who have the approved attributes. This paper addresses the committee selection problem when voters submit their approvals on attributes. Though attribute based preference is addressed in several contexts, committee selection problem with attribute approval has not been attempted earlier. We note that extending the theory of candidate approval to attribute approval in committee selection problem is not trivial. In this paper, we study different aspects of this problem and show that none of the existing aggregation rules satisfies Unanimity and Justified Representation when attribute based approvals are considered. We propose a new aggregation rule that satisfies both the above properties. We also present other analysis of committee selection problem with attribute approval.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10064

PDF

http://arxiv.org/pdf/1901.10064
Read All
Knowledge Refinement via Rule Selection

2019-01-29

Phokion G. Kolaitis, Lucian Popa, Kun Qian

arXiv_AI

arXiv_AI Knowledge Optimization
Abstract

In several different applications, including data transformation and entity resolution, rules are used to capture aspects of knowledge about the application at hand. Often, a large set of such rules is generated automatically or semi-automatically, and the challenge is to refine the encapsulated knowledge by selecting a subset of rules based on the expected operational behavior of the rules on available data. In this paper, we carry out a systematic complexity-theoretic investigation of the following rule selection problem: given a set of rules specified by Horn formulas, and a pair of an input database and an output database, find a subset of the rules that minimizes the total error, that is, the number of false positive and false negative errors arising from the selected rules. We first establish computational hardness results for the decision problems underlying this minimization problem, as well as upper and lower bounds for its approximability. We then investigate a bi-objective optimization version of the rule selection problem in which both the total error and the size of the selected rules are taken into account. We show that testing for membership in the Pareto front of this bi-objective optimization problem is DP-complete. Finally, we show that a similar DP-completeness result holds for a bi-level optimization version of the rule selection problem, where one minimizes first the total error and then the size.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10051

PDF

http://arxiv.org/pdf/1901.10051
Read All
Dense Depth Posterior from Single Image and Sparse Range

2019-01-28

Yanchao Yang, Alex Wong, Stefano Soatto

arXiv_CV

arXiv_CV Sparse Deep_Learning
Abstract

We present a deep learning system to infer the posterior distribution of a dense depth map associated with an image, by exploiting sparse range measurements, for instance from a lidar. While the lidar may provide a depth value for a small percentage of the pixels, we exploit regularities reflected in the training set to complete the map so as to have a probability over depth for each pixel in the image. We exploit a Conditional Prior Network, that allows associating a probability to each depth value given an image, and combine it with a likelihood term that uses the sparse measurements. Optionally we can also exploit the availability of stereo during training, but in any case only require a single image and a sparse point cloud at run-time. We test our approach on both unsupervised and supervised depth completion using the KITTI benchmark, and improve the state-of-the-art in both.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10034

PDF

http://arxiv.org/pdf/1901.10034
Read All
Lyapunov-based Safe Policy Optimization for Continuous Control

2019-01-28

Yinlam Chow, Ofir Nachum, Aleksandra Faust, Mohammad Ghavamzadeh, Edgar Duenez-Guzman

arXiv_AI

arXiv_AI Reinforcement_Learning Optimization
Abstract

We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through {\em safe} policies, i.e.,~policies that do not take the agent to undesirable situations. We formulate these problems as {\em constrained} Markov decision processes (CMDPs) and present safe policy optimization algorithms that are based on a {\em Lyapunov} approach to solve them. Our algorithms can use any standard policy gradient (PG) method, such as deep deterministic policy gradient (DDPG) or proximal policy optimization (PPO), to train a neural network policy, while guaranteeing near-constraint satisfaction for every policy update by projecting either the policy parameter or the action onto the set of feasible solutions induced by the state-dependent linearized Lyapunov constraints. Compared to the existing constrained PG algorithms, ours are more data efficient as they are able to utilize both on-policy and off-policy data. Moreover, our action-projection algorithm often leads to less conservative policy updates and allows for natural integration into an end-to-end PG training pipeline. We evaluate our algorithms and compare them with the state-of-the-art baselines on several simulated (MuJoCo) tasks, as well as a real-world indoor robot navigation problem, demonstrating their effectiveness in terms of balancing performance and constraint satisfaction. Videos of the experiments can be found in the following link: https://drive.google.com/file/d/1pzuzFqWIE710bE2U6DmS59AfRzqK2Kek/view?usp=sharing .

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10031

PDF

http://arxiv.org/pdf/1901.10031
Read All
Decoder-tailored Polar Code Design Using the Genetic Algorithm

2019-01-28

Ahmed Elkelesh, Moustafa Ebada, Sebastian Cammerer, Stephan ten Brink

arXiv_AI

arXiv_AI
Abstract

We propose a new framework for constructing polar codes (i.e., selecting the frozen bit positions) for arbitrary channels, and tailored to a given decoding algorithm, rather than based on the (not necessarily optimal) assumption of successive cancellation (SC) decoding. The proposed framework is based on the Genetic Algorithm (GenAlg), where populations (i.e., collections) of information sets evolve successively via evolutionary transformations based on their individual error-rate performance. These populations converge towards an information set that fits both the decoding behavior and the defined channel. Using our proposed algorithm over the additive white Gaussian noise (AWGN) channel, we construct a polar code of length 2048 with code rate 0.5, without the CRC-aid, tailored to plain successive cancellation list (SCL) decoding, achieving the same error-rate performance as the CRC-aided SCL decoding, and leading to a coding gain of 1 dB at BER of $10^{-6}$. Further, a belief propagation (BP)-tailored construction approaches the SCL error-rate performance without any modifications in the decoding algorithm itself. The performance gains can be attributed to the significant reduction in the total number of low-weight codewords. To demonstrate the flexibility, coding gains for the Rayleigh channel are shown under SCL and BP decoding. Besides improvements in error-rate performance, we show that, when required, the GenAlg can be also set up to reduce the decoding complexity, e.g., the SCL list size or the number of BP iterations can be reduced, while maintaining the same error-rate performance.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10464

PDF

http://arxiv.org/pdf/1901.10464
Read All
How Shall I Drive? Interaction Modeling and Motion Planning towards Empathetic and Socially-Graceful Driving

2019-01-28

Yi Ren, Steven Elliott, Yiwei Wang, Yezhou Yang, Wenlong Zhang

arXiv_RO

arXiv_RO Knowledge Inference
Abstract

While intelligence of autonomous vehicles (AVs) has significantly advanced in recent years, accidents involving AVs suggest that these autonomous systems lack gracefulness in driving when interacting with human drivers. In the setting of a two-player game, we propose model predictive control based on social gracefulness, which is measured by the discrepancy between the actions taken by the AV and those that could have been taken in favor of the human driver. We define social awareness as the ability of an agent to infer such favorable actions based on knowledge about the other agent’s intent, and further show that empathy, i.e., the ability to understand others’ intent by simultaneously inferring others’ understanding of the agent’s self intent, is critical to successful intent inference. Lastly, through an intersection case, we show that the proposed gracefulness objective allows an AV to learn more sophisticated behavior, such as passive-aggressive motions that gently force the other agent to yield.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10013

PDF

http://arxiv.org/pdf/1901.10013
Read All
Diseño de un espacio semántico sobre la base de la Wikipedia. Una propuesta de análisis de la semántica latente para el idioma español

2019-01-28

Dalina Aidee Villa, Igor Barahona, Luis Javier Álvarez

arXiv_CL

arXiv_CL
Abstract

Latent Semantic Analysis (LSA) was initially conceived by the cognitive psychology at the 90s decade. Since its emergence, the LSA has been used to model cognitive processes, pointing out academic texts, compare literature works and analyse political speeches, among other applications. Taking as starting point multivariate method for dimensionality reduction, this paper propose a semantic space for Spanish language. Out results include a document text matrix with dimensions 1.3 x10^6 and 5.9x10^6, which later is decomposed into singular values. Those singular values are used to semantically words or text.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1902.02173

PDF

https://arxiv.org/pdf/1902.02173
Read All
Dise~no de un espacio sem'antico sobre la base de la Wikipedia. Una propuesta de an'alisis de la sem'antica latente para el idioma espa~nol

2019-01-28

Dalina Aidee Villa, Igor Barahona, Luis Javier Álvarez

arXiv_CL

arXiv_CL
Abstract

Latent Semantic Analysis (LSA) was initially conceived by the cognitive psychology at the 90s decade. Since its emergence, the LSA has been used to model cognitive processes, pointing out academic texts, compare literature works and analyse political speeches, among other applications. Taking as starting point multivariate method for dimensionality reduction, this paper propose a semantic space for Spanish language. Out results include a document text matrix with dimensions 1.3 x10^6 and 5.9x10^6, which later is decomposed into singular values. Those singular values are used to semantically words or text.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.02173

PDF

http://arxiv.org/pdf/1902.02173
Read All
Generalized Label Propagation Methods for Semi-Supervised Learning

2019-01-28

Qimai Li, Xiao-Ming Wu, Zhichao Guan

arXiv_AI

arXiv_AI Knowledge_Graph Knowledge CNN Prediction
Abstract

The key challenge in semi-supervised learning is how to effectively leverage unlabeled data to improve learning performance. The classical label propagation method, despite its popularity, has limited modeling capability in that it only exploits graph information for making predictions. In this paper, we consider label propagation from a graph signal processing perspective and decompose it into three components: signal, filter, and classifier. By extending the three components, we propose a simple generalized label propagation (GLP) framework for semi-supervised learning. GLP naturally integrates graph and data feature information, and offers the flexibility of selecting appropriate filters and domain-specific classifiers for different applications. Interestingly, GLP also provides new insight into the popular graph convolutional network and elucidates its working mechanisms. Extensive experiments on three citation networks, one knowledge graph, and one image dataset demonstrate the efficiency and effectiveness of GLP.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09993

PDF

http://arxiv.org/pdf/1901.09993
Read All
Compressed domain image classification using a multi-rate neural network

2019-01-28

Yibo Xu, Kevin F. Kelly ( Department of Electrical & Computer Engineering, Rice University, Houston, USA)

arXiv_CV

arXiv_CV Image_Classification Classification
Abstract

Compressed domain image classification aims to directly perform classification on compressive measurements generated from the single-pixel camera. While neural network approaches have achieved state-of-the-art performance, previous methods require training a dedicated network for each different measurement rate which is computationally costly. In this work, we present a general approach that endows a single neural network with multi-rate property for compressed domain classification where a single network is capable of classifying over an arbitrary number of measurements using dataset-independent fixed binary sensing patterns. We demonstrate the multi-rate neural network performance on MNIST and grayscale CIFAR-10 datasets. We also show that using the Partial Complete binary sensing matrix, the multi-rate network outperforms previous methods especially in the case of very few measurements.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09983

PDF

http://arxiv.org/pdf/1901.09983
Read All
How is Your Mood When Writing Sexist tweets? Detecting the Emotion Type and Intensity of Emotion Using Natural Language Processing Techniques

2019-01-28

Sima Sharifirad, Borna Jafarpour, Stan Matwin

arXiv_CL

arXiv_CL Sentiment Knowledge Classification Detection
Abstract

Online social platforms have been the battlefield of users with different emotions and attitudes toward each other in recent years. While sexism has been considered as a category of hateful speech in the literature, there is no comprehensive definition and category of sexism attracting natural language processing techniques. Categorizing sexism as either benevolent or hostile sexism is so broad that it easily ignores the other categories of sexism on social media. Sharifirad S and Matwin S 2018 proposed a well-defined category of sexism including indirect harassment, information threat, sexual harassment and physical harassment, inspired from social science for the purpose of natural language processing techniques. In this article, we take advantage of a newly released dataset in SemEval-2018 task1: Affect in tweets, to show the type of emotion and intensity of emotion in each category. We train, test and evaluate different classification methods on the SemEval- 2018 dataset and choose the classifier with highest accuracy for testing on each category of sexist tweets to know the mental state and the affectual state of the user who tweets in each category. It is a nice avenue to explore because not all the tweets are directly sexist and they carry different emotions from the users. This is the first work experimenting on affect detection this in depth on sexist tweets. Based on our best knowledge they are all new contributions to the field; we are the first to demonstrate the power of such in-depth sentiment analysis on the sexist tweets.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.03089

PDF

http://arxiv.org/pdf/1902.03089
Read All
Heartbeat Anomaly Detection using Adversarial Oversampling

2019-01-28

Jefferson L. P. Lima, David Macêdo, Cleber Zanchettin

arXiv_AI

arXiv_AI Adversarial Knowledge GAN CNN Classification Detection
Abstract

Cardiovascular diseases are one of the most common causes of death in the world. Prevention, knowledge of previous cases in the family, and early detection is the best strategy to reduce this fact. Different machine learning approaches to automatic diagnostic are being proposed to this task. As in most health problems, the imbalance between examples and classes is predominant in this problem and affects the performance of the automated solution. In this paper, we address the classification of heartbeats images in different cardiovascular diseases. We propose a two-dimensional Convolutional Neural Network for classification after using a InfoGAN architecture for generating synthetic images to unbalanced classes. We call this proposal Adversarial Oversampling and compare it with the classical oversampling methods as SMOTE, ADASYN, and RandomOversampling. The results show that the proposed approach improves the classifier performance for the minority classes without harming the performance in the balanced classes.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09972

PDF

http://arxiv.org/pdf/1901.09972
Read All
DeGraF-Flow: Extending DeGraF Features for accurate and efficient sparse-to-dense optical flow estimation

2019-01-28

Felix Stephenson, Toby Breckon, Ioannis Katramados

arXiv_AI

arXiv_AI Salient Object_Detection Sparse Tracking Detection
Abstract

Modern optical flow methods make use of salient scene feature points detected and matched within the scene as a basis for sparse-to-dense optical flow estimation. Current feature detectors however either give sparse, non uniform point clouds (resulting in flow inaccuracies) or lack the efficiency for frame-rate real-time applications. In this work we use the novel Dense Gradient Based Features (DeGraF) as the input to a sparse-to-dense optical flow scheme. This consists of three stages: 1) efficient detection of uniformly distributed Dense Gradient Based Features (DeGraF); 2) feature tracking via robust local optical flow; and 3) edge preserving flow interpolation to recover overall dense optical flow. The tunable density and uniformity of DeGraF features yield superior dense optical flow estimation compared to other popular feature detectors within this three stage pipeline. Furthermore, the comparable speed of feature detection also lends itself well to the aim of real-time optical flow recovery. Evaluation on established real-world benchmark datasets show test performance in an autonomous vehicle setting where DeGraF-Flow shows promising results in terms of accuracy with competitive computational efficiency among non-GPU based methods, including a marked increase in speed over the conceptually similar EpicFlow approach.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09971

PDF

http://arxiv.org/pdf/1901.09971
Read All
Using Pre-Training Can Improve Model Robustness and Uncertainty

2019-01-28

Dan Hendrycks, Kimin Lee, Mantas Mazeika

arXiv_CV

arXiv_CV Adversarial Classification Detection
Abstract

Tuning a pre-trained network is commonly thought to improve data efficiency. However, Kaiming He et al. have called into question the utility of pre-training by showing that training from scratch can often yield similar performance, should the model train long enough. We show that although pre-training may not improve performance on traditional classification metrics, it does provide large benefits to model robustness and uncertainty. Through extensive experiments on label corruption, class imbalance, adversarial examples, out-of-distribution detection, and confidence calibration, we demonstrate large gains from pre-training and complementary effects with task-specific methods. We show approximately a 30% relative improvement in label noise robustness and a 10% absolute improvement in adversarial robustness on CIFAR-10 and CIFAR-100. In some cases, using pre-training without task-specific methods surpasses the state-of-the-art, highlighting the importance of using pre-training when evaluating future methods on robustness and uncertainty tasks.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09960

PDF

http://arxiv.org/pdf/1901.09960
Read All
OpenHowNet: An Open Sememe-based Lexical Knowledge Base

2019-01-28

Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Qiang Dong, Maosong Sun, Zhendong Dong

arXiv_CL

arXiv_CL Knowledge
Abstract

In this paper, we present an open sememe-based lexical knowledge base OpenHowNet. Based on well-known HowNet, OpenHowNet comprises three components: core data which is composed of more than 100 thousand senses annotated with sememes, OpenHowNet Web which gives a brief introduction to OpenHowNet as well as provides online exhibition of OpenHowNet information, and OpenHowNet API which includes several useful APIs such as accessing OpenHowNet core data and drawing sememe tree structures of senses. In the main text, we first give some backgrounds including definition of sememe and details of HowNet. And then we introduce some previous HowNet and sememe-based research works. Last but not least, we detail the constituents of OpenHowNet and their basic features and functionalities. Additionally, we briefly make a summary and list some future works.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09957

PDF

http://arxiv.org/pdf/1901.09957
Read All
TGAN: Deep Tensor Generative Adversarial Nets for Large Image Generation

2019-01-28

Zihan Ding, Xiao-Yang Liu, Miao Yin, Wei Liu, Linghe Kong

arXiv_CV

arXiv_CV Image_Caption Adversarial Super_Resolution GAN CNN
Abstract

Deep generative models have been successfully applied to many applications. However, existing works experience limitations when generating large images (the literature usually generates small images, e.g. 32 * 32 or 128 * 128). In this paper, we propose a novel scheme, called deep tensor adversarial generative nets (TGAN), that generates large high-quality images by exploring tensor structures. Essentially, the adversarial process of TGAN takes place in a tensor space. First, we impose tensor structures for concise image representation, which is superior in capturing the pixel proximity information and the spatial patterns of elementary objects in images, over the vectorization preprocess in existing works. Secondly, we propose TGAN that integrates deep convolutional generative adversarial networks and tensor super-resolution in a cascading manner, to generate high-quality images from random distributions. More specifically, we design a tensor super-resolution process that consists of tensor dictionary learning and tensor coefficients learning. Finally, on three datasets, the proposed TGAN generates images with more realistic textures, compared with state-of-the-art adversarial autoencoders. The size of the generated images is increased by over 8.5 times, namely 374 * 374 in PASCAL2.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09953

PDF

http://arxiv.org/pdf/1901.09953
Read All
CoCoNet: A Collaborative Convolutional Network

2019-01-28

Tapabrata Chakraborti, Brendan McCane, Steven Mills, Umapada Pal

arXiv_CV

arXiv_CV CNN Transfer_Learning Deep_Learning Recognition
Abstract

We present an end-to-end CNN architecture for fine-grained visual recognition called Collaborative Convolutional Network (CoCoNet). The network uses a collaborative filter after the convolutional layers to represent an image as an optimal weighted collaboration of features learned from training samples as a whole rather than one at a time. This gives CoCoNet more power to encode the fine-grained nature of the data with limited samples in an end-to-end fashion. We perform a detailed study of the performance with 1-stage and 2-stage transfer learning and different configurations with benchmark architectures like AlexNet and VggNet. The ablation study shows that the proposed method outperforms its constituent parts considerably and consistently. CoCoNet also outperforms the baseline popular deep learning based fine-grained recognition method, namely Bilinear-CNN (BCNN) with statistical significance. Experiments have been performed on the fine-grained species recognition problem, but the method is general enough to be applied to other similar tasks. Lastly, we also introduce a new public dataset for fine-grained species recognition, that of Indian endemic birds and have reported initial results on it. The training metadata and new dataset are available through the corresponding author.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09886

PDF

http://arxiv.org/pdf/1901.09886
Read All
CapsAttacks: Robust and Imperceptible Adversarial Attacks on Capsule Networks

2019-01-28

Alberto Marchisio, Giorgio Nanfa, Faiq Khalid, Muhammad Abdullah Hanif, Maurizio Martina, Muhammad Shafique

arXiv_CV

arXiv_CV Adversarial CNN Image_Classification Classification Relation Recognition
Abstract

Capsule Networks envision an innovative point of view about the representation of the objects in the brain and preserve the hierarchical spatial relationships between them. This type of networks exhibits a huge potential for several Machine Learning tasks like image classification, while outperforming Convolutional Neural Networks (CNNs). A large body of work has explored adversarial examples for CNNs, but their efficacy to Capsule Networks is not well explored. In our work, we study the vulnerabilities in Capsule Networks to adversarial attacks. These perturbations, added to the test inputs, are small and imperceptible to humans, but fool the network to mis-predict. We propose a greedy algorithm to automatically generate targeted imperceptible adversarial examples in a black-box attack scenario. We show that this kind of attacks, when applied to the German Traffic Sign Recognition Benchmark (GTSRB), mislead Capsule Networks. Moreover, we apply the same kind of adversarial attacks to a 9-layer CNN and analyze the outcome, compared to the Capsule Networks to study their differences / commonalities.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09878

PDF

http://arxiv.org/pdf/1901.09878
Read All
It could be worse, it could be raining: reliable automatic meteorological forecasting

2019-01-28

Matteo Cristani, Francesco Domenichini, Claudio Tomazzoli1, Luca Viganò, Margherita Zorzi

arXiv_AI

arXiv_AI Prediction
Abstract

Meteorological forecasting provides reliable prediction about the future weather within a given interval of time. Meteorological forecasting can be viewed as a form of hybrid diagnostic reasoning and can be mapped onto an integrated conceptual framework. The automation of the forecasting process would be helpful in a number of contexts, in particular: when the amount of data is too wide to be dealt with manually; to support forecasters education; when forecasting about underpopulated geographic areas is not interesting for everyday life (and then is out from human forecasters’ tasks) but is central for tourism sponsorship. We present logic MeteoLOG, a framework that models the main steps of the reasoner the forecaster adopts to provide a bulletin. MeteoLOG rests on several traditions, mainly on fuzzy, temporal and probabilistic logics. On this basis, we also introduce the algorithm Tournament, that transforms a set of MeteoLOG rules into a defeasible theory, that can be implemented into an automatic reasoner. We finally propose an example that models a real world forecasting scenario.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09867

PDF

http://arxiv.org/pdf/1901.09867
Read All
Multi-modal dialog for browsing large visual catalogs using exploration-exploitation paradigm in a joint embedding space

2019-01-28

Arkabandhu Chowdhury, Indrani Bhattacharya, Vikas Raykar

arXiv_CV

arXiv_CV Embedding
Abstract

We present a multi-modal dialog system to assist online shoppers in visually browsing through large catalogs. Visual browsing is different from visual search in that it allows the user to explore the wide range of products in a catalog, beyond the exact search matches. We focus on a slightly asymmetric version of the complete multi-modal dialog where the system can understand both text and image queries but responds only in images. We formulate our problem of ``showing $k$ best images to a user’’ based on the dialog context so far, as sampling from a Gaussian Mixture Model in a high dimensional joint multi-modal embedding space, that embed both the text and the image queries. Our system remembers the context of the dialog and uses an exploration-exploitation paradigm to assist in visual browsing. We train and evaluate the system on a multi-modal dialog dataset that we generate from large catalog data. Our experiments are promising and show that the agent is capable of learning and can display relevant results with an average cosine similarity of 0.85 to the ground truth. Our preliminary human evaluation also corroborates the fact that such a multi-modal dialog system for visual browsing is well-received and is capable of engaging human users.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09854

PDF

http://arxiv.org/pdf/1901.09854
Read All
A new evaluation framework for topic modeling algorithms based on synthetic corpora

2019-01-28

Hanyu Shi, Martin Gerlach, Isabel Diersen, Doug Downey, Luis A. N. Amaral

arXiv_CL

arXiv_CL Classification
Abstract

Topic models are in widespread use in natural language processing and beyond. Here, we propose a new framework for the evaluation of probabilistic topic modeling algorithms based on synthetic corpora containing an unambiguously defined ground truth topic structure. The major innovation of our approach is the ability to quantify the agreement between the planted and inferred topic structures by comparing the assigned topic labels at the level of the tokens. In experiments, our approach yields novel insights about the relative strengths of topic models as corpus characteristics vary, and the first evidence of an “undetectable phase” for topic models when the planted structure is weak. We also establish the practical relevance of the insights gained for synthetic corpora by predicting the performance of topic modeling algorithms in classification tasks in real-world corpora.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09848

PDF

http://arxiv.org/pdf/1901.09848
Read All
Squeezed Very Deep Convolutional Neural Networks for Text Classification

2019-01-28

Andréa B. Duque, Luã Lázaro J. Santos, David Macêdo, Cleber Zanchettin

arXiv_CL

arXiv_CL Text_Classification CNN Classification
Abstract

Most of the research in convolutional neural networks has focused on increasing network depth to improve accuracy, resulting in a massive number of parameters which restricts the trained network to platforms with memory and processing constraints. We propose to modify the structure of the Very Deep Convolutional Neural Networks (VDCNN) model to fit mobile platforms constraints and keep performance. In this paper, we evaluate the impact of Temporal Depthwise Separable Convolutions and Global Average Pooling in the network parameters, storage size, and latency. The squeezed model (SVDCNN) is between 10x and 20x smaller, depending on the network depth, maintaining a maximum size of 6MB. Regarding accuracy, the network experiences a loss between 0.4% and 1.3% and obtains lower latencies compared to the baseline model.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09821

PDF

http://arxiv.org/pdf/1901.09821
Read All
Generalization of feature embeddings transferred from different video anomaly detection domains

2019-01-28

Fernando Pereira dos Santos, Leonardo Sampaio Ferraz Ribeiro, Moacir Antonelli Ponti

arXiv_CV

arXiv_CV Object_Detection Embedding Transfer_Learning Prediction Detection
Abstract

Detecting anomalous activity in video surveillance often involves using only normal activity data in order to learn an accurate detector. Due to lack of annotated data for some specific target domain, one could employ existing data from a source domain to produce better predictions. Hence, transfer learning presents itself as an important tool. But how to analyze the resulting data space? This paper investigates video anomaly detection, in particular feature embeddings of pre-trained CNN that can be used with non-fully supervised data. By proposing novel cross-domain generalization measures, we study how source features can generalize for different target video domains, as well as analyze unsupervised transfer learning. The proposed generalization measures are not only a theorical approach, but show to be useful in practice as a way to understand which datasets can be used or transferred to describe video frames, which it is possible to better discriminate between normal and anomalous activity.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09819

PDF

http://arxiv.org/pdf/1901.09819
Read All
Analogies Explained: Towards Understanding Word Embeddings

2019-01-28

Carl Allen, Timothy Hospedales

arXiv_CL

arXiv_CL Embedding Language_Model Relation
Abstract

Word embeddings generated by neural network methods such as word2vec (W2V) are well known to exhibit seemingly linear behaviour, e.g. the embeddings of analogy “woman is to queen as man is to king” approximately describe a parallelogram. This property is particularly intriguing since the embeddings are not trained to achieve it. Several explanations have been proposed, but each introduces assumptions that do not hold in practice. We derive a probabilistically grounded definition of paraphrasing and show it can be re-interpreted as word transformation, a mathematical description of “$w_x$ is to $w_y$”. From these concepts we prove existence of the linear relationship between W2V-type embeddings that underlies the analogical phenomenon, and identify explicit error terms in the relationship.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09813

PDF

http://arxiv.org/pdf/1901.09813
Read All
Context-aware Monitoring in Robotic Surgery

2019-01-28

Mohammad Samin Yasar, David Evans, Homa Alemzadeh

arXiv_RO

arXiv_RO
Abstract

Robotic-assisted minimally invasive surgery (MIS) has enabled procedures with increased precision and dexterity, but surgical robots are still open loop and require surgeons to work with a tele-operation console providing only limited visual feedback. In this setting, mechanical failures, software faults, or human errors might lead to adverse events resulting in patient complications or fatalities. We argue that impending adverse events could be detected and mitigated by applying context-specific safety constraints on the motions of the robot. We present a context-aware safety monitoring system which segments a surgical task into subtasks using kinematics data and monitors safety constraints specific to each subtask. To test our hypothesis about context specificity of safety constraints, we analyze recorded demonstrations of dry-lab surgical tasks collected from the JIGSAWS database as well as from experiments we conducted on a Raven II surgical robot. Analysis of the trajectory data shows that each subtask of a given surgical procedure has consistent safety constraints across multiple demonstrations by different subjects. Our preliminary results show that violations of these safety constraints lead to unsafe events, and there is often sufficient time between the constraint violation and the safety-critical event to allow for a corrective action.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09802

PDF

http://arxiv.org/pdf/1901.09802
Read All
BlockDrop: Dynamic Inference Paths in Residual Networks

2019-01-28

Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris

arXiv_CV

arXiv_CV Reinforcement_Learning CNN Inference Prediction Quantitative Recognition
Abstract

Very deep convolutional neural networks offer excellent recognition results, yet their computational expense limits their impact for many real-world applications. We introduce BlockDrop, an approach that learns to dynamically choose which layers of a deep network to execute during inference so as to best reduce total computation without degrading prediction accuracy. Exploiting the robustness of Residual Networks (ResNets) to layer dropping, our framework selects on-the-fly which residual blocks to evaluate for a given novel image. In particular, given a pretrained ResNet, we train a policy network in an associative reinforcement learning setting for the dual reward of utilizing a minimal number of blocks while preserving recognition accuracy. We conduct extensive experiments on CIFAR and ImageNet. The results provide strong quantitative and qualitative evidence that these learned policies not only accelerate inference but also encode meaningful visual information. Built upon a ResNet-101 model, our method achieves a speedup of 20\% on average, going as high as 36\% for some images, while maintaining the same 76.4\% top-1 accuracy on ImageNet.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1711.08393

PDF

http://arxiv.org/pdf/1711.08393
Read All
On the effects of firing memory in the dynamics of conjunctive networks

2019-01-28

Eric Goles, Pedro Montealegre, Martín Ríos Wilson

arXiv_CV

arXiv_CV Prediction
Abstract

Boolean networks are one of the most studied discrete models in the context of the study of gene expression. In order to define the dynamics associated to a Boolean network, there are several \emph{update schemes} that range from parallel or \emph{synchronous} to \emph{asynchronous.} However, studying each possible dynamics defined by different update schemes might not be efficient. In this context, considering some type of temporal delay in the dynamics of Boolean networks emerges as an alternative approach. In this paper, we focus in studying the effect of a particular type of delay called \emph{firing memory} in the dynamics of Boolean networks. Particularly, we focus in symmetric (non-directed) conjunctive networks and we show that there exist examples that exhibit attractors of non-polynomial period. In addition, we study the prediction problem consisting in determinate if some vertex will eventually change its state, given an initial condition. We prove that this problem is {\bf PSPACE}-complete.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1901.09789

PDF

https://arxiv.org/pdf/1901.09789
Read All
Evaluating Word Embedding Models: Methods and Experimental Results

2019-01-28

Bin Wang, Angela Wang, Fenxiao Chen, Yunchen Wang, C.-C. Jay Kuo

arXiv_CL

arXiv_CL Embedding Relation
Abstract

Extensive evaluation on a large number of word embedding models for language processing applications is conducted in this work. First, we introduce popular word embedding models and discuss desired properties of word models and evaluation methods (or evaluators). Then, we categorize evaluators into intrinsic and extrinsic two types. Intrinsic evaluators test the quality of a representation independent of specific natural language processing tasks while extrinsic evaluators use word embeddings as input features to a downstream task and measure changes in performance metrics specific to that task. We report experimental results of intrinsic and extrinsic evaluators on six word embedding models. It is shown that different evaluators focus on different aspects of word models, and some are more correlated with natural language processing tasks. Finally, we adopt correlation analysis to study performance consistency of extrinsic and intrinsic evalutors.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09785

PDF

http://arxiv.org/pdf/1901.09785
Read All
Leveraging Outdoor Webcams for Local Descriptor Learning

2019-01-28

Milan Pultar, Dmytro Mishkin, Jiří Matas

arXiv_CV

arXiv_CV
Abstract

We present AMOS Patches, a large set of image cut-outs, intended primarily for the robustification of trainable local feature descriptors to illumination and appearance changes. Images contributing to AMOS Patches originate from the AMOS dataset of recordings from a large set of outdoor webcams. The semiautomatic method used to generate AMOS Patches is described. It includes camera selection, viewpoint clustering and patch selection. For training, we provide both the registered full source images as well as the patches. A new descriptor, trained on the AMOS Patches and 6Brown datasets, is introduced. It achieves state-of-the-art in matching under illumination changes on standard benchmarks.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09780

PDF

http://arxiv.org/pdf/1901.09780
Read All
Attribute-Guided Sketch Generation

2019-01-28

Hao Tang, Xinya Chen, Wei Wang, Dan Xu, Jason J. Corso, Nicu Sebe, Yan Yan

arXiv_CV

arXiv_CV Adversarial GAN Face
Abstract

Facial attributes are important since they provide a detailed description and determine the visual appearance of human faces. In this paper, we aim at converting a face image to a sketch while simultaneously generating facial attributes. To this end, we propose a novel Attribute-Guided Sketch Generative Adversarial Network (ASGAN) which is an end-to-end framework and contains two pairs of generators and discriminators, one of which is used to generate faces with attributes while the other one is employed for image-to-sketch translation. The two generators form a W-shaped network (W-net) and they are trained jointly with a weight-sharing constraint. Additionally, we also propose two novel discriminators, the residual one focusing on attribute generation and the triplex one helping to generate realistic looking sketches. To validate our model, we have created a new large dataset with 8,804 images, named the Attribute Face Photo & Sketch (AFPS) dataset which is the first dataset containing attributes associated to face sketch images. The experimental results demonstrate that the proposed network (i) generates more photo-realistic faces with sharper facial attributes than baselines and (ii) has good generalization capability on different generative tasks.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09774

PDF

http://arxiv.org/pdf/1901.09774
Read All
Additive Margin SincNet for Speaker Recognition

2019-01-28

João Antônio Chagas Nunes, David Macêdo, Cleber Zanchettin

arXiv_CL

arXiv_CL Deep_Learning Recognition
Abstract

Speaker Recognition is a challenging task with essential applications such as authentication, automation, and security. The SincNet is a new deep learning based model which has produced promising results to tackle the mentioned task. To train deep learning systems, the loss function is essential to the network performance. The Softmax loss function is a widely used function in deep learning methods, but it is not the best choice for all kind of problems. For distance-based problems, one new Softmax based loss function called Additive Margin Softmax (AM-Softmax) is proving to be a better choice than the traditional Softmax. The AM-Softmax introduces a margin of separation between the classes that forces the samples from the same class to be closer to each other and also maximizes the distance between classes. In this paper, we propose a new approach for speaker recognition systems called AM-SincNet, which is based on the SincNet but uses an improved AM-Softmax layer. The proposed method is evaluated in the TIMIT dataset and obtained an improvement of approximately 40% in the Frame Error Rate compared to SincNet.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10826

PDF

http://arxiv.org/pdf/1901.10826
Read All
CollaGAN : Collaborative GAN for Missing Image Data Imputation

2019-01-28

Dongwook Lee, Junyoung Kim, Won-Jin Moon, Jong Chul Ye

arXiv_CV

arXiv_CV Adversarial GAN
Abstract

In many applications requiring multiple inputs to obtain a desired output, if any of the input data is missing, it often introduces large amounts of bias. Although many techniques have been developed for imputing missing data, the image imputation is still difficult due to complicated nature of natural images. To address this problem, here we proposed a novel framework for missing image data imputation, called Collaborative Generative Adversarial Network (CollaGAN). CollaGAN converts an image imputation problem to a multi-domain images-to-image translation task so that a single generator and discriminator network can successfully estimate the missing data using the remaining clean data set. We demonstrate that CollaGAN produces the images with a higher visual quality compared to the existing competing approaches in various image imputation tasks.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09764

PDF

http://arxiv.org/pdf/1901.09764
Read All
Language Independent Sequence Labelling for Opinion Target Extraction

2019-01-28

Rodrigo Agerri, German Rigau

arXiv_CL

arXiv_CL Sentiment
Abstract

In this research note we present a language independent system to model Opinion Target Extraction (OTE) as a sequence labelling task. The system consists of a combination of clustering features implemented on top of a simple set of shallow local features. Experiments on the well known Aspect Based Sentiment Analysis (ABSA) benchmarks show that our approach is very competitive across languages, obtaining best results for six languages in seven different datasets. Furthermore, the results provide further insights into the behaviour of clustering features for sequence labelling tasks. The system and models generated in this work are available for public use and to facilitate reproducibility of results.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09755

PDF

http://arxiv.org/pdf/1901.09755
Read All
Transfer Learning with Neural AutoML

2019-01-28

Catherine Wong, Neil Houlsby, Yifeng Lu, Andrea Gesmundo

arXiv_CV

arXiv_CV Knowledge NAS Image_Classification Transfer_Learning Classification Deep_Learning
Abstract

We reduce the computational cost of Neural AutoML with transfer learning. AutoML relieves human effort by automating the design of ML algorithms. Neural AutoML has become popular for the design of deep learning architectures, however, this method has a high computation cost. To address this we propose Transfer Neural AutoML that uses knowledge from prior tasks to speed up network design. We extend RL-based architecture search methods to support parallel training on multiple tasks and then transfer the search strategy to new tasks. On language and image classification tasks, Transfer Neural AutoML reduces convergence time over single-task training by over an order of magnitude on many tasks.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1803.02780

PDF

https://arxiv.org/pdf/1803.02780
Read All
Edge, Ridge, and Blob Detection with Symmetric Molecules

2019-01-28

Rafael Reisenhofer, Emily J. King

arXiv_CV

arXiv_CV Object_Detection Detection
Abstract

We present a novel approach to the detection and characterization of edges, ridges, and blobs in two-dimensional images which exploits the symmetry properties of directionally sensitive analyzing functions in multiscale systems that are constructed in the framework of alpha-molecules. The proposed feature detectors are inspired by the notion of phase congruency, stable in the presence of noise, and by definition invariant to changes in contrast. We also show how the behavior of coefficients corresponding to differently scaled and oriented analyzing functions can be used to obtain a comprehensive characterization of the geometry of features in terms of local tangent directions, widths, and heights. The accuracy and robustness of the proposed measures are validated and compared to various state-of-the-art algorithms in extensive numerical experiments in which we consider sets of clean and distorted synthetic images that are associated with reliable ground truths. To further demonstrate the applicability, we show how the proposed ridge measure can be used to detect and characterize blood vessels in digital retinal images and how the proposed blob measure can be applied to automatically count the number of cell colonies in a Petri dish.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09723

PDF

http://arxiv.org/pdf/1901.09723
Read All
Anomaly DetectionWith Multiple-Hypotheses Predictions

2019-01-28

Duc Tam Nguyen, Zhongyu Lou, Michael Klar, Thomas Brox

arXiv_AI

arXiv_AI Prediction Detection
Abstract

In one-class-learning tasks, only the normal case (foreground) can be modeled with data, whereas the variation of all possible anomalies is too erratic to be described by samples. Thus, due to the lack of representative data, the wide-spread discriminative approaches cannot cover such learning tasks, and rather generative models, which attempt to learn the input density of the foreground, are used. However, generative models suffer from a large input dimensionality (as in images) and are typically inefficient learners. We propose to learn the data distribution of the foreground more efficiently with a multi-hypotheses autoencoder. Moreover, the model is criticized by a discriminator, which prevents artificial data modes not supported by data, and enforces diversity across hypotheses. Our multiple-hypothesesbased anomaly detection framework allows the reliable identification of out-of-distribution samples. For anomaly detection on CIFAR-10, it yields up to 3.9% points improvement over previously reported results. On a real anomaly detection task, the approach reduces the error of the baseline models from 6.8% to 1.5%.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1810.13292

PDF

http://arxiv.org/pdf/1810.13292
Read All
Personalized Dialogue Generation with Diversified Traits

2019-01-28

Yinhe Zheng, Guanyi Chen, Minlie Huang, Song Liu, Xuan Zhu

arXiv_CL

arXiv_CL Attention
Abstract

Endowing a dialogue system with particular personality traits is essential to deliver more human-like conversations. However, due to the challenge of embodying personality via language expression and the lack of large-scale persona-labeled dialogue data, this research problem is still far from well-studied. In this paper, we investigate the problem of incorporating explicit personality traits in dialogue generation to deliver personalized dialogues. To this end, firstly, we construct PersonalDialog, a large-scale multi-turn dialogue dataset containing various traits from a large number of speakers. The dataset consists of 20.83M sessions and 56.25M utterances from 8.47M speakers. Each utterance is associated with a speaker who is marked with traits like Age, Gender, Location, Interest Tags, etc. Several anonymization schemes are designed to protect the privacy of each speaker. This large-scale dataset will facilitate not only the study of personalized dialogue generation, but also other researches on sociolinguistics or social science. Secondly, to study how personality traits can be captured and addressed in dialogue generation, we propose persona-aware dialogue generation models within the sequence to sequence learning framework. Explicit personality traits (structured by key-value pairs) are embedded using a trait fusion module. During the decoding process, two techniques, namely persona-aware attention and persona-aware bias, are devised to capture and address trait-related information. Experiments demonstrate that our model is able to address proper traits in different contexts. Case studies also show interesting results for this challenging research problem.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09672

PDF

http://arxiv.org/pdf/1901.09672
Read All
Globally Optimal Registration based on Fast Branch and Bound

2019-01-28

Luca Consolini, Mattia Laurini, Marco Locatelli, Dario Lodi Rizzini

arXiv_RO

arXiv_RO
Abstract

The problem of planar registration consists in finding the transformation that better aligns two point sets. In our setting, the search domain is the set of planar rigid transformations and the objective function is the sum of the distances between each point of the transformed source set and the destination set. We propose a novel Branch and Bound (BnB) method for finding the globally optimal solution. The algorithm recursively splits the search domain into boxes and computes an upper and a lower bound for the minimum value of the restricted problem. We present two main contributions. First, we define two lower bounds. The cheap bound consists of the sum of the minimum distances between each point of source point set, transformed according to current box, and all the candidate points in the destination point set. The relaxation bound corresponds to the solution of a concave relaxation of the objective function based on the linearization of the distance. In large boxes, the cheap bound is a better approximation of the function minimum, while, in small boxes, the relaxation bound is much more accurate. Second, we present a queue-based algorithm that considerably speeds up the computation.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09641

PDF

http://arxiv.org/pdf/1901.09641
Read All
Geometry Preserving Sampling Method based on Spectral Decomposition for 3D Registration

2019-01-28

Mathieu Labussiere, Johann Laconte, François Pomerleau

arXiv_RO

arXiv_RO Quantitative
Abstract

In the context of 3D mapping, larger and larger point clouds are acquired with LIDAR sensors. The Iterative Closest Point (ICP) algorithm is used to align these point clouds. However, its complexity is directly dependent of the number of points to process. Several strategies exist to address this problem by reducing the number of points. However, they tend to underperform with non-uniform density, large sensor noise, spurious measurements, and large-scale point clouds, which is the case in mobile robotics. This paper presents a novel sampling algorithm for registration in ICP algorithm based on spectral decomposition analysis and called Spectral Decomposition Filter (SpDF). It preserves geometric information along the topology of point clouds and is able to scale to large environments with non-uniform density. The effectiveness of our method is validated and illustrated by quantitative and qualitative experiments on various environments.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1810.01666

PDF

http://arxiv.org/pdf/1810.01666
Read All
RASSA: Resistive Pre-Alignment Accelerator for Approximate DNA Long Read Mapping

2019-01-28

Roman Kaplan, Leonid Yavits, Ran Ginosar

arXiv_CV

arXiv_CV
Abstract

DNA read mapping is a computationally expensive bioinformatics task, required for genome assembly and consensus polishing. It requires to find the best-fitting location for each DNA read on a long reference sequence. A novel resistive approximate similarity search accelerator, RASSA, exploits charge distribution and parallel in-memory processing to reflect a mismatch count between DNA sequences. RASSA implementation of DNA long read pre-alignment outperforms the state-of-art solution, minimap2, by 16-77x with comparable accuracy and provides two orders of magnitude higher throughput than GateKeeper, a short-read pre-alignment hardware architecture implemented in FPGA.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1809.01127

PDF

https://arxiv.org/pdf/1809.01127
Read All
QA4IE: A Question Answering based Framework for Information Extraction

2019-01-28

Lin Qiu, Hao Zhou, Yanru Qu, Weinan Zhang, Suoheng Li, Shu Rong, Dongyu Ru, Lihua Qian, Kewei Tu, Yong Yu

arXiv_AI

arXiv_AI QA Relation_Extraction Relation
Abstract

Information Extraction (IE) refers to automatically extracting structured relation tuples from unstructured texts. Common IE solutions, including Relation Extraction (RE) and open IE systems, can hardly handle cross-sentence tuples, and are severely restricted by limited relation types as well as informal relation specifications (e.g., free-text based relation tuples). In order to overcome these weaknesses, we propose a novel IE framework named QA4IE, which leverages the flexible question answering (QA) approaches to produce high quality relation triples across sentences. Based on the framework, we develop a large IE benchmark with high quality human evaluation. This benchmark contains 293K documents, 2M golden relation triples, and 636 relation types. We compare our system with some IE baselines on our benchmark and the results show that our system achieves great improvements.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1804.03396

PDF

http://arxiv.org/pdf/1804.03396
Read All
Convolutional Neural Networks with Layer Reuse

2019-01-28

Okan Köpüklü, Maryam Babaee, Gerhard Rigoll

arXiv_CV

arXiv_CV CNN Image_Classification RNN Classification
Abstract

A convolutional layer in a Convolutional Neural Network (CNN) consists of many filters which apply convolution operation to the input, capture some special patterns and pass the result to the next layer. If the same patterns also occur at the deeper layers of the network, why wouldn’t the same convolutional filters be used also in those layers? In this paper, we propose a CNN architecture, Layer Reuse Network (LruNet), where the convolutional layers are used repeatedly without the need of introducing new layers to get a better performance. This approach introduces several advantages: (i) Considerable amount of parameters are saved since we are reusing the layers instead of introducing new layers, (ii) the Memory Access Cost (MAC) can be reduced since reused layer parameters can be fetched only once, (iii) the number of nonlinearities increases with layer reuse, and (iv) reused layers get gradient updates from multiple parts of the network. The proposed approach is evaluated on CIFAR-10, CIFAR-100 and Fashion-MNIST datasets for image classification task, and layer reuse improves the performance by 5.14%, 5.85% and 2.29%, respectively. The source code and pretrained models are publicly available.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09615

PDF

http://arxiv.org/pdf/1901.09615
Read All
A Simple Method to Reduce Off-chip Memory Accesses on Convolutional Neural Networks

2019-01-28

Doyun Kim, Kyoung-Young Kim, Sangsoo Ko, Sanghyuck Ha

arXiv_CV

arXiv_CV CNN
Abstract

For convolutional neural networks, a simple algorithm to reduce off-chip memory accesses is proposed by maximally utilizing on-chip memory in a neural process unit. Especially, the algorithm provides an effective way to process a module which consists of multiple branches and a merge layer. For Inception-V3 on Samsung’s NPU in Exynos, our evaluation shows that the proposed algorithm makes off-chip memory accesses reduced by 1/50, and accordingly achieves 97.59 % reduction in the amount of feature-map data to be transferred from/to off-chip memory.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09614

PDF

http://arxiv.org/pdf/1901.09614
Read All
Active Localization of Gas Leaks using Fluid Simulation

2019-01-28

Martin Asenov, Marius Rutkauskas, Derryck Reid, Kartic Subr, Subramanian Ramamoorthy

arXiv_RO

arXiv_RO Inference
Abstract

Sensors are routinely mounted on robots to acquire various forms of measurements in spatio-temporal fields. Locating features within these fields and reconstruction (mapping) of the dense fields can be challenging in resource-constrained situations, such as when trying to locate the source of a gas leak from a small number of measurements. In such cases, a model of the underlying complex dynamics can be exploited to discover informative paths within the field. We use a fluid simulator as a model, to guide inference for the location of a gas leak. We perform localization via minimization of the discrepancy between observed measurements and gas concentrations predicted by the simulator. Our method is able to account for dynamically varying parameters of wind flow (e.g., direction and strength), and its effects on the observed distribution of gas. We develop algorithms for off-line inference as well as for on-line path discovery via active sensing. We demonstrate the efficiency, accuracy and versatility of our algorithm using experiments with a physical robot conducted in outdoor environments. We deploy an unmanned air vehicle (UAV) mounted with a CO2 sensor to automatically seek out a gas cylinder emitting CO2 via a nozzle. We evaluate the accuracy of our algorithm by measuring the error in the inferred location of the nozzle, based on which we show that our proposed approach is competitive with respect to state of the art baselines.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09608

PDF

http://arxiv.org/pdf/1901.09608
Read All
Fast Hierarchical Depth Map Computation from Stereo

2019-01-28

Vinay Kaushik, Brejesh Lall

arXiv_CV

arXiv_CV Semi_Global
Abstract

Disparity by Block Matching stereo is usually used in applications with limited computational power in order to get depth estimates. However, the research on simple stereo methods has been lesser than the energy based counterparts which promise a better quality depth map with more potential for future improvements. Semi-global-matching (SGM) methods offer good performance and easy implementation but suffer from the problem of very high memory footprint because it’s working on the full disparity space image. On the other hand, Block matching stereo needs much less memory. In this paper, we introduce a novel multi-scale-hierarchical block-matching approach using a pyramidal variant of depth and cost functions which drastically improves the results of standard block matching stereo techniques while preserving the low memory footprint and further reducing the complexity of standard block matching. We tested our new multi block matching scheme on the Middlebury stereo benchmark. For the Middlebury benchmark we get results that are only slightly worse than state of the art SGM implementations.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.09593

PDF

http://arxiv.org/pdf/1901.09593
Read All

174/266

Welcome to AMDS123 Blog!

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL