Welcome to AMDS123 Blog!

Recent Papers about CV, CL and SD

Semi-supervised GANs to Infer Travel Modes in GPS Trajectories

2019-02-27

Ali Yazdizadeh, Zachary Patterson, Bilal Farooq

arXiv_CV

arXiv_CV Adversarial GAN Survey CNN Inference Prediction
Abstract

Semi-supervised Generative Adversarial Networks (GANs) are developed in the context of travel mode inference with uni-dimensional smartphone trajectory data. We use data from a large-scale smartphone travel survey in Montreal, Canada. We convert GPS trajectories into fixed-sized segments with five channels (variables). We develop different GANs architectures and compare their prediction results with Convolutional Neural Networks (CNNs). The best semi-supervised GANs model led to a prediction accuracy of 83.4%, while the best CNN model was able to achieve the prediction accuracy of 81.3%. The results compare favorably with previous studies, especially when taking the large-scale real-world nature of the dataset into account.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1902.10768

PDF

https://arxiv.org/pdf/1902.10768
Read All
Controllable Neural Story Plot Generation via Reinforcement Learning

2019-02-27

Pradyumna Tambwekar, Murtaza Dhuliawala, Animesh Mehta, Lara J. Martin, Brent Harrison, Mark O. Riedl

arXiv_CL

arXiv_CL Reinforcement_Learning Language_Model
Abstract

Language-modeling–based approaches to story plot generation attempt to construct a plot by sampling from a language model (LM) to predict the next character, word, or sentence to add to the story. LM techniques lack the ability to receive guidance from the user to achieve a specific goal, resulting in stories that don’t have a clear sense of progression and lack coherence. We present a reward-shaping technique that analyzes a story corpus and produces intermediate rewards that are backpropagated into a pre-trained LM in order to guide the model towards a given goal. Automated evaluations show our technique can create a model that generates story plots which consistently achieve a specified goal. Human-subject studies show that the generated stories have more plausible event ordering than baseline plot generation techniques.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1809.10736

PDF

http://arxiv.org/pdf/1809.10736
Read All
Nonlinear Markov Random Fields Learned via Backpropagation

2019-02-27

Mikael Brudfors, Yaël Balbastre, John Ashburner

arXiv_CV

arXiv_CV Segmentation CNN
Abstract

Although convolutional neural networks (CNNs) currently dominate competitions on image segmentation, for neuroimaging analysis tasks, more classical generative approaches based on mixture models are still used in practice to parcellate brains. To bridge the gap between the two, in this paper we propose a marriage between a probabilistic generative model, which has been shown to be robust to variability among magnetic resonance (MR) images acquired via different imaging protocols, and a CNN. The link is in the prior distribution over the unknown tissue classes, which are classically modelled using a Markov random field. In this work we model the interactions among neighbouring pixels by a type of recurrent CNN, which can encode more complex spatial interactions. We validate our proposed model on publicly available MR data, from different centres, and show that it generalises across imaging protocols. This result demonstrates a successful and principled inclusion of a CNN in a generative model, which in turn could be adapted by any probabilistic generative approach for image segmentation.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10747

PDF

http://arxiv.org/pdf/1902.10747
Read All
Joint Face Detection and Facial Motion Retargeting for Multiple Faces

2019-02-27

Bindita Chaudhuri, Noranart Vesdapunt, Baoyuan Wang

arXiv_CV

arXiv_CV Face CNN Detection Face_Detection
Abstract

Facial motion retargeting is an important problem in both computer graphics and vision, which involves capturing the performance of a human face and transferring it to another 3D character. Learning 3D morphable model (3DMM) parameters from 2D face images using convolutional neural networks is common in 2D face alignment, 3D face reconstruction etc. However, existing methods either require an additional face detection step before retargeting or use a cascade of separate networks to perform detection followed by retargeting in a sequence. In this paper, we present a single end-to-end network to jointly predict the bounding box locations and 3DMM parameters for multiple faces. First, we design a novel multitask learning framework that learns a disentangled representation of 3DMM parameters for a single face. Then, we leverage the trained single face model to generate ground truth 3DMM parameters for multiple faces to train another network that performs joint face detection and motion retargeting for images with multiple faces. Experimental results show that our joint detection and retargeting network has high face detection accuracy and is robust to extreme expressions and poses while being faster than state-of-the-art methods.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10744

PDF

http://arxiv.org/pdf/1902.10744
Read All
Cautious Deep Learning

2019-02-27

Yotam Hechtlinger, Barnabás Póczos, Larry Wasserman

arXiv_AI

arXiv_AI Adversarial CNN Deep_Learning Prediction
Abstract

Most classifiers operate by selecting the maximum of an estimate of the conditional distribution $p(y|x)$ where $x$ stands for the features of the instance to be classified and $y$ denotes its label. This often results in a {\em hubristic bias}: overconfidence in the assignment of a definite label. Usually, the observations are concentrated on a small volume but the classifier provides definite predictions for the entire space. We propose constructing conformal prediction sets which contain a set of labels rather than a single label. These conformal prediction sets contain the true label with probability $1-\alpha$. Our construction is based on $p(x|y)$ rather than $p(y|x)$ which results in a classifier that is very cautious: it outputs the null set — meaning “I don’t know” — when the object does not resemble the training examples. An important property of our approach is that adversarial attacks are likely to be predicted as the null set or would also include the true label. We demonstrate the performance on the ImageNet ILSVRC dataset and the CelebA and IMDB-Wiki facial datasets using high dimensional features obtained from state of the art convolutional neural networks.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1805.09460

PDF

http://arxiv.org/pdf/1805.09460
Read All
Stereo Visual Inertial LiDAR Simultaneous Localization and Mapping

2019-02-27

Weizhao Shao, Srinivasan Vijayarangan, Cong Li, George Kantor

arXiv_RO

arXiv_RO SLAM
Abstract

Simultaneous Localization and Mapping (SLAM) is a fundamental task to mobile and aerial robotics. LiDAR based systems have proven to be superior compared to vision based systems due to its accuracy and robustness. In spite of its superiority, pure LiDAR based systems fail in certain degenerate cases like traveling through a tunnel. We propose Stereo Visual Inertial LiDAR (VIL) SLAM that performs better on these degenerate cases and has comparable performance on all other cases. VIL-SLAM accomplishes this by incorporating tightly-coupled stereo visual inertial odometry (VIO) with LiDAR mapping and LiDAR enhanced visual loop closure. The system generates loop-closure corrected 6-DOF LiDAR poses in real-time and 1cm voxel dense maps near real-time. VIL-SLAM demonstrates improved accuracy and robustness compared to state-of-the-art LiDAR methods.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10741

PDF

http://arxiv.org/pdf/1902.10741
Read All
Object-driven Text-to-Image Synthesis via Adversarial Training

2019-02-27

Wenbo Li, Pengchuan Zhang, Lei Zhang, Qiuyuan Huang, Xiaodong He, Siwei Lyu, Jianfeng Gao

arXiv_CV

arXiv_CV Salient Adversarial Attention GAN
Abstract

In this paper, we propose Object-driven Attentive Generative Adversarial Newtorks (Obj-GANs) that allow object-centered text-to-image synthesis for complex scenes. Following the two-step (layout-image) generation process, a novel object-driven attentive image generator is proposed to synthesize salient objects by paying attention to the most relevant words in the text description and the pre-generated semantic layout. In addition, a new Fast R-CNN based object-wise discriminator is proposed to provide rich object-wise discrimination signals on whether the synthesized object matches the text description and the pre-generated layout. The proposed Obj-GAN significantly outperforms the previous state of the art in various metrics on the large-scale COCO benchmark, increasing the Inception score by 27% and decreasing the FID score by 11%. A thorough comparison between the traditional grid attention and the new object-driven attention is provided through analyzing their mechanisms and visualizing their attention layers, showing insights of how the proposed model generates complex scenes in high quality.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10740

PDF

http://arxiv.org/pdf/1902.10740
Read All
A Replication Study: Machine Learning Models Are Capable of Predicting Sexual Orientation From Facial Images

2019-02-27

John Leuner

arXiv_CV

arXiv_CV
Abstract

Recent research used machine learning methods to predict a person’s sexual orientation from their photograph (Wang and Kosinski, 2017). To verify this result, two of these models are replicated, one based on a deep neural network (DNN) and one on facial morphology (FM). Using a new dataset of 20,910 photographs from dating websites, the ability to predict sexual orientation is confirmed (DNN accuracy male 68%, female 77%, FM male 62%, female 72%). To investigate whether facial features such as brightness or predominant colours are predictive of sexual orientation, a new model based on highly blurred facial images was created. This model was also able to predict sexual orientation (male 63%, female 72%). The tested models are invariant to intentional changes to a subject’s makeup, eyewear, facial hair and head pose (angle that the photograph is taken at). It is shown that the head pose is not correlated with sexual orientation. While demonstrating that dating profile images carry rich information about sexual orientation these results leave open the question of how much is determined by facial morphology and how much by differences in grooming, presentation and lifestyle. The advent of new technology that is able to detect sexual orientation in this way may have serious implications for the privacy and safety of gay men and women.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10739

PDF

http://arxiv.org/pdf/1902.10739
Read All
Shallow Water Bathymetry Mapping from UAV Imagery based on Machine Learning

2019-02-27

Panagiotis Agrafiotis, Dimitrios Skarlatos, Andreas Georgopoulos, Konstantinos Karantzalos

arXiv_CV

arXiv_CV Survey Quantitative
Abstract

The determination of accurate bathymetric information is a key element for near offshore activities, hydrological studies such as coastal engineering applications, sedimentary processes, hydrographic surveying as well as archaeological mapping and biological research. UAV imagery processed with Structure from Motion (SfM) and Multi View Stereo (MVS) techniques can provide a low-cost alternative to established shallow seabed mapping techniques offering as well the important visual information. Nevertheless, water refraction poses significant challenges on depth determination. Till now, this problem has been addressed through customized image-based refraction correction algorithms or by modifying the collinearity equation. In this paper, in order to overcome the water refraction errors, we employ machine learning tools that are able to learn the systematic underestimation of the estimated depths. In the proposed approach, based on known depth observations from bathymetric LiDAR surveys, an SVR model was developed able to estimate more accurately the real depths of point clouds derived from SfM-MVS procedures. Experimental results over two test sites along with the performed quantitative validation indicated the high potential of the developed approach.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10733

PDF

http://arxiv.org/pdf/1902.10733
Read All
Private Center Points and Learning of Halfspaces

2019-02-27

Amos Beimel, Shay Moran, Kobbi Nissim, Uri Stemmer

arXiv_AI

arXiv_AI Relation
Abstract

We present a private learner for halfspaces over an arbitrary finite domain $X\subset \mathbb{R}^d$ with sample complexity $mathrm{poly}(d,2^{\log^|X|})$. The building block for this learner is a differentially private algorithm for locating an approximate center point of $m>\mathrm{poly}(d,2^{\log^|X|})$ points – a high dimensional generalization of the median function. Our construction establishes a relationship between these two problems that is reminiscent of the relation between the median and learning one-dimensional thresholds [Bun et al.\ FOCS ‘15]. This relationship suggests that the problem of privately locating a center point may have further applications in the design of differentially private algorithms. We also provide a lower bound on the sample complexity for privately finding a point in the convex hull. For approximate differential privacy, we show a lower bound of $m=\Omega(d+\log^*|X|)$, whereas for pure differential privacy $m=\Omega(d\log|X|)$.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10731

PDF

http://arxiv.org/pdf/1902.10731
Read All
Analyzing the Perceived Severity of Cybersecurity Threats Reported on Social Media

2019-02-27

Shi Zong, Alan Ritter, Graham Mueller, Evan Wright

arXiv_CL

arXiv_CL Face
Abstract

Breaking cybersecurity events are shared across a range of websites, including security blogs (FireEye, Kaspersky, etc.), in addition to social media platforms such as Facebook and Twitter. In this paper, we investigate methods to analyze the severity of cybersecurity threats based on the language that is used to describe them online. A corpus of 6,000 tweets describing software vulnerabilities is annotated with authors’ opinions toward their severity. We show that our corpus supports the development of automatic classifiers with high precision for this task. Furthermore, we demonstrate the value of analyzing users’ opinions about the severity of threats reported online as an early indicator of important software vulnerabilities. We present a simple, yet effective method for linking software vulnerabilities reported in tweets to Common Vulnerabilities and Exposures (CVEs) in the National Vulnerability Database (NVD). Using our predicted severity scores, we show that it is possible to achieve a Precision@50 of 0.86 when forecasting high severity vulnerabilities, significantly outperforming a baseline that is based on tweet volume. Finally we show how reports of severe vulnerabilities online are predictive of real-world exploits.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10680

PDF

http://arxiv.org/pdf/1902.10680
Read All
On Constrained Open-World Probabilistic Databases

2019-02-27

Tal Friedman, Guy Van den Broeck

arXiv_AI

arXiv_AI Knowledge
Abstract

Increasing amounts of available data have led to a heightened need for representing large-scale probabilistic knowledge bases. One approach is to use a probabilistic database, a model with strong assumptions that allow for efficiently answering many interesting queries. Recent work on open-world probabilistic databases strengthens the semantics of these probabilistic databases by discarding the assumption that any information not present in the data must be false. While intuitive, these semantics are not sufficiently precise to give reasonable answers to queries. We propose overcoming these issues by using constraints to restrict this open world. We provide an algorithm for one class of queries, and establish a basic hardness result for another. Finally, we propose an efficient and tight approximation for a large class of queries.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10677

PDF

http://arxiv.org/pdf/1902.10677
Read All
Meta-learning with differentiable closed-form solvers

2019-02-27

Luca Bertinetto, João F. Henriques, Philip H.S. Torr, Andrea Vedaldi

arXiv_CV

arXiv_CV Gradient_Descent
Abstract

Adapting deep networks to new concepts from a few examples is challenging, due to the high computational requirements of standard fine-tuning procedures. Most work on few-shot learning has thus focused on simple learning techniques for adaptation, such as nearest neighbours or gradient descent. Nonetheless, the machine learning literature contains a wealth of methods that learn non-deep models very efficiently. In this paper, we propose to use these fast convergent methods as the main adaptation mechanism for few-shot learning. The main idea is to teach a deep network to use standard machine learning tools, such as ridge regression, as part of its own internal model, enabling it to quickly adapt to novel data. This requires back-propagating errors through the solver steps. While normally the cost of the matrix operations involved in such a process would be significant, by using the Woodbury identity we can make the small number of examples work to our advantage. We propose both closed-form and iterative solvers, based on ridge regression and logistic regression components. Our methods constitute a simple and novel approach to the problem of few-shot learning and achieve performance competitive with or superior to the state of the art on three benchmarks.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1805.08136

PDF

http://arxiv.org/pdf/1805.08136
Read All
Customizing Object Detectors for Indoor Robots

2019-02-27

Saif Alabachi, Gita Sukthankar, Rahul Sukthankar

arXiv_RO

arXiv_RO Object_Detection Face CNN Detection
Abstract

Object detection models based on convolutional neural networks (CNNs) demonstrate impressive performance when trained on large-scale labeled datasets. While a generic object detector trained on such a dataset performs adequately in applications where the input data is similar to user photographs, the detector performs poorly on small objects, particularly ones with limited training data or imaged from uncommon viewpoints. Also, a specific room will have many objects that are missed by standard object detectors, frustrating a robot that continually operates in the same indoor environment. This paper describes a system for rapidly creating customized object detectors. Data is collected from a quadcopter that is teleoperated with an interactive interface. Once an object is selected, the quadcopter autonomously photographs the object from multiple viewpoints to %create training data that is used by DUNet (Dense Upscaled Net), collect data to train DUNet (Dense Upscaled Network), our proposed model for learning customized object detectors from scratch given limited data. Our experiments compare the performance of learning models from scratch with DUNet vs.\ fine tuning existing state of the art object detectors, both on our indoor robotics domain and on standard datasets.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10671

PDF

http://arxiv.org/pdf/1902.10671
Read All
Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions

2019-02-27

Omid Rohanian, Shiva Taslimipoor, Samaneh Kouchaki, Le An Ha, Ruslan Mitkov

arXiv_AI

arXiv_AI Attention CNN Deep_Learning Relation
Abstract

We introduce a new method to tag Multiword Expressions (MWEs) using a linguistically interpretable language-independent deep learning architecture. We specifically target discontinuity, an under-explored aspect that poses a significant challenge to computational treatment of MWEs. Two neural architectures are explored: Graph Convolutional Network (GCN) and multi-head self-attention. GCN leverages dependency parse information, and self-attention attends to long-range relations. We finally propose a combined model that integrates complementary information from both through a gating mechanism. The experiments on a standard multilingual dataset for verbal MWEs show that our model outperforms the baselines not only in the case of discontinuous MWEs but also in overall F-score.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10667

PDF

http://arxiv.org/pdf/1902.10667
Read All
Achieving Non-Uniform Densities in Vibration Driven Robot Swarms Using Phase Separation Theory

2019-02-27

Siddharth Mayya, Gennaro Notomista, Dylan Shell, Seth Hutchinson, Magnus Egerstedt

arXiv_RO

arXiv_RO
Abstract

In robot swarms operating with severely constrained sensing and communication, individuals may need to use direct physical proximity to facilitate information exchange, perform task-specific actions, or, crucially, both. Unfortunately, the sorts of densities that are most appropriate for information exchange may differ markedly from densities that are apt for performing the task at hand. We envision a scenario where a swarm of vibration-driven robots - which sit atop bristles and achieve directed motion by vibrating them - move somewhat randomly in an environment while colliding with each other. We demonstrate that such a swarm of brushbots can predictably form high-density robot clusters along with simultaneously co-existing regions with lower robot densities. Theoretical techniques from the study of far-from-equilibrium collectives and statistical mechanics clarify the mechanisms underlying the formation of these regions. Specifically, we capitalize on a transformation that connects the collective properties of a system of self-propelled particles with a fluid system which is passive and classical, thereby inheriting the rich theory of equilibrium thermodynamics. This deeply surprising connection is a formal one and is a relatively recent result in studies of motility induced phase separation; it is previously unexplored in the context of robotics. Experiments are presented for a swarm of differential-drive like brushbots.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10662

PDF

http://arxiv.org/pdf/1902.10662
Read All
From explanation to synthesis: Compositional program induction for learning from demonstration

2019-02-27

Michael Burke, Svetlin Penkov, Subramanian Ramamoorthy

arXiv_CV

arXiv_CV Inference
Abstract

Hybrid systems are a compact and natural mechanism with which to address problems in robotics. This work introduces an approach to learning hybrid systems from demonstrations, with an emphasis on extracting models that are explicitly verifiable and easily interpreted by robot operators. We fit a sequence of controllers using sequential importance sampling under a generative switching proportional controller task model. Here, we parameterise controllers using a proportional gain and a visually verifiable joint angle goal. Inference under this model is challenging, but we address this by introducing an attribution prior extracted from a neural end-to-end visuomotor control model. Given the sequence of controllers comprising a task, we simplify the trace using grammar parsing strategies, taking advantage of the sequence compositionality, before grounding the controllers by training perception networks to predict goals given images. Using this approach, we are successfully able to induce a program for a visuomotor reaching task involving loops and conditionals from a single demonstration and a neural end-to-end model. In addition, we are able to discover the program used for a tower building task. We argue that computer program-like control systems are more interpretable than alternative end-to-end learning approaches, and that hybrid systems inherently allow for better generalisation across task configurations.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10657

PDF

http://arxiv.org/pdf/1902.10657
Read All
F10-SGD: Fast Training of Elastic-net Linear Models for Text Classification and Named-entity Recognition

2019-02-27

Stanislav Peshterliev, Alexander Hsieh, Imre Kiss

arXiv_CL

arXiv_CL Text_Classification Classification Recognition
Abstract

Voice-assistants text classification and named-entity recognition (NER) models are trained on millions of example utterances. Because of the large datasets, long training time is one of the bottlenecks for releasing improved models. In this work, we develop F10-SGD, a fast optimizer for text classification and NER elastic-net linear models. On internal datasets, F10-SGD provides 4x reduction in training time compared to the OWL-QN optimizer without loss of accuracy or increase in model size. Furthermore, we incorporate biased sampling that prioritizes harder examples towards the end of the training. As a result, in addition to faster training, we were able to obtain statistically significant accuracy improvements for NER. On public datasets, F10-SGD obtains 22% faster training time compared to FastText for text classification. And, 4x reduction in training time compared to CRFSuite OWL-QN for NER.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10649

PDF

http://arxiv.org/pdf/1902.10649
Read All
Unifying Ensemble Methods for Q-learning via Social Choice Theory

2019-02-27

Rishav Chourasia, Adish Singla

arXiv_AI

arXiv_AI Reinforcement_Learning
Abstract

Ensemble methods have been widely applied in Reinforcement Learning (RL) in order to enhance stability, increase convergence speed, and improve exploration. These methods typically work by employing an aggregation mechanism over actions of different RL algorithms. We show that a variety of these methods can be unified by drawing parallels from committee voting rules in Social Choice Theory. We map the problem of designing an action aggregation mechanism in an ensemble method to a voting problem which, under different voting rules, yield popular ensemble-based RL algorithms like Majority Voting Q-learning or Bootstrapped Q-learning. Our unification framework, in turn, allows us to design new ensemble-RL algorithms with better performance. For instance, we map two diversity-centered committee voting rules, namely Single Non-Transferable Voting Rule and Chamberlin-Courant Rule, into new RL algorithms that demonstrate excellent exploratory behavior in our experiments.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10646

PDF

http://arxiv.org/pdf/1902.10646
Read All
Provable Guarantees for Gradient-Based Meta-Learning

2019-02-27

Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

arXiv_AI

arXiv_AI Regularization Optimization Deep_Learning
Abstract

We study the problem of meta-learning through the lens of online convex optimization, developing a meta-algorithm bridging the gap between popular gradient-based meta-learning and classical regularization-based multi-task transfer methods. Our method is the first to simultaneously satisfy good sample efficiency guarantees in the convex setting, with generalization bounds that improve with task-similarity, while also being computationally scalable to modern deep learning architectures and the many-task setting. Despite its simplicity, the algorithm matches, up to a constant factor, a lower bound on the performance of any such parameter-transfer method under natural task similarity assumptions. We use experiments in both convex and deep learning settings to verify and demonstrate the applicability of our theory.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10644

PDF

http://arxiv.org/pdf/1902.10644
Read All
Efficient Video Classification Using Fewer Frames

2019-02-27

Shweta Bhardwaj, Mukundhan Srinivasan, Mitesh M. Khapra

arXiv_CV

arXiv_CV Video_Classification Inference Classification
Abstract

Recently,there has been a lot of interest in building compact models for video classification which have a small memory footprint (<1 GB). While these models are compact, they typically operate by repeated application of a small weight matrix to all the frames in a video. E.g. recurrent neural network based methods compute a hidden state for every frame of the video using a recurrent weight matrix. Similarly, cluster-and-aggregate based methods such as NetVLAD, have a learnable clustering matrix which is used to assign soft-clusters to every frame in the video. Since these models look at every frame in the video, the number of floating point operations (FLOPs) is still large even though the memory footprint is small. We focus on building compute-efficient video classification models which process fewer frames and hence have less number of FLOPs. Similar to memory efficient models, we use the idea of distillation albeit in a different setting. Specifically, in our case, a compute-heavy teacher which looks at all the frames in the video is used to train a compute-efficient student which looks at only a small fraction of frames in the video. This is in contrast to a typical memory efficient Teacher-Student setting, wherein both the teacher and the student look at all the frames in the video but the student has fewer parameters. Our work thus complements the research on memory efficient video classification. We do an extensive evaluation with three types of models for video classification,viz.(i) recurrent models (ii) cluster-and-aggregate models and (iii) memory-efficient cluster-and-aggregate models and show that in each of these cases, a see-it-all teacher can be used to train a compute efficient see-very-little student. We show that the proposed student network can reduce the inference time by 30% and the number of FLOPs by approximately 90% with a negligible drop in the performance.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10640

PDF

http://arxiv.org/pdf/1902.10640
Read All
Neural Imaging Pipelines - the Scourge or Hope of Forensics?

2019-02-27

Pawel Korus, Nasir Memon

arXiv_CV

arXiv_CV Optimization Detection
Abstract

Forensic analysis of digital photographs relies on intrinsic statistical traces introduced at the time of their acquisition or subsequent editing. Such traces are often removed by post-processing (e.g., down-sampling and re-compression applied upon distribution in the Web) which inhibits reliable provenance analysis. Increasing adoption of computational methods within digital cameras further complicates the process and renders explicit mathematical modeling infeasible. While this trend challenges forensic analysis even in near-acquisition conditions, it also creates new opportunities. This paper explores end-to-end optimization of the entire image acquisition and distribution workflow to facilitate reliable forensic analysis at the end of the distribution channel, where state-of-the-art forensic techniques fail. We demonstrate that a neural network can be trained to replace the entire photo development pipeline, and jointly optimized for high-fidelity photo rendering and reliable provenance analysis. Such optimized neural imaging pipeline allowed us to increase image manipulation detection accuracy from approx. 45% to over 90%. The network learns to introduce carefully crafted artifacts, akin to digital watermarks, which facilitate subsequent manipulation detection. Analysis of performance trade-offs indicates that most of the gains can be obtained with only minor distortion. The findings encourage further research towards building more reliable imaging pipelines with explicit provenance-guaranteeing properties.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10707

PDF

http://arxiv.org/pdf/1902.10707
Read All
Learning a Family of Optimal State Feedback Controllers

2019-02-27

Christopher Iliffe Sprague, Dario Izzo, Petter Ögren

arXiv_RO

arXiv_RO
Abstract

Solving optimal control problems is well known to be very computationally demanding. In this paper we show how a combination of Pontryagin’s minimum principle and machine learning can be used to learn optimal feedback controllers for a parametric cost function. This enables an unmanned system with limited computational resources to run optimal feedback controllers, and furthermore change the objective being optimised on the fly in response to external events. Thus, a time optimal control policy can be changed to a fuel optimal one, in the event of e.g., fuel leakage. The proposed approach is illustrated on both a standard inverted pendulum swing-up problem and a more complex interplanetary spacecraft orbital transfer.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10139

PDF

http://arxiv.org/pdf/1902.10139
Read All
Zoho at SemEval-2019 Task 9: Semi-supervised Domain Adaptation using Tri-training for Suggestion Mining

2019-02-27

Sai Prasanna, Sri Ananda Seelan

arXiv_CL

arXiv_CL CNN Classification Language_Model
Abstract

This paper describes our submission for the SemEval-2019 Suggestion Mining task. A simple Convolutional Neural Network (CNN) classifier with contextual word representations from a pre-trained language model was used for sentence classification. The model is trained using tri-training, a semi-supervised bootstrapping mechanism for labelling unseen data. Tri-training proved to be an effective technique to accommodate domain shift for cross-domain suggestion mining (Subtask B) where there is no hand labelled training data. For in-domain evaluation (Subtask A), we use the same technique to augment the training set. Our system ranks thirteenth in Subtask A with an $F_1$-score of 68.07 and third in Subtask B with an $F_1$-score of 81.94.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10623

PDF

http://arxiv.org/pdf/1902.10623
Read All
Learning Factored Markov Decision Processes with Unawareness

2019-02-27

Craig Innes, Alex Lascarides

arXiv_AI

arXiv_AI
Abstract

Methods for learning and planning in sequential decision problems often assume the learner is aware of all possible states and actions in advance. This assumption is sometimes untenable. In this paper, we give a method to learn factored markov decision problems from both domain exploration and expert assistance, which guarantees convergence to near-optimal behaviour, even when the agent begins unaware of factors critical to success. Our experiments show our agent learns optimal behaviour on small and large problems, and that conserving information on discovering new possibilities results in faster convergence.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10619

PDF

http://arxiv.org/pdf/1902.10619
Read All
Differential Private Stack Generalization with an Application to Diabetes Prediction

2019-02-27

Xiawei Guo, Quanming Yao, James T. Kwok, WeiWei Tu, Yuqiang Chen, Wenyuan Dai, Qiang Yang

arXiv_AI

arXiv_AI GAN Prediction
Abstract

Differential privacy has recently developed as a standard to ensure data privacy in machine learning. However, to meet such standard, noise is usually introduced into the original data to disambiguate the learning algorithms, which inevitably leads to a deterioration in the predicting performance. In this paper, motivated by the success of improving predicting performance by ensemble learning, we propose to enhance privacy-preserving logistic regression by stacking. We show that this can be done either by sample-based or feature-based partitioning. However, we prove that when privacy-budgets are the same, feature-based partitioning requires fewer samples than sample-based one, thus likely has better empirical performance. Moreover, we prove that predicting performance can be further boosted for feature-based partitioning when feature importance is known. Finally, we not only demonstrate the effectiveness of our method on two benchmark data sets, i.e., MNIST and NEWS20, but also apply it into a real application of cross-organizational diabetes prediction from RUIJIN data set, where privacy is of significant concern.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1811.09491

PDF

http://arxiv.org/pdf/1811.09491
Read All
Still a Pain in the Neck: Evaluating Text Representations on Lexical Composition

2019-02-27

Vered Shwartz, Ido Dagan

arXiv_CL

arXiv_CL Embedding
Abstract

Building meaningful phrase representations is challenging because phrase meanings are not simply the sum of their constituent meanings. Lexical composition can shift the meanings of the constituent words and introduce implicit information. We tested a broad range of textual representations for their capacity to address these issues. We found that as expected, contextualized word representations perform better than static word embeddings, more so on detecting meaning shift than in recovering implicit information, in which their performance is still far from that of humans. Our evaluation suite, including 5 tasks related to lexical composition effects, can serve future research aiming to improve such representations.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10618

PDF

http://arxiv.org/pdf/1902.10618
Read All
Necessary and Sufficient Conditions for Passivity of Velocity-Sourced Impedance Control of Series Elastic Actuators

2019-02-27

Fatih Emre Tosun, Volkan Patoglu

arXiv_RO

arXiv_RO
Abstract

Series Elastic Actuation (SEA) has become prevalent in applications involving physical human-robot interaction as it provides considerable advantages over traditional stiff actuators in terms of stability robustness and fidelity of force control. Several impedance control architectures have been proposed for SEA. Among these alternatives, the cascaded controller with an inner-most velocity loop, an intermediate torque loop and an outer-most impedance loop is particularly favoured for its simplicity, robustness, and performance. In this paper, we derive the \emph{necessary and sufficient conditions} to ensure the passivity of this cascade-controller architecture for rendering two most common virtual impedance models. Based on the newly established passivity conditions, we provide non-conservative design guidelines to haptically display a null impedance and a pure spring while ensuring the passivity of interaction. We also demonstrate the importance of including physical damping in the actuator model during derivation of passivity conditions, when integral controllers are utilized. In particular, we show the adversary effect of physical damping on system passivity.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10607

PDF

http://arxiv.org/pdf/1902.10607
Read All
Architecting Dependable Learning-enabled Autonomous Systems: A Survey

2019-02-27

Chih-Hong Cheng, Dhiraj Gulati, Rongjie Yan

arXiv_AI

arXiv_AI Survey CNN
Abstract

We provide a summary over architectural approaches that can be used to construct dependable learning-enabled autonomous systems, with a focus on automated driving. We consider three technology pillars for architecting dependable autonomy, namely diverse redundancy, information fusion, and runtime monitoring. For learning-enabled components, we additionally summarize recent architectural approaches to increase the dependability beyond standard convolutional neural networks. We conclude the study with a list of promising research directions addressing the challenges of existing approaches.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10590

PDF

http://arxiv.org/pdf/1902.10590
Read All
Road is Enough! Extrinsic Calibration of Non-overlapping Stereo Camera and LiDAR using Road Information

2019-02-27

Jiyong Jeong, Lucas Y. Cho, Ayoung Kim

arXiv_CV

arXiv_CV Optimization Detection
Abstract

This paper presents a framework for the targetless extrinsic calibration of stereo cameras and Light Detection and Ranging (LiDAR) sensors with a non-overlapping Field of View (FOV). In order to solve the extrinsic calibrations problem under such challenging configuration, the proposed solution exploits road markings as static and robust features among the various dynamic objects that are present in urban environment. First, this study utilizes road markings that are commonly captured by the two sensor modalities to select informative images for estimating the extrinsic parameters. In order to accomplish stable optimization, multiple cost functions are defined, including Normalized Information Distance (NID), edge alignment and, plane fitting cost. Therefore a smooth cost curve is formed for global optimization to prevent convergence to the local optimal point. We further evaluate each cost function by examining parameter sensitivity near the optimal point. Another key characteristic of extrinsic calibration, repeatability, is analyzed by conducting the proposed method multiple times with varying randomly perturbed initial points.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10586

PDF

http://arxiv.org/pdf/1902.10586
Read All
FastCal: Robust Online Self-Calibration for Robotic Systems

2019-02-27

Fernando Nobre, Christoffer Heckman

arXiv_RO

arXiv_RO
Abstract

We propose a solution for sensor extrinsic self-calibration with very low time complexity, competitive accuracy and graceful handling of often-avoided corner cases: drift in calibration parameters and unobservable directions in the parameter space. It consists of three main parts: 1) information-theoretic based segment selection for constant-time estimation; 2) observability-aware parameter update through a rank-revealing decomposition of the Fisher information matrix; 3) drift-correcting self-calibration through the time-decay of segments. At the core of our FastCal algorithm is the loosely-coupled formulation for sensor extrinsics calibration and efficient selection of measurements. FastCal runs up to an order of magnitude faster than similar self-calibration algorithms (camera-to-camera extrinsics, excluding feature-matching and image pre-processing on all comparisons), making FastCal ideal for integration into existing, resource-constrained, robotics systems.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10585

PDF

http://arxiv.org/pdf/1902.10585
Read All
When a Tweet is Actually Sexist. A more Comprehensive Classification of Different Online Harassment Categories and The Challenges in NLP

2019-02-27

Sima Sharifirad, Stan Matwin

arXiv_CL

arXiv_CL Classification
Abstract

Sexism is very common in social media and makes the boundaries of freedom tighter for feminist and female users. There is still no comprehensive classification of sexism attracting natural language processing techniques. Categorizing sexism in social media in the categories of hostile or benevolent sexism are so general that simply ignores the other types of sexism happening in these media. This paper proposes a more comprehensive and in-depth categories of online harassment in social media e.g. twitter into the following categories, “Indirect harassment”, “Information threat”, “sexual harassment”, “Physical harassment” and “Not sexist” and address the challenge of labeling them along with presenting the classification result of the categories. It is preliminary work applying machine learning to learn the concept of sexism and distinguishes itself by looking at more precise categories of sexism in social media.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10584

PDF

http://arxiv.org/pdf/1902.10584
Read All
Multiresolution Graph Attention Networks for Relevance Matching

2019-02-27

Ting Zhang, Bang Liu, Di Niu, Kunfeng Lai, Yu Xu

arXiv_CL

arXiv_CL Attention CNN Deep_Learning
Abstract

A large number of deep learning models have been proposed for the text matching problem, which is at the core of various typical natural language processing (NLP) tasks. However, existing deep models are mainly designed for the semantic matching between a pair of short texts, such as paraphrase identification and question answering, and do not perform well on the task of relevance matching between short-long text pairs. This is partially due to the fact that the essential characteristics of short-long text matching have not been well considered in these deep models. More specifically, these methods fail to handle extreme length discrepancy between text pieces and neither can they fully characterize the underlying structural information in long text documents. In this paper, we are especially interested in relevance matching between a piece of short text and a long document, which is critical to problems like query-document matching in information retrieval and web searching. To extract the structural information of documents, an undirected graph is constructed, with each vertex representing a keyword and the weight of an edge indicating the degree of interaction between keywords. Based on the keyword graph, we further propose a Multiresolution Graph Attention Network to learn multi-layered representations of vertices through a Graph Convolutional Network (GCN), and then match the short text snippet with the graphical representation of the document with the attention mechanisms applied over each layer of the GCN. Experimental results on two datasets demonstrate that our graph approach outperforms other state-of-the-art deep matching models.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10580

PDF

http://arxiv.org/pdf/1902.10580
Read All
Balancing Global Exploration and Local-connectivity Exploitation with Rapidly-exploring Random disjointed-Trees

2019-02-27

Tin Lai, Fabio Ramos, Gilad Francis

arXiv_RO

arXiv_RO
Abstract

Sampling efficiency in a highly constrained environment has long been a major challenge for sampling-based planners. In this work, we propose Rapidly-exploring Random disjointed-Trees* (RRdT), an incremental optimal multi-query planner. RRdT uses multiple disjointed-trees to exploit local-connectivity of spaces via Markov Chain random sampling, which utilises neighbourhood information derived from previous successful and failed samples. To balance local exploitation, RRdT* actively explore unseen global spaces when local-connectivity exploitation is unsuccessful. The active trade-off between local exploitation and global exploration is formulated as a multi-armed bandit problem. We argue that the active balancing of global exploration and local exploitation is the key to improving sample efficient in sampling-based motion planners. We provide rigorous proofs of completeness and optimal convergence for this novel approach. Furthermore, we demonstrate experimentally the effectiveness of RRdT’s locally exploring trees in granting improved visibility for planning. Consequently, RRdT outperforms existing state-of-the-art incremental planners, especially in highly constrained environments.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1810.03749

PDF

http://arxiv.org/pdf/1810.03749
Read All
DeepLO: Geometry-Aware Deep LiDAR Odometry

2019-02-27

Younggun Cho, Giseop Kim, Ayoung Kim

arXiv_CV

arXiv_CV Detection
Abstract

Recently, learning-based ego-motion estimation approaches have drawn strong interest from studies mostly focusing on visual perception. These groundbreaking works focus on unsupervised learning for odometry estimation but mostly for visual sensors. Compared to images, a learning-based approach using Light Detection and Ranging (LiDAR) has been reported in a few studies where, most often, a supervised learning framework is proposed. In this paper, we propose a novel approach to geometry-aware deep LiDAR odometry trainable via both supervised and unsupervised frameworks. We incorporate the Iterated Closest Point (ICP) algorithm into a deep-learning framework and show the reliability of the proposed pipeline. We provide two loss functions that allow switching between supervised and unsupervised learning depending on the ground-truth validity in the training phase. An evaluation using the KITTI and Oxford RobotCar dataset demonstrates the prominent performance and efficiency of the proposed method when achieving pose accuracy.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10562

PDF

http://arxiv.org/pdf/1902.10562
Read All
STRIPStream: Integrating Symbolic Planners and Blackbox Samplers

2019-02-27

Caelan Reed Garrett, Tomás Lozano-Pérez, Leslie Pack Kaelbling

arXiv_AI

arXiv_AI Relation
Abstract

Many planning applications involve complex relationships defined on high-dimensional, continuous variables. For example, robotic manipulation requires planning with kinematic, collision, visibility, and motion constraints involving robot configurations, object transforms, and robot trajectories. These constraints typically require specialized procedures to sample satisfying values. We extend the STRIPS planning language to support a generic, declarative specification for these procedures while treating their implementation as black boxes. We also describe cost-sensitive planning within this framework. We provide several domain-independent algorithms that reduce STRIPStream problems to a sequence of finite-domain STRIPS planning problems. Finally, we evaluate our algorithms on three robotic planning domains.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1802.08705

PDF

http://arxiv.org/pdf/1802.08705
Read All
Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference

2019-02-27

Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan

arXiv_CV

arXiv_CV Regularization Inference Deep_Learning
Abstract

Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes. In this paper, we introduce a scalable multi-view stereo framework based on the recurrent neural network. Instead of regularizing the entire 3D cost volume in one go, the proposed Recurrent Multi-view Stereo Network (R-MVSNet) sequentially regularizes the 2D cost maps along the depth direction via the gated recurrent unit (GRU). This reduces dramatically the memory consumption and makes high-resolution reconstruction feasible. We first show the state-of-the-art performance achieved by the proposed R-MVSNet on the recent MVS benchmarks. Then, we further demonstrate the scalability of the proposed method on several large-scale scenarios, where previous learned approaches often fail due to the memory constraint. Code is available at https://github.com/YoYo000/MVSNet.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10556

PDF

http://arxiv.org/pdf/1902.10556
Read All
Technical report of 'Empirical Study on Human Evaluation of Complex Argumentation Frameworks'

2019-02-27

Marcos Cramer, Mathieu Guillaume

arXiv_AI

arXiv_AI Relation
Abstract

In abstract argumentation, multiple argumentation semantics have been proposed that allow to select sets of jointly acceptable arguments from a given argumentation framework, i.e. based only on the attack relation between arguments. The existence of multiple argumentation semantics raises the question which of these semantics predicts best how humans evaluate arguments. Previous empirical cognitive studies that have tested how humans evaluate sets of arguments depending on the attack relation between them have been limited to a small set of very simple argumentation frameworks, so that some semantics studied in the literature could not be meaningfully distinguished by these studies. In this paper we report on an empirical cognitive study that overcomes these limitations by taking into consideration twelve argumentation frameworks of three to eight arguments each. These argumentation frameworks were mostly more complex than the argumentation frameworks considered in previous studies. All twelve argumentation framework were systematically instantiated with natural language arguments based on a certain fictional scenario, and participants were shown both the natural language arguments and a graphical depiction of the attack relation between them. Our data shows that grounded and CF2 semantics were the best predictors of human argument evaluation. A detailed analysis revealed that part of the participants chose a cognitively simpler strategy that is predicted very well by grounded semantics, while another part of the participants chose a cognitively more demanding strategy that is mostly predicted well by CF2 semantics.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10552

PDF

http://arxiv.org/pdf/1902.10552
Read All
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models

2019-02-27

Alexandra Chronopoulou, Christos Baziotis, Alexandros Potamianos

arXiv_CL

arXiv_CL Text_Classification Transfer_Learning Optimization Classification Language_Model
Abstract

A growing number of state-of-the-art transfer learning methods employ language models pretrained on large generic corpora. In this paper we present a conceptually simple and effective transfer learning approach that addresses the problem of catastrophic forgetting. Specifically, we combine the task-specific optimization function with an auxiliary language model objective, which is adjusted during the training process. This preserves language regularities captured by language models, while enabling sufficient adaptation for solving the target task. Our method does not require pretraining or finetuning separate components of the network and we train our models end-to-end in a single step. We present results on a variety of challenging affective and text classification tasks, surpassing well established transfer learning methods with greater level of complexity.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10547

PDF

http://arxiv.org/pdf/1902.10547
Read All
Attributes-aided Part Detection and Refinement for Person Re-identification

2019-02-27

Shuzhao Li, Huimin Yu, Wei Huang, Jing Zhang

arXiv_CV

arXiv_CV Re-identification Object_Detection Person_Re-identification Classification Detection
Abstract

Person attributes are often exploited as mid-level human semantic information to help promote the performance of person re-identification task. In this paper, unlike most existing methods simply taking attribute learning as a classification problem, we perform it in a different way with the motivation that attributes are related to specific local regions, which refers to the perceptual ability of attributes. We utilize the process of attribute detection to generate corresponding attribute-part detectors, whose invariance to many influences like poses and camera views can be guaranteed. With detected local part regions, our model extracts local features to handle the body part misalignment problem, which is another major challenge for person re-identification. The local descriptors are further refined by fused attribute information to eliminate interferences caused by detection deviation. Extensive experiments on two popular benchmarks with attribute annotations demonstrate the effectiveness of our model and competitive performance compared with state-of-the-art algorithms.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10528

PDF

http://arxiv.org/pdf/1902.10528
Read All
DiscoFuse: A Large-Scale Dataset for Discourse-based Sentence Fusion

2019-02-27

Mor Geva, Eric Malmi, Idan Szpektor, Jonathan Berant

arXiv_CL

arXiv_CL Transfer_Learning
Abstract

Sentence fusion is the task of joining several independent sentences into a single coherent text. Current datasets for sentence fusion are small and insufficient for training modern neural models. In this paper, we propose a method for automatically-generating fusion examples from raw text and present DiscoFuse, a large scale dataset for discourse-based sentence fusion. We author a set of rules for identifying a diverse set of discourse phenomena in raw text, and decomposing the text into two independent sentences. We apply our approach on two document collections: Wikipedia and Sports articles, yielding 60 million fusion examples annotated with discourse information required to reconstruct the fused text. We develop a sequence-to-sequence model on DiscoFuse and thoroughly analyze its strengths and weaknesses with respect to the various discourse phenomena, using both automatic as well as human evaluation. Finally, we conduct transfer learning experiments with WebSplit, a recent dataset for text simplification. We show that pretraining on DiscoFuse substantially improves performance on WebSplit when viewed as a sentence fusion task.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10526

PDF

http://arxiv.org/pdf/1902.10526
Read All
Viable Dependency Parsing as Sequence Labeling

2019-02-27

Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez

arXiv_CL

arXiv_CL RNN
Abstract

We recast dependency parsing as a sequence labeling problem, exploring several encodings of dependency trees as labels. While dependency parsing by means of sequence labeling had been attempted in existing work, results suggested that the technique was impractical. We show instead that with a conventional BiLSTM-based model it is possible to obtain fast and accurate parsers. These parsers are conceptually simple, not needing traditional parsing algorithms or auxiliary structures. However, experiments on the PTB and a sample of UD treebanks show that they provide a good speed-accuracy tradeoff, with results competitive with more complex approaches.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10505

PDF

http://arxiv.org/pdf/1902.10505
Read All
EL Embeddings: Geometric construction of models for the Description Logic EL ++

2019-02-27

Maxat Kulmanov, Wang Liu-Wei, Yuan Yan, Robert Hoehndorf

arXiv_AI

arXiv_AI Knowledge_Graph Knowledge Embedding Optimization Prediction Relation
Abstract

An embedding is a function that maps entities from one algebraic structure into another while preserving certain characteristics. Embeddings are being used successfully for mapping relational data or text into vector spaces where they can be used for machine learning, similarity search, or similar tasks. We address the problem of finding vector space embeddings for theories in the Description Logic $\mathcal{EL}^{++}$ that are also models of the TBox. To find such embeddings, we define an optimization problem that characterizes the model-theoretic semantics of the operators in $\mathcal{EL}^{++}$ within $\Re^n$, thereby solving the problem of finding an interpretation function for an $\mathcal{EL}^{++}$ theory given a particular domain $\Delta$. Our approach is mainly relevant to large $\mathcal{EL}^{++}$ theories and knowledge bases such as the ontologies and knowledge graphs used in the life sciences. We demonstrate that our method can be used for improved prediction of protein–protein interactions when compared to semantic similarity measures or knowledge graph embedding

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10499

PDF

http://arxiv.org/pdf/1902.10499
Read All
Few-Shot Text Classification with Induction Network

2019-02-27

Ruiying Geng, Binhua Li, Yongbin Li, Yuxiao Ye, Ping Jian, Jian Sun

arXiv_CL

arXiv_CL Sentiment Sentiment_Classification Text_Classification Classification
Abstract

Text classification tends to struggle when data is deficient or when it needs to adapt to unseen classes. In such challenging scenarios, recent studies often use meta learning to simulate the few-shot task, in which new queries are compared to a small support set on a sample-wise level. However, this sample-wise comparison may be severely disturbed by the various expressions in the same class. Therefore, we should be able to learn a general representation of each class in the support set and then compare it to new queries. In this paper, we propose a novel Induction Network to learn such generalized class-wise representations, innovatively combining the dynamic routing algorithm with the typical meta learning framework. In this way, our model is able to induce from particularity to university, which is a more human-like learning approach. We evaluate our model on a well-studied sentiment classification dataset (English) and a real-world dialogue intent classification dataset (Chinese). Experiment results show that, on both datasets, our model significantly outperforms existing state-of-the-art models and improves the average accuracy by more than 3%, which proves the effectiveness of class-wise generalization in few-shot text classification.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10482

PDF

http://arxiv.org/pdf/1902.10482
Read All
Improving drone localisation around wind turbines using monocular model-based tracking

2019-02-27

Oliver Moolan-Feroze, Konstantinos Karachalios, Dimitrios N. Nikolaidis, Andrew Calway

arXiv_RO

arXiv_RO Tracking Drone CNN
Abstract

We present a novel method of integrating image-based measurements into a drone navigation system for the automated inspection of wind turbines. We take a model-based tracking approach, where a 3D skeleton representation of the turbine is matched to the image data. Matching is based on comparing the projection of the representation to that inferred from images using a convolutional neural network. This enables us to find image correspondences using a generic turbine model that can be applied to a wide range of turbine shapes and sizes. To estimate 3D pose of the drone, we fuse the network output with GPS and IMU measurements using a pose graph optimiser. Results illustrate that the use of the image measurements significantly improves the accuracy of the localisation over that obtained using GPS and IMU alone.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10474

PDF

http://arxiv.org/pdf/1902.10474
Read All
Fractional spectral graph wavelets and their applications

2019-02-27

Jiasong Wu, Fuzhi Wu, Qihan Yang, Youyong Kong, Xilin Liu, Yan Zhang, Lotfi Senhadji, Huazhong Shu

arXiv_CV

arXiv_CV
Abstract

One of the key challenges in the area of signal processing on graphs is to design transforms and dictionaries methods to identify and exploit structure in signals on weighted graphs. In this paper, we first generalize graph Fourier transform (GFT) to graph fractional Fourier transform (GFRFT), which is then used to define a novel transform named spectral graph fractional wavelet transform (SGFRWT), which is a generalized and extended version of spectral graph wavelet transform (SGWT). A fast algorithm for SGFRWT is also derived and implemented based on Fourier series approximation. The potential applications of SGFRWT are also presented.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10471

PDF

http://arxiv.org/pdf/1902.10471
Read All
Real-Time detection, classification and DOA estimation of Unmanned Aerial Vehicle

2019-02-27

Konstantinos Polyzos, Evangelos Dermatas

arXiv_SD

arXiv_SD Classification Detection
Abstract

The present work deals with a new passive system for real-time detection, classification and direction of arrival estimator of Unmanned Aerial Vehicles (UAVs). The proposed system composed of a very low cost hardware components, comprises two different arrays of three or six-microphones, non-linear amplification and filtering of the analog acoustic signal, avoiding also the saturation effect in case where the UAV is located nearby to the microphones. Advance array processing methods are used to detect and locate the wide-band sources in the near and far-field including array calibration and energy based beamforming techniques. Moreover, oversampling techniques are adopted to increase the acquired signals accuracy and to also decrease the quantization noise. The classifier is based on the nearest neighbor rule of a normalized Power Spectral Density, the acoustic signature of the UAV spectrum in short periods of time. The low-cost, low-power and high efficiency embedded processor STM32F405RG is used for system implementation. Preliminary experimental results have shown the effectiveness of the proposed approach.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.11130

PDF

http://arxiv.org/pdf/1902.11130
Read All
Generative Collaborative Networks for Single Image Super-Resolution

2019-02-27

Mohamed El Amine Seddik, Mohamed Tamaazousti, John Lin

arXiv_CV

arXiv_CV Super_Resolution CNN
Abstract

A common issue of deep neural networks-based methods for the problem of Single Image Super-Resolution (SISR), is the recovery of finer texture details when super-resolving at large upscaling factors. This issue is particularly related to the choice of the objective loss function. In particular, recent works proposed the use of a VGG loss which consists in minimizing the error between the generated high resolution images and ground-truth in the feature space of a Convolutional Neural Network (VGG19), pre-trained on the very “large” ImageNet dataset. When considering the problem of super-resolving images with a distribution “far” from the ImageNet images distribution (\textit{e.g.,} satellite images), their proposed \textit{fixed} VGG loss is no longer relevant. In this paper, we present a general framework named \textit{Generative Collaborative Networks} (GCN), where the idea consists in optimizing the \textit{generator} (the mapping of interest) in the feature space of a \textit{features extractor} network. The two networks (generator and extractor) are \textit{collaborative} in the sense that the latter “helps” the former, by constructing discriminative and relevant features (not necessarily \textit{fixed} and possibly learned \textit{mutually} with the generator). We evaluate the GCN framework in the context of SISR, and we show that it results in a method that is adapted to super-resolution domains that are “far” from the ImageNet domain.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10467

PDF

http://arxiv.org/pdf/1902.10467
Read All
Flash Lightens Gray Pixels

2019-02-27

Yanlin Qian, Song Yan, Joni-Kristian Kämäräinen, Jiri Matas

arXiv_CV

arXiv_CV Detection
Abstract

In the real world, a scene is usually cast by multiple illuminants and herein we address the problem of spatial illumination estimation. Our solution is based on detecting gray pixels with the help of flash photography. We show that flash photography significantly improves the performance of gray pixel detection without illuminant prior, training data or calibration of the flash. We also introduce a novel flash photography dataset generated from the MIT intrinsic dataset.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10466

PDF

http://arxiv.org/pdf/1902.10466
Read All
Multilingual Neural Machine Translation with Knowledge Distillation

2019-02-27

Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, Tieyan Liu

arXiv_CL

arXiv_CL Knowledge Attention
Abstract

Multilingual machine translation, which translates multiple languages with a single model, has attracted much attention due to its efficiency of offline training and online serving. However, traditional multilingual translation usually yields inferior accuracy compared with the counterpart using individual models for each language pair, due to language diversity and model capacity limitations. In this paper, we propose a distillation-based approach to boost the accuracy of multilingual machine translation. Specifically, individual models are first trained and regarded as teachers, and then the multilingual model is trained to fit the training data and match the outputs of individual models simultaneously through knowledge distillation. Experiments on IWSLT, WMT and Ted talk translation datasets demonstrate the effectiveness of our method. Particularly, we show that one model is enough to handle multiple languages (up to 44 languages in our experiment), with comparable or even better accuracy than individual models.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.10461

PDF

http://arxiv.org/pdf/1902.10461
Read All

141/266

Welcome to AMDS123 Blog!

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL