Welcome to AMDS123 Blog!

Recent Papers about CV, CL and SD

Controller Synthesis for Discrete-time Hybrid Polynomial Systems via Occupation Measures

2019-05-15

Weiqiao Han, Russ Tedrake

arXiv_RO

arXiv_RO Optimization
Abstract

We consider the feedback design for stabilizing a rigid body system by making and breaking multiple contacts with the environment without prespecifying the timing or the number of occurrence of the contacts. We model such a system as a discrete-time hybrid polynomial system, where the state-input space is partitioned into several polytopic regions with each region associated with a different polynomial dynamics equation. Based on the notion of occupation measures, we present a novel controller synthesis approach that solves finite-dimensional semidefinite programs as approximations to an infinite-dimensional linear program to stabilize the system. The optimization formulation is simple and convex, and for any fixed degree of approximations the computational complexity is polynomial in the state and control input dimensions. We illustrate our approach on some robotics examples.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1809.06715

PDF

http://arxiv.org/pdf/1809.06715
Read All
A Clinical Approach to Training Effective Data Scientists

2019-05-15

Kit T Rodolfa, Adolfo De Unanue, Matt Gee, Rayid Ghani

arXiv_AI

arXiv_AI
Abstract

Like medicine, psychology, or education, data science is fundamentally an applied discipline, with most students who receive advanced degrees in the field going on to work on practical problems. Unlike these disciplines, however, data science education remains heavily focused on theory and methods, and practical coursework typically revolves around cleaned or simplified data sets that have little analog in professional applications. We believe that the environment in which new data scientists are trained should more accurately reflect that in which they will eventually practice and propose here a data science master’s degree program that takes inspiration from the residency model used in medicine. Students in the suggested program would spend three years working on a practical problem with an industry, government, or nonprofit partner, supplemented with coursework in data science methods and theory. We also discuss how this program can also be implemented in shorter formats to augment existing professional masters programs in different disciplines. This approach to learning by doing is designed to fill gaps in our current approach to data science education and ensure that students develop the skills they need to practice data science in a professional context and under the many constraints imposed by that context.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.06875

PDF

http://arxiv.org/pdf/1905.06875
Read All
Unsupervised Deep Power Saving and Contrast Enhancement for OLED Displays

2019-05-15

Yong-Goo Shin, Seung Park, Min-Jae Yoo, Sung-Jea Ko

arXiv_AI

arXiv_AI Salient GAN CNN Deep_Learning
Abstract

Various power saving and contrast enhancement (PSCE) techniques have been applied to an organic light emitting diode (OLED) display for reducing the power demands of the display while preserving the image quality. In this paper, we propose a new deep learning-based PSCE scheme that can save power consumed by the OLED display while enhancing the contrast of the displayed image. In the proposed method, the power consumption is saved by simply reducing the brightness a certain ratio, whereas the perceived visual quality is preserved as much as possible by enhancing the contrast of the image using a convolutional neural network (CNN). Furthermore, our CNN can learn the PSCE technique without a reference image by unsupervised learning. Experimental results show that the proposed method is superior to conventional ones in terms of image quality assessment metrics such as a visual saliency-induced index (VSI) and a measure of enhancement (EME).

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05916

PDF

http://arxiv.org/pdf/1905.05916
Read All
Explicit Utilization of General Knowledge in Machine Reading Comprehension

2019-05-15

Chao Wang, Hui Jiang

arXiv_AI

arXiv_AI Knowledge Attention
Abstract

To bridge the gap between Machine Reading Comprehension (MRC) models and human beings, which is mainly reflected in the hunger for data and the robustness to noise, in this paper, we explore how to integrate the neural networks of MRC models with the general knowledge of human beings. On the one hand, we propose a data enrichment method, which uses WordNet to extract inter-word semantic connections as general knowledge from each given passage-question pair. On the other hand, we propose a new MRC model named as Knowledge Aided Reader (KAR), which explicitly utilizes the above extracted general knowledge in its attention mechanisms. Based on the data enrichment method, KAR is comparable in performance with the state-of-the-art MRC models and significantly more robust to noise than them. Besides, when only a subset (20% - 80%) of the training examples are available, KAR outperforms the state-of-the-art MRC models by a large margin and is still fairly robust to noise.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1809.03449

PDF

http://arxiv.org/pdf/1809.03449
Read All
Passage Ranking with Weak Supervsion

2019-05-15

Peng Xu, Xiaofei Ma, Ramesh Nallapati, Bing Xiang

arXiv_CL

arXiv_CL
Abstract

In this paper, we propose a \textit{weak supervision} framework for neural ranking tasks based on the data programming paradigm \citep{Ratner2016}, which enables us to leverage multiple weak supervision signals from different sources. Empirically, we consider two sources of weak supervision signals, unsupervised ranking functions and semantic feature similarities. We train a BERT-based passage-ranking model (which achieves new state-of-the-art performances on two benchmark datasets with full supervision) in our weak supervision framework. Without using ground-truth training labels, BERT-PR models outperform BM25 baseline by a large margin on all three datasets and even beat the previous state-of-the-art results with full supervision on two of the datasets.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05910

PDF

http://arxiv.org/pdf/1905.05910
Read All
A Learning based Branch and Bound for Maximum Common Subgraph Problems

2019-05-15

Yan-li Liu, Chu-min Li, Hua Jiang, Kun He

arXiv_CV

arXiv_CV Reinforcement_Learning
Abstract

Branch-and-bound (BnB) algorithms are widely used to solve combinatorial problems, and the performance crucially depends on its branching heuristic.In this work, we consider a typical problem of maximum common subgraph (MCS), and propose a branching heuristic inspired from reinforcement learning with a goal of reaching a tree leaf as early as possible to greatly reduce the search tree size.Extensive experiments show that our method is beneficial and outperforms current best BnB algorithm for the MCS.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05840

PDF

http://arxiv.org/pdf/1905.05840
Read All
Task-Driven Modular Networks for Zero-Shot Compositional Learning

2019-05-15

Senthil Purushwalkam, Maximilian Nickel, Abhinav Gupta, Marc'Aurelio Ranzato

arXiv_CV

arXiv_CV Knowledge Classification
Abstract

One of the hallmarks of human intelligence is the ability to compose learned knowledge into novel concepts which can be recognized without a single training example. In contrast, current state-of-the-art methods require hundreds of training examples for each possible category to build reliable and accurate classifiers. To alleviate this striking difference in efficiency, we propose a task-driven modular architecture for compositional reasoning and sample efficient learning. Our architecture consists of a set of neural network modules, which are small fully connected layers operating in semantic concept space. These modules are configured through a gating function conditioned on the task to produce features representing the compatibility between the input image and the concept under consideration. This enables us to express tasks as a combination of sub-tasks and to generalize to unseen categories by reweighting a set of small modules. Furthermore, the network can be trained efficiently as it is fully differentiable and its modules operate on small sub-spaces. We focus our study on the problem of compositional zero-shot classification of object-attribute categories. We show in our experiments that current evaluation metrics are flawed as they only consider unseen object-attribute pairs. When extending the evaluation to the generalized setting which accounts also for pairs seen during training, we discover that naive baseline methods perform similarly or better than current approaches. However, our modular network is able to outperform all existing approaches on two widely-used benchmark datasets.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05908

PDF

http://arxiv.org/pdf/1905.05908
Read All
Speaker-Independent Speech-Driven Visual Speech Synthesis using Domain-Adapted Acoustic Models

2019-05-15

Ahmed Hussen Abdelaziz, Barry-John Theobald, Justin Binder, Gabriele Fanelli, Paul Dixon, Nicholas Apostoloff, Thibaut Weise, Sachin Kajareker

arXiv_SD

arXiv_SD Face Speech_Recognition Recognition
Abstract

Speech-driven visual speech synthesis involves mapping features extracted from acoustic speech to the corresponding lip animation controls for a face model. This mapping can take many forms, but a powerful approach is to use deep neural networks (DNNs). However, a limitation is the lack of synchronized audio, video, and depth data required to reliably train the DNNs, especially for speaker-independent models. In this paper, we investigate adapting an automatic speech recognition (ASR) acoustic model (AM) for the visual speech synthesis problem. We train the AM on ten thousand hours of audio-only data. The AM is then adapted to the visual speech synthesis domain using ninety hours of synchronized audio-visual speech. Using a subjective assessment test, we compared the performance of the AM-initialized DNN to one with a random initialization. The results show that viewers significantly prefer animations generated from the AM-initialized DNN than the ones generated using the randomly initialized model. We conclude that visual speech synthesis can significantly benefit from the powerful representation of speech in the ASR acoustic models.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.06860

PDF

http://arxiv.org/pdf/1905.06860
Read All
Addressing the Loss-Metric Mismatch with Adaptive Loss Alignment

2019-05-15

Chen Huang, Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Shih-Yu Sun, Carlos Guestrin, Josh Susskind

arXiv_CV

arXiv_CV Reinforcement_Learning Classification
Abstract

In most machine learning training paradigms a fixed, often handcrafted, loss function is assumed to be a good proxy for an underlying evaluation metric. In this work we assess this assumption by meta-learning an adaptive loss function to directly optimize the evaluation metric. We propose a sample efficient reinforcement learning approach for adapting the loss dynamically during training. We empirically show how this formulation improves performance by simultaneously optimizing the evaluation metric and smoothing the loss landscape. We verify our method in metric learning and classification scenarios, showing considerable improvements over the state-of-the-art on a diverse set of tasks. Importantly, our method is applicable to a wide range of loss functions and evaluation metrics. Furthermore, the learned policies are transferable across tasks and data, demonstrating the versatility of the method.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05895

PDF

http://arxiv.org/pdf/1905.05895
Read All
Crowd Density Estimation using Novel Feature Descriptor

2019-05-15

Adwan Alownie Alanazi, Muhammad Bilal

arXiv_CV

arXiv_CV
Abstract

Crowd density estimation is an important task for crowd monitoring. Many efforts have been done to automate the process of estimating crowd density from images and videos. Despite series of efforts, it remains a challenging task. In this paper, we proposes a new texture feature-based approach for the estimation of crowd density based on Completed Local Binary Pattern (CLBP). We first divide the image into blocks and then re-divide the blocks into cells. For each cell, we compute CLBP and then concatenate them to describe the texture of the corresponding block. We then train a multi-class Support Vector Machine (SVM) classifier, which classifies each block of image into one of four categories, i.e. Very Low, Low, Medium, and High. We evaluate our technique on the PETS 2009 dataset, and from the experiments, we show to achieve 95% accuracy for the proposed descriptor. We also compare other state-of-the-art texture descriptors and from the experimental results, we show that our proposed method outperforms other state-of-the-art methods.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05891

PDF

http://arxiv.org/pdf/1905.05891
Read All
DARNet: Deep Active Ray Network for Building Segmentation

2019-05-15

Dominic Cheng, Renjie Liao, Sanja Fidler, Raquel Urtasun

arXiv_CV

arXiv_CV Segmentation CNN
Abstract

In this paper, we propose a Deep Active Ray Network (DARNet) for automatic building segmentation. Taking an image as input, it first exploits a deep convolutional neural network (CNN) as the backbone to predict energy maps, which are further utilized to construct an energy function. A polygon-based contour is then evolved via minimizing the energy function, of which the minimum defines the final segmentation. Instead of parameterizing the contour using Euclidean coordinates, we adopt polar coordinates, i.e., rays, which not only prevents self-intersection but also simplifies the design of the energy function. Moreover, we propose a loss function that directly encourages the contours to match building boundaries. Our DARNet is trained end-to-end by back-propagating through the energy minimization and the backbone CNN, which makes the CNN adapt to the dynamics of the contour evolution. Experiments on three building instance segmentation datasets demonstrate our DARNet achieves either state-of-the-art or comparable performances to other competitors.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05889

PDF

http://arxiv.org/pdf/1905.05889
Read All
Online Center of Mass Estimation for a Humanoid Wheeled Inverted Pendulum Robot

2019-05-14

Munzir Zafar, Akash Patel, Bogdan Vlahov, Nathaniel Glaser, Sergio Aguillera, Seth Hutchinson

arXiv_RO

arXiv_RO Gradient_Descent
Abstract

We present a novel application of robust control and online learning for the balancing of a n Degree of Freedom (DoF), Wheeled Inverted Pendulum (WIP) humanoid robot. Our technique condenses the inaccuracies of a mass model into a Center of Mass (CoM) error, balances despite this error, and uses online learning to update the mass model for a better CoM estimate. Using a simulated model of our robot, we meta-learn a set of excitory joint poses that makes our gradient descent algorithm quickly converge to an accurate (CoM) estimate. This simulated pipeline executes in a fully online fashion, using active disturbance rejection to address the mass errors that result from a steadily evolving mass model. Experiments were performed on a 19 DoF WIP, in which we manually acquired the data for the learned set of poses and show that the mass model produced by a gradient descent produces a CoM estimate that improves overall control and efficiency. This work contributes to a greater corpus of whole body control on the Golem Krang humanoid robot.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1810.03076

PDF

http://arxiv.org/pdf/1810.03076
Read All
Generative Design in Minecraft: Chronicle Challenge

2019-05-14

Christoph Salge, Christian Guckelsberger, Michael Cerny Green, Rodrigo Canaan, Julian Togelius

arXiv_AI

arXiv_AI
Abstract

We introduce the Chronicle Challenge as an optional addition to the Settlement Generation Challenge in Minecraft. One of the foci of the overall competition is adaptive procedural content generation (PCG), an arguably under-explored problem in computational creativity. In the base challenge, participants must generate new settlements that respond to and ideally interact with existing content in the world, such as the landscape or climate. The goal is to understand the underlying creative process, and to design better PCG systems. The Chronicle Challenge in particular focuses on the generation of a narrative based on the history of a generated settlement, expressed in natural language. We discuss the unique features of the Chronicle Challenge in comparison to other competitions, clarify the characteristics of a chronicle eligible for submission and describe the evaluation criteria. We furthermore draw on simulation-based approaches in computational storytelling as examples to how this challenge could be approached.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05888

PDF

http://arxiv.org/pdf/1905.05888
Read All
Kernel Mean Matching for Content Addressability of GANs

2019-05-14

Wittawat Jitkrittum, Patsorn Sangkloy, Muhammad Waleed Gondal, Amit Raj, James Hays, Bernhard Schölkopf

arXiv_CV

arXiv_CV Adversarial Knowledge GAN
Abstract

We propose a novel procedure which adds “content-addressability” to any given unconditional implicit model e.g., a generative adversarial network (GAN). The procedure allows users to control the generative process by specifying a set (arbitrary size) of desired examples based on which similar samples are generated from the model. The proposed approach, based on kernel mean matching, is applicable to any generative models which transform latent vectors to samples, and does not require retraining of the model. Experiments on various high-dimensional image generation problems (CelebA-HQ, LSUN bedroom, bridge, tower) show that our approach is able to generate images which are consistent with the input set, while retaining the image quality of the original model. To our knowledge, this is the first work that attempts to construct, at test time, a content-addressable generative model from a trained marginal model.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05882

PDF

http://arxiv.org/pdf/1905.05882
Read All
Budget-aware Semi-Supervised Semantic and Instance Segmentation

2019-05-14

Miriam Bellver, Amaia Salvador, Jordi Torres, Xavier-Giro-i-Nieto

arXiv_CV

arXiv_CV Segmentation
Abstract

Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention. Generally, the annotation burden is mitigated by labeling datasets with weaker forms of supervision, e.g. image-level labels or bounding boxes. Another option are semi-supervised settings, that commonly leverage a few strong annotations and a huge number of unlabeled/weakly-labeled data. In this paper, we revisit semi-supervised segmentation schemes and narrow down significantly the annotation budget (in terms of total labeling time of the training set) compared to previous approaches. With a very simple pipeline, we demonstrate that at low annotation budgets, semi-supervised methods outperform by a wide margin weakly-supervised ones for both semantic and instance segmentation. Our approach also outperforms previous semi-supervised works at a much reduced labeling cost. We present results for the Pascal VOC benchmark and unify weakly and semi-supervised approaches by considering the total annotation budget, thus allowing a fairer comparison between methods.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05880

PDF

http://arxiv.org/pdf/1905.05880
Read All
Zero-Shot Voice Style Transfer with Only Autoencoder Loss

2019-05-14

Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Mark Hasegawa-Johnson

arXiv_AI

arXiv_AI Adversarial GAN Style_Transfer
Abstract

Non-parallel many-to-many voice conversion, as well as zero-shot voice conversion, remain under-explored areas. Deep style transfer algorithms, such as generative adversarial networks (GAN) and conditional variational autoencoder (CVAE), are being applied as new solutions in this field. However, GAN training is sophisticated and difficult, and there is no strong evidence that its generated speech is of good perceptual quality. On the other hands, CVAE training is simple but does not come with the distribution-matching property as in GAN. In this paper, we propose a new style transfer scheme that involves only an autoencoder with a carefully designed bottleneck. We formally show that this scheme can achieve distribution-matching style transfer by training only on a self-reconstruction loss. Based on this scheme, we proposed AUTOVC, which achieves state-of-the-art results in many-to-many voice conversion with non-parallel data, and which is the first to perform zero-shot voice conversion.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05879

PDF

http://arxiv.org/pdf/1905.05879
Read All
Extraction and Analysis of Clinically Important Follow-up Recommendations in a Large Radiology Dataset

2019-05-14

Wilson Lau, Thomas H Payne, Ozlem Uzuner, Meliha Yetisgen

arXiv_CL

arXiv_CL Deep_Learning Recommendation
Abstract

Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. In this paper, we present a natural language processing approach based on deep learning to automatically identify clinically important recommendations in radiology reports. Our approach first identifies the recommendation sentences and then extracts reason, test, and time frame of the identified recommendations. To train our extraction models, we created a corpus of 567 radiology reports annotated for recommendation information. Our extraction models achieved 0.92 f-score for recommendation sentence, 0.65 f-score for reason, 0.73 f-score for test, and 0.84 f-score for time frame. We applied the extraction models to a set of over 3.3 million radiology reports and analyzed the adherence of follow-up recommendations.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05877

PDF

http://arxiv.org/pdf/1905.05877
Read All
Improving Head Pose Estimation with a Combined Loss and Bounding Box Margin Adjustment

2019-05-14

Mingzhen Shao, Zhun Sun, Mete Ozay, Takayuki Okatani

arXiv_CV

arXiv_CV Face Pose_Estimation
Abstract

We address a problem of estimating pose of a person’s head from its RGB image. The employment of CNNs for the problem has contributed to significant improvement in accuracy in recent works. However, we show that the following two methods, despite their simplicity, can attain further improvement: (i) proper adjustment of the margin of bounding box of a detected face, and (ii) choice of loss functions. We show that the integration of these two methods achieve the new state-of-the-art on standard benchmark datasets for in-the-wild head pose estimation.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.08609

PDF

http://arxiv.org/pdf/1905.08609
Read All
Interpretable Deep Neural Networks for Patient Mortality Prediction: A Consensus-based Approach

2019-05-14

Shaeke Salman, Seyedeh Neelufar Payrovnaziri, Xiuwen Liu, Zhe He

arXiv_AI

arXiv_AI Adversarial Classification Prediction
Abstract

Deep neural networks have achieved remarkable success in challenging tasks. However, the black-box approach of training and testing of such networks is not acceptable to critical applications. In particular, the existence of adversarial examples and their overgeneralization to irrelevant inputs makes it difficult, if not impossible, to explain decisions by commonly used neural networks. In this paper, we analyze the underlying mechanism of generalization of deep neural networks and propose an ($n$, $k$) consensus algorithm to be insensitive to adversarial examples and at the same time be able to reject irrelevant samples. Furthermore, the consensus algorithm is able to improve classification accuracy by using multiple trained deep neural networks. To handle the complexity of deep neural networks, we cluster linear approximations and use cluster means to capture feature importance. Due to weight symmetry, a small number of clusters are sufficient to produce a robust interpretation. Experimental results on a health dataset show the effectiveness of our algorithm in enhancing the prediction accuracy and interpretability of deep neural network models on one-year patient mortality prediction.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05849

PDF

http://arxiv.org/pdf/1905.05849
Read All
Identification and Recognition of Rice Diseases and Pests Using Convolutional Neural Networks

2019-05-14

Chowdhury Rafeed Rahman, Preetom Saha Arko, Mohammed Eunus Ali, Mohammad Ashik Iqbal Khan, Sajid Hasan Apon, Farzana Nowrin, Abu Wasif

arXiv_CV

arXiv_CV CNN Image_Classification Classification Deep_Learning Detection Recognition
Abstract

An accurate and timely detection of diseases and pests in rice plants can help farmers in applying timely treatment on the plants and thereby can reduce the economic losses substantially. Recent developments in deep learning based convolutional neural networks (CNN) have greatly improved image classification accuracy. In this paper, we present deep learning based approaches to detect diseases and pests in rice plants using images captured in real life scenario. We have experimented with various state-of-the-art CNN architectures on our large dataset of rice diseases and pests collected manually from the field, which contain both inter-class and intra-class variations and have nine classes in total. The results show that we can effectively detect and recognize rice diseases and pests using CNN with the best accuracy of 99.53% on test set using CNN architecture, VGG16. Though the accuracy of CNN models built on VGG16 or other similar architectures is impressive, these models are not suitable for mobile devices due to their large size having a huge number of parameters. To solve this problem, we propose a new CNN architecture, namely stacked CNN, that exploits two stage training to reduce the size of the model significantly while at the same time maintaining high classification accuracy. Our experimental results show that we achieve 95% test accuracy with stacked CNN, while reducing the model size by 98% compared to VGG16. This kind of memory efficient CNN architectures can contribute in rice disease detection and identification based mobile application development.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1812.01043

PDF

http://arxiv.org/pdf/1812.01043
Read All
Efficient 2D-3D Matching for Multi-Camera Visual Localization

2019-05-14

Marcel Geppert, Peidong Liu, Zhaopeng Cui, Marc Pollefeys, Torsten Sattler

arXiv_RO

arXiv_RO Pose_Estimation
Abstract

Visual localization, i.e., determining the position and orientation of a vehicle with respect to a map, is a key problem in autonomous driving. We present a multicamera visual inertial localization algorithm for large scale environments. To efficiently and effectively match features against a pre-built global 3D map, we propose a prioritized feature matching scheme for multi-camera systems. In contrast to existing works, designed for monocular cameras, we (1) tailor the prioritization function to the multi-camera setup and (2) run feature matching and pose estimation in parallel. This significantly accelerates the matching and pose estimation stages and allows us to dynamically adapt the matching efforts based on the surrounding environment. In addition, we show how pose priors can be integrated into the localization system to increase efficiency and robustness. Finally, we extend our algorithm by fusing the absolute pose estimates with motion estimates from a multi-camera visual inertial odometry pipeline (VIO). This results in a system that provides reliable and drift-less pose estimation. Extensive experiments show that our localization runs fast and robust under varying conditions, and that our extended algorithm enables reliable real-time pose estimation.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1809.06445

PDF

http://arxiv.org/pdf/1809.06445
Read All
Supervised Learning of the Next-Best-View for 3D Object Reconstruction

2019-05-14

Miguel Mendoza, J. Irving Vasquez-Gomez, Hind Taud, Luis Enrique Sucar, Carolina Reta

arXiv_CV

arXiv_CV Face CNN Deep_Learning
Abstract

Motivated by the advances in 3D sensing technology and the spreading of low-cost robotic platforms, 3D object reconstruction has become a common task in many areas. Nevertheless, the selection of the optimal sensor pose that maximizes the reconstructed surface is a problem that remains open. It is known in the literature as the next-best-view planning problem. In this paper, we propose a novel next-best-view planning scheme based on supervised deep learning. The scheme contains an algorithm for automatic generation of datasets and an original three-dimensional convolutional neural network (3D-CNN) used to learn the next-best-view. Unlike previous work where the problem is addressed as a search, the trained 3D-CNN directly predicts the sensor pose. We present a comparison of the proposed network against a similar net, and we present several experiments of the reconstruction of unknown objects validating the effectiveness of the proposed scheme.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05833

PDF

http://arxiv.org/pdf/1905.05833
Read All
A deep neural network to enhance prediction of 1-year mortality using echocardiographic videos of the heart

2019-05-14

Alvaro Ulloa, Linyuan Jing, Christopher W Good, David P vanMaanen, Sushravya Raghunath, Jonathan D Suever, Christopher D Nevius, Gregory J Wehner, Dustin Hartzel, Joseph B Leader, Amro Alsaid, Aalpen A Patel, H Lester Kirchner, Marios S Pattichis, Christopher M Haggerty, Brandon K Fornwalt

arXiv_AI

arXiv_AI Prediction
Abstract

Predicting future clinical events helps physicians guide appropriate intervention. Machine learning has tremendous promise to assist physicians with predictions based on the discovery of complex patterns from historical data, such as large, longitudinal electronic health records (EHR). This study is a first attempt to demonstrate such capabilities using raw echocardiographic videos of the heart. We show that a large dataset of 723,754 clinically-acquired echocardiographic videos (~45 million images) linked to longitudinal follow-up data in 27,028 patients can be used to train a deep neural network to predict 1-year mortality with good accuracy (area under the curve (AUC) in an independent test set = 0.839). Prediction accuracy was further improved by adding EHR data (AUC = 0.858). Finally, we demonstrate that the trained neural network was more accurate in mortality prediction than two expert cardiologists. These results highlight the potential of neural networks to add new power to clinical predictions.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1811.10553

PDF

http://arxiv.org/pdf/1811.10553
Read All
Reconstruction-Aware Imaging System Ranking by use of a Sparsity-Driven Numerical Observer Enabled by Variational Bayesian Inference

2019-05-14

Yujia Chen, Yang Lou, Kun Wang, Matthew A. Kupinski, Mark A. Anastasio

arXiv_CV

arXiv_CV Sparse Knowledge Optimization Inference
Abstract

It is widely accepted that optimization of imaging system performance should be guided by task-based measures of image quality (IQ). It has been advocated that imaging hardware or data-acquisition designs should be optimized by use of an ideal observer (IO) that exploits full statistical knowledge of the measurement noise and class of objects to be imaged, without consideration of the reconstruction method. In practice, accurate and tractable models of the complete object statistics are often difficult to determine. Moreover, in imaging systems that employ compressive sensing concepts, imaging hardware and sparse image reconstruction are innately coupled technologies. In this work, a sparsity-driven observer (SDO) that can be employed to optimize hardware by use of a stochastic object model describing object sparsity is described and investigated. The SDO and sparse reconstruction method can therefore be “matched” in the sense that they both utilize the same statistical information regarding the class of objects to be imaged. To efficiently compute the SDO test statistic, computational tools developed recently for variational Bayesian inference with sparse linear models are adopted. The use of the SDO to rank data-acquisition designs in a stylized example as motivated by magnetic resonance imaging (MRI) is demonstrated. This study reveals that the SDO can produce rankings that are consistent with visual assessments of the reconstructed images but different from those produced by use of the traditionally employed Hotelling observer (HO).

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05820

PDF

http://arxiv.org/pdf/1905.05820
Read All
Ontology-Aware Clinical Abstractive Summarization

2019-05-14

Sean MacAvaney, Sajad Sotudeh, Arman Cohan, Nazli Goharian, Ish Talati, Ross W. Filice

arXiv_CL

arXiv_CL Ontology Summarization
Abstract

Automatically generating accurate summaries from clinical reports could save a clinician’s time, improve summary coverage, and reduce errors. We propose a sequence-to-sequence abstractive summarization model augmented with domain-specific ontological information to enhance content selection and summary generation. We apply our method to a dataset of radiology reports and show that it significantly outperforms the current state-of-the-art on this task in terms of rouge scores. Extensive human evaluation conducted by a radiologist further indicates that this approach yields summaries that are less likely to omit important details, without sacrificing readability or accuracy.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05818

PDF

http://arxiv.org/pdf/1905.05818
Read All
Curriculum Learning for Domain Adaptation in Neural Machine Translation

2019-05-14

Xuan Zhang, Pamela Shapiro, Gaurav Kumar, Paul McNamee, Marine Carpuat, Kevin Duh

arXiv_CL

arXiv_CL
Abstract

We introduce a curriculum learning approach to adapt generic neural machine translation models to a specific domain. Samples are grouped by their similarities to the domain of interest and each group is fed to the training algorithm with a particular schedule. This approach is simple to implement on top of any neural framework or architecture, and consistently outperforms both unadapted and adapted baselines in experiments with two distinct domains and two language pairs.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05816

PDF

http://arxiv.org/pdf/1905.05816
Read All
Stochastic Gradient Push for Distributed Deep Learning

2019-05-14

Mahmoud Assran, Nicolas Loizou, Nicolas Ballas, Michael Rabbat

arXiv_AI

arXiv_AI Image_Classification Classification Deep_Learning
Abstract

Distributed data-parallel algorithms aim to accelerate the training of deep neural networks by parallelizing the computation of large mini-batch gradient updates across multiple nodes. Approaches that synchronize nodes using exact distributed averaging (e.g., via AllReduce) are sensitive to stragglers and communication delays. The PushSum gossip algorithm is robust to these issues, but only performs approximate distributed averaging. This paper studies Stochastic Gradient Push (SGP), which combines PushSum with stochastic gradient updates. We prove that SGP converges to a stationary point of smooth, non-convex objectives at the same sub-linear rate as SGD, and that all nodes achieve consensus. We empirically validate the performance of SGP on image classification (ResNet-50, ImageNet) and machine translation (Transformer, WMT’16 En-De) workloads. Our code will be made publicly available.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1811.10792

PDF

http://arxiv.org/pdf/1811.10792
Read All
Multi-task Learning for Multi-modal Emotion Recognition and Sentiment Analysis

2019-05-14

Md Shad Akhtar, Dushyant Singh Chauhan, Deepanway Ghosal, Soujanya Poria, Asif Ekbal, Pushpak Bhattacharyya

arXiv_CL

arXiv_CL Sentiment Attention Recognition
Abstract

Related tasks often have inter-dependence on each other and perform better when solved in a joint framework. In this paper, we present a deep multi-task learning framework that jointly performs sentiment and emotion analysis both. The multi-modal inputs (i.e., text, acoustic and visual frames) of a video convey diverse and distinctive information, and usually do not have equal contribution in the decision making. We propose a context-level inter-modal attention framework for simultaneously predicting the sentiment and expressed emotions of an utterance. We evaluate our proposed approach on CMU-MOSEI dataset for multi-modal sentiment and emotion analysis. Evaluation results suggest that multi-task learning framework offers improvement over the single-task framework. The proposed approach reports new state-of-the-art performance for both sentiment analysis and emotion analysis.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05812

PDF

http://arxiv.org/pdf/1905.05812
Read All
Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

2019-05-14

Dennis J. N. J. Soemers, Éric Piette, Matthew Stephenson, Cameron Browne

arXiv_AI

arXiv_AI
Abstract

In recent years, state-of-the-art game-playing agents often involve policies that are trained in self-playing processes where Monte Carlo tree search (MCTS) algorithms and trained policies iteratively improve each other. The strongest results have been obtained when policies are trained to mimic the search behaviour of MCTS by minimising a cross-entropy loss. Because MCTS, by design, includes an element of exploration, policies trained in this manner are also likely to exhibit a similar extent of exploration. In this paper, we are interested in learning policies for a project with future goals including the extraction of interpretable strategies, rather than state-of-the-art game-playing performance. For these goals, we argue that such an extent of exploration is undesirable, and we propose a novel objective function for training policies that are not exploratory. We derive a policy gradient expression for maximising this objective function, which can be estimated using MCTS value estimates, rather than MCTS visit counts. We empirically evaluate various properties of resulting policies, in a variety of board games.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05809

PDF

http://arxiv.org/pdf/1905.05809
Read All
Cross-Domain 3D Equivariant Image Embeddings

2019-05-14

Carlos Esteves, Avneesh Sud, Zhengyi Luo, Kostas Daniilidis, Ameesh Makadia

arXiv_CV

arXiv_CV Pose_Estimation Embedding CNN Classification
Abstract

Spherical convolutional networks have been introduced recently as tools to learn powerful feature representations of 3D shapes. Spherical CNNs are equivariant to 3D rotations making them ideally suited to applications where 3D data may be observed in arbitrary orientations. In this paper we learn 2D image embeddings with a similar equivariant structure: embedding the image of a 3D object should commute with rotations of the object. We introduce a cross-domain embedding from 2D images into a spherical CNN latent space. This embedding encodes images with 3D shape properties and is equivariant to 3D rotations of the observed object. The model is supervised only by target embeddings obtained from a spherical CNN pretrained for 3D shape classification. We show that learning a rich embedding for images with appropriate geometric structure is sufficient for tackling varied applications, such as relative pose estimation and novel view synthesis, without requiring additional task-specific supervision.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1812.02716

PDF

http://arxiv.org/pdf/1812.02716
Read All
Towards Automated Melanoma Detection with Deep Learning: Data Purification and Augmentation

2019-05-14

Devansh Bisla, Anna Choromanska, Jennifer A. Stein, David Polsky, Russell Berman

arXiv_CV

arXiv_CV Adversarial GAN Classification Deep_Learning Detection
Abstract

Melanoma is one of the ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion databases, which are small, heavily imbalanced, and contain images with occlusions. We build deep-learning-based tools for data purification and augmentation to counter-act these limitations. The developed tools can be utilized in a deep learning system for lesion classification and we show how to build such a system. The system heavily relies on the processing unit for removing image occlusions and the data generation unit, based on generative adversarial networks, for populating scarce lesion classes, or equivalently creating virtual patients with pre-defined types of lesions. We empirically verify our approach and show that incorporating these two units into melanoma detection system results in the superior performance over common baselines.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.06061

PDF

http://arxiv.org/pdf/1902.06061
Read All
Diffusion Methods for Classification with Pairwise Relationships

2019-05-14

Pedro F. Felzenszwalb, Benar F. Svaiter

arXiv_AI

arXiv_AI Classification Relation
Abstract

We define two algorithms for propagating information in classification problems with pairwise relationships. The algorithms are based on contraction maps and are related to non-linear diffusion and random walks on graphs. The approach is also related to message passing algorithms, including belief propagation and mean field methods. The algorithms we describe are guaranteed to converge on graphs with arbitrary topology. Moreover they always converge to a unique fixed point, independent of initialization. We prove that the fixed points of the algorithms under consideration define lower-bounds on the energy function and the max-marginals of a Markov random field. The theoretical results also illustrate a relationship between message passing algorithms and value iteration for an infinite horizon Markov decision process. We illustrate the practical application of the algorithms under study with numerical experiments in image restoration, stereo depth estimation and binary classification on a grid.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1505.06072

PDF

http://arxiv.org/pdf/1505.06072
Read All
Misleading Failures of Partial-input Baselines

2019-05-14

Shi Feng, Eric Wallace, Jordan Boyd-Graber

arXiv_AI

arXiv_AI Inference
Abstract

Recent work establishes dataset difficulty and removes annotation artifacts via partial-input baselines (e.g., hypothesis-only or image-only models). While the success of a partial-input baseline indicates a dataset is cheatable, our work cautions the converse is not necessarily true. Using artificial datasets, we illustrate how the failure of a partial-input baseline might shadow more trivial patterns that are only visible in the full input. We also identify such artifacts in real natural language inference datasets. Our work provides an alternative view on the use of partial-input baselines in future dataset creation.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05778

PDF

http://arxiv.org/pdf/1905.05778
Read All
Robust Neural Network Training using Periodic Sampling over Model Weights

2019-05-14

Samarth Tripathi, Jiayi Liu, Unmesh Kurup, Mohak Shah

arXiv_AI

arXiv_AI Segmentation Face Classification Detection
Abstract

Deep neural networks provide best-in-class performance for a number of computer vision problems. However, training these networks is computationally intensive and requires fine-tuning various hyperparameters. In addition, performance swings widely as the network converges making it hard to decide when to stop training. In this paper, we introduce a trio of techniques (PSWA, PWALKS, and PSWM) centered around periodic sampling of model weights that provide consistent and more robust convergence on a variety of vision problems (classification, detection, segmentation) and gradient update methods (vanilla SGD, Momentum, Adam) with marginal additional computation time. Our techniques use existing optimal training policies but converge in a less volatile fashion with performance improvements that are approximately monotonic. Our analysis of the loss surface shows that these techniques also produce minima that are deeper and wider than those found by SGD.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.05774

PDF

http://arxiv.org/pdf/1905.05774
Read All
Learnable Triangulation of Human Pose

2019-05-14

Karim Iskakov, Egor Burkov, Victor Lempitsky, Yury Malkov

arXiv_AI

arXiv_AI Pose_Estimation
Abstract

We present two novel solutions for multi-view 3D human pose estimation based on new learnable triangulation methods that combine 3D information from multiple 2D views. The first (baseline) solution is a basic differentiable algebraic triangulation with an addition of confidence weights estimated from the input images. The second solution is based on a novel method of volumetric aggregation from intermediate 2D backbone feature maps. The aggregated volume is then refined via 3D convolutions that produce final 3D joint heatmaps and allow modelling a human pose prior. Crucially, both approaches are end-to-end differentiable, which allows us to directly optimize the target metric. We demonstrate transferability of the solutions across datasets and considerably improve the multi-view state of the art on the Human3.6M dataset. Video demonstration, annotations and additional materials will be posted on our project page (this https URL).

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05754

PDF

https://arxiv.org/pdf/1905.05754
Read All
A classical-quantum hybrid oracle architecture for Boolean oracle identification in the noisy intermediate-scale quantum era

2019-05-14

Wooyeong Song, Marcin Wieśniak, Nana Liu, Marcin Pawłowski, Jinhyoung Lee, Jaewan Kim, Jeongho Bang

arXiv_CV

arXiv_CV Embedding
Abstract

Quantum algorithms have the potential to be very powerful. However, to exploit quantum parallelism, some quantum algorithms require an embedding of large classical data into quantum states. This embedding can cost a lot of resources, for instance by implementing quantum random-access memory (QRAM). An important instance of this is in quantum-enhanced machine learning algorithms. We propose a new way of circumventing this requirement by using a classical-quantum hybrid architecture where the input data can remain classical, which differs from other hybrid models. We apply this to a fundamental computational problem called Boolean oracle identification, which offers a useful primitive for quantum machine learning algorithms. Its aim is to identify an unknown oracle amongst a list of candidates while minimising the number of queries to the oracle. In our scheme, we replace the classical oracle with our hybrid oracle. We demonstrate both theoretically and numerically that the success rates of the oracle query can be improved in the presence of noise and also enables us to explore a larger search space. This also makes the model suitable for realisation in the current era of noisy intermediate-scale quantum (NISQ) devices. Furthermore, we can show our scheme can lead to a reduction in the learning sample complexity. This means that for certain sizes of learning samples, our classical-quantum hybrid learner can complete the learning task faithfully whereas a classical learner cannot.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05751

PDF

https://arxiv.org/pdf/1905.05751
Read All
DeepFlow: History Matching in the Space of Deep Generative Models

2019-05-14

Lukas Mosser, Olivier Dubrule, Martin J. Blunt

arXiv_CV

arXiv_CV Adversarial Face Gradient_Descent
Abstract

The calibration of a reservoir model with observed transient data of fluid pressures and rates is a key task in obtaining a predictive model of the flow and transport behaviour of the earth’s subsurface. The model calibration task, commonly referred to as “history matching”, can be formalised as an ill-posed inverse problem where we aim to find the underlying spatial distribution of petrophysical properties that explain the observed dynamic data. We use a generative adversarial network pretrained on geostatistical object-based models to represent the distribution of rock properties for a synthetic model of a hydrocarbon reservoir. The dynamic behaviour of the reservoir fluids is modelled using a transient two-phase incompressible Darcy formulation. We invert for the underlying reservoir properties by first modeling property distributions using the pre-trained generative model then using the adjoint equations of the forward problem to perform gradient descent on the latent variables that control the output of the generative model. In addition to the dynamic observation data, we include well rock-type constraints by introducing an additional objective function. Our contribution shows that for a synthetic test case, we are able to obtain solutions to the inverse problem by optimising in the latent variable space of a deep generative model, given a set of transient observations of a non-linear forward problem.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05749

PDF

https://arxiv.org/pdf/1905.05749
Read All
Graph Convolutional Gaussian Processes

2019-05-14

Ian Walker, Ben Glocker

arXiv_CV

arXiv_CV CNN Relation
Abstract

We propose a novel Bayesian nonparametric method to learn translation-invariant relationships on non-Euclidean domains. The resulting graph convolutional Gaussian processes can be applied to problems in machine learning for which the input observations are functions with domains on general graphs. The structure of these models allows for high dimensional inputs while retaining expressibility, as is the case with convolutional neural networks. We present applications of graph convolutional Gaussian processes to images and triangular meshes, demonstrating their versatility and effectiveness, comparing favorably to existing methods, despite being relatively simple models.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05739

PDF

https://arxiv.org/pdf/1905.05739
Read All
Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering

2019-05-14

Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

arXiv_CL

arXiv_CL QA
Abstract

This paper introduces a new framework for open-domain question answering in which the retriever and the reader iteratively interact with each other. The framework is agnostic to the architecture of the machine reading model, only requiring access to the token-level hidden representations of the reader. The retriever uses fast nearest neighbor search to scale to corpora containing millions of paragraphs. A gated recurrent unit updates the query at each step conditioned on the state of the reader and the reformulated query is used to re-rank the paragraphs by the retriever. We conduct analysis and show that iterative interaction helps in retrieving informative paragraphs from the corpus. Finally, we show that our multi-step-reasoning framework brings consistent improvement when applied to two widely used reader architectures DrQA and BiDAF on various large open-domain datasets — TriviaQA-unfiltered, QuasarT, SearchQA, and SQuAD-Open.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05733

PDF

https://arxiv.org/pdf/1905.05733
Read All
Learning to Groove with Inverse Sequence Transformations

2019-05-14

Jon Gillick, Adam Roberts, Jesse Engel, Douglas Eck, David Bamman

arXiv_SD

arXiv_SD Adversarial GAN
Abstract

We explore models for translating abstract musical ideas (scores, rhythms) into expressive performances using Seq2Seq and recurrent Variational Information Bottleneck (VIB) models. Though Seq2Seq models usually require painstakingly aligned corpora, we show that it is possible to adapt an approach from the Generative Adversarial Network (GAN) literature (e.g. Pix2Pix (Isola et al., 2017) and Vid2Vid (Wang et al. 2018a)) to sequences, creating large volumes of paired data by performing simple transformations and training generative models to plausibly invert these transformations. Music, and drumming in particular, provides a strong test case for this approach because many common transformations (quantization, removing voices) have clear semantics, and models for learning to invert them have real-world applications. Focusing on the case of drum set players, we create and release a new dataset for this purpose, containing over 13 hours of recordings by professional drummers aligned with fine-grained timing and dynamics information. We also explore some of the creative potential of these models, including demonstrating improvements on state-of-the-art methods for Humanization (instantiating a performance from a musical score).

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.06118

PDF

http://arxiv.org/pdf/1905.06118
Read All
Successor Options: An Option Discovery Framework for Reinforcement Learning

2019-05-14

Rahul Ramesh, Manan Tomar, Balaraman Ravindran

arXiv_AI

arXiv_AI Reinforcement_Learning
Abstract

The options framework in reinforcement learning models the notion of a skill or a temporally extended sequence of actions. The discovery of a reusable set of skills has typically entailed building options, that navigate to bottleneck states. This work adopts a complementary approach, where we attempt to discover options that navigate to landmark states. These states are prototypical representatives of well-connected regions and can hence access the associated region with relative ease. In this work, we propose Successor Options, which leverages Successor Representations to build a model of the state space. The intra-option policies are learnt using a novel pseudo-reward and the model scales to high-dimensional spaces easily. Additionally, we also propose an Incremental Successor Options model that iterates between constructing Successor Representations and building options, which is useful when robust Successor Representations cannot be built solely from primitive actions. We demonstrate the efficacy of our approach on a collection of grid-worlds, and on the high-dimensional robotic control environment of Fetch.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05731

PDF

https://arxiv.org/pdf/1905.05731
Read All
DisSent: Sentence Representation Learning from Explicit Discourse Relations

2019-05-14

Allen Nie, Erin D. Bennett, Noah D. Goodman

arXiv_AI

arXiv_AI Embedding Represenation_Learning RNN Prediction Relation
Abstract

Learning effective representations of sentences is one of the core missions of natural language understanding. Existing models either train on a vast amount of text, or require costly, manually curated sentence relation datasets. We show that with dependency parsing and rule-based rubrics, we can curate a high quality sentence relation task by leveraging explicit discourse relations. We show that our curated dataset provides an excellent signal for learning vector representations of sentence meaning, representing relations that can only be determined when the meanings of two sentences are combined. We demonstrate that the automatically curated corpus allows a bidirectional LSTM sentence encoder to yield high quality sentence embeddings and can serve as a supervised fine-tuning dataset for larger models such as BERT. We evaluate our sentence embeddings on a variety of transfer tasks, including SentEval. We achieve state-of-the-art result on Penn Discourse Treebank implicit relation prediction task.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1710.04334

PDF

http://arxiv.org/pdf/1710.04334
Read All
Deep Neural Architecture Search with Deep Graph Bayesian Optimization

2019-05-14

Lizheng Ma, Jiaxu Cui, Bo Yang

arXiv_CV

arXiv_CV NAS Optimization
Abstract

Bayesian optimization (BO) is an effective method of finding the global optima of black-box functions. Recently BO has been applied to neural architecture search and shows better performance than pure evolutionary strategies. All these methods adopt Gaussian processes (GPs) as surrogate function, with the handcraft similarity metrics as input. In this work, we propose a Bayesian graph neural network as a new surrogate, which can automatically extract features from deep neural architectures, and use such learned features to fit and characterize black-box objectives and their uncertainty. Based on the new surrogate, we then develop a graph Bayesian optimization framework to address the challenging task of deep neural architecture search. Experiment results show our method significantly outperforms the comparative methods on benchmark tasks.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.06159

PDF

https://arxiv.org/pdf/1905.06159
Read All
Timeline-based Planning and Execution with Uncertainty: Theory, Modeling Methodologies and Practice

2019-05-14

Alessandro Umbrico

arXiv_AI

arXiv_AI
Abstract

Automated Planning is one of the main research field of Artificial Intelligence since its beginnings. Research in Automated Planning aims at developing general reasoners (i.e., planners) capable of automatically solve complex problems. Broadly speaking, planners rely on a general model characterizing the possible states of the world and the actions that can be performed in order to change the status of the world. Given a model and an initial known state, the objective of a planner is to synthesize a set of actions needed to achieve a particular goal state. The classical approach to planning roughly corresponds to the description given above. The timeline-based approach is a particular planning paradigm capable of integrating causal and temporal reasoning within a unified solving process. This approach has been successfully applied in many real-world scenarios although a common interpretation of the related planning concepts is missing. Indeed, there are significant differences among the existing frameworks that apply this technique. Each framework relies on its own interpretation of timeline-based planning and therefore it is not easy to compare these systems. Thus, the objective of this work is to investigate the timeline-based approach to planning by addressing several aspects ranging from the semantics of the related planning concepts to the modeling and solving techniques. Specifically, the main contributions of this PhD work consist of: (i) the proposal of a formal characterization of the timeline-based approach capable of dealing with temporal uncertainty; (ii) the proposal of a hierarchical modeling and solving approach; (iii) the development of a general purpose framework for planning and execution with timelines; (iv) the validation†of this approach in real-world manufacturing scenarios.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05713

PDF

https://arxiv.org/pdf/1905.05713
Read All
DAS3H: Modeling Student Learning and Forgetting for Optimally Scheduling Distributed Practice of Skills

2019-05-14

Benoît Choffin, Fabrice Popineau, Yolaine Bourda, Jill-Jênn Vie

arXiv_AI

arXiv_AI Knowledge Relation
Abstract

Spaced repetition is among the most studied learning strategies in the cognitive science literature. It consists in temporally distributing exposure to an information so as to improve long-term memorization. Providing students with an adaptive and personalized distributed practice schedule would benefit more than just a generic scheduler. However, the applicability of such adaptive schedulers seems to be limited to pure memorization, e.g. flashcards or foreign language learning. In this article, we first frame the research problem of optimizing an adaptive and personalized spaced repetition scheduler when memorization concerns the application of underlying multiple skills. To this end, we choose to rely on a student model for inferring knowledge state and memory dynamics on any skill or combination of skills. We argue that no knowledge tracing model takes both memory decay and multiple skill tagging into account for predicting student performance. As a consequence, we propose a new student learning and forgetting model suited to our research problem: DAS3H builds on the additive factor models and includes a representation of the temporal distribution of past practice on the skills involved by an item. In particular, DAS3H allows the learning and forgetting curves to differ from one skill to another. Finally, we provide empirical evidence on three real-world educational datasets that DAS3H outperforms other state-of-the-art EDM models. These results suggest that incorporating both item-skill relationships and forgetting effect improves over student models that consider one or the other.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.06873

PDF

http://arxiv.org/pdf/1905.06873
Read All
Trajectory-Based Off-Policy Deep Reinforcement Learning

2019-05-14

Andreas Doerr, Michael Volpp, Marc Toussaint, Sebastian Trimpe, Christian Daniel

arXiv_AI

arXiv_AI Reinforcement_Learning Optimization Gradient_Descent
Abstract

Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. However, these methods are also data-inefficient, afflicted with high variance gradient estimates, and frequently get stuck in local optima. This work addresses these weaknesses by combining recent improvements in the reuse of off-policy data and exploration in parameter space with deterministic behavioral policies. The resulting objective is amenable to standard neural network optimization strategies like stochastic gradient descent or stochastic gradient Hamiltonian Monte Carlo. Incorporation of previous rollouts via importance sampling greatly improves data-efficiency, whilst stochastic optimization schemes facilitate the escape from local optima. We evaluate the proposed approach on a series of continuous control benchmark tasks. The results show that the proposed algorithm is able to successfully and reliably learn solutions using fewer system interactions than standard policy gradient methods.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05710

PDF

https://arxiv.org/pdf/1905.05710
Read All
Sparse Sequence-to-Sequence Models

2019-05-14

Ben Peters, Vlad Niculae, André F.T. Martins

arXiv_CL

arXiv_CL Sparse Attention
Abstract

Sequence-to-sequence models are a powerful workhorse of NLP. Most variants employ a softmax transformation in both their attention mechanism and output layer, leading to dense alignments and strictly positive output probabilities. This density is wasteful, making models less interpretable and assigning probability mass to many implausible outputs. In this paper, we propose sparse sequence-to-sequence models, rooted in a new family of $\alpha$-entmax transformations, which includes softmax and sparsemax as particular cases, and is sparse for any $\alpha > 1$. We provide fast algorithms to evaluate these transformations and their gradients, which scale well for large vocabulary sizes. Our models are able to produce sparse alignments and to assign nonzero probability to a short list of plausible outputs, sometimes rendering beam search exact. Experiments on morphological inflection and machine translation reveal consistent gains over dense models.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05702

PDF

https://arxiv.org/pdf/1905.05702
Read All
The Lower The Simpler: Simplifying Hierarchical Recurrent Models

2019-05-14

Chao Wang, Hui Jiang

arXiv_AI

arXiv_AI
Abstract

To improve the training efficiency of hierarchical recurrent models without compromising their performance, we propose a strategy named as `the lower the simpler’, which is to simplify the baseline models by making the lower layers simpler than the upper layers. We carry out this strategy to simplify two typical hierarchical recurrent models, namely Hierarchical Recurrent Encoder-Decoder (HRED) and R-NET, whose basic building block is GRU. Specifically, we propose Scalar Gated Unit (SGU), which is a simplified variant of GRU, and use it to replace the GRUs at the middle layers of HRED and R-NET. Besides, we also use Fixed-size Ordinally-Forgetting Encoding (FOFE), which is an efficient encoding method without any trainable parameter, to replace the GRUs at the bottom layers of HRED and R-NET. The experimental results show that the simplified HRED and the simplified R-NET contain significantly less trainable parameters, consume significantly less training time, and achieve slightly better performance than their baseline models.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1809.02790

PDF

http://arxiv.org/pdf/1809.02790
Read All
A Unified Linear-Time Framework for Sentence-Level Discourse Parsing

2019-05-14

Xiang Lin, Shafiq Joty, Prathyusha Jwalapuram, Saiful Bari

arXiv_AI

arXiv_AI Relation
Abstract

We propose an efficient neural framework for sentence-level discourse analysis in accordance with Rhetorical Structure Theory (RST). Our framework comprises a discourse segmenter to identify the elementary discourse units (EDU) in a text, and a discourse parser that constructs a discourse tree in a top-down fashion. Both the segmenter and the parser are based on Pointer Networks and operate in linear time. Our segmenter yields an $F_1$ score of 95.4, and our parser achieves an $F_1$ score of 81.7 on the aggregated labeled (relation) metric, surpassing previous approaches by a good margin and approaching human agreement on both tasks (98.3 and 83.0 $F_1$).

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05682

PDF

https://arxiv.org/pdf/1905.05682
Read All
Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation

2019-05-14

Loïc Vial, Benjamin Lecouteux, Didier Schwab

arXiv_CL

arXiv_CL Knowledge Relation
Abstract

In this article, we tackle the issue of the limited quantity of manually sense annotated corpora for the task of word sense disambiguation, by exploiting the semantic relationships between senses such as synonymy, hypernymy and hyponymy, in order to compress the sense vocabulary of Princeton WordNet, and thus reduce the number of different sense tags that must be observed to disambiguate all words of the lexical database. We propose two different methods that greatly reduces the size of neural WSD models, with the benefit of improving their coverage without additional training data, and without impacting their precision. In addition to our method, we present a new WSD system which relies on pre-trained BERT word vectors in order to achieve results that significantly outperform the state of the art on all WSD evaluation tasks.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1905.05677

PDF

https://arxiv.org/pdf/1905.05677
Read All

27/266

Welcome to AMDS123 Blog!

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL