Welcome to AMDS123 Blog!

Recent Papers about CV, CL and SD

Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching

2019-01-24

Tom Zahavy, Shie Mannor

arXiv_AI

arXiv_AI Sentiment Classification
Abstract

We study the neural-linear bandit model for solving sequential decision-making problems with high dimensional side information. Neural-linear bandits leverage the representation power of deep neural networks and combine it with efficient exploration mechanisms, designed for linear contextual bandits, on top of the last hidden layer. Since the representation is being optimized during learning, information regarding exploration with “old” features is lost. Here, we propose the first limited memory neural-linear bandit that is resilient to this phenomenon, which we term catastrophic forgetting. We evaluate our method on a variety of real-world data sets, including regression, classification, and sentiment analysis, and observe that our algorithm is resilient to catastrophic forgetting and achieves superior performance.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08612

PDF

http://arxiv.org/pdf/1901.08612
Read All
Multi-stream Network With Temporal Attention For Environmental Sound Classification

2019-01-24

Xinyu Li, Venkata Chebiyyam, Katrin Kirchhoff

arXiv_SD

arXiv_SD Attention CNN Classification
Abstract

Environmental sound classification systems often do not perform robustly across different sound classification tasks and audio signals of varying temporal structures. We introduce a multi-stream convolutional neural network with temporal attention that addresses these problems. The network relies on three input streams consisting of raw audio and spectral features and utilizes a temporal attention function computed from energy changes over time. Training and classification utilizes decision fusion and data augmentation techniques that incorporate uncertainty. We evaluate this network on three commonly used data sets for environmental sound and audio scene classification and achieve new state-of-the-art performance without any changes in network architecture or front-end preprocessing, thus demonstrating better generalizability.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08608

PDF

http://arxiv.org/pdf/1901.08608
Read All
Forecasting Transformative AI: An Expert Survey

2019-01-24

Ross Gruetzemacher, David Paradice, Kang Bok Lee

arXiv_AI

arXiv_AI Survey
Abstract

Transformative AI technologies have the potential to reshape critical aspects of society in the near future. However, in order to properly prepare policy initiatives for the arrival of such technologies accurate forecasts and timelines are necessary. A survey was administered to attendees of three AI conferences during the summer of 2018 (ICML, IJCAI and the HLAI conference). The survey included questions for estimating AI capabilities over the next decade, questions for forecasting five scenarios of transformative AI and questions concerning the impact of computational resources in AI research. Respondents indicated a median of 21.5% of human tasks (i.e., all tasks that humans are currently paid to do) can be feasibly automated now, and that this figure would rise to 40% in 5 years and 60% in 10 years. Median forecasts indicated a 50% probability of AI systems being capable of automating 90% of current human tasks in 25 years and 99% of current human tasks in 50 years. The conference of attendance was found to have a statistically significant impact on all forecasts, with attendees of HLAI providing more optimistic timelines with less uncertainty. These findings suggest that AI experts expect major advances in AI technology to continue over the next decade to a degree that will likely have profound transformative impacts on society.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08579

PDF

http://arxiv.org/pdf/1901.08579
Read All
F1/10: An Open-Source Autonomous Cyber-Physical Platform

2019-01-24

Matthew O'Kelly, Varundev Sukhil, Houssam Abbas, Jack Harkins, Chris Kao, Yash Vardhan Pant, Rahul Mangharam, Dipshil Agarwal, Madhur Behl, Paolo Burgio, Marko Bertogna

arXiv_RO

arXiv_RO
Abstract

In 2005 DARPA labeled the realization of viable autonomous vehicles (AVs) a grand challenge; a short time later the idea became a moonshot that could change the automotive industry. Today, the question of safety stands between reality and solved. Given the right platform the CPS community is poised to offer unique insights. However, testing the limits of safety and performance on real vehicles is costly and hazardous. The use of such vehicles is also outside the reach of most researchers and students. In this paper, we present F1/10: an open-source, affordable, and high-performance 1/10 scale autonomous vehicle testbed. The F1/10 testbed carries a full suite of sensors, perception, planning, control, and networking software stacks that are similar to full scale solutions. We demonstrate key examples of the research enabled by the F1/10 testbed, and how the platform can be used to augment research and education in autonomous systems, making autonomy more accessible.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08567

PDF

http://arxiv.org/pdf/1901.08567
Read All
Analysis of cause-effect inference by comparing regression errors

2019-01-24

Patrick Blöbaum, Dominik Janzing, Takashi Washio, Shohei Shimizu, Bernhard Schölkopf

arXiv_AI

arXiv_AI Inference Prediction Relation
Abstract

We address the problem of inferring the causal direction between two variables by comparing the least-squares errors of the predictions in both possible directions. Under the assumption of an independence between the function relating cause and effect, the conditional noise distribution, and the distribution of the cause, we show that the errors are smaller in causal direction if both variables are equally scaled and the causal relation is close to deterministic. Based on this, we provide an easily applicable algorithm that only requires a regression in both possible causal directions and a comparison of the errors. The performance of the algorithm is compared with various related causal inference methods in different artificial and real-world data sets.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1802.06698

PDF

http://arxiv.org/pdf/1802.06698
Read All
Simple Fusion: Return of the Language Model

2019-01-24

Felix Stahlberg, James Cross, Veselin Stoyanov

arXiv_CL

arXiv_CL NMT Language_Model Prediction
Abstract

Neural Machine Translation (NMT) typically leverages monolingual data in training through backtranslation. We investigate an alternative simple method to use monolingual data for NMT training: We combine the scores of a pre-trained and fixed language model (LM) with the scores of a translation model (TM) while the TM is trained from scratch. To achieve that, we train the translation model to predict the residual probability of the training data added to the prediction of the LM. This enables the TM to focus its capacity on modeling the source sentence since it can rely on the LM for fluency. We show that our method outperforms previous approaches to integrate LMs into NMT while the architecture is simpler as it does not require gating networks to balance TM and LM. We observe gains of between +0.24 and +2.36 BLEU on all four test sets (English-Turkish, Turkish-English, Estonian-English, Xhosa-English) on top of ensembles without LM. We compare our method with alternative ways to utilize monolingual data such as backtranslation, shallow fusion, and cold fusion.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1809.00125

PDF

https://arxiv.org/pdf/1809.00125
Read All
Distributed Learning of Decentralized Control Policies for Articulated Mobile Robots

2019-01-24

Guillaume Sartoretti, William Paivine, Yunfei Shi, Yue Wu, Howie Choset

arXiv_RO

arXiv_RO Reinforcement_Learning Relation
Abstract

State-of-the-art distributed algorithms for reinforcement learning rely on multiple independent agents, which simultaneously learn in parallel environments while asynchronously updating a common, shared policy. Moreover, decentralized control architectures (e.g., CPGs) can coordinate spatially distributed portions of an articulated robot to achieve system-level objectives. In this work, we investigate the relationship between distributed learning and decentralized control by learning decentralized control policies for the locomotion of articulated robots in challenging environments. To this end, we present an approach that leverages the structure of the asynchronous advantage actor-critic (A3C) algorithm to provide a natural means of learning decentralized control policies on a single articulated robot. Our primary contribution shows individual agents in the A3C algorithm can be defined by independently controlled portions of the robot’s body, thus enabling distributed learning on a single robot for efficient hardware implementation. We present results of closed-loop locomotion in unstructured terrains on a snake and a hexapod robot, using decentralized controllers learned offline and online respectively. Preprint of the paper submitted to the IEEE Transactions in Robotics (T-RO) journal in October 2018, and conditionally accepted for publication as a regular paper in January 2019.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08537

PDF

http://arxiv.org/pdf/1901.08537
Read All
Learning Disentangled Representations with Reference-Based Variational Autoencoders

2019-01-24

Adria Ruiz, Oriol Martinez, Xavier Binefa, Jakob Verbeek

arXiv_CV

arXiv_CV
Abstract

Learning disentangled representations from visual data, where different high-level generative factors are independently encoded, is of importance for many computer vision tasks. Solving this problem, however, typically requires to explicitly label all the factors of interest in training images. To alleviate the annotation cost, we introduce a learning setting which we refer to as “reference-based disentangling”. Given a pool of unlabeled images, the goal is to learn a representation where a set of target factors are disentangled from others. The only supervision comes from an auxiliary “reference set” containing images where the factors of interest are constant. In order to address this problem, we propose reference-based variational autoencoders, a novel deep generative model designed to exploit the weak-supervision provided by the reference set. By addressing tasks such as feature learning, conditional image generation or attribute transfer, we validate the ability of the proposed model to learn disentangled representations from this minimal form of supervision.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08534

PDF

http://arxiv.org/pdf/1901.08534
Read All
Mixed-Granularity Human-Swarm Interaction

2019-01-24

Jayam Patel, Yicong Xu, Carlo Pinciroli

arXiv_RO

arXiv_RO Face
Abstract

We present an augmented reality human-swarm interface that combines two modalities of interaction: environment-oriented and robot-oriented. The environment-oriented modality allows the user to modify the environment (either virtual or physical) to indicate a goal to attain for the robot swarm. The robot-oriented modality makes it possible to select individual robots to reassign them to other tasks to increase performance or remedy failures. Previous research has concluded that environment-oriented interaction might prove more difficult to grasp for untrained users. In this paper, we report a user study which indicates that, at least in collective transport, environment-oriented interaction is more effective than purely robot-oriented interaction, and that the two combined achieve remarkable efficacy.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08522

PDF

http://arxiv.org/pdf/1901.08522
Read All
Squared English Word: A Method of Generating Glyph to Use Super Characters for Sentiment Analysis

2019-01-24

Baohua Sun, Lin Yang, Catherine Chi, Wenhan Zhang, Michael Lin

arXiv_CL

arXiv_CL Sentiment Classification
Abstract

The Super Characters method addresses sentiment analysis problems by first converting the input text into images and then applying 2D-CNN models to classify the sentiment. It achieves state of the art performance on many benchmark datasets. However, it is not as straightforward to apply in Latin languages as in Asian languages. Because the 2D-CNN model is designed to recognize two-dimensional images, it is better if the inputs are in the form of glyphs. In this paper, we propose SEW (Squared English Word) method generating a squared glyph for each English word by drawing Super Characters images of each English word at the alphabet level, combining the squared glyph together into a whole Super Characters image at the sentence level, and then applying the CNN model to classify the sentiment within the sentence. We applied the SEW method to Wikipedia dataset and obtained a 2.1% accuracy gain compared to the original Super Characters method. In the CL-Aff shared task on the HappyDB dataset, we applied Super Characters with SEW method and obtained 86.9% accuracy for agency classification and 85.8% for social accuracy classification on the validation set based on 80%:20% random split on the given labeled dataset.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.02160

PDF

http://arxiv.org/pdf/1902.02160
Read All
Maximum Entropy Generators for Energy-Based Models

2019-01-24

Rithesh Kumar, Anirudh Goyal, Aaron Courville, Yoshua Bengio

arXiv_AI

arXiv_AI Adversarial GAN Detection
Abstract

Unsupervised learning is about capturing dependencies between variables and is driven by the contrast between the probable vs. improbable configurations of these variables, often either via a generative model that only samples probable ones or with an energy function (unnormalized log-density) that is low for probable ones and high for improbable ones. Here, we consider learning both an energy function and an efficient approximate sampling mechanism. Whereas the discriminator in generative adversarial networks (GANs) learns to separate data and generator samples, introducing an entropy maximization regularizer on the generator can turn the interpretation of the critic into an energy function, which separates the training distribution from everything else, and thus can be used for tasks like anomaly or novelty detection. Then, we show how Markov Chain Monte Carlo can be done in the generator latent space whose samples can be mapped to data space, producing better samples. These samples are used for the negative phase gradient required to estimate the log-likelihood gradient of the data space energy function. To maximize entropy at the output of the generator, we take advantage of recently introduced neural estimators of mutual information. We find that in addition to producing a useful scoring function for anomaly detection, the resulting approach produces sharp samples while covering the modes well, leading to high Inception and Frechet scores.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08508

PDF

http://arxiv.org/pdf/1901.08508
Read All
MPC for Humanoid Gait Generation: Stability and Feasibility

2019-01-24

Nicola Scianca, Daniele De Simone, Leonardo Lanari, Giuseppe Oriolo

arXiv_RO

arXiv_RO Review Prediction
Abstract

We present a novel MPC framework for humanoid gait generation which incorporates an explicit stability constraint in the formulation. The proposed method uses as prediction model a dynamically extended LIP where ZMP velocities are the control inputs, producing in real time a gait (including footsteps with the associated timing) that realizes omnidirectional motion commands coming from an external source. The stability constraint links the future ZMP velocities to the current system state so as to guarantee the essential requirement that the generated CoM trajectory is bounded with respect to the ZMP trajectory. Since the control horizon of the MPC algorithm is finite, only part of the future ZMP velocities are decision variables of the MPC problem; the remaining part, called tail, must be either conjectured or anticipated using preview information on the reference motion. Several possible options for the tail are discussed, and each of them is shown to correspond to a specific terminal constraint. The stability and feasibility of the proposed method are analyzed in detail: in particular, a theoretical analysis of the feasibility of the generic MPC iteration is developed and used to obtain sufficient conditions for recursive feasibility and stability. Simulation and experimental results on the NAO and the HRP-4 humanoids are presented to illustrate the performance of the proposed method.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08505

PDF

http://arxiv.org/pdf/1901.08505
Read All
Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning

2019-01-24

Sanjeevan Ahilan, Peter Dayan

arXiv_AI

arXiv_AI GAN Reinforcement_Learning
Abstract

We investigate how reinforcement learning agents can learn to cooperate. Drawing inspiration from human societies, in which successful coordination of many individuals is often facilitated by hierarchical organisation, we introduce Feudal Multi-agent Hierarchies (FMH). In this framework, a ‘manager’ agent, which is tasked with maximising the environmentally-determined reward function, learns to communicate subgoals to multiple, simultaneously-operating, ‘worker’ agents. Workers, which are rewarded for achieving managerial subgoals, take concurrent actions in the world. We outline the structure of FMH and demonstrate its potential for decentralised learning and control. We find that, given an adequate set of subgoals from which to choose, FMH performs, and particularly scales, substantially better than cooperative approaches that use a shared reward function.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08492

PDF

http://arxiv.org/pdf/1901.08492
Read All
Separators and Adjustment Sets in Causal Graphs: Complete Criteria and an Algorithmic Framework

2019-01-24

Benito van der Zander (1), Maciej Liśkiewicz (1), Johannes Textor (2) ((1) Institute for Theoretical Computer Science, Universität zu Lübeck, Germany, (2) Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands)

arXiv_AI

arXiv_AI
Abstract

Principled reasoning about the identifiability of causal effects from non-experimental data is an important application of graphical causal models. This paper focuses on effects that are identifiable by covariate adjustment, a commonly used estimation approach. We present an algorithmic framework for efficiently testing, constructing, and enumerating $m$-separators in ancestral graphs (AGs), a class of graphical causal models that can represent uncertainty about the presence of latent confounders. Furthermore, we prove a reduction from causal effect identification by covariate adjustment to $m$-separation in a subgraph for directed acyclic graphs (DAGs) and maximal ancestral graphs (MAGs). Jointly, these results yield constructive criteria that characterize all adjustment sets as well as all minimal and minimum adjustment sets for identification of a desired causal effect with multivariate exposures and outcomes in the presence of latent confounding. Our results extend several existing solutions for special cases of these problems. Our efficient algorithms allowed us to empirically quantify the identifiability gap between covariate adjustment and the do-calculus in random DAGs and MAGs, covering a wide range of scenarios. Implementations of our algorithms are provided in the R package dagitty.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1803.00116

PDF

http://arxiv.org/pdf/1803.00116
Read All
Decentralization of Multiagent Policies by Learning What to Communicate

2019-01-24

James Paulos, Steven W. Chen, Daigo Shishika, Vijay Kumar

arXiv_RO

arXiv_RO Optimization
Abstract

Effective communication is required for teams of robots to solve sophisticated collaborative tasks. In practice it is typical for both the encoding and semantics of communication to be manually defined by an expert; this is true regardless of whether the behaviors themselves are bespoke, optimization based, or learned. We present an agent architecture and training methodology using neural networks to learn task-oriented communication semantics based on the example of a communication-unaware expert policy. A perimeter defense game illustrates the system’s ability to handle dynamically changing numbers of agents and its graceful degradation in performance as communication constraints are tightened or the expert’s observability assumptions are broken.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08490

PDF

http://arxiv.org/pdf/1901.08490
Read All
Never Forget: Balancing Exploration and Exploitation via Learning Optical Flow

2019-01-24

Hsuan-Kung Yang, Po-Han Chiang, Kuan-Wei Ho, Min-Fong Hong, Chun-Yi Lee

arXiv_AI

arXiv_AI Sparse Reinforcement_Learning Prediction
Abstract

Exploration bonus derived from the novelty of the states in an environment has become a popular approach to motivate exploration for deep reinforcement learning agents in the past few years. Recent methods such as curiosity-driven exploration usually estimate the novelty of new observations by the prediction errors of their system dynamics models. Due to the capacity limitation of the models and difficulty of performing next-frame prediction, however, these methods typically fail to balance between exploration and exploitation in high-dimensional observation tasks, resulting in the agents forgetting the visited paths and exploring those states repeatedly. Such inefficient exploration behavior causes significant performance drops, especially in large environments with sparse reward signals. In this paper, we propose to introduce the concept of optical flow estimation from the field of computer vision to deal with the above issue. We propose to employ optical flow estimation errors to examine the novelty of new observations, such that agents are able to memorize and understand the visited states in a more comprehensive fashion. We compare our method against the previous approaches in a number of experimental experiments. Our results indicate that the proposed method appears to deliver superior and long-lasting performance than the previous methods. We further provide a set of comprehensive ablative analysis of the proposed method, and investigate the impact of optical flow estimation on the learning curves of the DRL agents.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08486

PDF

http://arxiv.org/pdf/1901.08486
Read All
Evolutionary-Neural Hybrid Agents for Architecture Search

2019-01-24

Krzysztof Maziarz, Andrey Khorlin, Quentin de Laroussilhe, Stanisław Jastrzębski, Mingxing Tan, Andrea Gesmundo

arXiv_CV

arXiv_CV Text_Classification NAS Reinforcement_Learning Image_Classification Classification
Abstract

Neural Architecture Search has recently shown potential to automate the design of Neural Networks. The use of Neural Network agents trained with Reinforcement Learning can offer the possibility to learn complex architectural patterns, as well as the ability to explore a vast and compositional search space. On the other hand, evolutionary algorithms offer the sample efficiency needed for such a resource intensive application. We propose a class of Evolutionary-Neural hybrid agents (Evo-NAS), that retain the qualities of the two approaches. We show that the Evo-NAS agent outperforms both Neural and Evolutionary agents when applied to architecture search for a suite of text classification and image classification benchmarks. On a high-complexity architecture search space for image classification, the Evo-NAS agent surpasses the performance of commonly used agents with only 1/3 of the trials.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1811.09828

PDF

https://arxiv.org/pdf/1811.09828
Read All
Truly eccentric. II. When can two circular planets mimic a single eccentric orbit?

2019-01-24

Robert A. Wittenmyer, Christoph Bergmann, Jonathan Horner, Jake Clark, Stephen R. Kane

arXiv_CV

arXiv_CV Sparse
Abstract

When, in the course of searching for exoplanets, sparse sampling and noisy data make it necessary to disentangle possible solutions to the observations, one must consider the possibility that what appears to be a single eccentric Keplerian signal may in reality be attributed to two planets in near-circular orbits. There is precedent in the literature for such outcomes, whereby further data or new analysis techniques reveal hitherto occulted signals. Here, we perform suites of simulations to explore the range of possible two-planet configurations that can result in such confusion. We find that a single Keplerian orbit with $e>$0.5 can virtually never be mimicked by such deceptive system architectures. This result adds credibility to the most eccentric planets that have been found to date, and suggests that it could well be worth revisiting the catalogue of moderately eccentric ‘confirmed’ exoplanets in the coming years, as more data become available, to determine whether any such deceptive couplets are hidden in the observational data.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1901.08472

PDF

https://arxiv.org/pdf/1901.08472
Read All
Communication-Efficient and Decentralized Multi-Task Boosting while Learning the Collaboration Graph

2019-01-24

Valentina Zantedeschi, Aurélien Bellet, Marc Tommasi

arXiv_AI

arXiv_AI Sparse Knowledge Optimization
Abstract

We study the decentralized machine learning scenario where many users collaborate to learn personalized models based on (i) their local datasets and (ii) a similarity graph over the users’ learning tasks. Our approach trains nonlinear classifiers in a multi-task boosting manner without exchanging personal data and with low communication costs. When background knowledge about task similarities is not available, we propose to jointly learn the personalized models and a sparse collaboration graph through an alternating optimization procedure. We analyze the convergence rate, memory consumption and communication complexity of our decentralized algorithms, and demonstrate the benefits of our approach compared to competing techniques on synthetic and real datasets.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08460

PDF

http://arxiv.org/pdf/1901.08460
Read All
Reducing Over-confident Errors outside the Known Distribution

2019-01-24

Zhizhong Li, Derek Hoiem

arXiv_CV

arXiv_CV Face Prediction Detection Recognition
Abstract

Intuitively, unfamiliarity should lead to lack of confidence. In reality, current algorithms often make highly confident yet wrong predictions when faced with unexpected test samples from an unknown distribution different from training. Unlike domain adaptation methods, we cannot gather an “unexpected dataset” prior to test, and unlike novelty detection methods, a best-effort original task prediction is still expected. We compare a number of methods from related fields such as calibration and epistemic uncertainty modeling, as well as two proposed methods that reduce overconfident errors of samples from an unknown novel distribution without drastically increasing evaluation time: (1) G-distillation, training an ensemble of classifiers and then distill into a single model using both labeled and unlabeled examples, or (2) NCR, reducing prediction confidence based on its novelty detection score. Experimentally, we investigate the overconfidence problem and evaluate our solution by creating “familiar” and “novel” test splits, where “familiar” are identically distributed with training and “novel” are not. We discover that calibrating using temperature scaling on familiar data is the best single-model method for improving novel confidence, followed by our proposed methods. In addition, some methods’ NLL performance are roughly equivalent to a regularly trained model with certain degree of smoothing. Calibrating can also reduce confident errors, for example, in gender recognition by 95\% on demographic groups different from the training data.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1804.03166

PDF

http://arxiv.org/pdf/1804.03166
Read All
Semantic Classification of Tabular Datasets via Character-Level Convolutional Neural Networks

2019-01-24

Paul Azunre, Craig Corcoran, Numa Dhamani, Jeffrey Gleason, Garrett Honke, David Sullivan, Rebecca Ruppel, Sandeep Verma, Jonathon Morgan

arXiv_CL

arXiv_CL CNN Transfer_Learning Inference Classification Prediction
Abstract

A character-level convolutional neural network (CNN) motivated by applications in “automated machine learning” (AutoML) is proposed to semantically classify columns in tabular data. Simulated data containing a set of base classes is first used to learn an initial set of weights. Hand-labeled data from the CKAN repository is then used in a transfer-learning paradigm to adapt the initial weights to a more sophisticated representation of the problem (e.g., including more classes). In doing so, realistic data imperfections are learned and the set of classes handled can be expanded from the base set with reduced labeled data and computing power requirements. Results show the effectiveness and flexibility of this approach in three diverse domains: semantic classification of tabular data, age prediction from social media posts, and email spam classification. In addition to providing further evidence of the effectiveness of transfer learning in natural language processing (NLP), our experiments suggest that analyzing the semantic structure of language at the character level without additional metadata—i.e., network structure, headers, etc.—can produce competitive accuracy for type classification, spam classification, and social media age prediction. We present our open-source toolkit SIMON, an acronym for Semantic Inference for the Modeling of ONtologies, which implements this approach in a user-friendly and scalable/parallelizable fashion.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08456

PDF

http://arxiv.org/pdf/1901.08456
Read All
CT synthesis from MR images for orthopedic applications in the lower arm using a conditional generative adversarial network

2019-01-24

Frank Zijlstra, Koen Willemsen, Mateusz C. Florkow, Ralph J.B. Sakkers, Harrie H. Weinans, Bart C.H. van der Wal, Marijn van Stralen, Peter R. Seevinck

arXiv_CV

arXiv_CV Adversarial Segmentation Face Deep_Learning Quantitative
Abstract

Purpose: To assess the feasibility of deep learning-based high resolution synthetic CT generation from MRI scans of the lower arm for orthopedic applications. Methods: A conditional Generative Adversarial Network was trained to synthesize CT images from multi-echo MR images. A training set of MRI and CT scans of 9 ex vivo lower arms was acquired and the CT images were registered to the MRI images. Three-fold cross-validation was applied to generate independent results for the entire dataset. The synthetic CT images were quantitatively evaluated with the mean absolute error metric, and Dice similarity and surface to surface distance on cortical bone segmentations. Results: The mean absolute error was 63.5 HU on the overall tissue volume and 144.2 HU on the cortical bone. The mean Dice similarity of the cortical bone segmentations was 0.86. The average surface to surface distance between bone on real and synthetic CT was 0.48 mm. Qualitatively, the synthetic CT images corresponded well with the real CT scans and partially maintained high resolution structures in the trabecular bone. The bone segmentations on synthetic CT images showed some false positives on tendons, but the general shape of the bone was accurately reconstructed. Conclusions: This study demonstrates that high quality synthetic CT can be generated from MRI scans of the lower arm. The good correspondence of the bone segmentations demonstrates that synthetic CT could be competitive with real CT in applications that depend on such segmentations, such as planning of orthopedic surgery and 3D printing.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08449

PDF

http://arxiv.org/pdf/1901.08449
Read All
Learning to compress and search visual data in large-scale systems

2019-01-24

Sohrab Ferdowsi

arXiv_CV

arXiv_CV
Abstract

The problem of high-dimensional and large-scale representation of visual data is addressed from an unsupervised learning perspective. The emphasis is put on discrete representations, where the description length can be measured in bits and hence the model capacity can be controlled. The algorithmic infrastructure is developed based on the synthesis and analysis prior models whose rate-distortion properties, as well as capacity vs. sample complexity trade-offs are carefully optimized. These models are then extended to multi-layers, namely the RRQ and the ML-STC frameworks, where the latter is further evolved as a powerful deep neural network architecture with fast and sample-efficient training and discrete representations. For the developed algorithms, three important applications are developed. First, the problem of large-scale similarity search in retrieval systems is addressed, where a double-stage solution is proposed leading to faster query times and shorter database storage. Second, the problem of learned image compression is targeted, where the proposed models can capture more redundancies from the training images than the conventional compression codecs. Finally, the proposed algorithms are used to solve ill-posed inverse problems. In particular, the problems of image denoising and compressive sensing are addressed with promising results.

Abstract (translated by Google)

URL

https://arxiv.org/abs/1901.08437

PDF

https://arxiv.org/pdf/1901.08437
Read All
Securing Tag-based recommender systems against profile injection attacks: A comparative study.

2019-01-24

Georgios K. Pitsilis, Heri Ramampiaro, Helge Langseth

arXiv_CL

arXiv_CL Deep_Learning Recommendation
Abstract

This work addresses the challenges related to attacks on collaborative tagging systems, which often comes in a form of malicious annotations or profile injection attacks. In particular, we study various countermeasures against two types of such attacks for social tagging systems, the Overload attack and the Piggyback attack. The countermeasure schemes studied here include baseline classifiers such as, Naive Bayes filter and Support Vector Machine, as well as a Deep Learning approach. Our evaluation performed over synthetic spam data generated from del.icio.us dataset, shows that in most cases, Deep Learning can outperform the classical solutions, providing high-level protection against threats.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08422

PDF

http://arxiv.org/pdf/1901.08422
Read All
A model for a Lindenmayer reconstruction algorithm

2019-01-24

Diego Gabriel Krivochen, Beth Phillips

arXiv_CL

arXiv_CL
Abstract

Given an input string s and a specific Lindenmayer system (the so-called Fibonacci grammar), we define an automaton which is capable of (i) determining whether s belongs to the set of strings that the Fibonacci grammar can generate (in other words, if s corresponds to a generation of the grammar) and, if so, (ii) reconstructing the previous generation.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08407

PDF

http://arxiv.org/pdf/1901.08407
Read All
Context Prediction for Unsupervised Deep Learning on Point Clouds

2019-01-24

Jonathan Sauder, Bjarne Sievers

arXiv_CV

arXiv_CV Segmentation Semantic_Segmentation Classification Deep_Learning Prediction Relation
Abstract

Point clouds provide a flexible and natural representation usable in countless applications such as robotics or self-driving cars. Recently, deep neural networks operating on raw point cloud data have shown promising results on supervised learning tasks such as object classification and semantic segmentation. While massive point cloud datasets can be captured using modern scanning technology, manually labelling such large 3D point clouds for supervised learning tasks is a cumbersome process. This necessitates effective unsupervised learning methods that can produce representations such that downstream tasks require significantly fewer annotated samples. We propose a novel method for unsupervised learning on raw point cloud data in which a neural network is trained to predict the spatial relationship between two point cloud segments. While solving this task, representations that capture semantic properties of the point cloud are learned. Our method outperforms previous unsupervised learning approaches in downstream object classification and segmentation tasks and performs on par with fully supervised methods.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08396

PDF

http://arxiv.org/pdf/1901.08396
Read All
Application of Decision Rules for Handling Class Imbalance in Semantic Segmentation

2019-01-24

Robin Chan, Matthias Rottmann, Fabian Hüger, Peter Schlicht, Hanno Gottschalk

arXiv_CV

arXiv_CV Segmentation Semantic_Segmentation Classification Detection
Abstract

As part of autonomous car driving systems, semantic segmentation is an essential component to obtain a full understanding of the car’s environment. One difficulty, that occurs while training neural networks for this purpose, is class imbalance of training data. Consequently, a neural network trained on unbalanced data in combination with maximum a-posteriori classification may easily ignore classes that are rare in terms of their frequency in the dataset. However, these classes are often of highest interest. We approach such potential misclassifications by weighting the posterior class probabilities with the prior class probabilities which in our case are the inverse frequencies of the corresponding classes in the training dataset. More precisely, we adopt a localized method by computing the priors pixel-wise such that the impact can be analyzed at pixel level as well. In our experiments, we train one network from scratch using a proprietary dataset containing 20,000 annotated frames of video sequences recorded from street scenes. The evaluation on our test set shows an increase of average recall with regard to instances of pedestrians and info signs by $25\%$ and $23.4\%$, respectively. In addition, we significantly reduce the non-detection rate for instances of the same classes by $61\%$ and $38\%$.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08394

PDF

http://arxiv.org/pdf/1901.08394
Read All
Regret Minimisation in Multi-Armed Bandits Using Bounded Arm Memory

2019-01-24

Arghya Roy Chaudhuri, Shivaram Kalyanakrishnan

arXiv_AI

arXiv_AI
Abstract

In this paper, we propose a constant word (RAM model) algorithm for regret minimisation for both finite and infinite Stochastic Multi-Armed Bandit (MAB) instances. Most of the existing regret minimisation algorithms need to remember the statistics of all the arms they encounter. This may become a problem for the cases where the number of available words of memory is limited. Designing an efficient regret minimisation algorithm that uses a constant number of words has long been interesting to the community. Some early attempts consider the number of arms to be infinite, and require the reward distribution of the arms to belong to some particular family. Recently, for finitely many-armed bandits an explore-then-commit based algorithm~\citep{Liau+PSY:2018} seems to escape such assumption. However, due to the underlying PAC-based elimination their method incurs a high regret. We present a conceptually simple, and efficient algorithm that needs to remember statistics of at most $M$ arms, and for any $K$-armed finite bandit instance it enjoys a $O(KM +K^{1.5}\sqrt{T\log (T/MK)}/M)$ upper-bound on regret. We extend it to achieve sub-linear \textit{quantile-regret}~\citep{RoyChaudhuri+K:2018} and empirically verify the efficiency of our algorithm via experiments.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08387

PDF

http://arxiv.org/pdf/1901.08387
Read All
Using CycleGANs for effectively reducing image variability across OCT devices and improving retinal fluid segmentation

2019-01-24

Philipp Seeböck, David Romo-Bucheli, Sebastian Waldstein, Hrvoje Bogunović, José Ignacio Orlando, Bianca S. Gerendas, Georg Langs, Ursula Schmidt-Erfurth

arXiv_CV

arXiv_CV Segmentation GAN
Abstract

Optical coherence tomography (OCT) has become the most important imaging modality in ophthalmology. A substantial amount of research has recently been devoted to the development of machine learning (ML) models for the identification and quantification of pathological features in OCT images. Among the several sources of variability the ML models have to deal with, a major factor is the acquisition device, which can limit the ML model’s generalizability. In this paper, we propose to reduce the image variability across different OCT devices (Spectralis and Cirrus) by using CycleGAN, an unsupervised unpaired image transformation algorithm. The usefulness of this approach is evaluated in the setting of retinal fluid segmentation, namely intraretinal cystoid fluid (IRC) and subretinal fluid (SRF). First, we train a segmentation model on images acquired with a source OCT device. Then we evaluate the model on (1) source, (2) target and (3) transformed versions of the target OCT images. The presented transformation strategy shows an F1 score of 0.4 (0.51) for IRC (SRF) segmentations. Compared with traditional transformation approaches, this means an F1 score gain of 0.2 (0.12).

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08379

PDF

http://arxiv.org/pdf/1901.08379
Read All
3D Backbone Network for 3D Object Detection

2019-01-24

Xuesong Li, Jose E Guivant, Ngaiming Kwok, Yongzhi Xu

arXiv_CV

arXiv_CV Object_Detection Sparse Detection
Abstract

The task of detecting 3D objects in point cloud has a pivotal role in many real-world applications. However, 3D object detection performance is behind that of 2D object detection due to the lack of powerful 3D feature extraction methods. In order to address this issue, we propose to build a 3D backbone network to learn rich 3D feature maps by using sparse 3D CNN operations for 3D object detection in point cloud. The 3D backbone network can inherently learn 3D features from almost raw data without compressing point cloud into multiple 2D images and generate rich feature maps for object detection. The sparse 3D CNN takes full advantages of the sparsity in the 3D point cloud to accelerate computation and save memory, which makes the 3D backbone network achievable. Empirical experiments are conducted on the KITTI benchmark and results show that the proposed method can achieve state-of-the-art performance for 3D object detection.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08373

PDF

http://arxiv.org/pdf/1901.08373
Read All
Robust Learning at Noisy Labeled Medical Images: Applied to Skin Lesion Classification

2019-01-24

Cheng Xue, Qi Dou, Xueying Shi, Hao Chen, Pheng Ann Heng

arXiv_CV

arXiv_CV Image_Classification Classification
Abstract

Deep neural networks (DNNs) have achieved great success in a wide variety of medical image analysis tasks. However, these achievements indispensably rely on the accurately-annotated datasets. If with the noisy-labeled images, the training procedure will immediately encounter difficulties, leading to a suboptimal classifier. This problem is even more crucial in the medical field, given that the annotation quality requires great expertise. In this paper, we propose an effective iterative learning framework for noisy-labeled medical image classification, to combat the lacking of high quality annotated medical data. Specifically, an online uncertainty sample mining method is proposed to eliminate the disturbance from noisy-labeled images. Next, we design a sample re-weighting strategy to preserve the usefulness of correctly-labeled hard samples. Our proposed method is validated on skin lesion classification task, and achieved very promising results.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.07759

PDF

http://arxiv.org/pdf/1901.07759
Read All
Deep Reasoning with Multi-scale Context for Salient Object Detection

2019-01-24

Zun Li, Congyan Lang, Yunpeng Chen, Junhao Liew, Jiashi Feng

arXiv_CV

arXiv_CV Salient Object_Detection Attention CNN Inference Detection
Abstract

To detect and segment salient objects accurately, existing methods are usually devoted to designing complex network architectures to fuse powerful features from the backbone networks. However, they put much less efforts on the saliency inference module and only use a few fully convolutional layers to perform saliency reasoning from the fused features. However, should feature fusion strategies receive much attention but saliency reasoning be ignored a lot? In this paper, we find that weakness of the saliency reasoning unit limits salient object detection performance, and claim that saliency reasoning after multi-scale convolutional features fusion is critical. To verify our findings, we first extract multi-scale features with a fully convolutional network, and then directly reason from these comprehensive features using a deep yet light-weighted network, modified from ShuffleNet, to fast and precisely predict salient objects. Such simple design is shown to be capable of reasoning from multi-scale saliency features as well as giving superior saliency detection performance with less computation cost. Experimental results show that our simple framework outperforms the best existing method with 2.3\% and 3.6\% promotion for F-measure scores, 2.8\% reduction for MAE score on PASCAL-S, DUT-OMRON and SOD datasets respectively.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08362

PDF

http://arxiv.org/pdf/1901.08362
Read All
Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification

2019-01-24

Xia Yuan, Liao xiaoli, Li Shilei, Shi Qinwen, Wu Jinfa, Li Ke

arXiv_CL

arXiv_CL Classification
Abstract

The core of evidence-based medicine is to read and analyze numerous papers in the medical literature on a specific clinical problem and summarize the authoritative answers to that problem. Currently, to formulate a clear and focused clinical problem, the popular PICO framework is usually adopted, in which each clinical problem is considered to consist of four parts: patient/problem (P), intervention (I), comparison (C) and outcome (O). In this study, we compared several classification models that are commonly used in traditional machine learning. Next, we developed a multitask classification model based on a soft-margin SVM with a specialized feature engineering method that combines 1-2gram analysis with TF-IDF analysis. Finally, we trained and tested several generic models on an open-source data set from BioNLP 2018. The results show that the proposed multitask SVM classification model based on 1-2gram TF-IDF features exhibits the best performance among the tested models.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08351

PDF

http://arxiv.org/pdf/1901.08351
Read All
Patch-Based Sparse Representation For Bacterial Detection

2019-01-24

Ahmed Karam Eldaly, Yoann Altmann, Ahsan Akram, Antonios Perperidis, Kevin Dhaliwal, Stephen McLaughlin

arXiv_CV

arXiv_CV Sparse Detection Relation
Abstract

In this paper, we propose an unsupervised approach for bacterial detection in optical endomicroscopy images. This approach splits each image into a set of overlapping patches and assumes that observed intensities are linear combinations of the actual intensity values associated with background image structures, corrupted by additive Gaussian noise and potentially by a sparse outlier term modelling anomalies (which are considered to be candidate bacteria). The actual intensity term representing background structures is modelled as a linear combination of a few atoms drawn from a dictionary which is learned from bacteria-free data and then fixed while analyzing new images. The bacteria detection task is formulated as a minimization problem and an alternating direction method of multipliers (ADMM) is then used to estimate the unknown parameters. Simulations conducted using two ex vivo lung datasets show good detection and correlation performance between bacteria counts identified by a trained clinician and those of the proposed method.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1810.12043

PDF

http://arxiv.org/pdf/1810.12043
Read All
Semantic Matching by Weakly Supervised 2D Point Set Registration

2019-01-24

Zakaria Laskar, Hamed R. Tavakoli, Juho Kannala

arXiv_CV

arXiv_CV Weakly_Supervised CNN
Abstract

In this paper we address the problem of establishing correspondences between different instances of the same object. The problem is posed as finding the geometric transformation that aligns a given image pair. We use a convolutional neural network (CNN) to directly regress the parameters of the transformation model. The alignment problem is defined in the setting where an unordered set of semantic key-points per image are available, but, without the correspondence information. To this end we propose a novel loss function based on cyclic consistency that solves this 2D point set registration problem by inferring the optimal geometric transformation model parameters. We train and test our approach on a standard benchmark dataset Proposal-Flow (PF-PASCAL)\cite{proposal_flow}. The proposed approach achieves state-of-the-art results demonstrating the effectiveness of the method. In addition, we show our approach further benefits from additional training samples in PF-PASCAL generated by using category level information.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08341

PDF

http://arxiv.org/pdf/1901.08341
Read All
Semi-Supervised Semantic Matching

2019-01-24

Zakaria Laskar, Juho Kannala

arXiv_CV

arXiv_CV CNN
Abstract

Convolutional neural networks (CNNs) have been successfully applied to solve the problem of correspondence estimation between semantically related images. Due to non-availability of large training datasets, existing methods resort to self-supervised or unsupervised training paradigm. In this paper we propose a semi-supervised learning framework that imposes cyclic consistency constraint on unlabeled image pairs. Together with the supervised loss the proposed model achieves state-of-the-art on a benchmark semantic matching dataset.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08339

PDF

http://arxiv.org/pdf/1901.08339
Read All
Teaching robots to imitate a human with no on-teacher sensors. What are the key challenges?

2019-01-24

Radoslav Skoviera, Karla Stepanova, Michael Tesar, Gabriela Sejnova, Jiri Sedlar, Michal Vavrecka, Robert Babuska, Josef Sivic

arXiv_RO

arXiv_RO Pose_Estimation Tracking
Abstract

In this paper, we consider the problem of learning object manipulation tasks from human demonstration using RGB or RGB-D cameras. We highlight the key challenges in capturing sufficiently good data with no tracking devices - starting from sensor selection and accurate 6DoF pose estimation to natural language processing. In particular, we focus on two showcases: gluing task with a glue gun and simple block-stacking with variable blocks. Furthermore, we discuss how a linguistic description of the task could help to improve the accuracy of task description. We also present the whole architecture of our transfer of the imitated task to the simulated and real robot environment.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08335

PDF

http://arxiv.org/pdf/1901.08335
Read All
A review of sentiment computation methods with R packages

2019-01-24

Maurizio Naldi

arXiv_CL

arXiv_CL Sentiment Review
Abstract

Four packages in R are analyzed to carry out sentiment analysis. All packages allow to define custom dictionaries. Just one - Sentiment R - properly accounts for the presence of negators.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08319

PDF

http://arxiv.org/pdf/1901.08319
Read All
Whole slide image registration for the study of tumor heterogeneity

2019-01-24

Leslie Solorzano, Gabriela M. Almeida, Bárbara Mesquita, Diana Martins, Carla Oliveira, Carolina Wählby

arXiv_CV

arXiv_CV
Abstract

Consecutive thin sections of tissue samples make it possible to study local variation in e.g. protein expression and tumor heterogeneity by staining for a new protein in each section. In order to compare and correlate patterns of different proteins, the images have to be registered with high accuracy. The problem we want to solve is registration of gigapixel whole slide images (WSI). This presents 3 challenges: (i) Images are very large; (ii) Thin sections result in artifacts that make global affine registration prone to very large local errors; (iii) Local affine registration is required to preserve correct tissue morphology (local size, shape and texture). In our approach we compare WSI registration based on automatic and manual feature selection on either the full image or natural sub-regions (as opposed to square tiles). Working with natural sub-regions, in an interactive tool makes it possible to exclude regions containing scientifically irrelevant information. We also present a new way to visualize local registration quality by a Registration Confidence Map (RCM). With this method, intra-tumor heterogeneity and charateristics of the tumor microenvironment can be observed and quantified.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08317

PDF

http://arxiv.org/pdf/1901.08317
Read All
Deep Learning on Attributed Graphs: A Journey from Graphs to Their Embeddings and Back

2019-01-24

Martin Simonovsky

arXiv_CV

arXiv_CV Image_Caption Embedding Deep_Learning Prediction Relation
Abstract

A graph is a powerful concept for representation of relations between pairs of entities. Data with underlying graph structure can be found across many disciplines and there is a natural desire for understanding such data better. Deep learning (DL) has achieved significant breakthroughs in a variety of machine learning tasks in recent years, especially where data is structured on a grid, such as in text, speech, or image understanding. However, surprisingly little has been done to explore the applicability of DL on arbitrary graph-structured data directly. The goal of this thesis is to investigate architectures for DL on graphs and study how to transfer, adapt or generalize concepts that work well on sequential and image data to this domain. We concentrate on two important primitives: embedding graphs or their nodes into a continuous vector space representation (encoding) and, conversely, generating graphs from such vectors back (decoding). To that end, we make the following contributions. First, we introduce Edge-Conditioned Convolutions (ECC), a convolution-like operation on graphs performed in the spatial domain where filters are dynamically generated based on edge attributes. The method is used to encode graphs with arbitrary and varying structure. Second, we propose SuperPoint Graph, an intermediate point cloud representation with rich edge attributes encoding the contextual relationship between object parts. Based on this representation, ECC is employed to segment large-scale point clouds without major sacrifice in fine details. Third, we present GraphVAE, a graph generator allowing us to decode graphs with variable but upper-bounded number of nodes making use of approximate graph matching for aligning the predictions of an autoencoder with its inputs. The method is applied to the task of molecule generation.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08296

PDF

http://arxiv.org/pdf/1901.08296
Read All
Anomaly Detection in Road Traffic Using Visual Surveillance: A Survey

2019-01-24

Santhosh Kelathodi Kumaran, Debi Prosad Dogra, Partha Pratim Roy

arXiv_CV

arXiv_CV Survey Detection
Abstract

Computer vision has evolved in the last decade as a key technology for numerous applications replacing human supervision. In this paper, we present a survey on relevant visual surveillance related researches for anomaly detection in public places, focusing primarily on roads. Firstly, we revisit the surveys done in the last 10 years in this field. Since the underlying building block of a typical anomaly detection is learning, we emphasize more on learning methods applied on video scenes. We then summarize the important contributions made during last six years on anomaly detection primarily focusing on features, underlying techniques, applied scenarios and types of anomalies using single static camera. Finally, we discuss the challenges in the computer vision related anomaly detection techniques and some of the important future possibilities.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08292

PDF

http://arxiv.org/pdf/1901.08292
Read All
Combinational Q-Learning for Dou Di Zhu

2019-01-24

Yang You, Liangwei Li, Baisong Guo, Weiming Wang, Cewu Lu

arXiv_AI

arXiv_AI Adversarial Knowledge Attention Reinforcement_Learning Relation
Abstract

Deep reinforcement learning (DRL) has gained a lot of attention in recent years, and has been proven to be able to play Atari games and Go at or above human levels. However, those games are assumed to have a small fixed number of actions and could be trained with a simple CNN network. In this paper, we study a special class of Asian popular card games called Dou Di Zhu, in which two adversarial groups of agents must consider numerous card combinations at each time step, leading to huge number of actions. We propose a novel method to handle combinatorial actions, which we call combinational Q-learning (CQL). We employ a two-stage network to reduce action space and also leverage order-invariant max-pooling operations to extract relationships between primitive actions. Results show that our method prevails over state-of-the art methods like naive Q-learning and A3C. We develop an easy-to-use card game environments and train all agents adversarially from sractch, with only knowledge of game rules and verify that our agents are comparative to humans. Our code to reproduce all reported results will be available online.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08925

PDF

http://arxiv.org/pdf/1901.08925
Read All
Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks

2019-01-24

Charles H. Martin, Michael W. Mahoney

arXiv_CV

arXiv_CV Regularization Relation
Abstract

Given two or more Deep Neural Networks (DNNs) with the same or similar architectures, and trained on the same dataset, but trained with different solvers, parameters, hyper-parameters, regularization, etc., can we predict which DNN will have the best test accuracy, and can we do so without peeking at the test data? In this paper, we show how to use a new Theory of Heavy-Tailed Self-Regularization (HT-SR) to answer this. HT-SR suggests, among other things, that modern DNNs exhibit what we call Heavy-Tailed Mechanistic Universality (HT-MU), meaning that the correlations in the layer weight matrices can be fit to a power law with exponents that lie in common Universality classes from Heavy-Tailed Random Matrix Theory (HT-RMT). From this, we develop a Universal capacity control metric that is a weighted average of these PL exponents. Rather than considering small toy NNs, we examine over 50 different, large-scale pre-trained DNNs, ranging over 15 different architectures, trained on ImagetNet, each of which has been reported to have different test accuracies. We show that this new capacity metric correlates very well with the reported test accuracies of these DNNs, looking across each architecture (VGG16/…/VGG19, ResNet10/…/ResNet152, etc.). We also show how to approximate the metric by the more familiar Product Norm capacity measure, as the average of the log Frobenius norm of the layer weight matrices. Our approach requires no changes to the underlying DNN or its loss function, it does not require us to train a model (although it could be used to monitor training), and it does not even require access to the ImageNet data.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08278

PDF

http://arxiv.org/pdf/1901.08278
Read All
Federated Reinforcement Learning

2019-01-24

Hankz Hankui Zhuo, Wenfeng Feng, Qian Xu, Qiang Yang, Yufeng Lin

arXiv_AI

arXiv_AI Knowledge Reinforcement_Learning
Abstract

In reinforcement learning, building policies of high-quality is challenging when the feature space of states is small and the training data is limited. Directly transferring data or knowledge from an agent to another agent will not work due to the privacy requirement of data and models. In this paper, we propose a novel reinforcement learning approach to considering the privacy requirement and building Q-network for each agent with the help of other agents, namely federated reinforcement learning (FRL). To protect the privacy of data and models, we exploit Gausian differentials on the information shared with each other when updating their local models. In the experiment, we evaluate our FRL framework in two diverse domains, Grid-world and Text2Action domains, by comparing to various baselines.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08277

PDF

http://arxiv.org/pdf/1901.08277
Read All
A Novel Self-Intersection Penalty Term for Statistical Body Shape Models and Its Applications in 3D Pose Estimation

2019-01-24

Zaiqiang Wu, Wei Jiang, Hao Luo, Lin Cheng

arXiv_CV

arXiv_CV Face Pose_Estimation Optimization Prediction Quantitative Detection
Abstract

Statistical body shape models are widely used in 3D pose estimation due to their low-dimensional parameters representation. However, it is difficult to avoid self-intersection between body parts accurately. Motivated by this fact, we proposed a novel self-intersection penalty term for statistical body shape models applied in 3D pose estimation. To avoid the trouble of computing self-intersection for complex surfaces like the body meshes, the gradient of our proposed self-intersection penalty term is manually derived from the perspective of geometry. First, the self-intersection penalty term is defined as the volume of the self-intersection region. To calculate the partial derivatives with respect to the coordinates of the vertices, we employed detection rays to divide vertices of statistical body shape models into different groups depending on whether the vertex is in the region of self-intersection. Second, the partial derivatives could be easily derived by the normal vectors of neighboring triangles of the vertices. Finally, this penalty term could be applied in gradient-based optimization algorithms to remove the self-intersection of triangular meshes without using any approximation. Qualitative and quantitative evaluations were conducted to demonstrate the effectiveness and generality of our proposed method compared with previous approaches. The experimental results show that our proposed penalty term can avoid self-intersection to exclude unreasonable predictions and improves the accuracy of 3D pose estimation indirectly. Further more, the proposed method could be employed universally in triangular mesh based 3D reconstruction.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08274

PDF

http://arxiv.org/pdf/1901.08274
Read All
Learning Vector Representation of Content and Matrix Representation of Change: Towards a Representational Model of V1

2019-01-24

Ruiqi Gao, Jianwen Xie, Song-Chun Zhu, Ying Nian Wu

arXiv_CV

arXiv_CV Inference Prediction
Abstract

This paper entertains the hypothesis that the primary purpose of the cells of the primary visual cortex (V1) is to perceive motions and predict changes of local image contents. Specifically, we propose a model that couples the vector representations of local image contents with the matrix representations of local pixel displacements caused by the relative motions between the agent and the surrounding objects and scene. When the image changes from one time frame to the next due to pixel displacements, the vector at each pixel is multiplied by a matrix that represents the displacement of this pixel. We show that by learning from pair of images that are deformed versions of each other, we can learn both vector and matrix representations. The units in the learned vector representations resemble V1 cells. The learned vector-matrix representations enable prediction of image frames over time, and more importantly, inference of the local pixel displacements caused by relative motions.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.03871

PDF

http://arxiv.org/pdf/1902.03871
Read All
FANDA: A Novel Approach to Perform Follow-up Query Analysis

2019-01-24

Qian Liu, Bei Chen, Jian-Guang Lou, Ge Jin, Dongmei Zhang

arXiv_AI

arXiv_AI Attention Weakly_Supervised Face
Abstract

Recent work on Natural Language Interfaces to Databases (NLIDB) has attracted considerable attention. NLIDB allow users to search databases using natural language instead of SQL-like query languages. While saving the users from having to learn query languages, multi-turn interaction with NLIDB usually involves multiple queries where contextual information is vital to understand the users’ query intents. In this paper, we address a typical contextual understanding problem, termed as follow-up query analysis. In spite of its ubiquity, follow-up query analysis has not been well studied due to two primary obstacles: the multifarious nature of follow-up query scenarios and the lack of high-quality datasets. Our work summarizes typical follow-up query scenarios and provides a new FollowUp dataset with $1000$ query triples on 120 tables. Moreover, we propose a novel approach FANDA, which takes into account the structures of queries and employs a ranking model with weakly supervised max-margin learning. The experimental results on FollowUp demonstrate the superiority of FANDA over multiple baselines across multiple metrics.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08259

PDF

http://arxiv.org/pdf/1901.08259
Read All
Generative Adversarial Network with Multi-Branch Discriminator for Cross-Species Image-to-Image Translation

2019-01-24

Ziqiang Zheng, Zhibin Yu, Haiyong Zheng, Yang Wu, Bing Zheng, Ping Lin

arXiv_CV

arXiv_CV Adversarial GAN
Abstract

Current approaches have made great progress on image-to-image translation tasks benefiting from the success of image synthesis methods especially generative adversarial networks (GANs). However, existing methods are limited to handling translation tasks between two species while keeping the content matching on the semantic level. A more challenging task would be the translation among more than two species. To explore this new area, we propose a simple yet effective structure of a multi-branch discriminator for enhancing an arbitrary generative adversarial architecture (GAN), named GAN-MBD. It takes advantage of the boosting strategy to break a common discriminator into several smaller ones with fewer parameters, which can enhance the generation and synthesis abilities of GANs efficiently and effectively. Comprehensive experiments show that the proposed multi-branch discriminator can dramatically improve the performance of popular GANs on cross-species image-to-image translation tasks while reducing the number of parameters for computation. The code and some datasets are attached as supplementary materials for reference.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.10895

PDF

http://arxiv.org/pdf/1901.10895
Read All
Unsupervised Image-to-Image Translation with Self-Attention Networks

2019-01-24

Taewon Kang, Kwang Hee Lee

arXiv_CV

arXiv_CV Attention GAN Style_Transfer Quantitative
Abstract

Unsupervised image translation aims to learn the transformation from a source domain to another target domain given unpaired training data. Several state-of-the-art works have yielded impressive results in the GANs-based unsupervised image-to-image translation. It fails to capture strong geometric or structural change between domains or is unsatisfactory for complex scenes, compared to texture change tasks such as style transfer. Recently, SAGAN (Han Zhang, 2018) showed that the self-attention network produces better results than the convolution-based GAN. However, the effectiveness of the self-attention network in unsupervised image-to-image translation tasks have not been verified. In this paper, we propose an unsupervised image-to-image translation with self-attention networks, in which long range dependency helps to not only capture strong geometric change but also generate details using cues from all feature locations. In experiments, we qualitatively and quantitatively show superiority of the proposed method compared to existing state-of-the-art unsupervised image-to-image translation task.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08242

PDF

http://arxiv.org/pdf/1901.08242
Read All
Location reference identification from tweets during emergencies: A deep learning approach

2019-01-24

Abhinav Kumar, Jyoti Prakash Singh

arXiv_CL

arXiv_CL CNN Deep_Learning
Abstract

Twitter is recently being used during crises to communicate with officials and provide rescue and relief operation in real time. The geographical location information of the event, as well as users, are vitally important in such scenarios. The identification of geographic location is one of the challenging tasks as the location information fields, such as user location and place name of tweets are not reliable. The extraction of location information from tweet text is difficult as it contains a lot of non-standard English, grammatical errors, spelling mistakes, non-standard abbreviations, and so on. This research aims to extract location words used in the tweet using a Convolutional Neural Network (CNN) based model. We achieved the exact matching score of 0.929, Hamming loss of 0.002, and $F_1$-score of 0.96 for the tweets related to the earthquake. Our model was able to extract even three- to four-word long location references which is also evident from the exact matching score of over 92\%. The findings of this paper can help in early event localization, emergency situations, real-time road traffic management, localized advertisement, and in various location-based services.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1901.08241

PDF

http://arxiv.org/pdf/1901.08241
Read All

178/266

Welcome to AMDS123 Blog!

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL

PDF

Abstract

Abstract (translated by Google)

URL