papers AI Learner
The Github is limit! Click to go to the new site.

Optimizing Sequential Medical Treatments with Auto-Encoding Heuristic Search in POMDPs

2019-05-17
Luchen Li, Matthieu Komorowski, Aldo A. Faisal

Abstract

Health-related data is noisy and stochastic in implying the true physiological states of patients, limiting information contained in single-moment observations for sequential clinical decision making. We model patient-clinician interactions as partially observable Markov decision processes (POMDPs) and optimize sequential treatment based on belief states inferred from history sequence. To facilitate inference, we build a variational generative model and boost state representation with a recurrent neural network (RNN), incorporating an auxiliary loss from sequence auto-encoding. Meanwhile, we optimize a continuous policy of drug levels with an actor-critic method where policy gradients are obtained from a stablized off-policy estimate of advantage function, with the value of belief state backed up by parallel best-first suffix trees. We exploit our methodology in optimizing dosages of vasopressor and intravenous fluid for sepsis patients using a retrospective intensive care dataset and evaluate the learned policy with off-policy policy evaluation (OPPE). The results demonstrate that modelling as POMDPs yields better performance than MDPs, and that incorporating heuristic search improves sample efficiency.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1905.07465

PDF

http://arxiv.org/pdf/1905.07465


Similar Posts

Comments