Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Chandak, Yash; Shankar, Shiv; Bastian, Nathaniel D.; Castro da Silva, Bruno; Brunskill, Emma

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Files

Off-Policy Evaluation for Action-Dependent Non-Stationary Environ.pdf (1.11 MB)

Authors

Chandak, Yash

Shankar, Shiv

Bastian, Nathaniel D.

Castro da Silva, Bruno

Brunskill, Emma

Issue Date

2022

Type

Conference presentations, papers, posters

Keywords

Machine Learning , Artificial Intelligence

Abstract

Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary. This limits the application of such methods because real-world problems are often subject to changes due to external factors (passive non-stationarity), changes induced by interactions with the system itself (active non-stationarity), or both (hybrid non-stationarity). In this work, we take the first steps towards the fundamental challenge of on-policy and off-policy evaluation amidst structured changes due to active, passive, or hybrid non-stationarity. Towards this goal, we make a higher-order stationarity assumption such that non-stationarity results in changes over time, but the way changes happen is fixed. We propose, OPEN, an algorithm that uses a double application of counterfactual reasoning and a novel importance-weighted instrument-variable regression to obtain both a lower bias and a lower variance estimate of the structure in the changes of a policy’s past performances. Finally, we show promising results on how OPEN can be used to predict future performances for several domains inspired by real-world applications that exhibit non-stationarity.

Citation

Chandak, Yash; Shankar, Shiv; Bastian, Nathaniel D.; Castro da Silva, Bruno; Brunskill, Emma; and Thomas, Philip, "Off-Policy Evaluation for Action-Dependent Non-Stationary Environments" (2022).

Publisher

36th Conference on Neural Information Processing Systems

URI

https://hdl.handle.net/20.500.14216/549

Collections

Army Cyber Institute

Full item page

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Files

Authors

Issue Date

Type

Language

Keywords

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Description

Citation

Publisher

License

Journal

Volume

Issue

URI

PubMed ID

DOI

ISSN

EISSN

Collections