Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
dc.contributor.author | Chandak, Yash | |
dc.contributor.author | Shankar, Shiv | |
dc.contributor.author | Bastian, Nathaniel D. | |
dc.contributor.author | Castro da Silva, Bruno | |
dc.contributor.author | Brunskill, Emma | |
dc.date.accessioned | 2023-09-05T19:39:56Z | |
dc.date.available | 2023-09-05T19:39:56Z | |
dc.date.issued | 2022 | |
dc.description.abstract | Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary. This limits the application of such methods because real-world problems are often subject to changes due to external factors (passive non-stationarity), changes induced by interactions with the system itself (active non-stationarity), or both (hybrid non-stationarity). In this work, we take the first steps towards the fundamental challenge of on-policy and off-policy evaluation amidst structured changes due to active, passive, or hybrid non-stationarity. Towards this goal, we make a higher-order stationarity assumption such that non-stationarity results in changes over time, but the way changes happen is fixed. We propose, OPEN, an algorithm that uses a double application of counterfactual reasoning and a novel importance-weighted instrument-variable regression to obtain both a lower bias and a lower variance estimate of the structure in the changes of a policy’s past performances. Finally, we show promising results on how OPEN can be used to predict future performances for several domains inspired by real-world applications that exhibit non-stationarity. | |
dc.description.sponsorship | Army Cyber Institute | |
dc.identifier.citation | Chandak, Yash; Shankar, Shiv; Bastian, Nathaniel D.; Castro da Silva, Bruno; Brunskill, Emma; and Thomas, Philip, "Off-Policy Evaluation for Action-Dependent Non-Stationary Environments" (2022). | |
dc.identifier.other | NA | |
dc.identifier.uri | https://hdl.handle.net/20.500.14216/549 | |
dc.publisher | 36th Conference on Neural Information Processing Systems | |
dc.subject | Machine Learning | |
dc.subject | Artificial Intelligence | |
dc.title | Off-Policy Evaluation for Action-Dependent Non-Stationary Environments | |
dc.type | Conference presentations, papers, posters | |
local.peerReviewed | Yes |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Off-Policy Evaluation for Action-Dependent Non-Stationary Environ.pdf
- Size:
- 1.11 MB
- Format:
- Adobe Portable Document Format