Announcement_1

New preprint on arXiv: Decision Points RL (DPRL) to identify “diffs” to the behavior policy in batch RL settings. We achieve provably high-confidence improvement.