Announcement_1
New preprint on arXiv: Decision Points RL (DPRL) to identify “diffs” to the behavior policy in batch RL settings. We achieve provably high-confidence improvement.
New preprint on arXiv: Decision Points RL (DPRL) to identify “diffs” to the behavior policy in batch RL settings. We achieve provably high-confidence improvement.