Unsupervised RL Paper List

Unsupervised RL Paper List

Predictable MDP Abstraction for Unsupervised Model-Based RL
Abs: Skill learning and curiosity for Model-based RL
https://arxiv.org/abs/2302.03921

Efficient Exploration via State Marginal Matching
Abs: 将基于predict error的exploration转变为基于minimax博弈问题的exploration。同时引入target distribution来引导exploration,使max-entropy exploration可以转变为goal-conditional exploration
https://arxiv.org/abs/1906.05274

Reinforcement learning with prototypical representations
Abs: 对已探索的状态空间进行聚类,exploration policy需要最大化k个类的熵。该算法使用SwAV加速表征学习,通过最大化聚类的熵实现unsupervised exploration
https://arxiv.org/abs/2102.11271

Behavior From the Void: Unsupervised Active Pre-Training
Abs: 对比学习head+内在奖励设计来驱动无监督探索。内在奖励由minibatch内样本间的距离来定义,样本间的平均距离越大,奖励越大
https://arxiv.org/abs/2103.04551

Prototypical Context-Aware Dynamics Generalization for High-Dimensional Model-Based Reinfocement Learning
Abs: 利用SwAV对轨迹进行无监督聚类,学习当前轨迹的context,提升智能体对未见过的动态环境的泛化能力
https://arxiv.org/abs/2211.12774

DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations
Abs: 利用SwAV对obs的正负样本对进行聚类,同时学习动态相关的时序特征与图像特征,提升state abstraction能力,进而提升泛化能力
https://arxiv.org/abs/2110.14565

URLB: Unsupervised Reinforcement Learning Benchmark
Abs: 基于dm_control的无监督RL Benchmark
https://arxiv.org/pdf/2110.15191.pdf


Unsupervised RL Paper List
http://mooricanna.github.io/2024/04/21/Unsupervised-RL-Paper-List/
作者
mooricAnna
发布于
2024年4月21日
许可协议