RL | 速读 IJCAI 2025 的强化学习论文,有哪些可以挖掘?

摘要:论文列表 359 Multi-granularity Knowledge Transfer for Continual Reinforcement Learning - 为持续 RL 而设计的多粒度知识迁移 769 BILE: An Eff
论文列表 359 Multi-granularity Knowledge Transfer for Continual Reinforcement Learning - 为持续 RL 而设计的多粒度知识迁移 769 BILE: An Effective Behavior-based Latent Exploration Scheme for Deep Reinforcement Learning - BILE:一种有效的 Behavior-based 的 DRL latent 探索方案 908 Imagination-Limited Q-Learning for Offline Reinforcement Learning - 用于 offline RL 的想象力限制的 Q-Learning 2430 Self-Consistent Model-based Adaptation for Visual Reinforcement Learning - 为视觉 RL 而设计的自我一致的 model-based 的自适应 3591 Two-Stage Feature Generation with Transformer and Reinforcement Learning - 使用 Transformer 和强化学习进行两阶段特征生成 3621 PNAct: Crafting Backdoor Attacks in Safe Reinforcement Learning - PNAct:在 safe RL 中制作后门攻击 3768 Efficient Diversity-based Experience Replay for Deep Reinforcement Learning - 为 DRL 设计的 基于 diversity 的高效 experience replay 4744 Deduction with Induction: Combining Knowledge Discovery with Reasoning for Interpretable Deep Reinforcement Learning - 演论与归纳法:将知识发现与推理相结合,实现可解释的深度强化学习 4997 From End-to-end to Step-by-step: Learning to Abstract via Abductive Reinforcement Learning - 从 end-to-end 到 step-by-step:通过归纳(Abductive)强化学习 学习抽象 5103 Efficient Multi-view Clustering via Reinforcement Contrastive Learning - 通过强化对比学习,进行高效的多视图聚类 目录论文列表359 Multi-granularity Knowledge Transfer for Continual Reinforcement Learning - 为持续 RL 而设计的多粒度知识迁移一、 研究背景与核心痛点(The Gap)二、 动机与故事线构建(Motivation & Narrative)三、 审稿策略分析(Positioning Strategy)四、 方法合理性与技术细节(Method Justification)1. 架构:分层协作(HRL Structure)2. 知识迁移机制:策略库与符号化食谱3. 鲁棒性保障:闭环反馈(Closed-Loop Feedback)769 BILE: An Effective Behavior-based Latent Exploration Scheme for Deep Reinforcement Learning - BILE:一种有效的 Behavior-based 的 DRL latent 探索方案零、介绍 \(\pi\)-Bisimulation metric一、背景与挑战:高维稀疏环境下的探索困境二、核心机制:隐向量 \(\mathbf{z}\) 的作用与采样三、BILE 的关键技术创新:鲁棒的行为度量学习3.1 度量的目标:价值多样性3.2 鲁棒性机制:引入预测误差四、BILE 与 METRA / ETD 的对比分析关键差异总结:908 Imagination-Limited Q-Learning for Offline Reinforcement Learning - 用于 offline RL 的想象力限制的 Q-Learning一、引言:offline RL 的挑战与现有困境二、ILQ 的叙事核心:寻找“合理的乐观”三、核心方法:想象力受限 Bellman
阅读全文