• 00 07-17 (4) Nash equilibrium seeking for a class of quadratic-bilinear Wasserstein distributionally robust games Nash Gleichgewicht Suche nach einer Klasse von quadratisch-bilinearen Wasserstein Verteilung robusten Spielen Nash 均衡, 寻求类二次- 贝里尼奥尔 瓦西斯坦分配强强的游戏 2411.09636v2
  • 01 07-17 Coral Protocol: Open Infrastructure Connecting The Internet of Agents Coral Protocol: Open Infrastructure Connecting Das Internet der Agenten 珊瑚议定书:开放基础设施连接代理物互联网 2505.00749v2
  • 02 07-17 Imitating Mistakes in a Learning Companion AI Agent for Online Peer Learning Nachahmen von Fehlern in einem Learning Companion KI Agent für Online Peer Learning 模拟学习伙伴AI在线同行学习代理的错误 2507.12801v1
  • 03 07-16 (3) Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation Lernpolitik für dynamische Koalitionsbildung in der Multi-Roboter-Task-Allokation 多机器人任务分配中动态联盟形成学习政策 2412.20397v3
  • 04 07-16 NLI4VolVis: Natural Language Interaction for Volume Visualization via LLM Multi-Agents and Editable 3D Gaussian Splatting NLI4VolVis: Natürliche Sprachinteraktion für die Volumenvisualisierung über LLM Multi-Agenten und editierbare 3D Gaussian Splatting NLI4VolVis:通过LLM多代理商和可编辑的 3D Gaussian Splating 进行卷量可视化的自然语言互动 2507.12621v1
  • 05 07-16 Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies Mixed-Reality Digital Twins: Nutzung der physischen und virtuellen Welten für Hybrid Sim2Real Transition von Multi-Agent Verstärkungs-Learning-Politiken 混合-现实数字双对:利用物理和虚拟世界促进混合的Sim2重新过渡多机构强化学习政策 2403.10996v6
  • 06 07-16 Modeling Feasible Locomotion of Nanobots for Cancer Detection and Treatment Modellierung einer Machbarkeitslokomotion von Nanobots für Krebserkennung und -behandlung 用于癌症检测和治疗的纳米机器人的可行易燃活动模型 2507.12400v1
  • 07 07-16 From Semantic Web and MAS to Agentic AI: A Unified Narrative of the Web of Agents Von Semantic Web und MAS zu Agentic AI: Ein einheitliches Narrativ des Web of Agents 从语义网站和MAS到AA:关于 “ 代理人网络 “ 的统一说明 2507.10644v2
  • 08 07-16 Robot Metabolism: Towards machines that can grow by consuming other machines Robotermetabolismus: Auf dem Weg zu Maschinen, die durch den Verzehr anderer Maschinen wachsen können 机器人新陈代谢:研制能够通过消耗其他机器而成长的机器 2411.11192v2
  • 09 07-16 Programming Distributed Collective Processes in the eXchange Calculus Programmierung verteilter kollektiver Prozesse im eXchange Calculus eXchange Calculus 中的程序编程分配集体进程 2401.11212v5
  • 10 07-16 Fast and Scalable Game-Theoretic Trajectory Planning with Intentional Uncertainties Schnelle und skalierbare game-theoretische Trajektorie-Planung mit Absichtsunsicherheiten 具有有意不确定性的快速和可缩放游戏-理论轨迹规划 2507.12174v1
  • 11 07-16 Value-Based Large Language Model Agent Simulation for Mutual Evaluation of Trust and Interpersonal Closeness Value-Based Large Language Model Agent Simulation zur gegenseitigen Bewertung von Vertrauen und zwischenmenschlicher Nähe 用于相互评价信任和人际亲密的基于价值的大型语言模型模拟剂 2507.11979v1
  • 12 07-16 BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling BRIDGE: Bootstrapping-Text zur Steuerung der Time-Series-Generation über Multi-Agent iterative Optimierung und Diffusionsmodellierung BRIDGE:通过多代理迭代优化和传播模型化控制时间- 系列生成的推进文本 2503.02445v5
  • 13 07-16 CoCre-Sam (Kokkuri-san): Modeling Ouija Board as Collective Langevin Dynamics Sampling from Fused Language Models CoCre-Sam (Kokkuri-san): Modellierung des Ouija Boards als kollektive Langevin Dynamics-Probenahme aus Fused Language Models Core-Sam (Kokkuri-san): 建模Uija 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对 校对: 校对 校对 校对 校对 校对 校对 校对 校对: 校对: 校对: 校对: 语言 语言 校对 校对: 校对 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对 校对 校对: 校对 校对 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对 校对: 校对: 校对 校对: 校对: 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校 2507.11906v1
  • 14 07-15 (2) Bridging Literature and the Universe Via A Multi-Agent Large Language Model System Überbrückung der Literatur und des Universums über ein Multi-Agenten Large Language Model System 搭桥文学和宇宙通过多种需要的大型语言模式系统 2507.08958v2
  • 15 07-15 Horus: A Protocol for Trustless Delegation Under Uncertainty Horus: Ein Protokoll für eine treulose Delegation unter Unsicherheit 荷鲁斯:不确定性下无信托代表团议定书 2507.00631v6
  • 16 07-15 Large-scale distributed synchronization systems, using a cancel-on-completion redundancy mechanism Großmaßstäbliche verteilte Synchronisationssysteme, mit einem storn-on-completion Redundanz-Mechanismus 使用完成后注销冗余机制,使用大规模分布式分布式的大规模同步系统 2507.11779v1
  • 17 07-15 A Cellular Automata Approach to Donation Game Ein zellulärer Automata Ansatz zur Spende Spiel 捐赠游戏的细胞自动模式 2507.11744v1
  • 18 07-15 MR-LDM – The Merge-Reactive Longitudinal Decision Model: Game Theoretic Human Decision Modeling for Interactive Sim Agents MR-LDM – Das Merge-Reactive Longitudinal Decision Model: Game Theoretic Human Decision Modeling for Interactive Sim Agents MR-LDM – – 合并-反反应纵向决定模型:互动模拟剂的游戏理论人类决定模型 2507.12494v1
  • 19 07-15 Let’s Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification Lassen Sie uns in zwei Schritten denken: Abmildern Vereinbarung Bias in MLLMs mit selbst-gerundete Verifikation 让我们思考两步:在MLLMs中减少协议与自我核查的偏见 2507.11662v1
  • 20 07-15 STAGED: A Multi-Agent Neural Network for Learning Cellular Interaction Dynamics STAGED: Ein multi-agent-neurales Netzwerk zum Lernen zellulärer Interaktionsdynamik STAGAD: 学习细胞互动动态多要素神经网络 2507.11660v1
  • 21 07-15 LF: Online Multi-Robot Path Planning Meets Optimal Trajectory Control LF: Online-Multi-Roboter-Pfadplanung trifft auf optimale Trajektoriensteuerung LF: 在线多机器人路径规划满足最佳轨迹控制 2507.11464v1
  • 22 07-15 Simulation for All: A Step-by-Step Cookbook for Developing Human-Centered Multi-Agent Transportation Simulators Simulation für alle: Ein Schritt für Schritt Kochbuch für die Entwicklung von Mensch-zentrierten Multi-Agenten-Transportsimulatoren 面向所有人的模拟:开发以人为本的多机构运输模拟器的《一步一步步编制手册》 2507.09367v2
  • 23 07-15 From Kinetic Theory to AI: a Rediscovery of High-Dimensional Divergences and Their Properties Von der Kinetischen Theorie zur KI: Eine Wiederentdeckung hochdimensionaler Divergenzen und ihrer Eigenschaften 从动从理论到AI:重现高度多元差异及其属性 2507.11387v1
  • 24 07-15 Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs Schrittweise Richtlinie für Wissen über seltene Werkzeuge (SPaRK): Offline-RL, die vielfältige Werkzeugnutzung in LLMs antreibt 有限工具知识(SPARK)的逐步政策:驱动在LLM中使用多样化工具的离线RL 2507.11371v1
  • 25 07-15 Beyond Predictions: A Participatory Framework for Multi-Stakeholder Decision-Making Beyond Predictions: Ein partizipatorischer Rahmen für Entscheidungsfindung mit mehreren Interessenträgern 超越预测:多方利益攸关方决策参与框架 2502.08542v2
  • 26 07-15 Taming Uncertainty via Automation: Observing, Analyzing, and Optimizing Agentic AI Systems Ungewissheit durch Automatisierung: Beobachten, Analysieren und Optimieren Agentischer KI-Systeme 通过自动化防止不确定性:观测、分析、优化ATI系统 2507.11277v1
  • 27 07-15 On multiagent online problems with predictions Auf Multiagent Online-Probleme mit Vorhersagen 多试剂在线预测问题 2507.12486v1
  • 28 07-15 Voting or Consensus? Decision-Making in Multi-Agent Debate Abstimmung oder Konsens? Entscheidungsfindung in Multi-Agent-Debatte 表决还是协商一致?多机构辩论中的决策 2502.19130v3
  • 29 07-15 A unifying approach to self-organizing systems interacting via conservation laws Ein vereinheitlichter Ansatz für selbstorganisierende Systeme, die über Erhaltungsgesetze interagieren 对通过养护法相互作用的自我组织系统采取统一办法 2507.02575v3
  • 30 07-15 MATE: LLM-Powered Multi-Agent Translation Environment for Accessibility Applications MATE: LLM-Powered Multi-Agent Translation Environment for Accessibility Applications MATE:为无障碍应用提供LLM 授权多机构翻译环境 2506.19502v2
  • 31 07-15 Fully Data-driven but Interpretable Human Behavioural Modelling with Differentiable Discrete Choice Model Vollständig datengesteuerte, aber interpretierbare menschliche Verhaltensmodellierung mit differenzierbarem diskretes Wahlmodell 完全由数据驱动但可解释的人类行为模型与差异分辨选择模型 2412.19403v3
  • 32 07-15 Trajectory Imputation in Multi-Agent Sports with Derivative-Accumulating Self-Ensemble Trajektorien-Imputation im Multi-Agenten-Sport mit demivativ-akkumulierendem Selbst-Ensemble 多机构体育中具有衍生-累积自我集合功能的多机构体育 2408.10878v4
  • 33 07-15 A Learning Framework For Cooperative Collision Avoidance of UAV Swarms Leveraging Domain Knowledge Ein Lernrahmen zur kooperativen Kollision Vermeidung von UAV-Schwärmen Nutzung von Domain-Wissen 合作协作避免无人驾驶航空飞行器冲冲冲器利用域域知识学习框架 2507.10913v1
  • 34 07-15 Autonomous Multi-Modal LLM Agents for Treatment Planning in Focused Ultrasound Ablation Surgery Autonome Multi-Modal LLM-Agenten für die Behandlungsplanung in fokussierter Ultraschallablationschirurgie 重点超声速超声振动外科手术治疗规划代理 2505.21418v2
  • 35 07-14 (1) AI-Powered Math Tutoring: Platform for Personalized and Adaptive Education KI-Powered Math Tutoring: Plattform für Personalisierte und Adaptive Bildung AI 授权数学教学:个性化和适应教育平台 2507.12484v1
  • 36 07-14 DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving DroidSpeak: KV Cache Sharing für Cross-LLM Kommunikation und Multi-LLM Serving DroidSpeak: KV 共享缓存, 用于跨 LLM 通信和多 LLM 服务 2411.02820v4
  • 37 07-14 DeepResearch$^{\text{Eco}}$: A Recursive Agentic Workflow for Complex Scientific Question Answering in Ecology DeepResearch$^{\text{Eco}}$: Ein rekursiver Agentischer Workflow für komplexe wissenschaftliche Fragen in der Ökologie 深层研究$text{Eco}$:生态中复杂科学问题答案的递递性制剂工作流程 2507.10522v1
  • 38 07-14 Toolsuite for Implementing Multiagent Systems Based on Communication Protocols Toolsuite zur Implementierung von Multiagentensystemen auf Basis von Kommunikationsprotokollen 基于通信议定书的用于实施多剂系统的工具 2507.10324v1
  • 39 07-14 Prompt Informed Reinforcement Learning for Visual Coverage Path Planning Prompt Informierte Verstärkung Lernen für die visuelle Abdeckung Pfadplanung 视力覆盖规划快速信息强化学习 2507.10284v1
  • 40 07-14 ToMacVF : Temporal Macro-action Value Factorization for Asynchronous Multi-Agent Reinforcement Learning ToMacVF : Zeitliche Makro-Wirkungs-Wertfaktorisierung für asynchrones Mehr-Agenten-Verstärkungs-Lernen ToMacVF: 同步多机构强化学习的时际宏观行动价值系数 2507.10251v1
  • 41 07-14 Multi-Robot Cooperative Herding through Backstepping Control Barrier Functions Multi-Roboter-Kooperative Herdung durch rückschrittliche Steuerungsbarrieren-Funktionen 多机器人合作通过后步控制障碍功能 2507.10249v1
  • 42 07-14 Adaptability in Multi-Agent Reinforcement Learning: A Framework and Unified Review Anpassungsfähigkeit im Mehr-Agenten-Verstärkungs-Lernen: Ein Rahmen und eine einheitliche Überprüfung 多机构加强学习中的适应性:框架和统一审查 2507.10142v1
  • 43 07-14 Collaboration Promotes Group Resilience in Multi-Agent RL Zusammenarbeit fördert Gruppenresistenz in Multi-Agent RL 协作促进多机构RL中的团体复原力 2111.06614v3
  • 44 07-14 CoDe: A Cooperative and Decentralized Collision Avoidance Algorithm for Small-Scale UAV Swarms Considering Energy Efficiency CoDe: Ein kooperativer und dezentralisierter Kollisionsvermeidungsalgorithmus für kleine UAV-Schwärmer unter Berücksichtigung der Energieeffizienz Code:考虑到能源效率的小型无人驾驶航空器的小型蜂群合作和分散协调避免费用等级 2204.08594v2
  • 45 07-14 Improving monotonic optimization in heterogeneous multi-agent reinforcement learning with optimal marginal deterministic policy gradient Verbesserung der monotonen Optimierung im heterogenen Multi-Agenten-Verstärkungslernen mit optimalem marginalen deterministischen politischen Gradienten 以最优化的边际确定性政策梯度,改进多元多剂强化学习中的单体优化 2507.09989v1
  • 46 07-14 AnalogTester: A Large Language Model-Based Framework for Automatic Testbench Generation in Analog Circuit Design AnalogTester: Ein großsprachiges modellbasiertes Framework für die automatische Testbench-Generierung im Analog Circuit Design 模拟试验者:在模拟电路设计中自动产生自动试验箱的大型语言示范框架 2507.09965v1
  • 47 07-14 Large Population Models Große Bevölkerungsmodelle 大型人口模式 2507.09901v1
  • 48 07-14 Multi-residual Mixture of Experts Learning for Cooperative Control in Multi-vehicle Systems Multi-Residual Mixture of Experts Learning for Cooperative Control in Multi-Vehicle Systems 多车辆系统合作控制专家学习 2507.09836v1
  • 49 07-13 (7) TinyTroupe: An LLM-powered Multiagent Persona Simulation Toolkit TinyTroupe: Ein LLM-powered Multiagent Persona Simulation Toolkit TiniyTrouppe:一个由LLM驱动的多剂人模拟工具包 2507.09788v1
  • 50 07-13 Negotiating Comfort: Simulating Personality-Driven LLM Agents in Shared Residential Social Networks Verhandeln von Komfort: Simulieren von Persönlichkeits-getriebenen LLM-Agenten in Shared Residential Social Networks 谈判舒适:在共享住宅社会网络中模拟个性驱动的LLM代理 2507.09657v1
  • 51 07-13 VFlow: Discovering Optimal Agentic Workflows for Verilog Generation VFlow: Optimale Agentische Workflows für die Verilog-Generation entdecken VFlow: 为维利罗格生成发现最佳样本工作流程 2504.03723v2
  • 52 07-13 It’s Not All Black and White: Degree of Truthfulness for Risk-Avoiding Agents Es ist nicht alles schwarz und weiß: Grad von Wahrhaftigkeit für risikovermeidende Agenten 并非全黑白:风险避险剂的真实程度 2502.18805v2
  • 53 07-13 Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy Kann eine Gesellschaft Generativer Mittel menschliches Verhalten simulieren und die öffentliche Gesundheitspolitik informieren? 基因代理学会能够模拟人类行为和信息公共卫生政策吗? 疫苗安全案例研究 2503.09639v4
  • 54 07-12 (6) Adaptive Social Learning using Theory of Mind Adaptives Soziallernen unter Verwendung der Geistestheorie 利用思想理论进行适应性社会学习 2507.09409v1
  • 55 07-12 StockSim: A Dual-Mode Order-Level Simulator for Evaluating Multi-Agent LLMs in Financial Markets StockSim: Ein Dual-Mode Order-Level Simulator zur Bewertung von Multi-Agent LLMs in Finanzmärkten StockSim: 金融市场多方商家LMS评估双Mo级命令级模拟器 2507.09255v1
  • 56 07-12 Coordinated Communication and Inventory Optimization in Multi-Retailer Supply Chains Koordinierte Kommunikation und Bestandsoptimierung in Multi-Retailer Supply Chains 多零售供应链中协调通信和库存协调优化 2507.09223v1
  • 57 07-11 (5) Accelerating Drug Discovery Through Agentic AI: A Multi-Agent Approach to Laboratory Automation in the DMTA Cycle Beschleunigen der Wirkstoff-Discovery durch Agentic AI: Multi-Agenten-Ansatz zur Laborautomatisierung im DMTA-Zyklus AI:对DMTTA周期实验室自动化采取多机构办法 2507.09023v1
  • 58 07-11 Equilibria in multiagent online problems with predictions Equilibria in Multiagent Online-Probleme mit Vorhersagen 多试剂在线预测问题中的平衡 2405.11873v3
  • 59 07-11 How to Train a Leader: Hierarchical Reasoning in Multi-Agent LLMs Wie man einen Führer ausbildet: Hierarchische Vernunft in multi-agenten LLMs 如何培训领导者:多机构LLM中的等级原因 2507.08960v1
  • 60 07-11 Optimizing Sequential Multi-Step Tasks with Parallel LLM Agents Optimierung sequentieller Mehrschritt-Aufgaben mit parallelen LLM-Agenten 与平行LLM代理商优化序列式多步骤任务 2507.08944v1
  • 61 07-11 Experimental Setup and Software Pipeline to Evaluate Optimization based Autonomous Multi-Robot Search Algorithms Experimentelle Einrichtung und Software-Pipeline zur Bewertung von Optimierungs-basierten autonomen Multi-Roboter-Suche Algorithmen 实验设置和软件管道以评价基于优化的自动多机器人搜索算法 2506.16710v3
  • 62 07-11 Upgrade or Switch: Do We Need a Next-Gen Trusted Architecture for the Internet of AI Agents? Upgrade oder Switch: Brauchen wir eine vertrauenswürdige Next-Gen-Architektur für das Internet von KI-Agenten? 升级或切换:我们是否需要为AI代理商的互联网建立下一代信任的架构? 2506.12003v2
  • 63 07-11 Safe Deep Reinforcement Learning for Resource Allocation with Peak Age of Information Violation Guarantees Sicheres tiefes Stärkungslernen für Ressourcenallokation mit Spitzenzeit der Informationsverletzungsgarantien 安全深强化学习,以进行违反信息达到高峰年龄的违反信息保障的资源分配 2507.08653v1
  • 64 07-11 Open Source Planning & Control System with Language Agents for Autonomous Scientific Discovery Open Source Planning & Control System mit Language Agents für autonome wissenschaftliche Entdeckung 拥有自主科学发现语言代理的开放源规划和控制系统 2507.07257v2
  • 65 07-11 AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs AgentsNet: Koordination und kollaborative Reasoning in Multi-Agent LLMs 网:多机构LLM中的协调与合作理由 2507.08616v1
  • 66 07-11 To Trade or Not to Trade: An Agentic Approach to Estimating Market Risk Improves Trading Decisions Handel oder Nichthandel: Ein Agentischer Ansatz zur Schätzung des Marktrisikos verbessert Handelsentscheidungen 贸易或非贸易贸易:估计市场风险的代理办法 改善贸易决定 2507.08584v1
  • 67 07-11 Finding Common Ground: Using Large Language Models to Detect Agreement in Multi-Agent Decision Conferences Gemeinsamer Grund: Mit großen Sprachmodellen Vereinbarungen in Multi-Agent-Entscheidungskonferenzen zu erkennen 寻找共同点:在多机构决定会议上使用大语言模型来检测协议 2507.08440v1
  • 68 07-11 Properties of Quasi-synchronization Time of High-dimensional Hegselmann-Krause Dynamics Eigenschaften der Quasi-Synchronisierung Zeit der hochdimensionalen Hegselmann-Krause-Dynamik 高维 Hegselmann-Krause 动态的 准同步时间属性 2507.08900v1
  • 69 07-11 Exploring Design of Multi-Agent LLM Dialogues for Research Ideation Erforschung der Gestaltung von LLM-Dialogen mit mehreren Agenten für die Forschungsideation 探索设计多种机构用LLM 研究主题对话 2507.08350v1
  • 70 07-11 Conversational Self-Play for Discovering and Understanding Psychotherapy Approaches Conversational Self-Play für das Entdecken und Verstehen von Psychotherapieansätzen 发现和理解心理疗法方法的相互交流的自我宣传 2503.16521v2
  • 71 07-11 CRMAgent: A Multi-Agent LLM System for E-Commerce CRM Message Template Generation CRMAgent: Ein Multi-Agent LLM-System für E-Commerce CRM-Meldungsvorlagen-Erstellung CRMM 信息模板生成多机构代理LLM系统 2507.08325v1
  • 72 07-11 An Outlook on the Opportunities and Challenges of Multi-Agent AI Systems Ausblick auf die Chancen und Herausforderungen multiagenter KI-Systeme 关于多机构AI系统机会和挑战的展望 2505.18397v2
  • 73 07-10 (4) Multi-Actor Generative Artificial Intelligence as a Game Engine Multi-Actor Generative Künstliche Intelligenz als Game Engine 多驱动器生成人工智能作为游戏引擎 2507.08892v1
  • 74 07-10 Noise-Enabled Goal Attainment in Crowded Collectives Lärmfähiges Ziel-Attainment in Crowded Collectives 聚众集体实现无声目标 2507.08100v1
  • 75 07-10 Agent-based visualization of streaming text Agentenbasierte Visualisierung von Streaming-Texten 以代理为基础的流流文本可视化 2507.08884v1
  • 76 07-10 MAEBE: Multi-Agent Emergent Behavior Framework MAEBE: Multi-Agent Emergent Behavior Framework 多边代理新兴行为框架 2506.03053v2
  • 77 07-10 MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework MF-LLM: Simulation von Populationsentscheidungsdynamiken über ein mittleres Feld Large Language Model Framework MF-LLM:通过一个中外地大语言示范框架模拟人口决策动态 2504.21582v3
  • 78 07-10 Conjugated Capabilities: Interrelations of Elementary Human Capabilities and Their Implication on Human-Machine Task Allocation and Capability Testing Procedures Konjugierte Fähigkeiten: Zusammenhänge von elementaren menschlichen Fähigkeiten und deren Implikationen auf Mensch-Maschine-Aufgaben-Zuteilungs- und Fähigkeitsprüfungsverfahren 相容能力:人类基本能力之间的相互关系及其对人类-海洋任务分配和能力测试程序的影响 2507.07560v1
  • 79 07-10 Toward Real-World Chinese Psychological Support Dialogues: CPsDD Dataset and a Co-Evolving Multi-Agent System Auf dem Weg zu echten chinesischen Psychologischen Unterstützungsdialogen: CPsDD-Datensatz und ein gemeinsames Multi-Agenten-System 走向现实世界的中国心理支持对话:CPsDD数据集和共同演进的多行为者系统 2507.07509v1
  • 80 07-10 KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows KVFlow: Effizientes Präfix-Caching zur Beschleunigung von LLM-basierten Multiagenten-Workflows KVFlow: 为加速基于LLM的多重需要工作流程而高效预置缓存 2507.07400v1
  • 81 07-10 Multi-Agent Pathfinding Under Team-Connected Communication Constraint via Adaptive Path Expansion and Dynamic Leading Multi-Agent Pathfinding unter Team-Connected Communication Constraint über Adaptive Path Expansion und Dynamic Leading 通过适应性路径扩展和动态领导,在联成一体的通信制约下,开展多机构多方机构路透调查 2501.02770v4

Article 0

Title@2025-07-17 (4): Nash equilibrium seeking for a class of quadratic-bilinear Wasserstein distributionally robust games

Title: Nash equilibrium seeking for a class of quadratic-bilinear Wasserstein distributionally robust games Nash Gleichgewicht Suche nach einer Klasse von quadratisch-bilinearen Wasserstein Verteilung robusten Spielen Nash 均衡, 寻求类二次- 贝里尼奥尔 瓦西斯坦分配强强的游戏 2411.09636v2

Authors (3): Georgios Pantazis, Reza Rahimi Baghbadorani, Sergio Grammatico

We consider a class of Wasserstein distributionally robust Nash equilibrium problems, where agents construct heterogeneous data-driven Wasserstein ambiguity sets using private samples and radii, in line with their individual risk-averse behaviour. By leveraging relevant properties of this class of games, we show that equilibria of the original seemingly infinite-dimensional problem can be obtained as a solution to a finite-dimensional Nash equilibrium problem. We then reformulate the problem as a finite-dimensional variational inequality and establish the connection between the corresponding solution sets. Our reformulation has scalable behaviour with respect to the data size and maintains a fixed number of constraints, independently of the number of samples. To compute a solution, we leverage two algorithms, based on the golden ratio algorithm. The efficiency of both algorithmic schemes is corroborated through extensive simulation studies on an illustrative example and a stochastic portfolio allocation game, where behavioural coupling among investors is modeled.

我们认为瓦森斯坦分配上非常稳健的纳什均衡问题,在这类问题上,代理商根据个人风险规避行为,利用私人样本和反射行为,构建了不同数据驱动的瓦森斯坦模棱两可的模棱两可的模样。我们利用这一类游戏的相关特性,可以证明原始看似无限的维度问题的平衡性可以作为有限维度纳什平衡问题的解决办法。然后,我们重新将这一问题描述为有限维度的变异性不平等,并在相应的解决方案之间建立联系。我们的重新拟订在数据大小方面有可缩放的行为,并保持固定数量的制约,与样本数量无关。为了计算一个解决办法,我们根据黄金比例算法,运用两种算法。两种算法的效率都通过对一个示例的广泛模拟研究和一个随机组合分配游戏得到证实,投资者之间的行为组合是模拟的。


Article 1

Title@2025-07-17 (4): Coral Protocol: Open Infrastructure Connecting The Internet of Agents

Title: Coral Protocol: Open Infrastructure Connecting The Internet of Agents Coral Protocol: Open Infrastructure Connecting Das Internet der Agenten 珊瑚议定书:开放基础设施连接代理物互联网 2505.00749v2

Authors (6): Roman J. Georgio, Caelum Forder, Suman Deb, Andri Rahimov, Peter Carroll, Önder Gürcan

Coral Protocol is an open and decentralized collaboration infrastructure that enables communication, coordination, trust and payments for The Internet of Agents. It addresses the growing need for interoperability in a world where organizations are deploying multiple specialized AI agents that must work together across domains and vendors. As a foundational platform for multi-agent AI ecosystems, Coral establishes a common language and coordination framework allowing any agent to participate in complex workflows with others. Its design emphasizes broad compatibility, security, and vendor neutrality, ensuring that agent interactions are efficient and trustworthy. In particular, Coral introduces standardized messaging formats for agent communication, a modular coordination mechanism for orchestrating multi-agent tasks, and secure team formation capabilities for dynamically assembling trusted groups of agents. Together, these innovations position Coral Protocol as a cornerstone of the emerging “Internet of Agents,” unlocking new levels of automation, collective intelligence, and business value through open agent collaboration.

《珊瑚议定书》是一个开放和分散的协作基础设施,它使代理商的互联网能够进行通信、协调、信任和支付,它解决了在这样一个世界中各组织正在部署多个专门的AI代理商的世界中日益需要互操作性的问题,这些代理商必须在各个领域和供应商之间开展合作。作为多试剂AI生态系统的基础平台,珊瑚岛建立了一个共同的语言和协调框架,允许任何代理商与他人参与复杂的工作流程。它的设计强调广泛的兼容性、安全和供应商中立性,确保代理商的互动是高效和可信赖的。特别是,珊瑚岛为代理商的通信引入了标准化的信息传递格式,这是协调多试剂任务的模块化协调机制,也是动态地集合受信任的代理商团体的团队组建能力。这些创新共同将《珊瑚议定书》定位为新兴的“代理商互联网”的基石,通过开放代理商协作释放新的自动化、集体情报和商业价值。


Article 2

Title@2025-07-17 (4): Imitating Mistakes in a Learning Companion AI Agent for Online Peer Learning

Title: Imitating Mistakes in a Learning Companion AI Agent for Online Peer Learning Nachahmen von Fehlern in einem Learning Companion KI Agent für Online Peer Learning 模拟学习伙伴AI在线同行学习代理的错误 2507.12801v1

Authors (2): Sosui Moribe, Taketoshi Ushiama

In recent years, peer learning has gained attention as a method that promotes spontaneous thinking among learners, and its effectiveness has been confirmed by numerous studies. This study aims to develop an AI Agent as a learning companion that enables peer learning anytime and anywhere. However, peer learning between humans has various limitations, and it is not always effective. Effective peer learning requires companions at the same proficiency levels. In this study, we assume that a learner’s peers with the same proficiency level as the learner make the same mistakes as the learner does and focus on English composition as a specific example to validate this approach.

近年来,同侪学习作为一种促进学习者自发思维的方法,得到了人们的注意,其有效性也得到了许多研究的证实。这项研究的目的是发展一个AI代理作为学习伙伴,使同侪能够随时随地相互学习。然而,人与人之间的同侪学习有各种限制,而且并不总是有效的。有效的同侪学习需要具有相同熟练水平的同伴。在这个研究中,我们认为,与学习者同样的熟练水平的同侪会犯与学习者一样的错误,并侧重于英语组成,作为验证这一方法的具体例子。


Article 3

Title@2025-07-16 (3): Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation

Title: Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation Lernpolitik für dynamische Koalitionsbildung in der Multi-Roboter-Task-Allokation 多机器人任务分配中动态联盟形成学习政策 2412.20397v3

Authors (3): Lucas C. D. Bezerra, Ataíde M. G. dos Santos, Shinkyu Park

We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA). Our approach extends MAPPO by integrating spatial action maps, robot motion planning, intention sharing, and task allocation revision to enable effective and adaptive coalition formation. Extensive simulation studies confirm the effectiveness of our model, enabling each robot to rely solely on local information to learn timely revisions of task selections and form coalitions with other robots to complete collaborative tasks. The results also highlight the proposed framework’s ability to handle large robot populations and adapt to scenarios with diverse task sets.

我们提出了在多机器人任务分配(MRTA)中建立动态联盟的分散化、基于学习的框架。 我们的方法通过整合空间行动地图、机器人动作规划、意图共享和任务分配修订来扩展MAPOPO,以便能够有效和适应性地组成联盟。 广泛的模拟研究证实了我们模型的有效性,使每个机器人能够完全依靠当地信息及时了解任务选择的修订,并与其他机器人结成联盟完成协作任务。 研究结果还突出了拟议框架处理大型机器人人口和适应不同任务组合情景的能力。


Article 4

Title@2025-07-16 (3): NLI4VolVis: Natural Language Interaction for Volume Visualization via LLM Multi-Agents and Editable 3D Gaussian Splatting

Title: NLI4VolVis: Natural Language Interaction for Volume Visualization via LLM Multi-Agents and Editable 3D Gaussian Splatting NLI4VolVis: Natürliche Sprachinteraktion für die Volumenvisualisierung über LLM Multi-Agenten und editierbare 3D Gaussian Splatting NLI4VolVis:通过LLM多代理商和可编辑的 3D Gaussian Splating 进行卷量可视化的自然语言互动 2507.12621v1

Authors (3): Kuangshi Ai, Kaiyuan Tang, Chaoli Wang

Traditional volume visualization (VolVis) methods, like direct volume rendering, suffer from rigid transfer function designs and high computational costs. Although novel view synthesis approaches enhance rendering efficiency, they require additional learning effort for non-experts and lack support for semantic-level interaction. To bridge this gap, we propose NLI4VolVis, an interactive system that enables users to explore, query, and edit volumetric scenes using natural language. NLI4VolVis integrates multi-view semantic segmentation and vision-language models to extract and understand semantic components in a scene. We introduce a multi-agent large language model architecture equipped with extensive function-calling tools to interpret user intents and execute visualization tasks. The agents leverage external tools and declarative VolVis commands to interact with the VolVis engine powered by 3D editable Gaussians, enabling open-vocabulary object querying, real-time scene editing, best-view selection, and 2D stylization. We validate our system through case studies and a user study, highlighting its improved accessibility and usability in volumetric data exploration. We strongly recommend readers check our case studies, demo video, and source code at https://nli4volvis.github.io/.

传统体积可视化(VolVisi)方法,如直接体积转换,受到僵硬的转换功能设计和高计算成本的影响。虽然新观点合成方法提高了效率,但它们需要非专家更多的学习努力,缺乏对语义层面互动的支持。为缩小这一差距,我们提议采用NLI4VolVisis,这是一个互动系统,使用户能够利用自然语言探索、查询和编辑体积场景。NLI4VolVis将多视图语义分解和视觉语言模型结合起来,以提取和理解场景中的语义组成部分。我们引入了多剂大型语言模型结构,配有广泛的功能感应工具,用于解释用户意图和执行可视化任务。代理商利用外部工具和宣示性VolVisives指令与由3D可编辑的高音员驱动的VolVisius引擎互动,使用户能够用自然语言探索、查询、查询、实时现场编辑、最佳视图选择和2DStylation。我们通过案例研究和用户研究来验证我们的系统,在体积数据探索中突出其可访问性和可用性。我们案例研究、演示源代码。我们强烈建议读者检查我们的案例研究。


Article 5

Title@2025-07-16 (3): Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies

Title: Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies Mixed-Reality Digital Twins: Nutzung der physischen und virtuellen Welten für Hybrid Sim2Real Transition von Multi-Agent Verstärkungs-Learning-Politiken 混合-现实数字双对:利用物理和虚拟世界促进混合的Sim2重新过渡多机构强化学习政策 2403.10996v6

Authors (3): Chinmay Vilas Samak, Tanmay Vilas Samak, Venkat Narayan Krovi

Multi-agent reinforcement learning (MARL) for cyber-physical vehicle systems usually requires a significantly long training time due to their inherent complexity. Furthermore, deploying the trained policies in the real world demands a feature-rich environment along with multiple physical embodied agents, which may not be feasible due to monetary, physical, energy, or safety constraints. This work seeks to address these pain points by presenting a mixed-reality (MR) digital twin (DT) framework capable of: (i) boosting training speeds by selectively scaling parallelized simulation workloads on-demand, and (ii) immersing the MARL policies across hybrid simulation-to-reality (sim2real) experiments. The viability and performance of the proposed framework are highlighted through two representative use cases, which cover cooperative as well as competitive classes of MARL problems. We study the effect of: (i) agent and environment parallelization on training time, and (ii) systematic domain randomization on zero-shot sim2real transfer, across both case studies. Results indicate up to 76.3% reduction in training time with the proposed parallelization scheme and sim2real gap as low as 2.9% using the proposed deployment method.

由于网络物理车辆系统的多剂强化学习(MARL)通常需要相当长的培训时间,因为其内在的复杂性,因此,在现实世界中部署经过训练的政策需要具有丰富特点的环境以及多种物理成形剂,由于货币、物理、能源或安全方面的限制,这可能不可行。这项工作力求解决这些痛苦点,办法是提出一个混合现实(MR)数字双胞胎(DT)框架,能够:(一) 通过有选择地根据需求扩大平行模拟工作量,提高培训速度;(二) 在混合模拟到现实(im2real)试验中浸泡出MARL政策,通过两个有代表性的使用案例来强调拟议框架的可行性和绩效,这两个案例涉及MARL问题的合作和竞争性类别。我们研究:(一) 代理和环境对培训时间的平行效应,以及(二) 两种案例研究对零点成双向的模拟转移的系统性域随机化效果。结果显示,与拟议的平行计划的培训时间减少76.3%,与使用拟议部署方法的轻度为2.9%的模拟差距,低于2.9%。


Article 6

Title@2025-07-16 (3): Modeling Feasible Locomotion of Nanobots for Cancer Detection and Treatment

Title: Modeling Feasible Locomotion of Nanobots for Cancer Detection and Treatment Modellierung einer Machbarkeitslokomotion von Nanobots für Krebserkennung und -behandlung 用于癌症检测和治疗的纳米机器人的可行易燃活动模型 2507.12400v1

Authors (5): Noble Harasha, Cristina Gava, Nancy Lynch, Claudia Contini, Frederik Mallmann-Trenn

Deploying motile nanosized particles, also known as ``nanobots’’, in the human body promises to improve selectivity in drug delivery and reduce side effects. We consider a swarm of nanobots locating a single cancerous region and treating it by releasing an onboard payload of drugs at the site. At nanoscale, the computation, communication, sensing, and locomotion capabilities of individual agents are extremely limited, noisy, and/or nonexistent. We present a general model to formally describe the individual and collective behavior of agents in a colloidal environment, such as the bloodstream, for cancer detection and treatment by nanobots. This includes a feasible and precise model of agent locomotion, inspired by actual nanoparticles that, in the presence of an external chemical gradient, move towards areas of higher concentration by means of self-propulsion. We present two variants of our general model: The first assumes an endogenous chemical gradient that is fixed over time and centered at the targeted cancer site; the second is a more speculative and dynamic variant in which agents themselves create and amplify a chemical gradient centered at the cancer site. In both settings, agents can sense the gradient and ascend it noisily, locating the cancer site more quickly than via simple Brownian motion. For the first variant of the model, we present simulation results to show the behavior of agents under our locomotion model, as well as {analytical results} to bound the time it takes for the agents to reach the cancer site. For the second variant, simulation results highlight the collective benefit in having agents issue their own chemical signal. While arguably more speculative in its agent capability assumptions, this variant shows a significant improvement in runtime performance over the first variant, resulting from its chemical signal amplification mechanism.

在人体中部署运动式纳米粒子,也称为“nanobots’, ” , 在人体中, 将改善药物供应的选择性,减少副作用。 我们认为, 一群纳米机器人将单一癌症区域定位起来, 并通过在现场释放一个在船上的药物有效载荷来治疗它。 在纳米规模上, 单个物剂的计算、 通信、 感知和移动能力极为有限、 吵闹和/ 或不存在。 我们提出了一个一般模型, 正式描述在凝固环境中, 如血液流, 用于纳米机器人检测和治疗癌症的个体和集体行为。 这包括一种可行和精确的纳米机器人移动模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型


Article 7

Title@2025-07-16 (3): From Semantic Web and MAS to Agentic AI: A Unified Narrative of the Web of Agents

Title: From Semantic Web and MAS to Agentic AI: A Unified Narrative of the Web of Agents Von Semantic Web und MAS zu Agentic AI: Ein einheitliches Narrativ des Web of Agents 从语义网站和MAS到AA:关于 “ 代理人网络 “ 的统一说明 2507.10644v2

Authors (4): Tatiana Petrova, Boris Bliznioukov, Aleksandr Puzikov, Radu State

The concept of the Web of Agents (WoA), which transforms the static, document-centric Web into an environment of autonomous agents acting on users’ behalf, has attracted growing interest as large language models (LLMs) become more capable. However, research in this area is still fragmented across different communities. Contemporary surveys catalog the latest LLM-powered frameworks, while the rich histories of Multi-Agent Systems (MAS) and the Semantic Web are often treated as separate, legacy domains. This fragmentation obscures the intellectual lineage of modern systems and hinders a holistic understanding of the field’s trajectory. We present the first comprehensive evolutionary overview of the WoA. We show that modern protocols like A2A and the MCP, are direct evolutionary responses to the well-documented limitations of earlier standards like FIPA standards and OWL-based semantic agents. To systematize this analysis, we introduce a four-axis taxonomy (semantic foundation, communication paradigm, locus of intelligence, discovery mechanism). This framework provides a unified analytical lens for comparing agent architectures across all generations, revealing a clear line of descent where others have seen a disconnect. Our analysis identifies a paradigm shift in the ‘locus of intelligence’: from being encoded in external data (Semantic Web) or the platform (MAS) to being embedded within the agent’s core model (LLM). This shift is foundational to modern Agentic AI, enabling the scalable and adaptive systems the WoA has long envisioned. We conclude that while new protocols are essential, they are insufficient for building a robust, open, trustworthy ecosystem. Finally, we argue that the next research frontier lies in solving persistent socio-technical challenges, and we map out a new agenda focused on decentralized identity, economic models, security, and governance for the emerging WoA.

将静态的、以文件为中心的网络概念转化为代表用户行事的自主代理机构的环境,随着大型语言模型(LLMS)的能力增强,这一概念引起了越来越多的兴趣。然而,这一领域的研究仍然在不同社区中分散。当代调查将最新的LLM动力框架编成目录,而多机构系统(MAS)和语义网络的丰富历史往往被视为单独的遗留领域。这种分散掩盖了现代系统的知识线,阻碍了对实地运行轨迹的全面理解。我们介绍了WAA的首次全面演进概览。我们显示,A2A和MCP等现代协议是对早期标准(如FIPA标准和OWL的语义媒介)有详细记载的限制的直接进化反应。为了系统系统化,我们引入了四轴分类(命令基础、通信模式、智能中心、发现机制),这个框架为不同代间对代理机构结构的比较提供了一个统一的分析透析透析透析,揭示了一条清晰的路径,而其他人则看到,A2A类和MLA的离子线。我们的分析指出,一个清晰的模型的模型在网络平台上将最终转换了。


Article 8

Title@2025-07-16 (3): Robot Metabolism: Towards machines that can grow by consuming other machines

Title: Robot Metabolism: Towards machines that can grow by consuming other machines Robotermetabolismus: Auf dem Weg zu Maschinen, die durch den Verzehr anderer Maschinen wachsen können 机器人新陈代谢:研制能够通过消耗其他机器而成长的机器 2411.11192v2

Authors (20): Philippe Martin Wyder, Riyaan Bakhda, Meiqi Zhao, Quinn A. Booth, Matthew E. Modi, Andrew Song, Simon Kang, Jiahao Wu, Priya Patel, Robert T. Kasumi, David Yi, Nihar Niraj Garg, Pranav Jhunjhunwala, Siddharth Bhutoria, Evan H. Tong, Yuhang Hu, Judah Goldfeder, Omer Mustel, Donghan Kim, Hod Lipson

Biological lifeforms can heal, grow, adapt, and reproduce – abilities essential for sustained survival and development. In contrast, robots today are primarily monolithic machines with limited ability to self-repair, physically develop, or incorporate material from their environments. While robot minds rapidly evolve new behaviors through AI, their bodies remain closed systems, unable to systematically integrate material to grow or heal. We argue that open-ended physical adaptation is only possible when robots are designed using a small repertoire of simple modules. This allows machines to mechanically adapt by consuming parts from other machines or their surroundings and shed broken components. We demonstrate this principle on a truss modular robot platform. We show how robots can grow bigger, faster, and more capable by consuming materials from their environment and other robots. We suggest that machine metabolic processes like those demonstrated here will be an essential part of any sustained future robot ecology.

生物生命形态可以治愈、生长、适应和复制 – – 持续生存和发展所必需的能力。相比之下,今天的机器人主要是单一机械,其自我修复、物理开发或融入环境材料的能力有限。虽然机器人的大脑通过人工智能迅速演化新的行为,但其身体仍然是封闭的系统,无法系统地整合材料以生长或治愈。我们争辩说,只有当机器人使用少量的简单模块来设计机器人时,才能进行无限制的物理适应。这使机器能够机械地通过消耗其他机器或其周围的部件来进行机械适应,并拆卸部件。我们在Truss模块机器人平台上展示了这一原则。我们展示了机器人如何长大更大、更快、更有能力从环境和其他机器人中消耗材料。我们建议,像这里展示的机器代谢过程将成为未来任何持续机器人生态的关键部分。


Article 9

Title@2025-07-16 (3): Programming Distributed Collective Processes in the eXchange Calculus

Title: Programming Distributed Collective Processes in the eXchange Calculus Programmierung verteilter kollektiver Prozesse im eXchange Calculus eXchange Calculus 中的程序编程分配集体进程 2401.11212v5

Authors (5): Giorgio Audrito, Roberto Casadei, Ferruccio Damiani, Gianluca Torta, Mirko Viroli

Recent trends like the Internet of Things (IoT) suggest a vision of dense and multi-scale deployments of computing devices in nearly all kinds of environments. A prominent engineering challenge revolves around programming the collective adaptive behaviour of such computational ecosystems. This requires abstractions able to capture concepts like ensembles (dynamic groups of cooperating devices) and collective tasks (joint activities carried out by ensembles). In this work, we consider collections of devices interacting with neighbours and that execute in nearly-synchronised sense-compute-interact rounds, where the computation is given by a single program mapping sensing values and incoming messages to output and outcoming messages. To support programming whole computational collectives, we propose the abstraction of a distributed collective process, which can be used to define at once the ensemble formation logic and its collective task. We formalise the abstraction in the eXchange Calculus (XC), a core functional language based on neighbouring values (maps from neighbours to values) where state and interaction is handled through a single primitive, exchange, and provide a corresponding implementation in the FCPP language. Then, we exercise distributed collective processes using two case studies: multi-hop message propagation and distributed monitoring of spatial properties. Finally, we discuss the features of the abstraction and its suitability for different kinds of distributed computing applications.

在这项工作中,我们考虑与邻居发生互动的装置的集成,这些装置以近同步的感知和计算互动周期执行,计算方法是由一个单一程序绘制感测值和发送信息到输出和流出信息。为了支持整个计算集体的编程,我们提议一个分布式集体过程的抽象化,这个过程可以用来立即界定共性形成逻辑及其集体任务。我们把电子Xchange Calculus(XC)中的抽象化,这是一个基于相邻价值的核心功能语言(从邻居到价值观的图解),通过单一原始、交换处理国家和互动,并在FCPP语言中提供相应的执行。最后,我们利用两种案例研究,进行分布式集成的集体进程,并传播各种空间信息。最后,我们用两种案例研究的形式,进行集体分布式的数学特性。我们用两种案例研究来传播其空间信息。最后,我们用两种案例研究来传播空间信息。


Article 10

Title@2025-07-16 (3): Fast and Scalable Game-Theoretic Trajectory Planning with Intentional Uncertainties

Title: Fast and Scalable Game-Theoretic Trajectory Planning with Intentional Uncertainties Schnelle und skalierbare game-theoretische Trajektorie-Planung mit Absichtsunsicherheiten 具有有意不确定性的快速和可缩放游戏-理论轨迹规划 2507.12174v1

Authors (5): Zhenmin Huang, Yusen Xie, Benshan Ma, Shaojie Shen, Jun Ma

Trajectory planning involving multi-agent interactions has been a long-standing challenge in the field of robotics, primarily burdened by the inherent yet intricate interactions among agents. While game-theoretic methods are widely acknowledged for their effectiveness in managing multi-agent interactions, significant impediments persist when it comes to accommodating the intentional uncertainties of agents. In the context of intentional uncertainties, the heavy computational burdens associated with existing game-theoretic methods are induced, leading to inefficiencies and poor scalability. In this paper, we propose a novel game-theoretic interactive trajectory planning method to effectively address the intentional uncertainties of agents, and it demonstrates both high efficiency and enhanced scalability. As the underpinning basis, we model the interactions between agents under intentional uncertainties as a general Bayesian game, and we show that its agent-form equivalence can be represented as a potential game under certain minor assumptions. The existence and attainability of the optimal interactive trajectories are illustrated, as the corresponding Bayesian Nash equilibrium can be attained by optimizing a unified optimization problem. Additionally, we present a distributed algorithm based on the dual consensus alternating direction method of multipliers (ADMM) tailored to the parallel solving of the problem, thereby significantly improving the scalability. The attendant outcomes from simulations and experiments demonstrate that the proposed method is effective across a range of scenarios characterized by general forms of intentional uncertainties. Its scalability surpasses that of existing centralized and decentralized baselines, allowing for real-time interactive trajectory planning in uncertain game settings.

在机器人领域,涉及多试剂相互作用的试探性规划是一项长期挑战,主要是由代理人之间内在的复杂互动造成的负担。虽然游戏理论方法在管理多试剂相互作用方面的效力得到广泛承认,但在适应代理人故意的不确定性方面仍然存在重大障碍。在有意的不确定性方面,与现有游戏理论方法相关的沉重的计算负担是诱发的,导致效率低下和伸缩性差。在本文件中,我们提议一种新型的游戏理论互动轨迹规划方法,以有效解决代理人故意的不确定性,这既显示了高效率,又加强了可伸缩性。作为基础,我们将有意不确定性下的代理人之间的互动作为普通贝叶色游戏的一种模式,我们表明其代理形式等同可在某些次要假设下作为一种潜在游戏。 说明现有游戏理论和理论性互动轨迹的最佳存在和可实现性,因为相应的巴耶什平衡可以通过优化一个统一的优化问题来实现。 此外,我们根据双重共识交替的乘数性方向方法(ADMM)和增强的可伸缩性。作为基础,我们把有意不确定性的代理人之间的相互作用模型作为模型的模型模型模型模型模型模型,从而显著地反映其总体的可递增缩性。


Article 11

Title@2025-07-16 (3): Value-Based Large Language Model Agent Simulation for Mutual Evaluation of Trust and Interpersonal Closeness

Title: Value-Based Large Language Model Agent Simulation for Mutual Evaluation of Trust and Interpersonal Closeness Value-Based Large Language Model Agent Simulation zur gegenseitigen Bewertung von Vertrauen und zwischenmenschlicher Nähe 用于相互评价信任和人际亲密的基于价值的大型语言模型模拟剂 2507.11979v1

Authors (3): Yuki Sakamoto, Takahisa Uchida, Hiroshi Ishiguro

Large language models (LLMs) have emerged as powerful tools for simulating complex social phenomena using human-like agents with specific traits. In human societies, value similarity is important for building trust and close relationships; however, it remains unexplored whether this principle holds true in artificial societies comprising LLM agents. Therefore, this study investigates the influence of value similarity on relationship-building among LLM agents through two experiments. First, in a preliminary experiment, we evaluated the controllability of values in LLMs to identify the most effective model and prompt design for controlling the values. Subsequently, in the main experiment, we generated pairs of LLM agents imbued with specific values and analyzed their mutual evaluations of trust and interpersonal closeness following a dialogue. The experiments were conducted in English and Japanese to investigate language dependence. The results confirmed that pairs of agents with higher value similarity exhibited greater mutual trust and interpersonal closeness. Our findings demonstrate that the LLM agent simulation serves as a valid testbed for social science theories, contributes to elucidating the mechanisms by which values influence relationship building, and provides a foundation for inspiring new theories and insights into the social sciences.

大型语言模型(LLMs)已成为利用具有特定特征的类似人类剂模拟复杂社会现象的有力工具。在人类社会,价值相似性对于建立信任和密切的关系很重要;然而,对于这一原则在由LLM代理商组成的人工社会中是否适用,仍没有探讨这一原则是否在由LLLM代理商组成的人工社会中适用。因此,这项研究调查了价值相似性对LLM代理商之间建立关系的影响。首先,在初步实验中,我们评估了LLMS中价值的可控制性,以确定最有效的模式和控制价值的迅速设计。随后,在主要实验中,我们产生了一对配有特定价值的LLM代理商,分析了他们在对话后对信任和人际关系密切的相互评价。实验用英语和日语进行,以调查语言依赖性。结果证实,价值相近的两对代理人表现出更大的相互信任和人际关系密切性。我们的调查结果表明,LLM代理商模拟是社会科学理论的有效检验台,有助于解释价值建立关系的机制,并为激发社会科学的新理论和洞察提供基础。


Article 12

Title@2025-07-16 (3): BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling

Title: BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling BRIDGE: Bootstrapping-Text zur Steuerung der Time-Series-Generation über Multi-Agent iterative Optimierung und Diffusionsmodellierung BRIDGE:通过多代理迭代优化和传播模型化控制时间- 系列生成的推进文本 2503.02445v5

Authors (8): Hao Li, Yu-Hao Huang, Chang Xu, Viktor Schlegel, Renhe Jiang, Riza Batista-Navarro, Goran Nenadic, Jiang Bian

Time-series Generation (TSG) is a prominent research area with broad applications in simulations, data augmentation, and counterfactual analysis. While existing methods have shown promise in unconditional single-domain TSG, real-world applications demand for cross-domain approaches capable of controlled generation tailored to domain-specific constraints and instance-level requirements. In this paper, we argue that text can provide semantic insights, domain information and instance-specific temporal patterns, to guide and improve TSG. We introduce ``Text-Controlled TSG’’, a task focused on generating realistic time series by incorporating textual descriptions. To address data scarcity in this setting, we propose a novel LLM-based Multi-Agent framework that synthesizes diverse, realistic text-to-TS datasets. Furthermore, we introduce BRIDGE, a hybrid text-controlled TSG framework that integrates semantic prototypes with text description for supporting domain-level guidance. This approach achieves state-of-the-art generation fidelity on 11 of 12 datasets, and improves controllability by up to 12% on MSE and 6% MAE compared to no text input generation, highlighting its potential for generating tailored time-series data.

时间序列生成(TSG)是一个突出的研究领域,在模拟、数据增强和反事实分析方面广泛应用。虽然现有方法在无条件单域 TSG 中显示出前景,但现实世界应用对跨域方法的需求,这些方法能够根据具体领域的限制和实例要求进行有控制的生成。在本文中,我们认为文本可以提供语义洞察力、域信息和具体实例的时间模式,以指导和改进TSG。我们引入了“Text-croled TSG ”这一任务,其重点是通过纳入文本描述生成现实的时间序列。为了解决这一设置中的数据稀缺问题,我们提出了一个基于LLM的新的多要素框架,以综合多样化、现实的文本到TS数据集。此外,我们引入了BRIDGE,这是一个混合文本控制的 TSG 框架,将语义原型与文本描述相结合,用于支持域级指导。这个方法在12个数据集中的11个中实现了“Text-text-crolled TSG ” ,并改进了对MSE 和 6 % MAE 的可控性,以12 % 的MSE , 将它的潜力与不按时间生成数据进行对比。


Article 13

Title@2025-07-16 (3): CoCre-Sam (Kokkuri-san): Modeling Ouija Board as Collective Langevin Dynamics Sampling from Fused Language Models

Title: CoCre-Sam (Kokkuri-san): Modeling Ouija Board as Collective Langevin Dynamics Sampling from Fused Language Models CoCre-Sam (Kokkuri-san): Modellierung des Ouija Boards als kollektive Langevin Dynamics-Probenahme aus Fused Language Models Core-Sam (Kokkuri-san): 建模Uija 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对 校对: 校对 校对 校对 校对 校对 校对 校对 校对: 校对: 校对: 校对: 语言 语言 校对 校对: 校对 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对 校对 校对: 校对 校对 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对: 校对 校对: 校对: 校对 校对: 校对: 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校对 校 2507.11906v1

Authors (4): Tadahiro Taniguchi, Masatoshi Nagano, Haruumi Omoto, Yoshiki Hayashi

Collective human activities like using an Ouija board (or Kokkuri-san) often produce emergent, coherent linguistic outputs unintended by any single participant. While psychological explanations such as the ideomotor effect exist, a computational understanding of how decentralized, implicit linguistic knowledge fuses through shared physical interaction remains elusive. We introduce CoCre-Sam (Collective-Creature Sampling), a framework modeling this phenomenon as collective Langevin dynamics sampling from implicitly fused language models. Each participant is represented as an agent associated with an energy landscape derived from an internal language model reflecting linguistic priors, and agents exert stochastic forces based on local energy gradients. We theoretically prove that the collective motion of the shared pointer (planchette) corresponds to Langevin MCMC sampling from the sum of individual energy landscapes, representing fused collective knowledge. Simulations validate that CoCre-Sam dynamics effectively fuse different models and generate meaningful character sequences, while ablation studies confirm the essential roles of collective interaction and stochasticity. Altogether, CoCre-Sam provides a novel computational mechanism linking individual implicit knowledge, embodied collective action, and emergent linguistic phenomena, grounding these complex interactions in the principles of probabilistic sampling.

使用 Ouija 板(或 Kokkkuri-san) 等人类集体活动往往产生任何参与者都无意的突发、一致的语言产出。虽然存在诸如idemotor效应等心理解释,但对于通过共同物理互动的分散、隐含语言知识引信的计算理解仍然渺茫。我们引入了CoCre-Sam(Colective-Cretural-Cretural Sam抽样),将这一现象作为集体Langevin动态抽样的模型,来自隐含结合的语言模型。每个参与者都作为与反映语言前缀的内部语言模型产生的能源景观相关的代理物被代表,代理物根据当地的能源梯度施加了随机力。我们理论上证明,共享点(Planchette)的集体运动相当于Langevin MCMC采样从单个能源景观的总和中取的样本,代表了集成的集体知识。模拟证实CoCre-Sam动态有效地融合了不同的模型并产生了有意义的性性序列。同时,关系研究确认了集体互动和相交合性作用的基本作用。加之,Cre-Sam提供了一种新型的计算机制,将个人隐含的隐含的地面知识、体现的物理现象的形成。


Article 14

Title@2025-07-15 (2): Bridging Literature and the Universe Via A Multi-Agent Large Language Model System

Title: Bridging Literature and the Universe Via A Multi-Agent Large Language Model System Überbrückung der Literatur und des Universums über ein Multi-Agenten Large Language Model System 搭桥文学和宇宙通过多种需要的大型语言模式系统 2507.08958v2

Authors (6): Xiaowen Zhang, Zhenyu Bi, Patrick Lachance, Xuan Wang, Tiziana Di Matteo, Rupert A. C. Croft

As cosmological simulations and their associated software become increasingly complex, physicists face the challenge of searching through vast amounts of literature and user manuals to extract simulation parameters from dense academic papers, each using different models and formats. Translating these parameters into executable scripts remains a time-consuming and error-prone process. To improve efficiency in physics research and accelerate the cosmological simulation process, we introduce SimAgents, a multi-agent system designed to automate both parameter configuration from the literature and preliminary analysis for cosmology research. SimAgents is powered by specialized LLM agents capable of physics reasoning, simulation software validation, and tool execution. These agents collaborate through structured communication, ensuring that extracted parameters are physically meaningful, internally consistent, and software-compliant. We also construct a cosmological parameter extraction evaluation dataset by collecting over 40 simulations in published papers from Arxiv and leading journals that cover diverse simulation types. Experiments on the dataset demonstrate a strong performance of SimAgents, highlighting its effectiveness and potential to accelerate scientific research for physicists. Our demonstration video is available at: https://youtu.be/w1zLpm_CaWA. The complete system and dataset are publicly available at https://github.com/xwzhang98/SimAgents.

随着宇宙学模拟及其相关软件日益复杂,物理学家面临挑战,通过大量文献和用户手册搜索,从密集的学术论文中提取模拟参数,每个论文都使用不同的模型和格式。将这些参数转换成可执行的脚本仍是一个耗时和容易出错的过程。为了提高物理研究的效率和加速宇宙模拟过程,我们引入了SimAgents,这是一个多试办系统,旨在将宇宙学研究的文献和初步分析的参数配置自动化。SimAgents由能够进行物理推理、模拟软件验证和工具执行的专门LLM代理商提供动力。这些代理商通过结构化的通信进行协作,确保提取的参数具有物理意义、内部一致性和符合软件要求。我们还通过从Arxiv和涵盖多种模拟类型的主要期刊上收集40多份模拟文件来构建一个宇宙参数提取评估数据集。对数据集的实验展示了SimAgents的强大性能,突出其效力和潜力以加速物理学科学研究。我们的演示视频可在https://yotu.be/wz1Lpmb_Amas_pub_pubs. 公开提供数据和数据系统。


Article 15

Title@2025-07-15 (2): Horus: A Protocol for Trustless Delegation Under Uncertainty

Title: Horus: A Protocol for Trustless Delegation Under Uncertainty Horus: Ein Protokoll für eine treulose Delegation unter Unsicherheit 荷鲁斯:不确定性下无信托代表团议定书 2507.00631v6

Authors (2): David Shi, Kevin Joo

Correctness is an emergent property of systems where exposing error is cheaper than committing it. In dynamic, low-trust environments, autonomous AI agents benefit from delegating work to sub-agents, yet correctness cannot be assured through upfront specification or centralized oversight. We propose a protocol that enforces correctness through collateralized claims in a recursive verification game. Tasks are published as intents, and solvers compete to fulfill them. Selected solvers carry out tasks under risk, with correctness checked post hoc by verifiers. Any challenger can challenge a result by staking against it to trigger the verification process. Incorrect agents are slashed and correct opposition is rewarded, with an escalation path that penalizes erroneous verifiers themselves. When incentives are aligned across solvers, challengers, and verifiers, falsification conditions make correctness the Nash equilibrium.

正确性是暴露错误比实施错误更便宜的系统的一种新兴特性。 在动态的低信任环境中,自主的AI代理商从将工作委托给分代理人中受益,但无法通过先期规格或集中监督来保证正确性。 我们提议了一项协议,在循环性核查游戏中通过抵押债权强制执行正确性。 任务作为意图公布,解决者竞相完成。 选定的解决者执行有风险的任务,由核查者检查是否正确性。 任何挑战者都可以通过对它进行打击以触发核查进程来挑战结果。 错误的代理商被砍断,正确的反对者被奖励,而升级路径则惩罚错误的验证者本身。 当激励措施在解决者、挑战者和核查者之间一致时,伪造的条件可以使纳什平衡得到正确性。


Article 16

Title@2025-07-15 (2): Large-scale distributed synchronization systems, using a cancel-on-completion redundancy mechanism

Title: Large-scale distributed synchronization systems, using a cancel-on-completion redundancy mechanism Großmaßstäbliche verteilte Synchronisationssysteme, mit einem storn-on-completion Redundanz-Mechanismus 使用完成后注销冗余机制,使用大规模分布式分布式的大规模同步系统 2507.11779v1

Authors (1): Alexander Stolyar

We consider a class of multi-agent distributed synchronization systems, which are modeled as $n$ particles moving on the real line. This class generalizes the model of a multi-server queueing system, considered in [15], employing so-called cancel-on-completion (c.o.c.) redundancy mechanism, but is motivated by other applications as well. The model in [15] is a particle system, regulated at the left boundary point. The more general model of this paper is such that we allow regulation boundaries on either side, or both sides, or no regulation at all. We consider the mean-field asymptotic regime, when the number of particles $n$ and the job arrival rates go to infinity, while the job arrival rates per particle remain constant. The results include: the existence/uniqueness of fixed points of mean-field limits (ML), which describe the limiting dynamics of the system; conditions for the steady-state asymptotic independence (concentration, as $n \to\infty$, of the stationary distribution on a single state, which is necessarily an ML fixed point); the limits, as $n \to\infty$, of the average velocity at which unregulated (free) particle system advances. In particular, our results for the left-regulated system unify and generalize the corresponding results in [15]. Our technical development is such that the systems with different types of regulation are analyzed within a unified framework. In particular, these systems are used as tools for analysis of each other.

我们考虑的是一组多试剂分布式同步系统,其模型是实际线上移动的零美元粒子。这一类将多服务器排队系统的模式(在[15] 中考虑的多服务器排队系统的模式概括化,采用所谓的“取消即成”(c.o.c.)冗余机制,但也受其他应用的驱动。[15] 中的模型是一个粒子系统,在左边边界点加以规范。本文较为笼统的模式是,我们允许双方或双方的监管界限,或根本不有任何监管。我们考虑的是,当颗粒数量和工作到货率达到无限时,该模型将采用模式,而每个粒子的工作到货率则保持不变,其结果包括:平均场限值固定点的存在/性质(c.o.c.c.)是一个粒子系统在左边边界点上调节,稳定状态在单一状态上分配,这必然是ML固定值分析值的固定点,而每个粒子的到达率率率则以我们普通系统内部的平均结果为准。


Article 17

Title@2025-07-15 (2): A Cellular Automata Approach to Donation Game

Title: A Cellular Automata Approach to Donation Game Ein zellulärer Automata Ansatz zur Spende Spiel 捐赠游戏的细胞自动模式 2507.11744v1

Authors (5): Marcin Kowalik, Przemysław Stokłosa, Mateusz Grabowski, Janusz Starzyk, Paweł Raif

The donation game is a well-established framework for studying the emergence and evolution of cooperation in multi-agent systems. The cooperative behavior can be influenced by the environmental noise in partially observable settings and by the decision-making strategies of agents, which may incorporate not only reputation but also traits such as generosity and forgiveness. Traditional simulations often assume fully random interactions, where cooperation is tested between randomly selected agent pairs. In this paper, we investigate cooperation dynamics using the concept of Stephen Wolfram’s one-dimensional binary cellular automata. This approach allows us to explore how cooperation evolves when interactions are limited to neighboring agents. We define binary cellular automata rules that conform to the donation game mechanics. Additionally, we introduce models of perceptual and action noise, along with a mutation matrix governing the probabilistic evolution of agent strategies. Our empirical results demonstrate that cooperation is significantly affected by agents’ mobility and their spatial locality on the game board. These findings highlight the importance of distinguishing between entirely random multi-agent systems and those in which agents are more likely to interact with their nearest neighbors.

捐赠游戏是研究多试剂系统合作的出现和演变的既定框架。合作行为可以受部分可观测环境中的环境噪音以及代理商决策战略的影响,这些战略可能不仅包括声誉,而且包括慷慨和宽恕等特征。传统模拟往往假定完全随机的相互作用,随机选择的代理商对等方之间对合作进行测试。在本文中,我们利用Stephen Wolfram的单维二维蜂窝自动磁体的概念来调查合作动态。这一方法使我们能够探索合作在互动仅限于邻近代理商时如何发展。我们界定符合捐赠游戏机的二进制蜂窝自动磁体规则。此外,我们引入了概念和行动噪音模型,以及指导代理商战略概率演进的突变矩阵。我们的经验结果表明,合作受到代理人在游戏板上的流动及其空间位置的重大影响。这些研究结果突出表明,必须区分完全随机的多试剂系统和代理人更有可能与其近邻进行互动的系统。


Article 18

Title@2025-07-15 (2): MR-LDM – The Merge-Reactive Longitudinal Decision Model: Game Theoretic Human Decision Modeling for Interactive Sim Agents

Title: MR-LDM – The Merge-Reactive Longitudinal Decision Model: Game Theoretic Human Decision Modeling for Interactive Sim Agents MR-LDM – Das Merge-Reactive Longitudinal Decision Model: Game Theoretic Human Decision Modeling for Interactive Sim Agents MR-LDM – – 合并-反反应纵向决定模型:互动模拟剂的游戏理论人类决定模型 2507.12494v1

Authors (4): Dustin Holley, Jovin D’sa, Hossein Nourkhiz Mahjoub, Gibran Ali

Enhancing simulation environments to replicate real-world driver behavior, i.e., more humanlike sim agents, is essential for developing autonomous vehicle technology. In the context of highway merging, previous works have studied the operational-level yielding dynamics of lag vehicles in response to a merging car at highway on-ramps. Other works focusing on tactical decision modeling generally consider limited action sets or utilize payoff functions with large parameter sets and limited payoff bounds. In this work, we aim to improve the simulation of the highway merge scenario by targeting a game theoretic model for tactical decision-making with improved payoff functions and lag actions. We couple this with an underlying dynamics model to have a unified decision and dynamics model that can capture merging interactions and simulate more realistic interactions in an explainable and interpretable fashion. The proposed model demonstrated good reproducibility of complex interactions when validated on a real-world dataset. The model was finally integrated into a high fidelity simulation environment and confirmed to have adequate computation time efficiency for use in large-scale simulations to support autonomous vehicle development.

强化模拟环境以复制真实世界驱动力行为,即更像人性的模拟剂,对于发展自主车辆技术至关重要。在高速公路合并方面,以往的工程已经研究过针对高速灯光上的汽车合并后,滞后车辆在操作层面产生的动力。其他侧重于战术决策模型的工程一般考虑到有限的成套行动,或利用具有大参数集和有限报酬界限的补偿功能。在这项工作中,我们的目标是改进高速公路合并情景的模拟,方法是针对一个游戏理论模型进行战术决策,改进支付功能和滞后动作。我们将此与一个潜在的动态模型结合起来,以便有一个统一的决定和动态模型,能够以可解释和可解释的方式捕捉到合并的相互作用和模拟更现实的互动。拟议的模型表明,在现实世界数据集验证时,复杂互动可以很好地再现。该模型最终被纳入一个高度忠诚的模拟环境,并被确认具有充分的计算时间效率,用于大规模模拟,以支持自主车辆开发。


Article 19

Title@2025-07-15 (2): Let’s Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification

Title: Let’s Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification Lassen Sie uns in zwei Schritten denken: Abmildern Vereinbarung Bias in MLLMs mit selbst-gerundete Verifikation 让我们思考两步:在MLLMs中减少协议与自我核查的偏见 2507.11662v1

Authors (6): Moises Andrade, Joonhyuk Cha, Brandon Ho, Vriksha Srihari, Karmesh Yadav, Zsolt Kira

Verifiers – functions assigning rewards to agent behavior – have been key for AI progress in domains like math and board games. However, extending these gains to domains without clear-cut success criteria (e.g.,computer use) remains a challenge: while humans can recognize suitable outcomes, translating this intuition into scalable rules is non-trivial. Multimodal Large Language Models(MLLMs) emerge as a promising solution, given their world knowledge, human-preference alignment, and reasoning skills. We evaluate MLLMs as verifiers of agent trajectories across web navigation, computer use, and robotic manipulation, and identify a critical limitation: agreement bias, a strong tendency for MLLMs to favor information in their context window, often generating chains of thought to rationalize flawed behavior. This bias is pervasive across models, resilient to test-time scaling, and can impact several methods using MLLMs as evaluators (e.g.,data filtering). Notably, it occurs despite MLLMs showing strong, human-aligned priors on desired behavior. To address this, we propose Self-Grounded Verification (SGV), a lightweight method that enables more effective use of MLLMs’ knowledge and reasoning by harnessing their own sampling mechanisms via unconditional and conditional generation. SGV operates in two steps: first, the MLLM is elicited to retrieve broad priors about task completion, independent of the data under evaluation. Then, conditioned on self-generated priors, it reasons over and evaluates a candidate trajectory. Enhanced with SGV, MLLM verifiers show gains of up to 20 points in accuracy and failure detection rates, and can perform real-time supervision of heterogeneous agents, boosting task completion of a GUI specialist in OSWorld, a diffusion policy in robomimic, and a ReAct agent in VisualWebArena – setting a new state of the art on the benchmark, surpassing the previous best by 48%.

验证者 – – 对代理行为给予奖励的职能 – – 在数学和棋盘游戏等领域是AI进步的关键。然而,将这些收益扩大到没有明确成功标准的领域(例如计算机使用),这仍然是一个挑战:虽然人类可以承认合适的结果,将这种直觉转化为可扩展的规则是非三角的。多式大语言模型(MLLMs)因其世界知识、人文比比对和推理技能而成为一个有希望的解决方案。我们评估MLLMS是互联网导航、计算机使用和机器人操纵等领域代理轨迹的验证者,并确定了一个关键限制:协议偏向、MLLLMS偏重其上的信息,往往形成将错误行为合理化的思维链条。这种偏向于各种模型,适应测试时间缩放,并且能够影响以 MLLLMMs为评审者(例如,数据过滤)为评价者的一种最佳方法。 值得注意的是,尽管MLLLMM公司在网络导航、计算机和机器人操作过程中表现出强力、人性更接近的预感知性前行,我们提议进行自我循环核查(SGV),一个较轻度的升级的升级的SDRDRDRevral),一个方法使得它们能够更高效地在时间上使用一个更精确的升级的升级的升级的动作。


Article 20

Title@2025-07-15 (2): STAGED: A Multi-Agent Neural Network for Learning Cellular Interaction Dynamics

Title: STAGED: A Multi-Agent Neural Network for Learning Cellular Interaction Dynamics STAGED: Ein multi-agent-neurales Netzwerk zum Lernen zellulärer Interaktionsdynamik STAGAD: 学习细胞互动动态多要素神经网络 2507.11660v1

Authors (9): Joao F. Rocha, Ke Xu, Xingzhi Sun, Ananya Krishna, Dhananjay Bhaskar, Blanche Mongeon, Morgan Craig, Mark Gerstein, Smita Krishnaswamy

The advent of single-cell technology has significantly improved our understanding of cellular states and subpopulations in various tissues under normal and diseased conditions by employing data-driven approaches such as clustering and trajectory inference. However, these methods consider cells as independent data points of population distributions. With spatial transcriptomics, we can represent cellular organization, along with dynamic cell-cell interactions that lead to changes in cell state. Still, key computational advances are necessary to enable the data-driven learning of such complex interactive cellular dynamics. While agent-based modeling (ABM) provides a powerful framework, traditional approaches rely on handcrafted rules derived from domain knowledge rather than data-driven approaches. To address this, we introduce Spatio Temporal Agent-Based Graph Evolution Dynamics(STAGED) integrating ABM with deep learning to model intercellular communication, and its effect on the intracellular gene regulatory network. Using graph ODE networks (GDEs) with shared weights per cell type, our approach represents genes as vertices and interactions as directed edges, dynamically learning their strengths through a designed attention mechanism. Trained to match continuous trajectories of simulated as well as inferred trajectories from spatial transcriptomics data, the model captures both intercellular and intracellular interactions, enabling a more adaptive and accurate representation of cellular dynamics.

单细胞技术的出现大大增进了我们对正常和疾病条件下不同组织细胞状态和亚人群的理解。 但是,这些方法将细胞视为人口分布的独立数据点。通过空间超时缩微集,我们可以代表细胞组织以及动态细胞互动,从而导致细胞状态的变化。然而,关键计算进步对于使这种复杂的交互式细胞动态能够以数据驱动学习是必要的。虽然基于代理的模型(ABM)提供了一个强大的框架,但传统方法依靠由域知识而不是数据驱动的方法产生的手动规则。为了解决这个问题,我们引入了Spatio Temal Agental-Basic 图表演变动态(STAGED),将反弹道导弹与深层学习模式间通信及其对细胞内部基因监管网络的影响结合起来。使用具有共享重量的图形ODE网络(GDE),我们的方法将基因作为定向边缘,通过设计的关注机制动态地学习其长处。为了解决这个问题,我们引入了Statimologal As 和更精确的移动模型模型,将有利于性空间阵列的模型的模型模拟和模型性模型模拟,从而匹配了内部空间阵列之间的模拟。


Article 21

Title@2025-07-15 (2): LF: Online Multi-Robot Path Planning Meets Optimal Trajectory Control

Title: LF: Online Multi-Robot Path Planning Meets Optimal Trajectory Control LF: Online-Multi-Roboter-Pfadplanung trifft auf optimale Trajektoriensteuerung LF: 在线多机器人路径规划满足最佳轨迹控制 2507.11464v1

Authors (3): Ajay Shankar, Keisuke Okumura, Amanda Prorok

We propose a multi-robot control paradigm to solve point-to-point navigation tasks for a team of holonomic robots with access to the full environment information. The framework invokes two processes asynchronously at high frequency: (i) a centralized, discrete, and full-horizon planner for computing collision- and deadlock-free paths rapidly, leveraging recent advances in multi-agent pathfinding (MAPF), and (ii) dynamics-aware, robot-wise optimal trajectory controllers that ensure all robots independently follow their assigned paths reliably. This hierarchical shift in planning representation from (i) discrete and coupled to (ii) continuous and decoupled domains enables the framework to maintain long-term scalable motion synthesis. As an instantiation of this idea, we present LF, which combines a fast state-of-the-art MAPF solver (LaCAM), and a robust feedback control stack (Freyja) for executing agile robot maneuvers. LF provides a robust and versatile mechanism for lifelong multi-robot navigation even under asynchronous and partial goal updates, and adapts to dynamic workspaces simply by quick replanning. We present various multirotor and ground robot demonstrations, including the deployment of 15 real multirotors with random, consecutive target updates while a person walks through the operational workspace.

我们提出一个多机器人控制模式,以解决能够获取完整环境信息的全环球机器人团队的点对点导航任务。框架在高频率上不同步地援引两个进程:(一) 一个中央、离散和全正方位规划器,快速计算碰撞和无僵局路径,利用多试剂路由调查(MAPF)的最新进展,以及(二) 一个动态智能、机器人智能最佳轨迹控制器,确保所有机器人都可靠地独立地遵循指定路径。这一在规划代表结构上从(一) 离散和连接到(二) 连续和分解的域的等级变化使得框架能够维持长期可伸缩运动合成。作为这一想法的即时化,我们介绍一个LF,它结合了多试探(MAPF解答(LACAM)的最新进展)和强有力的反馈控制堆(Freyja),用于执行灵活的机器人调整。LFLF为即使在同步和部分同步情况下的终身多机器人导航提供了一种强大和灵活机制。


Article 22

Title@2025-07-15 (2): Simulation for All: A Step-by-Step Cookbook for Developing Human-Centered Multi-Agent Transportation Simulators

Title: Simulation for All: A Step-by-Step Cookbook for Developing Human-Centered Multi-Agent Transportation Simulators Simulation für alle: Ein Schritt für Schritt Kochbuch für die Entwicklung von Mensch-zentrierten Multi-Agenten-Transportsimulatoren 面向所有人的模拟:开发以人为本的多机构运输模拟器的《一步一步步编制手册》 2507.09367v2

Authors (2): Shiva Azimi, Arash Tavakoli

As cities evolve toward more complex and multimodal transportation systems, the need for human-centered multi-agent simulation tools has never been more urgent. Yet most existing platforms remain limited - they often separate different types of road users, rely on scripted or pre-defined behaviors, overlook public transit users as active participants, and are rarely designed with accessibility in mind for non-technical users. To address this gap, this paper presents the specifications of a multi-agent simulation platform designed to support real-time, human-centered, and immersive studies of all road users, accompanied by open-source scripts for replication. Using high-fidelity immersive virtual environments, our platform enables interaction across public transit users, pedestrians, cyclists, automated vehicles, and drivers. The architecture is modular, extensible, and designed for accessibility. The system integrates hardware-specific modules - including an omnidirectional treadmill, a seating arrangement, a smart trainer, and an actuated cockpit. Additionally, the platform collects multimodal physiological, neurological, and behavioral data through embedded sensing devices such as functional near-infrared spectroscopy (fNIRS), eye tracking, and wrist-based biosensors. To show the usability of this system, we present three use cases. Simulation for All aims to lower the barrier to entry for high-fidelity transportation simulation, support experimentation across disciplines, and advance our understanding of multimodal mobility in complex urban environments.

随着城市向更复杂和多式运输系统的演变,对以人为本的多试剂模拟工具的需求从未像现在这样迫切。然而,大多数现有平台仍然有限 — — 它们往往将不同类型的道路使用者区分开来,依靠脚本或预先确定的行为,忽视公共交通用户作为积极的参与者,很少设计出非技术用户可以无障碍使用,本文介绍了多试样模拟平台的规格,该平台旨在支持实时、以人为本和暗中研究所有道路使用者,并配有供复制用的开源脚本。利用高度忠诚即隐蔽的虚拟环境,我们的平台能够让公共过境使用者、行人、骑自行车者、自动化车辆和驾驶者之间互动,其结构是模块化的、可扩展的,而且设计方便非技术用户使用。为了解决这一差距,本文件介绍了多试模拟平台的规格,旨在支持实时、以人为中心的所有道路用户,通过基于嵌入的感测设备,例如功能性的近距离、自行车、自动机动车辆和机动车辆的移动系统,以显示我们目前三种可移动性、直观和直径系统,从而显示我们目前可移动性、直观、直观、直观和直观系统的快速进入所有进入的系统。


Article 23

Title@2025-07-15 (2): From Kinetic Theory to AI: a Rediscovery of High-Dimensional Divergences and Their Properties

Title: From Kinetic Theory to AI: a Rediscovery of High-Dimensional Divergences and Their Properties Von der Kinetischen Theorie zur KI: Eine Wiederentdeckung hochdimensionaler Divergenzen und ihrer Eigenschaften 从动从理论到AI:重现高度多元差异及其属性 2507.11387v1

Authors (4): Gennaro Auricchio, Giovanni Brigati, Paolo Giudici, Giuseppe Toscani

Selecting an appropriate divergence measure is a critical aspect of machine learning, as it directly impacts model performance. Among the most widely used, we find the Kullback-Leibler (KL) divergence, originally introduced in kinetic theory as a measure of relative entropy between probability distributions. Just as in machine learning, the ability to quantify the proximity of probability distributions plays a central role in kinetic theory. In this paper, we present a comparative review of divergence measures rooted in kinetic theory, highlighting their theoretical foundations and exploring their potential applications in machine learning and artificial intelligence.

选择适当的差异度量是机器学习的一个关键方面,因为它直接影响到模型性能。 在最广泛使用的模型性能中,我们发现Kullback-Lebel(KL)差异,最初引入动能理论,作为概率分布之间的相对倍增量。正如在机器学习中一样,对概率分布的接近进行量化的能力在动能理论中起着核心作用。在本文中,我们对植根于动能理论的差异度量进行了比较审查,突出其理论基础并探索其在机器学习和人工智能中的潜在应用。


Article 24

Title@2025-07-15 (2): Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs

Title: Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs Schrittweise Richtlinie für Wissen über seltene Werkzeuge (SPaRK): Offline-RL, die vielfältige Werkzeugnutzung in LLMs antreibt 有限工具知识(SPARK)的逐步政策:驱动在LLM中使用多样化工具的离线RL 2507.11371v1

Authors (3): Gabriel Bo, Koa Chang, Justin Gu

We present Step-wise Policy for Rare-tool Knowledge (SPaRK), a novel reinforcement learning framework that teaches large language models to explore diverse tool usage patterns beyond conventional high-temperature sampling. Building on recent advances in step-wise reinforcement learning, we introduce a dual-objective reward system that simultaneously optimizes for answer quality and tool diversity, training a Llama-3.1 8B model through offline PPO on synthetically generated trajectories from the MMLU-Pro dataset. Our approach uniquely employs a rarity-first exploitation strategy where a GPT-4o judge scores candidate actions across eight distinct tools plus chain-of-thought reasoning, with the policy favoring less-frequently used but still viable tools to encourage systematic exploration. Empirical results demonstrate that SPaRK achieves competitive performance across 14 MMLU-Pro categories while exhibiting significantly higher entropy in tool selection compared to both baseline and supervised fine-tuning approaches, suggesting that algorithmic exploration through explicit tool diversity can enhance reasoning capabilities without sacrificing accuracy.

我们提出了“渐进工具知识政策”(SPARK),这是一个新颖的强化学习框架,它教育了大型语言模型,以探索传统高温抽样以外的多种工具使用模式。基于在渐进式强化学习方面的最新进展,我们引入了一个双重目标奖励制度,同时优化回答质量和工具多样性,通过离线 PPO对Llama-3.1 8B模型进行关于MMLU-Pro数据集合成生成轨迹的培训。我们的方法独特地运用了一种稀有第一开发战略,即GPT-4o法官对8种不同工具的候选行动进行评分,加上一系列思考推理,而政策则倾向于较不经常地使用但仍然可行的工具来鼓励系统探索。经验性结果表明,SPARK在工具选择方面实现了14个MLU-Pro类的竞争性业绩,同时在工具选择上展示了比基线和监督下的微调方法都高得多的精度。建议,通过明确的工具多样性进行算法探索可以提高推理能力,同时又不牺牲准确性。


Article 25

Title@2025-07-15 (2): Beyond Predictions: A Participatory Framework for Multi-Stakeholder Decision-Making

Title: Beyond Predictions: A Participatory Framework for Multi-Stakeholder Decision-Making Beyond Predictions: Ein partizipatorischer Rahmen für Entscheidungsfindung mit mehreren Interessenträgern 超越预测:多方利益攸关方决策参与框架 2502.08542v2

Authors (3): Vittoria Vineis, Giuseppe Perelli, Gabriele Tolomei

Conventional automated decision-support systems, often based on supervised learning, focus on predicting outcomes to recommend actions. However, they typically overlook the complexity of multi-actor environments, where diverse and conflicting stakeholder preferences must be balanced. At the same time, participatory AI approaches remain largely context-specific, limiting their broader applicability. To address these gaps, we propose a participatory framework that reframes decision-making as a multi-stakeholder optimization problem, using context-dependent reward functions to represent each actor’s preferences. Our modular, model-agnostic framework employs k-fold cross-validation to fine-tune user-provided prediction models and evaluate decision strategies, including compromise functions that mediate stakeholder trade-offs. A synthetic scoring mechanism aggregates user-defined preferences across multiple metrics to rank strategies and select an optimal decision-maker for generating actionable recommendations on new data. Validated on two high-stake real-world case studies, the framework consistently produces stakeholder-aware decisions that outperform purely predictive baselines across multiple metrics, while enhancing the transparency and accountability of AI-supported decision-making.

通常基于监督学习的常规自动决策支持系统注重预测结果以建议行动,但通常忽视多方因素环境的复杂性,这种环境必须平衡不同和相互冲突的利益攸关方偏好;同时,参与性的AI方法在很大程度上仍因具体情况而异,限制了其更广泛的适用性;为弥补这些差距,我们提议了一个参与性框架,将决策重新设定为多利益攸关方优化问题,利用基于背景的奖励功能来代表每个行为体的偏好;我们的模块式、模式性、不可知性框架对微调用户提供的预测模型进行交叉验证,并评价决策战略,包括调解利益攸关方取舍的折中功能;合成评分机制将用户定义的偏好组合为多种衡量标准,对战略进行排名,并选择最佳决策人,以便就新数据提出可操作的建议;经两个高取真实世界案例研究验证,该框架始终产生符合利益攸关方意识的决定,超越多种计量的纯预测基线,同时增强AI支持的决策的透明度和问责制。


Article 26

Title@2025-07-15 (2): Taming Uncertainty via Automation: Observing, Analyzing, and Optimizing Agentic AI Systems

Title: Taming Uncertainty via Automation: Observing, Analyzing, and Optimizing Agentic AI Systems Ungewissheit durch Automatisierung: Beobachten, Analysieren und Optimieren Agentischer KI-Systeme 通过自动化防止不确定性:观测、分析、优化ATI系统 2507.11277v1

Authors (2): Dany Moshkovich, Sergey Zeltyn

Large Language Models (LLMs) are increasingly deployed within agentic systems-collections of interacting, LLM-powered agents that execute complex, adaptive workflows using memory, tools, and dynamic planning. While enabling powerful new capabilities, these systems also introduce unique forms of uncertainty stemming from probabilistic reasoning, evolving memory states, and fluid execution paths. Traditional software observability and operations practices fall short in addressing these challenges. This paper introduces AgentOps: a comprehensive framework for observing, analyzing, optimizing, and automating operation of agentic AI systems. We identify distinct needs across four key roles-developers, testers, site reliability engineers (SREs), and business users-each of whom engages with the system at different points in its lifecycle. We present the AgentOps Automation Pipeline, a six-stage process encompassing behavior observation, metric collection, issue detection, root cause analysis, optimized recommendations, and runtime automation. Throughout, we emphasize the critical role of automation in managing uncertainty and enabling self-improving AI systems-not by eliminating uncertainty, but by taming it to ensure safe, adaptive, and effective operation.

大型语言模型(LLMS)在互动、LLM(LLM)动力代理商的代理系统收集中越来越多地部署,这些代理商利用记忆、工具和动态规划执行复杂、适应性工作流程;这些系统虽然能够带来强大的新能力,但也带来了来自概率推理、记忆状态变化和流体执行路径的独特不确定形式;传统的软件可观察性和操作做法在应对这些挑战方面不尽如人意。本文介绍了Agres:一个用于观察、分析、优化和自动化AI(ARI)系统运行的综合框架。我们确定了四个关键角色开发商、测试员、站点可靠性工程师(SREs)和企业用户的各自不同需要,这些角色、测试员、站点可靠性工程师(SREs)和企业用户在系统生命周期的不同点上与系统打交道。我们介绍了AgentOps自动管道,这是一个六阶段的过程,包括行为观察、指标收集、问题检测、根源分析、优化建议和运行自动化。我们强调自动化在管理不确定性和自我改进AI系统方面的关键作用,而不是通过消除不确定性,而是通过测试它确保安全、适应和有效运作。


Article 27

Title@2025-07-15 (2): On multiagent online problems with predictions

Title: On multiagent online problems with predictions Auf Multiagent Online-Probleme mit Vorhersagen 多试剂在线预测问题 2507.12486v1

Authors (3): Gabriel Istrate, Cosmin Bonchis, Victor Bogdan

We study the power of (competitive) algorithms with predictions in a multiagent setting. We introduce a two predictor framework, that assumes that agents use one predictor for their future (self) behavior, and one for the behavior of the other players. The main problem we are concerned with is understanding what are the best competitive ratios that can be achieved by employing such predictors, under various assumptions on predictor quality. As an illustration of our framework, we introduce and analyze a multiagent version of the ski-rental problem. In this problem agents can collaborate by pooling resources to get a group license for some asset. If the license price is not met then agents have to rent the asset individually for the day at a unit price. Otherwise the license becomes available forever to everyone at no extra cost. In the particular case of perfect other predictions the algorithm that follows the self predictor is optimal but not robust to mispredictions of agent’s future behavior; we give an algorithm with better robustness properties and benchmark it.

我们研究多试剂环境中预测的(竞争性)算法的力量。 我们引入了两个预测框架, 假设代理商使用一个预测器来预测他们的未来行为, 而另一个则使用其他参与者的行为。 我们关心的主要问题是了解在预测器质量的各种假设下,使用这些预测器能够达到的最具竞争力的比例。 作为我们框架的一个例子, 我们引入和分析一个多试剂版本的滑雪- 租赁问题。 在这个框架中, 问题代理商可以合作, 汇集资源, 为某些资产获得集体许可证。 如果许可证价格无法满足, 代理商则必须按单位价格单独租赁资产。 否则, 许可证将永远免费提供给每个人。 在其他完美预测的情况下, 遵循自我预测器的算法是最佳的, 但对于代理商未来行为的错误并不有力; 我们给出一种更稳健的算法, 并设定其基准 。


Article 28

Title@2025-07-15 (2): Voting or Consensus? Decision-Making in Multi-Agent Debate

Title: Voting or Consensus? Decision-Making in Multi-Agent Debate Abstimmung oder Konsens? Entscheidungsfindung in Multi-Agent-Debatte 表决还是协商一致?多机构辩论中的决策 2502.19130v3

Authors (5): Lars Benedikt Kaesberg, Jonas Becker, Jan Philip Wahle, Terry Ruas, Bela Gipp

Much of the success of multi-agent debates depends on carefully choosing the right parameters. The decision-making protocol stands out as it can highly impact final model answers, depending on how decisions are reached. Systematic comparison of decision protocols is difficult because many studies alter multiple discussion parameters beyond the protocol. So far, it has been largely unknown how decision-making influences different tasks. This work systematically evaluates the impact of seven decision protocols (e.g., majority voting, unanimity consensus). We change only one variable at a time - the decision protocol - to analyze how different methods affect the collaboration between agents and measure differences in knowledge and reasoning tasks. Our results show that voting protocols improve performance by 13.2% in reasoning tasks and consensus protocols by 2.8% in knowledge tasks compared to other decision protocols. Increasing the number of agents improves performance, while more discussion rounds before voting reduce it. To improve decision-making by increasing answer diversity, we propose two new methods, All-Agents Drafting (AAD) and Collective Improvement (CI). Our methods improve task performance by up to 3.3% with AAD and up to 7.4% with CI. This work demonstrates the importance of decision-making in multi-agent debates beyond scaling.

多代理人辩论的成功很大程度上取决于如何仔细选择正确的参数。决策协议的突出之处在于它能够对最终模式的答案产生很大影响,取决于如何作出决定。对决定协议的系统比较是困难的,因为许多研究改变了议定书之外的许多讨论参数。到目前为止,人们基本上不知道决策如何影响不同的任务。这项工作系统地评估了七个决定协议的影响(例如多数投票、全体一致协商一致)。我们一次只改变一个变量——决定协议——分析不同方法如何影响代理人之间的合作并衡量知识和推理任务的差异。我们的结果表明,投票协议提高了13.2%的推理任务和协商一致协议的绩效,比其他决定协议提高了2.8 %的推理任务和协商一致协议的绩效。增加代理人的数量提高了绩效,而更多的投票前讨论回合减少了绩效。为了通过增加答案的多样性来改进决策,我们提出了两种新方法,即 “ 所有人起草(AAD) “ 和 “ 集体改进 “ (CI)。我们的方法提高了任务绩效,与AAD的比例提高到3.3%,与CI的比例提高到7.4 %。这项工作表明,在超出规模的多代理人辩论中,必须进行决策。


Article 29

Title@2025-07-15 (2): A unifying approach to self-organizing systems interacting via conservation laws

Title: A unifying approach to self-organizing systems interacting via conservation laws Ein vereinheitlichter Ansatz für selbstorganisierende Systeme, die über Erhaltungsgesetze interagieren 对通过养护法相互作用的自我组织系统采取统一办法 2507.02575v3

Authors (8): Frank Barrows, Guanming Zhang, Satyam Anand, Zixi Chen, Jonathan Lin, Aman Desai, Stefano Martiniani, Francesco Caravelli

We present a unified framework for embedding and analyzing dynamical systems using generalized projection operators rooted in local conservation laws. By representing physical, biological, and engineered systems as graphs with incidence and cycle matrices, we derive dual projection operators that decompose network fluxes and potentials. This formalism aligns with principles of non-equilibrium thermodynamics and captures a broad class of systems governed by flux-forcing relationships and local constraints. We extend this approach to collective dynamics through the PRojective Embedding of Dynamical Systems (PrEDS), which lifts low-dimensional dynamics into a high-dimensional space, enabling both replication and recovery of the original dynamics. When systems fall within the PrEDS class, their collective behavior can be effectively approximated through projection onto a mean-field space. We demonstrate the versatility of PrEDS across diverse domains, including resistive and memristive circuits, adaptive flow networks (e.g., slime molds), elastic string networks, and particle swarms. Notably, we establish a direct correspondence between PrEDS and swarm dynamics, revealing new insights into optimization and self-organization. Our results offer a general theoretical foundation for analyzing complex networked systems and for designing systems that self-organize through local interactions.

我们提出了一个统一框架,用于利用植根于当地保护法的通用预测操作员嵌入和分析动态系统。我们通过将物理、生物和工程设计系统作为事件和周期矩阵的图表来代表物理、生物和工程系统,产生分解网络通量和潜力的双重预测操作员。这种形式主义符合非平衡热动力学的原则,并捕捉了由通量-促进关系和地方制约的多种系统。我们通过动态系统(PrEDS)的旋转嵌入式嵌入系统(e.g.,粘液型模子)、弹性弦网络和粒子蒸发器等,将这一方法推广到集体动态中,将低维动态提升到一个高维空间,使原始动态得以复制和恢复。当系统属于PREDS级时,它们的集体行为可以通过投射到一个中等空间空间来有效地近似。我们展示了PREDS的多功能性,包括受通量-促进关系和地方制约的电路、适应性流动网络(e.g.lipee molds)、弹性弦网络网络和粒子蒸发波波波波波波体。 特别是,我们为自我分析的系统提供了自我分析的系统。


Article 30

Title@2025-07-15 (2): MATE: LLM-Powered Multi-Agent Translation Environment for Accessibility Applications

Title: MATE: LLM-Powered Multi-Agent Translation Environment for Accessibility Applications MATE: LLM-Powered Multi-Agent Translation Environment for Accessibility Applications MATE:为无障碍应用提供LLM 授权多机构翻译环境 2506.19502v2

Authors (3): Aleksandr Algazinov, Matt Laing, Paul Laban

Accessibility remains a critical concern in today’s society, as many technologies are not developed to support the full range of user needs. Existing multi-agent systems (MAS) often cannot provide comprehensive assistance for users in need due to the lack of customization stemming from closed-source designs. Consequently, individuals with disabilities frequently encounter significant barriers when attempting to interact with digital environments. We introduce MATE, a multimodal accessibility MAS, which performs the modality conversions based on the user’s needs. The system is useful for assisting people with disabilities by ensuring that data will be converted to an understandable format. For instance, if the user cannot see well and receives an image, the system converts this image to its audio description. MATE can be applied to a wide range of domains, industries, and areas, such as healthcare, and can become a useful assistant for various groups of users. The system supports multiple types of models, ranging from LLM API calling to using custom machine learning (ML) classifiers. This flexibility ensures that the system can be adapted to various needs and is compatible with a wide variety of hardware. Since the system is expected to run locally, it ensures the privacy and security of sensitive information. In addition, the framework can be effectively integrated with institutional technologies (e.g., digital healthcare service) for real-time user assistance. Furthermore, we introduce ModCon-Task-Identifier, a model that is capable of extracting the precise modality conversion task from the user input. Numerous experiments show that ModCon-Task-Identifier consistently outperforms other LLMs and statistical models on our custom data. Our code and data are publicly available at https://github.com/AlgazinovAleksandr/Multi-Agent-MATE.

现有多试剂系统(MAS)往往无法向需要的用户提供全面援助,因为封闭源码设计缺乏定制,因此残疾人在试图与数字环境互动时经常遇到重大障碍。我们引入了MATE,一个基于用户需要进行模式转换的多式无障碍MAS。这个系统有助于帮助残疾人,确保数据转换成易理解的格式。例如,如果用户不能看好并接收图像,该系统将这种图像转换成其音频描述。MATE可以应用于广泛的领域、行业和领域,例如医疗保健,并可以成为各类用户群体的一个有用助手。我们引入了多种模式,从LAM API到使用自定义机器学习(MLM)分类器。这种灵活性确保了系统能够适应各种需求,并且与广泛的硬件兼容。由于该系统要在当地运行,它能确保用户的隐私和安全性将MORSK转换成其数据格式。此外,我们也可以在系统内部的服务器服务器服务器服务器上,我们可以有效地使用实时服务器。


Article 31

Title@2025-07-15 (2): Fully Data-driven but Interpretable Human Behavioural Modelling with Differentiable Discrete Choice Model

Title: Fully Data-driven but Interpretable Human Behavioural Modelling with Differentiable Discrete Choice Model Vollständig datengesteuerte, aber interpretierbare menschliche Verhaltensmodellierung mit differenzierbarem diskretes Wahlmodell 完全由数据驱动但可解释的人类行为模型与差异分辨选择模型 2412.19403v3

Authors (4): Fumiyasu Makinoshima, Tatsuya Mitomi, Fumiya Makihara, Eigo Segawa

Discrete choice models are essential for modelling various decision-making processes in human behaviour. However, the specification of these models has depended heavily on domain knowledge from experts, and the fully automated but interpretable modelling of complex human behaviours has been a long-standing challenge. In this paper, we introduce the differentiable discrete choice model (Diff-DCM), a fully data-driven method for the interpretable modelling, learning, prediction, and control of complex human behaviours, which is realised by differentiable programming. Solely from input features and choice outcomes without any prior knowledge, Diff-DCM can estimate interpretable closed-form utility functions that reproduce observed behaviours. Comprehensive experiments with both synthetic and real-world data demonstrate that Diff-DCM can be applied to various types of data and requires only a small amount of computational resources for the estimations, which can be completed within tens of seconds on a laptop without any accelerators. In these experiments, we also demonstrate that, using its differentiability, Diff-DCM can provide useful insights into human behaviours, such as an optimal intervention path for effective behavioural changes. This study provides a strong basis for the fully automated and reliable modelling, prediction, and control of human behaviours.

然而,这些模型的规格在很大程度上依赖于专家的域内知识,对复杂的人类行为进行完全自动化但可解释的模拟是一个长期的挑战。在本文件中,我们引入了不同独立的选择模型(Diff-DCM),这是一种完全由数据驱动的方法,用于对复杂的人类行为进行可解释的建模、学习、预测和控制,通过不同的编程来实现。从输入特征和选择结果中可以单独看出,Diff-DCM可以在没有任何事先知识的情况下对可解释的封闭式效用功能进行估计,从而复制观察到的行为。合成和现实世界数据的全面实验表明,Diff-DCM可以应用于各种类型的数据,只需要少量的计算资源来进行估算,而这种估算可以在没有加速器的情况下在笔记本电脑上数秒内完成。在这些实验中,我们还表明,Diff-DCM可以利用其不同性,提供有用的人类行为的洞察力,例如有效行为变化的最佳干预路径。这一研究为充分自动化和可靠地进行模拟提供了坚实的人类行为预测。


Article 32

Title@2025-07-15 (2): Trajectory Imputation in Multi-Agent Sports with Derivative-Accumulating Self-Ensemble

Title: Trajectory Imputation in Multi-Agent Sports with Derivative-Accumulating Self-Ensemble Trajektorien-Imputation im Multi-Agenten-Sport mit demivativ-akkumulierendem Selbst-Ensemble 多机构体育中具有衍生-累积自我集合功能的多机构体育 2408.10878v4

Authors (7): Han-Jun Choi, Hyunsung Kim, Minho Lee, Minchul Jeong, Chang-Jo Kim, Jinsung Yoon, Sang-Ki Ko

Multi-agent trajectory data collected from domains such as team sports often suffer from missing values due to various factors. While many imputation methods have been proposed for spatiotemporal data, they are not well-suited for multi-agent sports scenarios where player movements are highly dynamic and inter-agent interactions continuously evolve. To address these challenges, we propose MIDAS (Multi-agent Imputer with Derivative-Accumulating Self-ensemble), a framework that imputes multi-agent trajectories with high accuracy and physical plausibility. It jointly predicts positions, velocities, and accelerations through a Set Transformer-based neural network and generates alternative estimates by recursively accumulating predicted velocity and acceleration values. These predictions are then combined using a learnable weighted ensemble to produce final imputed trajectories. Experiments on three sports datasets demonstrate that MIDAS significantly outperforms existing baselines in both positional accuracy and physical plausibility. Lastly, we showcase use cases of MIDAS, such as approximating total distance and pass success probability, to highlight its applicability to practical downstream tasks that require complete tracking data.

从团队体育等领域收集的多试剂轨迹数据往往由于各种因素而缺乏价值。虽然提出了许多估算方法,用于空间时空数据,但并不适合多试剂体育情景,即玩家运动高度动态,而且代理相互作用不断变化。为了应对这些挑战,我们提议采用MIDAS(多试剂浸泡器与衍生-累积自共性自共性)这一框架,该框架对多试剂轨迹进行了高精确度和物理可视性两方面的估算,它通过基于变换器的神经网络联合预测位置、速度和加速率,并通过反复累积预测速度和加速值生成替代估计数。然后,这些预测结合使用可学习的加权组合来产生最后的推测性轨道。对三个体育数据集的实验表明,MIDAS在定位准确性和物理可视性两方面都大大超过现有基线。最后,我们展示了MIDAS的例子,例如利用完整的实际跟踪和跨度概率来显示其整个移动性数据。


Article 33

Title@2025-07-15 (2): A Learning Framework For Cooperative Collision Avoidance of UAV Swarms Leveraging Domain Knowledge

Title: A Learning Framework For Cooperative Collision Avoidance of UAV Swarms Leveraging Domain Knowledge Ein Lernrahmen zur kooperativen Kollision Vermeidung von UAV-Schwärmen Nutzung von Domain-Wissen 合作协作避免无人驾驶航空飞行器冲冲冲器利用域域知识学习框架 2507.10913v1

Authors (3): Shuangyao Huang, Haibo Zhang, Zhiyi Huang

This paper presents a multi-agent reinforcement learning (MARL) framework for cooperative collision avoidance of UAV swarms leveraging domain knowledge-driven reward. The reward is derived from knowledge in the domain of image processing, approximating contours on a two-dimensional field. By modeling obstacles as maxima on the field, collisions are inherently avoided as contours never go through peaks or intersect. Additionally, counters are smooth and energy-efficient. Our framework enables training with large swarm sizes as the agent interaction is minimized and the need for complex credit assignment schemes or observation sharing mechanisms in state-of-the-art MARL approaches are eliminated. Moreover, UAVs obtain the ability to adapt to complex environments where contours may be non-viable or non-existent through intensive training. Extensive experiments are conducted to evaluate the performances of our framework against state-of-the-art MARL algorithms.

本文介绍了一个多剂强化学习框架(MARL),以合作避免无人机群群群群群群人碰撞,利用领域知识驱动的奖励。奖励来自图像处理领域的知识,即近似于二维领域的轮廓。通过将各种障碍建模成实地的顶点,碰撞本来就是避免的,因为轮廓从未穿过高峰或交叉。此外,计数器是光滑和节能的。我们的框架使得培训能够进行大群规模的培训,因为代理器相互作用被尽可能减少,并消除了在最新MARL方法中采用复杂的信用分配计划或观察共享机制的需要。此外,无人机获得了适应复杂环境的能力,这种环境的轮廓可能不可行,或通过密集培训不存在。还进行了广泛的实验,以评价我们框架在与最先进的MARL算法方面的表现。


Article 34

Title@2025-07-15 (2): Autonomous Multi-Modal LLM Agents for Treatment Planning in Focused Ultrasound Ablation Surgery

Title: Autonomous Multi-Modal LLM Agents for Treatment Planning in Focused Ultrasound Ablation Surgery Autonome Multi-Modal LLM-Agenten für die Behandlungsplanung in fokussierter Ultraschallablationschirurgie 重点超声速超声振动外科手术治疗规划代理 2505.21418v2

Authors (9): Lina Zhao, Jiaxing Bai, Zihao Bian, Qingyue Chen, Yafang Li, Guangbo Li, Min He, Huaiyuan Yao, Zongjiu Zhang

Focused Ultrasound Ablation Surgery (FUAS) has emerged as a promising non-invasive therapeutic modality, valued for its safety and precision. Nevertheless, its clinical implementation entails intricate tasks such as multimodal image interpretation, personalized dose planning, and real-time intraoperative decision-making processes that demand intelligent assistance to improve efficiency and reliability. We introduce FUAS-Agents, an autonomous agent system that leverages the multimodal understanding and tool-using capabilities of large language models (LLMs). By integrating patient profiles and MRI data, FUAS-Agents orchestrates a suite of specialized medical AI tools, including segmentation, treatment dose prediction, and clinical guideline retrieval, to generate personalized treatment plans comprising MRI image, dose parameters, and therapeutic strategies. We evaluate the system in a uterine fibroid treatment scenario. Human assessment by four senior FUAS experts indicates that 82.5%, 82.5%, 87.5%, and 97.5% of the generated plans were rated 4 or above (on a 5-point scale) in terms of completeness, accuracy, fluency, and clinical compliance, respectively. These results demonstrate the potential of LLM-driven agents in enhancing decision-making across complex clinical workflows, and exemplify a translational paradigm that combines general-purpose models with specialized expert systems to solve practical challenges in vertical healthcare domains.

以安全和精确性为价值价值的临床实施是一项有希望的非侵入性治疗模式(FUAS)。然而,临床实施包含复杂的任务,如多式图像判读、个性化剂量规划以及实时的、需要智能援助以提高效率和可靠性的手术内部决策程序。我们引入了FUAS-Agents(FUAS-Agents),这是一个自主代理系统,利用大型语言模型的多式联运理解和工具使用能力。通过整合患者概况和MRI数据,FUAS-Agents将一套专门的医疗AI工具,包括分解、治疗剂量预测和临床准则检索,以产生个性化治疗计划,包括MRI图像、剂量参数和治疗战略。我们在子宫纤维治疗情景中评估这一系统。 FUAS四名高级专家的人类评估表明,生成的计划中有82.5%、8.2.5%、87.5%和97.5%在完整性、准确性能和临床合规性方面被评为4或以上(5个百分点)。这些结果分别表明,在高端的临床治理模式中,将高端翻译机构在专业性、高端选择领域中,在高端翻译领域将高端研究模式中,将高端研究模式与高端研究领域中,将高端研究模式相结合。


Article 35

Title@2025-07-14 (1): AI-Powered Math Tutoring: Platform for Personalized and Adaptive Education

Title: AI-Powered Math Tutoring: Platform for Personalized and Adaptive Education KI-Powered Math Tutoring: Plattform für Personalisierte und Adaptive Bildung AI 授权数学教学:个性化和适应教育平台 2507.12484v1

Authors (2): Jarosław A. Chudziak, Adam Kostka

The growing ubiquity of artificial intelligence (AI), in particular large language models (LLMs), has profoundly altered the way in which learners gain knowledge and interact with learning material, with many claiming that AI positively influences their learning achievements. Despite this advancement, current AI tutoring systems face limitations associated with their reactive nature, often providing direct answers without encouraging deep reflection or incorporating structured pedagogical tools and strategies. This limitation is most apparent in the field of mathematics, in which AI tutoring systems remain underdeveloped. This research addresses the question: How can AI tutoring systems move beyond providing reactive assistance to enable structured, individualized, and tool-assisted learning experiences? We introduce a novel multi-agent AI tutoring platform that combines adaptive and personalized feedback, structured course generation, and textbook knowledge retrieval to enable modular, tool-assisted learning processes. This system allows students to learn new topics while identifying and targeting their weaknesses, revise for exams effectively, and practice on an unlimited number of personalized exercises. This article contributes to the field of artificial intelligence in education by introducing a novel platform that brings together pedagogical agents and AI-driven components, augmenting the field with modular and effective systems for teaching mathematics.

越来越多的人工智能(AI),特别是大型语言模型(LLMS),日益普遍的人工智能(AI),特别是大型语言模型(LLMS),深刻地改变了学习者获取知识和与学习材料互动的方式,许多人声称AI积极影响他们的学习成就。尽管取得了这一进步,目前AI辅导系统仍面临与其被动性质有关的限制,往往提供直接答案,而没有鼓励深刻的反思或纳入结构化教学工具和战略。这种限制在数学领域最为明显,而AI辅导系统仍然不发达。这一研究涉及以下问题:AI辅导系统如何超越提供反应性援助,使结构化、个性化和工具辅助的学习经验成为可能?我们引入了一个新的多机构AI辅导平台,将适应性和个性化反馈、结构化课程生成和教科书知识检索结合起来,以模块化、工具辅助学习进程为条件,使学生能够学习新课题,同时查明和针对其弱点,对考试进行有效修改,并采用无限个性练习的做法。这一条文章有助于人工智能教育领域,引入一个将教学代理人和AI驱动的组件结合起来的新平台,用模块和有效数学教学系统加强实地。


Article 36

Title@2025-07-14 (1): DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving

Title: DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving DroidSpeak: KV Cache Sharing für Cross-LLM Kommunikation und Multi-LLM Serving DroidSpeak: KV 共享缓存, 用于跨 LLM 通信和多 LLM 服务 2411.02820v4

Authors (12): Yuhan Liu, Yuyang Huang, Jiayi Yao, Shaoting Feng, Zhuohan Gu, Kuntai Du, Hanchen Li, Yihua Cheng, Junchen Jiang, Shan Lu, Madan Musuvathi, Esha Choukse

Compound AI systems, such as agentic systems, are an emerging trend in large-scale enterprise settings, with multiple LLMs specialized for different users, tasks, and/or roles working together. In these scenarios, different models often process inputs that share the same context prefix. Although much work was done in the past to enable the reuse of prefix KV caches across inputs for a single model, how to enable one model to reuse the prefix KV caches of a different model remains an open question. We introduce DroidSpeak, the first distributed LLM inference system that enables KV cache reuse across distributed nodes running inference of different LLMs, so long as the LLMs have the same architecture. We present the first study that aims at understanding the impact of sharing KV caches across different LLMs, and if/when such sharing affects quality. Inspired by the findings, we present DroidSpeak, which selectively recomputes a few layers of the KV cache produced by another LLM and reuses the remaining layers, with negligible quality loss. Moreover, carefully pipelining the layer-wise re-computation and the loading of reused KV cache further improves the inference performance. Experiments on diverse datasets and model pairs demonstrate that DroidSpeak achieves up to 4x throughput improvement and about 3.1x faster prefill (time to first token), with negligible loss of quality in F1 scores, Rouge-L or code similarity score, compared to the baseline which does not allow any sharing across models.

复杂的人工智能系统,如代理系统,是大型企业环境中出现的新趋势,其间,多个LLMS系统是针对不同用户、任务和/或角色的多重分布式LLMs专门使用,在这些情景中,不同的模型往往处理具有相同背景前缀的投入。虽然过去做了许多工作,使输入中前缀KV缓存能够重新用于单一模式,但如何使一个模型能够重新使用不同模型的前缀KV缓存仍然是一个未决问题。我们引入了DroidSpeak,这是第一个分散式LLM推导系统,使分布式无主节点的KV缓存再利用运行不同LMs的推断,只要LLMs具有相同的结构。我们介绍第一项研究的目的是了解不同LMs之间共享KV缓存前缀的影响,以及当这种共享影响质量后,我们介绍DrodSpeak,它有选择地重新配置另一个LM公司生成的KV缓存数的几层,再利用其余层,质量损失微不足道。此外,我们谨慎地将Setting the drivlex deal delialalal deal deal deal deality ex ex ex ex as as intravelilds delist the deliver drevations relist ex deal deal dealds lauts lauts be silds be ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex lautilvacilvacessationsilddrupdrupdrevaldsildmentaltimentaldsild saldsild supddddddsalds sumentaldsildsilds suds ex sudds ex ex ex ex ex ex ex subilddaldalds ex ex ex ex ex ex ex ex lads ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex


Article 37

Title@2025-07-14 (1): DeepResearch$^{\text{Eco}}$: A Recursive Agentic Workflow for Complex Scientific Question Answering in Ecology

Title: DeepResearch$^{\text{Eco}}$: A Recursive Agentic Workflow for Complex Scientific Question Answering in Ecology DeepResearch$^{\text{Eco}}$: Ein rekursiver Agentischer Workflow für komplexe wissenschaftliche Fragen in der Ökologie 深层研究$text{Eco}$:生态中复杂科学问题答案的递递性制剂工作流程 2507.10522v1

Authors (3): Jennifer D’Souza, Endres Keno Sander, Andrei Aioanei

We introduce DeepResearch$^{\text{Eco}}$, a novel agentic LLM-based system for automated scientific synthesis that supports recursive, depth- and breadth-controlled exploration of original research questions – enhancing search diversity and nuance in the retrieval of relevant scientific literature. Unlike conventional retrieval-augmented generation pipelines, DeepResearch enables user-controllable synthesis with transparent reasoning and parameter-driven configurability, facilitating high-throughput integration of domain-specific evidence while maintaining analytical rigor. Applied to 49 ecological research questions, DeepResearch achieves up to a 21-fold increase in source integration and a 14.9-fold rise in sources integrated per 1,000 words. High-parameter settings yield expert-level analytical depth and contextual diversity. Source code available at: https://github.com/sciknoworg/deep-research.

我们引入了基于深层研究${text{Eco}$这个基于新颖代理LLM的自动化科学合成系统,支持对原始研究问题进行循环、深度和广度控制的探索 – – 加强相关科学文献检索中的搜索多样性和细微度。与传统的检索增强的生成管道不同,深层研究使用户能够以透明推理和参数驱动的可配置性来控制合成,促进高通量整合特定领域的证据,同时保持分析规范。应用到49个生态研究问题,深层研究实现了源集化增加21倍,源集成源次增加14.9倍。高参数环境产生了专家级分析深度和背景多样性。源代码见:https://github.com/scinoworg/deep-research。


Article 38

Title@2025-07-14 (1): Toolsuite for Implementing Multiagent Systems Based on Communication Protocols

Title: Toolsuite for Implementing Multiagent Systems Based on Communication Protocols Toolsuite zur Implementierung von Multiagentensystemen auf Basis von Kommunikationsprotokollen 基于通信议定书的用于实施多剂系统的工具 2507.10324v1

Authors (3): Amit K. Chopra, Samuel H. Christie V, Munindar P. Singh

Interaction-Oriented Programming (IOP) is an approach to building a multiagent system by modeling the interactions between its roles via a flexible interaction protocol and implementing agents to realize the interactions of the roles they play in the protocol. In recent years, we have developed an extensive suite of software that enables multiagent system developers to apply IOP. These include tools for efficiently verifying protocols for properties such as liveness and safety and middleware that simplifies the implementation of agents. This paper presents some of that software suite.

以互动为主的编程(IPP)是建立多试剂系统的一种方法,其方法是通过灵活的互动协议,模拟其作用与实施者之间的互动,以实现其在协议中所起作用的相互作用。近年来,我们开发了一套广泛的软件,使多试剂系统开发者能够应用IPP。其中包括有效核查诸如活性和安全等特性协议的工具,以及简化代理实施过程的中间软件。本文件介绍了其中的一些软件套件。


Article 39

Title@2025-07-14 (1): Prompt Informed Reinforcement Learning for Visual Coverage Path Planning

Title: Prompt Informed Reinforcement Learning for Visual Coverage Path Planning Prompt Informierte Verstärkung Lernen für die visuelle Abdeckung Pfadplanung 视力覆盖规划快速信息强化学习 2507.10284v1

Authors (1): Venkat Margapuri

Visual coverage path planning with unmanned aerial vehicles (UAVs) requires agents to strategically coordinate UAV motion and camera control to maximize coverage, minimize redundancy, and maintain battery efficiency. Traditional reinforcement learning (RL) methods rely on environment-specific reward formulations that lack semantic adaptability. This study proposes Prompt-Informed Reinforcement Learning (PIRL), a novel approach that integrates the zero-shot reasoning ability and in-context learning capability of large language models with curiosity-driven RL. PIRL leverages semantic feedback from an LLM, GPT-3.5, to dynamically shape the reward function of the Proximal Policy Optimization (PPO) RL policy guiding the agent in position and camera adjustments for optimal visual coverage. The PIRL agent is trained using OpenAI Gym and evaluated in various environments. Furthermore, the sim-to-real-like ability and zero-shot generalization of the agent are tested by operating the agent in Webots simulator which introduces realistic physical dynamics. Results show that PIRL outperforms multiple learning-based baselines such as PPO with static rewards, PPO with exploratory weight initialization, imitation learning, and an LLM-only controller. Across different environments, PIRL outperforms the best-performing baseline by achieving up to 14% higher visual coverage in OpenAI Gym and 27% higher in Webots, up to 25% higher battery efficiency, and up to 18\% lower redundancy, depending on the environment. The results highlight the effectiveness of LLM-guided reward shaping in complex spatial exploration tasks and suggest a promising direction for integrating natural language priors into RL for robotics.

无人驾驶飞行器(UAVs)的视觉覆盖路径规划要求代理商从战略上协调UAV运动和相机控制,以最大限度地扩大覆盖范围,最大限度地减少冗余,并保持电池效率。传统的强化学习方法依赖于缺乏语义适应性的环境专用奖赏配方。本研究报告提出快速化强化学习(PIRL),这是一种新颖的方法,结合了由好奇驱动的RL. PIRL的大型语言模型的零点推理能力和文内流学习能力。 PIRL利用了LLM、GPT-3.5的语义反馈,以动态方式影响Proximal政策优化化(PPPO)的奖赏功能。传统强化学习(RL)政策在最佳视觉覆盖的定位和相机调整中指导代理商的位置和相机调整。PIRLA是使用OpAI Gym (PPPOLL) 培训,在各种环境中评价。此外,通过在Webotos Simal-lationalalalal-lational-IL 上操作代理代理器,在Slobal-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-lal-lation-lal-lal-lation Acal-lation Acal-l) 上,在前将一个高级精略到一个高级精略到一个最精略到一个高级的高级精采缩缩缩缩略图图图图图。


Article 40

Title@2025-07-14 (1): ToMacVF : Temporal Macro-action Value Factorization for Asynchronous Multi-Agent Reinforcement Learning

Title: ToMacVF : Temporal Macro-action Value Factorization for Asynchronous Multi-Agent Reinforcement Learning ToMacVF : Zeitliche Makro-Wirkungs-Wertfaktorisierung für asynchrones Mehr-Agenten-Verstärkungs-Lernen ToMacVF: 同步多机构强化学习的时际宏观行动价值系数 2507.10251v1

Authors (2): Wenjing Zhang, Wei Zhang

Existing asynchronous MARL methods based on MacDec-POMDP typically construct training trajectory buffers by simply sampling limited and biased data at the endpoints of macro-actions, and directly apply conventional MARL methods on the buffers. As a result, these methods lead to an incomplete and inaccurate representation of the macro-action execution process, along with unsuitable credit assignments. To solve these problems, the Temporal Macro-action Value Factorization (ToMacVF) is proposed to achieve fine-grained temporal credit assignment for macro-action contributions. A centralized training buffer, called Macro-action Segmented Joint Experience Replay Trajectory (Mac-SJERT), is designed to incorporate with ToMacVF to collect accurate and complete macro-action execution information, supporting a more comprehensive and precise representation of the macro-action process. To ensure principled and fine-grained asynchronous value factorization, the consistency requirement between joint and individual macro-action selection called Temporal Macro-action based IGM (To-Mac-IGM) is formalized, proving that it generalizes the synchronous cases. Based on To-Mac-IGM, a modularized ToMacVF architecture, which satisfies CTDE principle, is designed to conveniently integrate previous value factorization methods. Next, the ToMacVF algorithm is devised as an implementation of the ToMacVF architecture. Experimental results demonstrate that, compared to asynchronous baselines, our ToMacVF algorithm not only achieves optimal performance but also exhibits strong adaptability and robustness across various asynchronous multi-agent experimental scenarios.

以MacDec-POMDP为基础的现有非同步的MARL方法通常通过在宏观行动的终点对有限和有偏差的数据进行抽样抽样抽样,建立培训轨迹缓冲,并在缓冲点上直接采用常规的MARL方法。因此,这些方法导致宏观行动执行过程的表述不完整和不准确,加上不适当的信用分配。为了解决这些问题,建议采用时间-宏观行动价值系数(TomacVFF),为宏观行动贡献实现细微加权临时信用分配。 一种中央化培训缓冲,称为宏观行动部分联合经验重现(Mac-SJERT),旨在与ToMacVFF一起收集准确和完整的宏观行动执行信息,支持更全面和准确地表述宏观行动过程。为了确保有原则的和微缩缩缩缩的数值系数化,联合和个人宏观选择之间的一致性要求,即基于IMM(to-Mac-IGM)的模拟(To-Mac-IGM), 证明它把同步的轨迹模型概括化过程, 也用来显示前的模型结构, 以Mal-F-mac-mac-alalalalal-dealalalalizalalal asalizalizmalizmalizmalizmalizmalizmalizmal asusalizmalizmalizmalizmalizmalizmalizmalizmalizalizmalizmal)。


Article 41

Title@2025-07-14 (1): Multi-Robot Cooperative Herding through Backstepping Control Barrier Functions

Title: Multi-Robot Cooperative Herding through Backstepping Control Barrier Functions Multi-Roboter-Kooperative Herdung durch rückschrittliche Steuerungsbarrieren-Funktionen 多机器人合作通过后步控制障碍功能 2507.10249v1

Authors (5): Kang Li, Ming Li, Wenkang Ji, Zhiyong Sun, Shiyu Zhao

We propose a novel cooperative herding strategy through backstepping control barrier functions (CBFs), which coordinates multiple herders to herd a group of evaders safely towards a designated goal region. For the herding system with heterogeneous groups involving herders and evaders, the behavior of the evaders can only be influenced indirectly by the herders’ motion, especially when the evaders follow an inverse dynamics model and respond solely to repulsive interactions from the herders. This indirect interaction mechanism inherently renders the overall system underactuated. To address this issue, we first construct separate CBFs for the dual objectives of goal reaching and collision avoidance, which ensure both herding completion and safety guarantees. Then, we reformulate the underactuated herding dynamics into a control-affine structure and employ a backstepping approach to recursively design control inputs for the hierarchical barrier functions, avoiding taking derivatives of the higher-order system. Finally, we present a cooperative herding strategy based on backstepping CBFs that allow herders to safely herd multiple evaders into the goal region. In addition, centralized and decentralized implementations of the proposed algorithm are developed, further enhancing its flexibility and applicability. Extensive simulations and real-world experiments validate the effectiveness and safety of the proposed strategy in multi-robot herding.

我们提出一个新的合作放牧战略,通过后步控制屏障功能(CBFs)来协调多个牧民,将一组逃避者安全地赶到指定的目标区域。对于由牧民和逃避者参与的不同群体组成的牧草系统,躲避者的行为只能受到牧民运动的间接影响,特别是当逃避者遵循反动态模型,只对牧民的令人厌恶的相互作用作出反应时。这种间接互动机制必然使整个系统不起作用。为了解决这一问题,我们首先为实现目标和避免碰撞的双重目标分别建立CBFs,以确保放牧完成和安全保障。然后,我们重新将未充分激活的放牧动力转化为控制-麻木结构,并采用后步办法,为等级屏障功能反复设计控制投入,避免取取高阶系统的衍生物。最后,我们提出了一个合作放牧战略,其基础是倒退,使牧民能够安全地将多个逃避者带进入目标区域。此外,我们提出的算法的集中和分散执行过程,将进一步提升其灵活性和适用性。


Article 42

Title@2025-07-14 (1): Adaptability in Multi-Agent Reinforcement Learning: A Framework and Unified Review

Title: Adaptability in Multi-Agent Reinforcement Learning: A Framework and Unified Review Anpassungsfähigkeit im Mehr-Agenten-Verstärkungs-Lernen: Ein Rahmen und eine einheitliche Überprüfung 多机构加强学习中的适应性:框架和统一审查 2507.10142v1

Authors (6): Siyi Hu, Mohamad A Hady, Jianglin Qiao, Jimmy Cao, Mahardhika Pratama, Ryszard Kowalczyk

Multi-Agent Reinforcement Learning (MARL) has shown clear effectiveness in coordinating multiple agents across simulated benchmarks and constrained scenarios. However, its deployment in real-world multi-agent systems (MAS) remains limited, primarily due to the complex and dynamic nature of such environments. These challenges arise from multiple interacting sources of variability, including fluctuating agent populations, evolving task goals, and inconsistent execution conditions. Together, these factors demand that MARL algorithms remain effective under continuously changing system configurations and operational demands. To better capture and assess this capacity for adjustment, we introduce the concept of \textit{adaptability} as a unified and practically grounded lens through which to evaluate the reliability of MARL algorithms under shifting conditions, broadly referring to any changes in the environment dynamics that may occur during learning or execution. Centred on the notion of adaptability, we propose a structured framework comprising three key dimensions: learning adaptability, policy adaptability, and scenario-driven adaptability. By adopting this adaptability perspective, we aim to support more principled assessments of MARL performance beyond narrowly defined benchmarks. Ultimately, this survey contributes to the development of algorithms that are better suited for deployment in dynamic, real-world multi-agent systems.

多机构强化学习(MARL)在模拟基准和受限情景下协调多种物剂方面显示出明显的实效,然而,在现实世界多剂系统中的部署仍然有限,这主要是由于这种环境的复杂性和动态性,这些挑战来自多种相互作用的变异来源,包括代理人数量波动、任务目标和执行条件不一致。这些因素共同要求MAR算法在不断变化的系统配置和业务需求下依然有效。为了更好地收集和评估这一调整能力,我们引入了\textit{adable}概念,作为统一和实际上有根的透镜,用以评价在变化条件下MARL算法的可靠性,大致是指学习或执行期间可能出现的环境动态变化。以适应性概念为中心,我们提出了一个结构框架,由三个关键方面组成:学习适应性、政策适应性和情景驱动的适应性。我们采用这种适应性观点,目的是支持对MARL业绩进行超出狭义基准的更有原则的评估。最终,这项调查有助于发展更适合动态、现实世界多剂系统中部署的算法。


Article 43

Title@2025-07-14 (1): Collaboration Promotes Group Resilience in Multi-Agent RL

Title: Collaboration Promotes Group Resilience in Multi-Agent RL Zusammenarbeit fördert Gruppenresistenz in Multi-Agent RL 协作促进多机构RL中的团体复原力 2111.06614v3

Authors (6): Ilai Shraga, Guy Azran, Matthias Gerstgrasser, Ofir Abu, Jeffrey S. Rosenschein, Sarah Keren

To effectively operate in various dynamic scenarios, RL agents must be resilient to unexpected changes in their environment. Previous work on this form of resilience has focused on single-agent settings. In this work, we introduce and formalize a multi-agent variant of resilience, which we term group resilience. We further hypothesize that collaboration with other agents is key to achieving group resilience; collaborating agents adapt better to environmental perturbations in multi-agent reinforcement learning (MARL) settings. We test our hypothesis empirically by evaluating different collaboration protocols and examining their effect on group resilience. Our experiments show that all the examined collaborative approaches achieve higher group resilience than their non-collaborative counterparts.

为了在各种动态情景下有效运行,RL代理机构必须具有抵御环境意外变化的复原力。以前关于这种形式的复原力的工作侧重于单一剂环境。在这项工作中,我们引入并正式确定一个多剂的复原力变体,我们称之为集体复原力。我们进一步假设与其他代理机构的合作是实现群体复原力的关键;合作代理机构在多剂强化学习环境中更好地适应环境扰动。我们通过评估不同的协作协议并审查其对集体复原力的影响,对我们的假设进行实证试验。我们的实验表明,所审查的所有合作方法都比非协作性对应方具有更高的集体复原力。


Article 44

Title@2025-07-14 (1): CoDe: A Cooperative and Decentralized Collision Avoidance Algorithm for Small-Scale UAV Swarms Considering Energy Efficiency

Title: CoDe: A Cooperative and Decentralized Collision Avoidance Algorithm for Small-Scale UAV Swarms Considering Energy Efficiency CoDe: Ein kooperativer und dezentralisierter Kollisionsvermeidungsalgorithmus für kleine UAV-Schwärmer unter Berücksichtigung der Energieeffizienz Code:考虑到能源效率的小型无人驾驶航空器的小型蜂群合作和分散协调避免费用等级 2204.08594v2

Authors (3): Shuangyao Huang, Haibo Zhang, Zhiyi Huang

This paper introduces a cooperative and decentralized collision avoidance algorithm (CoDe) for small-scale UAV swarms consisting of up to three UAVs. CoDe improves energy efficiency of UAVs by achieving effective cooperation among UAVs. Moreover, CoDe is specifically tailored for UAV’s operations by addressing the challenges faced by existing schemes, such as ineffectiveness in selecting actions from continuous action spaces and high computational complexity. CoDe is based on Multi-Agent Reinforcement Learning (MARL), and finds cooperative policies by incorporating a novel credit assignment scheme. The novel credit assignment scheme estimates the contribution of an individual by subtracting a baseline from the joint action value for the swarm. The credit assignment scheme in CoDe outperforms other benchmarks as the baseline takes into account not only the importance of a UAV’s action but also the interrelation between UAVs. Furthermore, extensive experiments are conducted against existing MARL-based and conventional heuristic-based algorithms to demonstrate the advantages of the proposed algorithm.

本文介绍了由最多三个无人驾驶航空器组成的小型无人驾驶航空器群合作和分散的避免碰撞算法(Code)。 Code通过实现无人驾驶航空器之间的有效合作提高了无人驾驶航空器的能源效率。此外, Code专门为无人驾驶航空器的运作量而设计,解决了现有办法面临的挑战,例如从连续行动空间选择行动效率低下和计算复杂程度高。 Code基于多点加强学习,通过纳入新的信用分配办法找到合作政策。新颖的信用分配办法估计了个人的贡献,从群落联合行动价值中减去基线。 Code中的信用分配办法超越了其他基准,因为基线不仅考虑到无人驾驶航空器行动的重要性,而且还考虑到无人驾驶航空器之间的相互关系。此外,还对现有基于MARL和传统的超值算法进行了广泛的实验,以证明拟议的算法的优点。


Article 45

Title@2025-07-14 (1): Improving monotonic optimization in heterogeneous multi-agent reinforcement learning with optimal marginal deterministic policy gradient

Title: Improving monotonic optimization in heterogeneous multi-agent reinforcement learning with optimal marginal deterministic policy gradient Verbesserung der monotonen Optimierung im heterogenen Multi-Agenten-Verstärkungslernen mit optimalem marginalen deterministischen politischen Gradienten 以最优化的边际确定性政策梯度,改进多元多剂强化学习中的单体优化 2507.09989v1

Authors (4): Xiaoyang Yu, Youfang Lin, Shuo Wang, Sheng Han

In heterogeneous multi-agent reinforcement learning (MARL), achieving monotonic improvement plays a pivotal role in enhancing performance. The HAPPO algorithm proposes a feasible solution by introducing a sequential update scheme, which requires independent learning with No Parameter-sharing (NoPS). However, heterogeneous MARL generally requires Partial Parameter-sharing (ParPS) based on agent grouping to achieve high cooperative performance. Our experiments prove that directly combining ParPS with the sequential update scheme leads to the policy updating baseline drift problem, thereby failing to achieve improvement. To solve the conflict between monotonic improvement and ParPS, we propose the Optimal Marginal Deterministic Policy Gradient (OMDPG) algorithm. First, we replace the sequentially computed $Q_{\psi}^s(s,a_{1:i})$ with the Optimal Marginal Q (OMQ) function $\phi_{\psi}^*(s,a_{1:i})$ derived from Q-functions. This maintains MAAD’s monotonic improvement while eliminating the conflict through optimal joint action sequences instead of sequential policy ratio calculations. Second, we introduce the Generalized Q Critic (GQC) as the critic function, employing pessimistic uncertainty-constrained loss to optimize different Q-value estimations. This provides the required Q-values for OMQ computation and stable baselines for actor updates. Finally, we implement a Centralized Critic Grouped Actor (CCGA) architecture that simultaneously achieves ParPS in local policy networks and accurate global Q-function computation. Experimental results in SMAC and MAMuJoCo environments demonstrate that OMDPG outperforms various state-of-the-art MARL baselines.

在多元多剂加固学习(MARL)中,实现单质改进在提高绩效方面发挥着关键作用。 HAPPO 算法提出可行的解决方案,通过引入一个顺序更新计划,这需要与无参数共享(NOPS)进行独立学习。然而,混合MARL通常需要部分参数共享(PARPS),以代理组为基础,实现高合作性能。我们的实验证明,将PARPS与顺序更新计划直接结合,导致政策更新基线漂移问题,从而无法实现改进。为了解决单质改进与 PARPS之间的冲突,我们建议采用最佳边际确定性政策分级(OMDPG) 。首先,我们用顺序计算 $psi(s,a1:i}(ParPARPS) 共享部分参数共享部分参数共享(PARPS) 来实现高合作性效性能。我们通过最佳联合行动序列序列计算来维持MAD的单质改进,同时消除冲突,而不是连续政策比重计算。第二,我们采用通用的QLMal-Malal-alalalal ASyal ASyal ASal ASal IM IM 需要,我们最终的C 需要实现稳定的C-cal-cal-cal-cal-cal-cal-al-al-al-al-al-al-alupalupalupal 需要,以我们C 需要实现C-cal-cal-cal-cal-cal-cal-cal-cal-salup算算算算算算出一个C。


Article 46

Title@2025-07-14 (1): AnalogTester: A Large Language Model-Based Framework for Automatic Testbench Generation in Analog Circuit Design

Title: AnalogTester: A Large Language Model-Based Framework for Automatic Testbench Generation in Analog Circuit Design AnalogTester: Ein großsprachiges modellbasiertes Framework für die automatische Testbench-Generierung im Analog Circuit Design 模拟试验者:在模拟电路设计中自动产生自动试验箱的大型语言示范框架 2507.09965v1

Authors (8): Weiyu Chen, Chengjie Liu, Wenhao Huang, Jinyang Lyu, Mingqian Yang, Yuan Du, Li Du, Jun Yang

Recent advancements have demonstrated the significant potential of large language models (LLMs) in analog circuit design. Nevertheless, testbench construction for analog circuits remains manual, creating a critical bottleneck in achieving fully automated design processes. Particularly when replicating circuit designs from academic papers, manual Testbench construction demands time-intensive implementation and frequent adjustments, which fails to address the dynamic diversity and flexibility requirements for automation. AnalogTester tackles automated analog design challenges through an LLM-powered pipeline: a) domain-knowledge integration, b) paper information extraction, c) simulation scheme synthesis, and d) testbench code generation with Tsinghua Electronic Design (TED). AnalogTester has demonstrated automated Testbench generation capabilities for three fundamental analog circuit types: operational amplifiers (op-amps), bandgap references (BGRs), and low-dropout regulators (LDOs), while maintaining a scalable framework for adaptation to broader circuit topologies. Furthermore, AnalogTester can generate circuit knowledge data and TED code corpus, establishing fundamental training datasets for LLM specialization in analog circuit design automation.

最近的进展表明,大型语言模型(LLMs)在模拟电路设计方面具有巨大的潜力,然而,模拟电路的测试模型建设仍然是手工的,为实现完全自动化的设计过程创造了一个关键的瓶颈。特别是在复制学术论文的电路设计时,人工Testbench建筑要求时间密集的实施和频繁的调整,这未能解决动态多样性和自动化的灵活性要求。模拟实验员通过LLM驱动管道解决自动模拟设计挑战:(a) 域知识整合,(b) 纸张信息提取,(c) 模拟计划合成,以及(d) 与Tsinghua电子设计(TED)的测试代码生成。AnalogTester展示了三种基本模拟电路类型(操作放大器(Op-amps)、带宽参考(BGR)和低滴调调调调调调调调调调调调(LDOs)的自动测试生成能力,同时保持一个可扩展的框架,以适应更广泛的电路图学。此外,AnalogTester能够生成电路知识数据和TED代码,为模拟电路设计自动化的LM专业建立基本培训数据集。


Article 47

Title@2025-07-14 (1): Large Population Models

Title: Large Population Models Große Bevölkerungsmodelle 大型人口模式 2507.09901v1

Authors (1): Ayush Chopra

Many of society’s most pressing challenges, from pandemic response to supply chain disruptions to climate adaptation, emerge from the collective behavior of millions of autonomous agents making decisions over time. Large Population Models (LPMs) offer an approach to understand these complex systems by simulating entire populations with realistic behaviors and interactions at unprecedented scale. LPMs extend traditional modeling approaches through three key innovations: computational methods that efficiently simulate millions of agents simultaneously, mathematical frameworks that learn from diverse real-world data streams, and privacy-preserving communication protocols that bridge virtual and physical environments. This allows researchers to observe how agent behavior aggregates into system-level outcomes and test interventions before real-world implementation. While current AI advances primarily focus on creating “digital humans” with sophisticated individual capabilities, LPMs develop “digital societies” where the richness of interactions reveals emergent phenomena. By bridging individual agent behavior and population-scale dynamics, LPMs offer a complementary path in AI research illuminating collective intelligence and providing testing grounds for policies and social innovations before real-world deployment. We discuss the technical foundations and some open problems here. LPMs are implemented by the AgentTorch framework (github.com/AgentTorch/AgentTorch)

大型人口模型(LPMs)提供了一种理解这些复杂系统的方法,其方法是以前所未有的规模以现实行为和互动方式模拟整个人口,从而以现实行为和互动方式模拟整个人口。LPMs通过三个关键创新扩展传统模型方法:高效同时模拟数百万代理物的计算方法、从不同现实世界数据流中学习的数学框架、连接虚拟和物理环境的隐私保护通信协议。这使得研究人员能够观察代理物行为集合如何成为系统一级的结果和在现实世界实施之前测试干预措施。目前AI(LPMs)主要侧重于创造具有先进个人能力的“数字人 ” ,而LPMs则发展“数字社会 ” , 其互动的丰富性揭示了突发现象。LPMs通过连接个体代理物行为和人口规模动态,为AI研究提供了一条互补路径,在现实世界部署之前启发集体情报并为政策和社会创新提供测试基础。我们讨论了技术基础和一些开放的问题。LPMPMs在这里由TARch/TORCH(G)框架(G)实施。


Article 48

Title@2025-07-14 (1): Multi-residual Mixture of Experts Learning for Cooperative Control in Multi-vehicle Systems

Title: Multi-residual Mixture of Experts Learning for Cooperative Control in Multi-vehicle Systems Multi-Residual Mixture of Experts Learning for Cooperative Control in Multi-Vehicle Systems 多车辆系统合作控制专家学习 2507.09836v1

Authors (4): Vindula Jayawardana, Sirui Li, Yashar Farid, Cathy Wu

Autonomous vehicles (AVs) are becoming increasingly popular, with their applications now extending beyond just a mode of transportation to serving as mobile actuators of a traffic flow to control flow dynamics. This contrasts with traditional fixed-location actuators, such as traffic signals, and is referred to as Lagrangian traffic control. However, designing effective Lagrangian traffic control policies for AVs that generalize across traffic scenarios introduces a major challenge. Real-world traffic environments are highly diverse, and developing policies that perform robustly across such diverse traffic scenarios is challenging. It is further compounded by the joint complexity of the multi-agent nature of traffic systems, mixed motives among participants, and conflicting optimization objectives subject to strict physical and external constraints. To address these challenges, we introduce Multi-Residual Mixture of Expert Learning (MRMEL), a novel framework for Lagrangian traffic control that augments a given suboptimal nominal policy with a learned residual while explicitly accounting for the structure of the traffic scenario space. In particular, taking inspiration from residual reinforcement learning, MRMEL augments a suboptimal nominal AV control policy by learning a residual correction, but at the same time dynamically selects the most suitable nominal policy from a pool of nominal policies conditioned on the traffic scenarios and modeled as a mixture of experts. We validate MRMEL using a case study in cooperative eco-driving at signalized intersections in Atlanta, Dallas Fort Worth, and Salt Lake City, with real-world data-driven traffic scenarios. The results show that MRMEL consistently yields superior performance-achieving an additional 4%-9% reduction in aggregate vehicle emissions relative to the strongest baseline in each setting.

自主车辆(AV)越来越受欢迎,其应用现在超越了运输模式,而是作为交通流量流动的移动驱动器来控制流动动态。这与传统的固定位置驱动器(如交通信号)形成对照,并被称为Lagrangian交通控制。然而,为AV设计有效的Lagrangian交通控制政策,这种政策在各种交通情景中普遍化,带来了重大挑战。现实世界交通环境非常多样化,制定政策在这种不同的交通情景中强有力地发挥作用是困难的。由于交通系统多试剂性质、参与者之间动机混杂、以及优化目标相互冲突,在严格的物理和外部限制下,这与传统的固定位置驱动器形成对比。为了应对这些挑战,我们引入了多功能的专家学习多功能混合组合混合组合组合组合组合组合组合(MMRMEL),这是拉格朗加交通控制的新框架,这一框架以学习到的残余残余残留残留残留残余物,同时明确计算交通情景空间的结构。特别是,从剩余强化学习的激励中,MRMEL, 不断提高名义AV控制低位模式,从最优化的AV控制政策,在最接近的物理和外部的运行的市级的汇率模型中,通过学习最高级的压压压压压压压的压的压压压压压压的压的压的压的压的压压压压的压的压压压压压压压压压压的压的压的压的压的压的压的压的车辆政策。


Article 49

Title@2025-07-13 (7): TinyTroupe: An LLM-powered Multiagent Persona Simulation Toolkit

Title: TinyTroupe: An LLM-powered Multiagent Persona Simulation Toolkit TinyTroupe: Ein LLM-powered Multiagent Persona Simulation Toolkit TiniyTrouppe:一个由LLM驱动的多剂人模拟工具包 2507.09788v1

Authors (6): Paulo Salem, Robert Sim, Christopher Olsen, Prerit Saxena, Rafael Barcelos, Yi Ding

Recent advances in Large Language Models (LLM) have led to a new class of autonomous agents, renewing and expanding interest in the area. LLM-powered Multiagent Systems (MAS) have thus emerged, both for assistive and simulation purposes, yet tools for realistic human behavior simulation – with its distinctive challenges and opportunities – remain underdeveloped. Existing MAS libraries and tools lack fine-grained persona specifications, population sampling facilities, experimentation support, and integrated validation, among other key capabilities, limiting their utility for behavioral studies, social simulation, and related applications. To address these deficiencies, in this work we introduce TinyTroupe, a simulation toolkit enabling detailed persona definitions (e.g., nationality, age, occupation, personality, beliefs, behaviors) and programmatic control via numerous LLM-driven mechanisms. This allows for the concise formulation of behavioral problems of practical interest, either at the individual or group level, and provides effective means for their solution. TinyTroupe’s components are presented using representative working examples, such as brainstorming and market research sessions, thereby simultaneously clarifying their purpose and demonstrating their usefulness. Quantitative and qualitative evaluations of selected aspects are also provided, highlighting possibilities, limitations, and trade-offs. The approach, though realized as a specific Python implementation, is meant as a novel conceptual contribution, which can be partially or fully incorporated in other contexts. The library is available as open source at https://github.com/microsoft/tinytroupe.

大型语言模型(LLM)的最近进展导致新的自主代理人类别,更新和扩大对该领域的兴趣。因此,LLM驱动的多试剂系统(MAS)已经出现,既用于辅助性和模拟目的,也用于模拟目的,然而现实的人类行为模拟工具 – – 及其独特的挑战和机遇 – – 仍然不发达;现有的MAS图书馆和工具缺乏细微的个性化规格、人口抽样设施、实验支持和综合验证等关键能力,限制了其对行为研究、社会模拟和相关应用的实用性。为了解决这些缺陷,我们在这项工作中引入了TinyTroup(TinTyyTroup),这是一个模拟工具包,能够通过LLM驱动的各种机制提供详细的个人定义(例如国籍、年龄、职业、个性、信仰、行为)和方案控制。这便于在个人或群体一级对实际感兴趣的行为问题作出简明扼要的描述,并提供了有效的解决手段。TinyTrouperoupe部分使用了具有代表性的工作实例,如集思广益/市场研究会议,从而澄清其宗旨并展示其实用性,同时展示其实用性。量化和定性评估,虽然在特定概念上也意味着了某些方面,但贸易也是一种可实现的可能性,但贸易,但也是完全的。


Article 50

Title@2025-07-13 (7): Negotiating Comfort: Simulating Personality-Driven LLM Agents in Shared Residential Social Networks

Title: Negotiating Comfort: Simulating Personality-Driven LLM Agents in Shared Residential Social Networks Verhandeln von Komfort: Simulieren von Persönlichkeits-getriebenen LLM-Agenten in Shared Residential Social Networks 谈判舒适:在共享住宅社会网络中模拟个性驱动的LLM代理 2507.09657v1

Authors (3): Ann Nedime Nese Rende, Tolga Yilmaz, Özgür Ulusoy

We use generative agents powered by large language models (LLMs) to simulate a social network in a shared residential building, driving the temperature decisions for a central heating system. Agents, divided into Family Members and Representatives, consider personal preferences, personal traits, connections, and weather conditions. Daily simulations involve family-level consensus followed by building-wide decisions among representatives. We tested three personality traits distributions (positive, mixed, and negative) and found that positive traits correlate with higher happiness and stronger friendships. Temperature preferences, assertiveness, and selflessness have a significant impact on happiness and decisions. This work demonstrates how LLM-driven agents can help simulate nuanced human behavior where complex real-life human simulations are difficult to set.

我们使用由大型语言模型(LLMs)驱动的基因代理器,在共用住宅楼模拟社会网络,推动中央供暖系统的温度决定。代理器分为家庭成员和代表,考虑个人偏好、个人特征、连接和天气条件。日常模拟涉及家庭一级的共识,随后代表之间形成全方位的决定。我们测试了三种个性特征分布(积极的、混合的和消极的),发现正面特征与更高幸福度和更牢固的友谊相关。温度偏好、自信和无私对幸福和决定有重大影响。这项工作表明LLM驱动的代理器如何在复杂的现实人类模拟难以设定的情况下帮助模拟细微人类行为。


Article 51

Title@2025-07-13 (7): VFlow: Discovering Optimal Agentic Workflows for Verilog Generation

Title: VFlow: Discovering Optimal Agentic Workflows for Verilog Generation VFlow: Optimale Agentische Workflows für die Verilog-Generation entdecken VFlow: 为维利罗格生成发现最佳样本工作流程 2504.03723v2

Authors (6): Yangbo Wei, Zhen Huang, Huang Li, Wei W. Xing, Ting-Jung Lin, Lei He

Hardware design automation faces challenges in generating high-quality Verilog code efficiently. This paper introduces VFlow, an automated framework that optimizes agentic workflows for Verilog code generation. Unlike traditional approaches relying on fixed prompts or manually designed flows, VFlow treats workflow discovery as a search over graph-structured LLM invocation sequences. It introduces a multi-population cooperative evolution (CEPE-MCTS) algorithm that balances multiple hardware objectives – functional correctness, area, power, timing and token cost – while sharing successful patterns and avoiding repeated failures. Integrated multi-level verification ensures syntactic correctness, functional behavior, and synthesizability. Experiments on VerilogEval and RTLLM2.0 show VFlow improves pass@1 by 20–30\% over prompting baselines and closely matches designer-level area/power. Remarkably, VFlow enables small LLMs to outperform larger models with up to 10.9$\times$ ROI, offering a cost-effective solution for RTL design. This work paves the way for intelligent, automated hardware development, advancing LLM applications in EDA.

硬件设计自动化在高效生成高质量 Verilog 代码方面面临挑战。 本文介绍了VFlow, 这是一个优化 Verilog 代码生成的代理工作流程的自动化框架。 与依赖固定提示或人工设计的流程的传统方法不同, VFlow 将工作流程发现视为对图形结构LLM 设定序列的搜索。 它引入了多人口合作演进( CEEE- MCTS)算法, 平衡多种硬件目标 – – 功能正确性、 面积、 功率、 时间和象征性成本 – – 同时共享成功模式并避免重复失败。 综合多层次的核查确保了合成正确性、 功能行为和同步性。 VFlow 与 依赖固定提示或人工设计的流程的传统方法不同, VFlow 将工作流程的发现视为对图形结构化LLMM 2. 0 的扩展1 20- 30 ++ 与设计师级区域/ 能力密切匹配。 值得注意的是, VFlow 使小型LMM 能够超越大型模型, 高达10.9\ times ROI, 为RTLL 设计提供成本有效的解决方案。 。 。 。这项工作为智能、自动化硬件开发铺路铺路。


Article 52

Title@2025-07-13 (7): It’s Not All Black and White: Degree of Truthfulness for Risk-Avoiding Agents

Title: It’s Not All Black and White: Degree of Truthfulness for Risk-Avoiding Agents Es ist nicht alles schwarz und weiß: Grad von Wahrhaftigkeit für risikovermeidende Agenten 并非全黑白:风险避险剂的真实程度 2502.18805v2

Authors (3): Eden Hartman, Erel Segal-Halevi, Biaoshuai Tao

The classic notion of \emph{truthfulness} requires that no agent has a profitable manipulation – an untruthful report that, for \emph{some} combination of reports of the other agents, increases her utility. This strong notion implicitly assumes that the manipulating agent either knows what all other agents are going to report, or is willing to take the risk and act as-if she knows their reports. Without knowledge of the others’ reports, most manipulations are \emph{risky} – they might decrease the manipulator’s utility for some other combinations of reports by the other agents. Accordingly, a recent paper (Bu, Song and Tao, ``On the existence of truthful fair cake cutting mechanisms’’, Artificial Intelligence 319 (2023), 103904) suggests a relaxed notion, which we refer to as \emph{risk-avoiding truthfulness (RAT)}, which requires only that no agent can gain from a \emph{safe} manipulation – one that is sometimes beneficial and never harmful. Truthfulness and RAT are two extremes: the former considers manipulators with complete knowledge of others, whereas the latter considers manipulators with no knowledge at all. In reality, agents often know about some – but not all – of the other agents. This paper introduces the \emph{RAT-degree} of a mechanism, defined as the smallest number of agents whose reports, if known, may allow another agent to safely manipulate, or $n$ if there is no such number. This notion interpolates between classic truthfulness (degree $n$) and RAT (degree at least $1$): a mechanism with a higher RAT-degree is harder to manipulate safely. To illustrate the generality and applicability of this concept, we analyze the RAT-degree of prominent mechanisms across various social choice settings, including auctions, indivisible goods allocations, cake-cutting, voting, and two-sided matching.

经典的 \ emph{ truthfulity 概念要求任何代理商都没有盈利性的操纵 – – 一个不真实的报告,对于其他代理商的报告组合来说,这个不真实的报告会增加她的效用。这个强烈的概念暗含地假设操纵代理商知道所有其他代理商要报告什么,或者如果她知道其他代理商的报告,愿意承担风险和行为。在不了解其他代理商的报告的情况下,多数操纵是 emph{risky} – –它们可能会降低操纵商对其它代理商报告的其他组合的 一种不真实的操作商的实用性。因此,最近的一份论文(布、宋和道,“关于真实公平蛋糕切割机制的存在,人工智能情报319(2023),103904) 暗示了一个宽松的概念,我们称之为“emph{ 风险- 肯定真实性(RAT) ” , 这只要求任何代理商不能从一个 development $ (e) or developmentality) work work (有时有利而且永远不会有害。真相和RAT 是两个极端的极端 : 前的操纵商认为, 而前的操纵商往往知道其他的操纵商的精调控算, 而后, 也认为其他的精调。


Article 53

Title@2025-07-13 (7): Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy

Title: Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy Kann eine Gesellschaft Generativer Mittel menschliches Verhalten simulieren und die öffentliche Gesundheitspolitik informieren? 基因代理学会能够模拟人类行为和信息公共卫生政策吗? 疫苗安全案例研究 2503.09639v4

Authors (9): Abe Bohan Hou, Hongru Du, Yichen Wang, Jingyu Zhang, Zixiao Wang, Paul Pu Liang, Daniel Khashabi, Lauren Gardner, Tianxing He

Can we simulate a sandbox society with generative agents to model human behavior, thereby reducing the over-reliance on real human trials for assessing public policies? In this work, we investigate the feasibility of simulating health-related decision-making, using vaccine hesitancy, defined as the delay in acceptance or refusal of vaccines despite the availability of vaccination services (MacDonald, 2015), as a case study. To this end, we introduce the VacSim framework with 100 generative agents powered by Large Language Models (LLMs). VacSim simulates vaccine policy outcomes with the following steps: 1) instantiate a population of agents with demographics based on census data; 2) connect the agents via a social network and model vaccine attitudes as a function of social dynamics and disease-related information; 3) design and evaluate various public health interventions aimed at mitigating vaccine hesitancy. To align with real-world results, we also introduce simulation warmup and attitude modulation to adjust agents’ attitudes. We propose a series of evaluations to assess the reliability of various LLM simulations. Experiments indicate that models like Llama and Qwen can simulate aspects of human behavior but also highlight real-world alignment challenges, such as inconsistent responses with demographic profiles. This early exploration of LLM-driven simulations is not meant to serve as definitive policy guidance; instead, it serves as a call for action to examine social simulation for policy development.

我们能否模拟沙箱社会,以基因化剂模拟人类行为,从而减少过度依赖真实的人类试验来评估公共政策?在这项工作中,我们调查利用疫苗偏执(即尽管有疫苗服务,但疫苗接受或拒绝疫苗方面的延误)来模拟沙箱社会,以模拟人类行为,从而减少对真实人类行为的过度依赖;为此,我们采用VacSim框架,以100种基因化剂来模拟人类行为,由大语言模型(LLLMs)提供动力;VacSim模拟疫苗政策结果,采取以下步骤:(1) 即时根据普查数据对具有人口统计特征的代理人口群体进行人口测试;(2) 利用社会网络和疫苗模式的态度,作为社会动态和疾病相关信息的函数,将这些代理者联系起来;(3) 设计和评价各种旨在减轻疫苗偏差的公共卫生干预措施(MacDonald,2015年),作为案例研究。 为了与现实世界的号召,我们还采用模拟暖和态度调节,以调整各种LM模拟的可靠性。 实验表明,Llamama和Qwen等模型通过社交网络和模型将模式作为社会行为作为最终的定位,可以模拟,作为人类行为的模拟,用来模拟,用来模拟,从而模拟,从而模拟地模拟地模拟地模拟地模拟地分析人类行为。


Article 54

Title@2025-07-12 (6): Adaptive Social Learning using Theory of Mind

Title: Adaptive Social Learning using Theory of Mind Adaptives Soziallernen unter Verwendung der Geistestheorie 利用思想理论进行适应性社会学习 2507.09409v1

Authors (4): Lance Ying, Ryan Truong, Joshua B. Tenenbaum, Samuel J. Gershman

Social learning is a powerful mechanism through which agents learn about the world from others. However, humans don’t always choose to observe others, since social learning can carry time and cognitive resource costs. How do people balance social and non-social learning? In this paper, we propose a rational mentalizing model of the decision to engage in social learning. This model estimates the utility of social learning by reasoning about the other agent’s goal and the informativity of their future actions. It then weighs the utility of social learning against the utility of self-exploration (non-social learning). Using a multi-player treasure hunt game, we show that our model can quantitatively capture human trade-offs between social and non-social learning. Furthermore, our results indicate that these two components allow agents to flexibly apply social learning to achieve their goals more efficiently.

社会学习是一种强大的机制,通过这种机制,代理商可以向他人了解世界。然而,人类并不总是选择观察他人,因为社会学习可以带来时间和认知资源成本。人们如何平衡社会学习和非社会学习?在本文中,我们提出了一个理性的思维模式,决定进行社会学习。这个模式通过推理另一个代理商的目标来估计社会学习的效用,以及他们未来行动的不通俗性。然后,它衡量社会学习的效用与自我探索(非社会学习)的效用。我们用多玩家的寻宝游戏来显示,我们的模型可以量化地捕捉人类在社会学习和非社会学习之间的取舍。此外,我们的结果表明,这两个组成部分允许代理商灵活应用社会学习来更有效地实现其目标。


Article 55

Title@2025-07-12 (6): StockSim: A Dual-Mode Order-Level Simulator for Evaluating Multi-Agent LLMs in Financial Markets

Title: StockSim: A Dual-Mode Order-Level Simulator for Evaluating Multi-Agent LLMs in Financial Markets StockSim: Ein Dual-Mode Order-Level Simulator zur Bewertung von Multi-Agent LLMs in Finanzmärkten StockSim: 金融市场多方商家LMS评估双Mo级命令级模拟器 2507.09255v1

Authors (6): Charidimos Papadakis, Giorgos Filandrianos, Angeliki Dimitriou, Maria Lymperaiou, Konstantinos Thomas, Giorgos Stamou

We present StockSim, an open-source simulation platform for systematic evaluation of large language models (LLMs) in realistic financial decision-making scenarios. Unlike previous toolkits that offer limited scope, StockSim delivers a comprehensive system that fully models market dynamics and supports diverse simulation modes of varying granularity. It incorporates critical real-world factors, such as latency, slippage, and order-book microstructure, that were previously neglected, enabling more faithful and insightful assessment of LLM-based trading agents. An extensible, role-based agent framework supports heterogeneous trading strategies and multi-agent coordination, making StockSim a uniquely capable testbed for NLP research on reasoning under uncertainty and sequential decision-making. We open-source all our code at https: //github.com/harrypapa2002/StockSim.

我们介绍了StockSim(StockSim),这是一个在现实的财政决策情景中系统评价大型语言模型的开放源码模拟平台;与以往提供有限范围的工具包不同,StockSim提供了一个全面系统,充分模拟市场动态,支持不同颗粒的多种模拟模式;它包含以前被忽视的关键性现实世界因素,如潜伏、缓冲和订货簿微观结构,使得能够对以LLLM为基础的贸易代理人进行更忠实和有见地的评估;一个可扩展的、基于作用的代理框架支持多种贸易战略和多剂协调,使StockSim成为NLP关于不确定性和顺序决策下推理研究的独特能力测试台;我们在https:/github.com/harrypa2002/StockSim公开了我们所有的代码。


Article 56

Title@2025-07-12 (6): Coordinated Communication and Inventory Optimization in Multi-Retailer Supply Chains

Title: Coordinated Communication and Inventory Optimization in Multi-Retailer Supply Chains Koordinierte Kommunikation und Bestandsoptimierung in Multi-Retailer Supply Chains 多零售供应链中协调通信和库存协调优化 2507.09223v1

Authors (2): Sagar Sudhakara, Yuchong Zhang

We consider a multi-retailer supply chain where each retailer can dynamically choose when to share information (e.g., local inventory levels or demand observations) with other retailers, incurring a communication cost for each sharing event. This flexible information exchange mechanism contrasts with fixed protocols such as always sharing or never sharing. We formulate a joint optimization of inventory control and communication strategies, aiming to balance the trade-off between communication overhead and operational performance (service levels, holding, and stockout costs). We adopt a common information framework and derive a centralized Partially Observable Markov Decision Process (POMDP) model for a supply chain coordinator. Solving this coordinator’s POMDP via dynamic programming characterizes the structure of optimal policies, determining when retailers should communicate and how they should adjust orders based on available information. We show that, in this setting, retailers can often act optimally by sharing only limited summaries of their private data, reducing communication frequency without compromising performance. We also incorporate practical constraints on communication frequency and propose an approximate point-based POMDP solution method (PBVI/SARSOP) to address computational complexity. Numerical experiments on multi-retailer inventory scenarios demonstrate that our approach significantly improves the cost-service trade-off compared to static information sharing policies, effectively optimizing the schedule of information exchange for cooperative inventory control.

我们考虑建立一个多零售商供应链,使每个零售商能够动态地选择何时与其他零售商分享信息(例如,当地库存水平或需求观察),为每个共享活动带来通信费用。这种灵活的信息交流机制与固定协议形成对比,例如总是共享或永不共享。我们制定联合优化库存控制和通信战略,目的是平衡通信间接费用和业务业绩(服务水平、持有和库存成本)之间的权衡。我们采用一个共同的信息框架,并为供应链协调员推出一个集中的局部可观测的马尔科夫决定程序(POMDP)模式。通过动态的方案编制解决该协调员的POMDP,是最佳政策结构的特点,决定零售商应何时沟通,以及他们应如何根据现有信息调整订单。我们表明,在这一背景下,零售商通常能够最佳地采取行动,只分享有限的私人数据摘要,降低通信频率,同时不损害绩效。我们还纳入了通信频率的实际限制,并提出一种近乎点的POMDP解决方案(PBVI/SARSOP)模式,以解决计算成本的复杂性。关于共享多定期信息交易计划的合作性实验,以大幅改进我们共享的统计式交易时间表,以展示我们的合作性交易列表式交易计划。


Article 57

Title@2025-07-11 (5): Accelerating Drug Discovery Through Agentic AI: A Multi-Agent Approach to Laboratory Automation in the DMTA Cycle

Title: Accelerating Drug Discovery Through Agentic AI: A Multi-Agent Approach to Laboratory Automation in the DMTA Cycle Beschleunigen der Wirkstoff-Discovery durch Agentic AI: Multi-Agenten-Ansatz zur Laborautomatisierung im DMTA-Zyklus AI:对DMTTA周期实验室自动化采取多机构办法 2507.09023v1

Authors (12): Yao Fehlis, Charles Crain, Aidan Jensen, Michael Watson, James Juhasz, Paul Mandel, Betty Liu, Shawn Mahon, Daren Wilson, Nick Lynch-Jonely, Ben Leedom, David Fuller

The pharmaceutical industry faces unprecedented challenges in drug discovery, with traditional approaches struggling to meet modern therapeutic development demands. This paper introduces a novel AI framework, Tippy, that transforms laboratory automation through specialized AI agents operating within the Design-Make-Test-Analyze (DMTA) cycle. Our multi-agent system employs five specialized agents - Supervisor, Molecule, Lab, Analysis, and Report, with Safety Guardrail oversight - each designed to excel in specific phases of the drug discovery pipeline. Tippy represents the first production-ready implementation of specialized AI agents for automating the DMTA cycle, providing a concrete example of how AI can transform laboratory workflows. By leveraging autonomous AI agents that reason, plan, and collaborate, we demonstrate how Tippy accelerates DMTA cycles while maintaining scientific rigor essential for pharmaceutical research. The system shows significant improvements in workflow efficiency, decision-making speed, and cross-disciplinary coordination, offering a new paradigm for AI-assisted drug discovery.

制药业在药物发现方面面临着前所未有的挑战,传统方法在努力满足现代治疗发展需求。本文件介绍一个新的AI框架Tippy,通过在设计-制造-测试-分析(DMTA)周期内运作的专门的AI代理机构改造实验室自动化。我们的多试剂系统雇用了5个专业代理机构 — — 主管、分子、实验室、分析和报告,由安全卫士监督,每个监督机构都旨在在药物发现管道的特定阶段取得优异成绩。Tippy是首次为DMTA周期自动化而实施专门的AI代理机构,为AI如何改变实验室工作流程提供了具体的范例。我们通过利用自主的AI代理机构来解释、规划和合作,我们展示了Tippy如何加速DMTA周期,同时保持药物研究所必需的科学规范。这个系统在工作流程效率、决策速度和跨学科协调方面有了显著的改进,为AI辅助药物发现提供了新的范例。


Article 58

Title@2025-07-11 (5): Equilibria in multiagent online problems with predictions

Title: Equilibria in multiagent online problems with predictions Equilibria in Multiagent Online-Probleme mit Vorhersagen 多试剂在线预测问题中的平衡 2405.11873v3

Authors (3): Gabriel Istrate, Cosmin Bonchiş, Victor Bogdan

We study the power of (competitive) algorithms with predictions in a multiagent setting. To this goal, we introduce a multiagent version of the ski-rental problem. In this problem agents can collaborate by pooling resources to get a group license for some asset. If the license price is not met then agents have to rent the asset individually for the day at a unit price. Otherwise the license becomes available forever to everyone at no extra cost. We investigate the effect of using predictors for self and others’ behavior in such a setting, as well as the new equilibria formed in this way.

我们研究(竞争性)算法在多试剂环境下的预测能力。 为此, 我们引入了滑雪租赁问题的多试剂版本。 在此问题上, 问题代理商可以通过汇集资源获得某些资产的集体许可来合作。 如果不符合许可价格, 代理商必须按单位价格单独租赁一天的资产。 否则, 许可证将永远免费向所有人提供。 我们调查在这种环境下使用预测器进行自我行为和他人行为的效果, 以及由此形成的新的平衡。


Article 59

Title@2025-07-11 (5): How to Train a Leader: Hierarchical Reasoning in Multi-Agent LLMs

Title: How to Train a Leader: Hierarchical Reasoning in Multi-Agent LLMs Wie man einen Führer ausbildet: Hierarchische Vernunft in multi-agenten LLMs 如何培训领导者:多机构LLM中的等级原因 2507.08960v1

Authors (4): Andrew Estornell, Jean-Francois Ton, Muhammad Faaiz Taufiq, Hang Li

Large Language Models (LLMs) have achieved strong performance on a wide range of complex reasoning tasks, yet further gains are often possible by leveraging the complementary strengths of multiple models. While multi-agent frameworks can improve solution quality by leveraging multiple LLMs, existing methods are often computationally expensive, both at training and inference time. In this work, we introduce a hierarchical multi-agent framework that addresses these challenges by training only a single leader LLM to coordinate a team of untrained peer agents. To this end, we propose Multi-agent guided Leader Policy \textbf{O}ptimization (MLPO), a novel approach which trains the leader to evaluate and synthesize agent responses without auxiliary value networks or explicit agent feedback. Leaders trained with MLPO exhibit improved performance not only when interacting with the agent team at inference time, but also enjoy improved performance when deployed in single-agent settings without the team. Empirical results on Big-Bench Hard (BBH), MATH, and MMLU demonstrate that our framework achieves substantial performance improvements over both single-agent and multi-agent baselines. Our results highlight the effectiveness and efficiency of training a single, flexible leader for collaborative reasoning in multi-agent LLM systems.

大型语言模型(LLMS)在一系列复杂的推理任务中取得了强有力的成绩,但通过利用多种模型的互补优势,往往能够取得更大的收益。虽然多试剂框架能够通过利用多个LLMS来提高解决方案的质量,但现有的方法在培训和推论时间往往在计算上都非常昂贵。在这项工作中,我们引入了一个等级分级的多试剂框架来应对这些挑战,只培训一个领导LLMM来协调一组未经培训的同行代理人。为此,我们提议多试剂指导领导政策(Textbf{O}propimization),这是一种新颖的方法,培训领导人在没有辅助价值网络或明确代理商反馈的情况下评价和综合代理方反应。与MLPO培训的领导人不仅在推论时间与代理团队互动时提高了业绩,而且在没有小组的情况下在单一代理环境中部署时也提高了业绩。BBench Hard(BBHH)、MATH和MLU的实证结果表明,我们的框架在单一试剂和多试剂基线上都取得了重大的业绩改进。我们的成果突出表明了培训一名单一、灵活的LM领导在多边推理系统上的有效性和效率。


Article 60

Title@2025-07-11 (5): Optimizing Sequential Multi-Step Tasks with Parallel LLM Agents

Title: Optimizing Sequential Multi-Step Tasks with Parallel LLM Agents Optimierung sequentieller Mehrschritt-Aufgaben mit parallelen LLM-Agenten 与平行LLM代理商优化序列式多步骤任务 2507.08944v1

Authors (6): Enhao Zhang, Erkang Zhu, Gagan Bansal, Adam Fourney, Hussein Mozannar, Jack Gerrits

Large language model (LLM)-based multi-agent systems have demonstrated remarkable promise for tackling complex tasks by breaking them down into subtasks that are iteratively planned, executed, observed, and refined. Despite their effectiveness, these systems often incur high latency because real-world problems frequently demand multiple iterative cycles of reasoning steps. To address this challenge, we propose M1-Parallel, a framework that concurrently runs multiple multi-agent teams in parallel to uncover distinct solution paths. By leveraging an event-driven communication model with asynchronous messaging, M1-Parallel efficiently capitalizes on the inherent diversity of valid plans to either reduce end-to-end latency or boost task completion rates. Our experiments on complex tasks show that M1-Parallel with early termination achieves up to $2.2\times$ speedup while preserving accuracy, and that M1-Parallel with aggregation yields higher task completion rates. We further investigate strategies aimed at encouraging diverse execution plans but observe no additional performance gains over repeated sampling. Overall, these findings underscore the potential of parallel plan execution for optimizing multi-agent systems for real-world, high-complexity reasoning tasks.

大型语言模式(LLM)的大型多试剂系统显示,通过将这些系统拆成可反复规划、执行、观察和完善的子任务,在应对复杂任务方面表现出非凡的希望。尽管这些系统是有效的,但它们往往具有很高的潜伏性,因为现实世界的问题经常要求多个反复循环的推理步骤。为了应对这一挑战,我们提议M1-Parallel,这个框架同时运行多个多试剂小组,以找出不同的解决办法。通过利用由事件驱动的通信模式,以无同步信息、M1-Parallel,有效地利用有效计划的内在多样性来降低终端到终端的延时率或提高任务完成率。我们在复杂任务方面的实验表明,M1-Parallel在保持准确性的同时,能够达到22美元的速度,而M1-Parallel则能够带来更高的任务完成率。我们进一步调查旨在鼓励不同执行计划的战略,但在重复抽样后没有看到额外的业绩收益。总体而言,这些结论强调了为现实世界、高复合性推理学任务优化多试系统而平行执行计划的潜力。


Article 61

Title@2025-07-11 (5): Experimental Setup and Software Pipeline to Evaluate Optimization based Autonomous Multi-Robot Search Algorithms

Title: Experimental Setup and Software Pipeline to Evaluate Optimization based Autonomous Multi-Robot Search Algorithms Experimentelle Einrichtung und Software-Pipeline zur Bewertung von Optimierungs-basierten autonomen Multi-Roboter-Suche Algorithmen 实验设置和软件管道以评价基于优化的自动多机器人搜索算法 2506.16710v3

Authors (5): Aditya Bhatt, Mary Katherine Corra, Franklin Merlo, Prajit KrisshnaKumar, Souma Chowdhury

Signal source localization has been a problem of interest in the multi-robot systems domain given its applications in search & rescue and hazard localization in various industrial and outdoor settings. A variety of multi-robot search algorithms exist that usually formulate and solve the associated autonomous motion planning problem as a heuristic model-free or belief model-based optimization process. Most of these algorithms however remains tested only in simulation, thereby losing the opportunity to generate knowledge about how such algorithms would compare/contrast in a real physical setting in terms of search performance and real-time computing performance. To address this gap, this paper presents a new lab-scale physical setup and associated open-source software pipeline to evaluate and benchmark multi-robot search algorithms. The presented physical setup innovatively uses an acoustic source (that is safe and inexpensive) and small ground robots (e-pucks) operating in a standard motion-capture environment. This setup can be easily recreated and used by most robotics researchers. The acoustic source also presents interesting uncertainty in terms of its noise-to-signal ratio, which is useful to assess sim-to-real gaps. The overall software pipeline is designed to readily interface with any multi-robot search algorithm with minimal effort and is executable in parallel asynchronous form. This pipeline includes a framework for distributed implementation of multi-robot or swarm search algorithms, integrated with a ROS (Robotics Operating System)-based software stack for motion capture supported localization. The utility of this novel setup is demonstrated by using it to evaluate two state-of-the-art multi-robot search algorithms, based on swarm optimization and batch-Bayesian Optimization (called Bayes-Swarm), as well as a random walk baseline.

信号源本地化一直是多机器人系统域中一个令人感兴趣的问题,因为它在各种工业和室外环境中应用了搜索和救援以及危险本地化等应用程序,因此对多机器人系统域产生了兴趣。 存在多种多机器人搜索算法,这些算法通常会将相关的自主动作规划问题作为无超自然模型或基于信仰的模型优化程序来制定和解决。 这些算法大多只是模拟测试,从而失去了在搜索性能和实时计算性能方面如何在真实的战时物理环境中比较/调控的知识。 为解决这一差距,本文展示了一个新的实验室级物理设置和相关的开源的开源软件管道管道以评价和基准多机器人搜索算法。 所展示的物理设置创新使用一个声学源(这是安全和廉价的)和小型地面机器人(e-pucks)在一个标准的运动封套件环境中运作。 这种设置很容易被重新创造出来,并被多数机器人研究人员用作支持。 声学源源也展示了在两个基于噪音到信号的轨道运行运行率比率方面的令人感兴趣的不确定性, 用于对整个管道界面进行快速的搜索。


Article 62

Title@2025-07-11 (5): Upgrade or Switch: Do We Need a Next-Gen Trusted Architecture for the Internet of AI Agents?

Title: Upgrade or Switch: Do We Need a Next-Gen Trusted Architecture for the Internet of AI Agents? Upgrade oder Switch: Brauchen wir eine vertrauenswürdige Next-Gen-Architektur für das Internet von KI-Agenten? 升级或切换:我们是否需要为AI代理商的互联网建立下一代信任的架构? 2506.12003v2

Authors (14): Ramesh Raskar, Pradyumna Chari, Jared James Grogan, Mahesh Lambe, Robert Lincourt, Raghu Bala, Aditi Joshi, Abhishek Singh, Ayush Chopra, Rajesh Ranjan, Shailja Gupta, Dimitris Stripelis, Maria Gorskikh, Sichao Wang

The emerging Internet of AI Agents challenges existing web infrastructure designed for human-scale, reactive interactions. Unlike traditional web resources, autonomous AI agents initiate actions, maintain persistent state, spawn sub-agents, and negotiate directly with peers: demanding millisecond-level discovery, instant credential revocation, and cryptographic behavioral proofs that exceed current DNS/PKI capabilities. This paper analyzes whether to upgrade existing infrastructure or implement purpose-built index architectures for autonomous agents. We identify critical failure points: DNS propagation (24-48 hours vs. required milliseconds), certificate revocation unable to scale to trillions of entities, and IPv4/IPv6 addressing inadequate for agent-scale routing. We evaluate three approaches: (1) Upgrade paths, (2) Switch options, (3) Hybrid index/registries. Drawing parallels to dialup-to-broadband transitions, we find that agent requirements constitute qualitative, and not incremental, changes. While upgrades offer compatibility and faster deployment, clean-slate solutions provide better performance but require longer for adoption. Our analysis suggests hybrid approaches will emerge, with centralized indexes for critical agents and federated meshes for specialized use cases.

AI代理商的新兴互联网挑战了为人类规模和反应性互动设计的现有网络基础设施。与传统的网络资源不同,自主的AI代理商发起行动,保持持久性状态,产卵子试剂,并与同行直接谈判:要求获得毫秒水平的发现,即时认证撤销,以及超出当前DNS/PKI能力的加密行为证明。本文分析是对现有基础设施进行升级,还是为自主代理商实施目的建造的指数结构。我们查明了关键的故障点:DNS传播(24至48小时相对于要求的毫秒 ) , 证书撤销无法达到数万亿实体, IPv4/IPv6 解决代理规模路由不足的问题。我们评估了三种方法:(1) 升级路径,(2) 切换选项,(3) 混合索引/登记。我们发现代理要求是定性的,而不是递增的。升级提供了兼容性和更快的部署,清洁的解决方案提供了更好的绩效,但需要更长的采用。我们的分析认为,混合方法将出现,关键代理商的中央指数和专门使用的美式胶片。


Article 63

Title@2025-07-11 (5): Safe Deep Reinforcement Learning for Resource Allocation with Peak Age of Information Violation Guarantees

Title: Safe Deep Reinforcement Learning for Resource Allocation with Peak Age of Information Violation Guarantees Sicheres tiefes Stärkungslernen für Ressourcenallokation mit Spitzenzeit der Informationsverletzungsgarantien 安全深强化学习,以进行违反信息达到高峰年龄的违反信息保障的资源分配 2507.08653v1

Authors (2): Berire Gunes Reyhan, Sinem Coleri

In Wireless Networked Control Systems (WNCSs), control and communication systems must be co-designed due to their strong interdependence. This paper presents a novel optimization theory-based safe deep reinforcement learning (DRL) framework for ultra-reliable WNCSs, ensuring constraint satisfaction while optimizing performance, for the first time in the literature. The approach minimizes power consumption under key constraints, including Peak Age of Information (PAoI) violation probability, transmit power, and schedulability in the finite blocklength regime. PAoI violation probability is uniquely derived by combining stochastic maximum allowable transfer interval (MATI) and maximum allowable packet delay (MAD) constraints in a multi-sensor network. The framework consists of two stages: optimization theory and safe DRL. The first stage derives optimality conditions to establish mathematical relationships among variables, simplifying and decomposing the problem. The second stage employs a safe DRL model where a teacher-student framework guides the DRL agent (student). The control mechanism (teacher) evaluates compliance with system constraints and suggests the nearest feasible action when needed. Extensive simulations show that the proposed framework outperforms rule-based and other optimization theory based DRL benchmarks, achieving faster convergence, higher rewards, and greater stability.

在无线网络控制系统(WNCS)中,控制与通信系统必须由于其强大的相互依存关系而共同设计。本文件为超可靠网络中超可靠网络系统提供了一个新型优化理论安全深度强化学习框架(DRL)框架,首次在文献中确保限制满意度和最佳性能,首次在文献中确保限制满意度和最佳性能。这种方法在关键限制下最大限度地减少动力消耗,包括信息高峰时代(PaoI)违反概率、传输力和有限区长系统中的延缓性。PaoI违规概率的独特来源是将多传感器网络中最大允许传输间隔(MATI)和最大允许包延迟限制(MAD)相结合。框架由两个阶段组成:优化理论和安全DRL。第一阶段产生最佳性条件,以便在变量之间建立数学关系,简化和分解问题。第二阶段采用安全DRL模型,教师-学生框架指导DRL代理(学生)。控制机制(教师)评估系统限制的遵守情况,并在必要时建议最接近的可行行动。根据更快速的理论、更快速的模拟标准,显示实现更稳定、更稳定、更快速的模型和更快的模型。


Article 64

Title@2025-07-11 (5): Open Source Planning & Control System with Language Agents for Autonomous Scientific Discovery

Title: Open Source Planning & Control System with Language Agents for Autonomous Scientific Discovery Open Source Planning & Control System mit Language Agents für autonome wissenschaftliche Entdeckung 拥有自主科学发现语言代理的开放源规划和控制系统 2507.07257v2

Authors (26): Licong Xu, Milind Sarkar, Anto I. Lonappan, Íñigo Zubeldia, Pablo Villanueva-Domingo, Santiago Casas, Christian Fidler, Chetana Amancharla, Ujjwal Tiwari, Adrian Bayer, Chadi Ait Ekioui, Miles Cranmer, Adrian Dimitrov, James Fergusson, Kahaan Gandhi, Sven Krippendorf, Andrew Laverick, Julien Lesgourgues, Antony Lewis, Thomas Meier, Blake Sherwin, Kristen Surrao, Francisco Villaescusa-Navarro, Chi Wang, Xueqing Xu, Boris Bolliet

We present a multi-agent system for automation of scientific research tasks, cmbagent (https://github.com/CMBAgents/cmbagent). The system is formed by about 30 Large Language Model (LLM) agents and implements a Planning & Control strategy to orchestrate the agentic workflow, with no human-in-the-loop at any point. Each agent specializes in a different task (performing retrieval on scientific papers and codebases, writing code, interpreting results, critiquing the output of other agents) and the system is able to execute code locally. We successfully apply cmbagent to carry out a PhD level cosmology task (the measurement of cosmological parameters using supernova data) and evaluate its performance on two benchmark sets, finding superior performance over state-of-the-art LLMs. The source code is available on GitHub, demonstration videos are also available, and the system is deployed on HuggingFace and will be available on the cloud.

我们提出了科研任务自动化的多试剂系统(https://github.com/CMBAgents/cmbagents),该系统由大约30个大语言模型代理商组成,并执行一项规划与控制战略,以协调代理工作流程,在任何时候都没有人手,每个代理商都从事不同的任务(对科学文件和代码库进行检索、写法、解释代码、解释结果、使其他代理商的输出发生误差),该系统能够在当地执行代码。我们成功地应用了计算机代理商执行博士级宇宙学任务(使用超新星数据测量宇宙参数),并评价其在两套基准系列上的业绩,找到顶级LLMMs的优越性能。源代码可在GitHub上查到,演示视频也可以在Hugging Face上安装,并将在云层上提供。


Article 65

Title@2025-07-11 (5): AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs

Title: AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs AgentsNet: Koordination und kollaborative Reasoning in Multi-Agent LLMs 网:多机构LLM中的协调与合作理由 2507.08616v1

Authors (5): Florian Grötschla, Luis Müller, Jan Tönshoff, Mikhail Galkin, Bryan Perozzi

Large-language models (LLMs) have demonstrated powerful problem-solving capabilities, in particular when organized in multi-agent systems. However, the advent of such systems also raises several questions on the ability of a complex network of agents to effectively self-organize and collaborate. While measuring performance on standard reasoning benchmarks indicates how well multi-agent systems can solve reasoning tasks, it is unclear whether these systems are able to leverage their topology effectively. Here, we propose AgentsNet, a new benchmark for multi-agent reasoning. By drawing inspiration from classical problems in distributed systems and graph theory, AgentsNet measures the ability of multi-agent systems to collaboratively form strategies for problem-solving, self-organization, and effective communication given a network topology. We evaluate a variety of baseline methods on AgentsNet including homogeneous networks of agents which first have to agree on basic protocols for organization and communication. We find that some frontier LLMs are already demonstrating strong performance for small networks but begin to fall off once the size of the network scales. While existing multi-agent benchmarks cover at most 2-5 agents, AgentsNet is practically unlimited in size and can scale with new generations of LLMs. As such, we also probe frontier models in a setup with up to 100 agents.

大型语言模型(LLMS)显示了强大的解决问题能力,特别是在多试剂系统中组织起来时,这种系统的出现也引起了关于复杂的代理人网络有效自我组织和协作能力的若干问题。衡量标准推理基准的绩效表明,多试剂系统能如何很好地解决推理任务,但尚不清楚这些系统是否能够有效地利用它们的地形学。我们在此提出多试剂网络,这是多试剂推理的新基准。通过从分布式系统和图表理论的典型问题中汲取灵感,代理网络衡量多试剂系统以协作方式制定解决问题、自我组织和有效通信战略的能力。我们评估了各种关于代理网络的基线方法,包括同质的代理人网络,这些代理网络首先必须商定组织和通信的基本规程。我们发现,一些边际LMS已经显示出小型网络的强大性能,但一旦网络规模缩小,就开始下降。虽然现有的多试剂基准覆盖了大多数2-5个代理人,但代理网络实际上规模有限,可以与新一代的LMSMs合作制定规模和规模。我们还要研究100个边界模型。


Article 66

Title@2025-07-11 (5): To Trade or Not to Trade: An Agentic Approach to Estimating Market Risk Improves Trading Decisions

Title: To Trade or Not to Trade: An Agentic Approach to Estimating Market Risk Improves Trading Decisions Handel oder Nichthandel: Ein Agentischer Ansatz zur Schätzung des Marktrisikos verbessert Handelsentscheidungen 贸易或非贸易贸易:估计市场风险的代理办法 改善贸易决定 2507.08584v1

Authors (4): Dimitrios Emmanoulopoulos, Ollie Olby, Justin Lyon, Namid R. Stillman

Large language models (LLMs) are increasingly deployed in agentic frameworks, in which prompts trigger complex tool-based analysis in pursuit of a goal. While these frameworks have shown promise across multiple domains including in finance, they typically lack a principled model-building step, relying instead on sentiment- or trend-based analysis. We address this gap by developing an agentic system that uses LLMs to iteratively discover stochastic differential equations for financial time series. These models generate risk metrics which inform daily trading decisions. We evaluate our system in both traditional backtests and using a market simulator, which introduces synthetic but causally plausible price paths and news events. We find that model-informed trading strategies outperform standard LLM-based agents, improving Sharpe ratios across multiple equities. Our results show that combining LLMs with agentic model discovery enhances market risk estimation and enables more profitable trading decisions.

大型语言模型(LLMS)越来越多地被部署在代理人框架中,从而触发了为实现一个目标而进行的基于工具的复杂分析。这些框架虽然在包括金融在内的多个领域显示了希望,但通常缺乏原则性的模式建设步骤,而是依赖情绪或趋势分析。我们通过开发一种代理系统,利用LLMS反复发现金融时间序列的随机差异方程式来弥补这一差距。这些模型产生风险指标,为日常贸易决策提供信息。我们在传统的后期测试和使用市场模拟器来评估我们的系统,该模拟器引入了合成但因果似似实的价格路径和新闻事件。我们发现,示范知情的贸易战略超越了基于LLMM的标准代理商,改善了多种股票的普惠比率。我们的成果显示,将LMS与代理模型发现结合起来,可以提高市场风险估计,并使得交易决定更加有利可图。


Article 67

Title@2025-07-11 (5): Finding Common Ground: Using Large Language Models to Detect Agreement in Multi-Agent Decision Conferences

Title: Finding Common Ground: Using Large Language Models to Detect Agreement in Multi-Agent Decision Conferences Gemeinsamer Grund: Mit großen Sprachmodellen Vereinbarungen in Multi-Agent-Entscheidungskonferenzen zu erkennen 寻找共同点:在多机构决定会议上使用大语言模型来检测协议 2507.08440v1

Authors (4): Selina Heller, Mohamed Ibrahim, David Antony Selby, Sebastian Vollmer

Decision conferences are structured, collaborative meetings that bring together experts from various fields to address complex issues and reach a consensus on recommendations for future actions or policies. These conferences often rely on facilitated discussions to ensure productive dialogue and collective agreement. Recently, Large Language Models (LLMs) have shown significant promise in simulating real-world scenarios, particularly through collaborative multi-agent systems that mimic group interactions. In this work, we present a novel LLM-based multi-agent system designed to simulate decision conferences, specifically focusing on detecting agreement among the participant agents. To achieve this, we evaluate six distinct LLMs on two tasks: stance detection, which identifies the position an agent takes on a given issue, and stance polarity detection, which identifies the sentiment as positive, negative, or neutral. These models are further assessed within the multi-agent system to determine their effectiveness in complex simulations. Our results indicate that LLMs can reliably detect agreement even in dynamic and nuanced debates. Incorporating an agreement-detection agent within the system can also improve the efficiency of group debates and enhance the overall quality and coherence of deliberations, making them comparable to real-world decision conferences regarding outcome and decision-making. These findings demonstrate the potential for LLM-based multi-agent systems to simulate group decision-making processes. They also highlight that such systems could be instrumental in supporting decision-making with expert elicitation workshops across various domains.

为实现这一目标,我们评估了六项不同的任务:立场探测,确定代理人在某一问题上的立场,以及立场对极地探测,确定积极、消极或中性的情绪。这些模型在多机构系统中得到了进一步评估,以确定其在复杂的模拟中的有效性。我们的结果表明,即使在动态和细致的辩论中,LLMS也能可靠地检测到一致意见。在系统内设立一个协议检测代理人也可以提高小组辩论的效率,提高审议的总体质量和一致性。这些结论还表明,在多边机制中,支持成果和决策过程的多方代表制。


Article 68

Title@2025-07-11 (5): Properties of Quasi-synchronization Time of High-dimensional Hegselmann-Krause Dynamics

Title: Properties of Quasi-synchronization Time of High-dimensional Hegselmann-Krause Dynamics Eigenschaften der Quasi-Synchronisierung Zeit der hochdimensionalen Hegselmann-Krause-Dynamik 高维 Hegselmann-Krause 动态的 准同步时间属性 2507.08900v1

Authors (4): Wei Su, Meiru Jiang, Yongguang Yu, Ge Chen

The behavior of one-dimensional Hegselmann-Krause (HK) dynamics driven by noise has been extensively studied. Previous research has indicated that within no matter the bounded or the unbounded space of one dimension, the HK dynamics attain quasi-synchronization (synchronization in noisy case) in finite time. However, it remains unclear whether this phenomenon holds in high-dimensional space. This paper investigates the random time for quasi-synchronization of multi-dimensional HK model and reveals that the boundedness and dimensions of the space determine different outcomes. To be specific, if the space is bounded, quasi-synchronization can be attained almost surely for all dimensions within a finite time, whereas in unbounded space, quasi-synchronization can only be achieved in low-dimensional cases (one and two). Furthermore, different integrability of the random time of various cases is proved.

由噪音驱动的一维Hegselmann-Krause(HK)动态的行为已经进行了广泛的研究。以前的研究表明,无论一个层面的界限空间或无界限空间,香港动态都在有限的时间内实现准同步化(噪音情况下的同步化 ) , 然而,这一现象是否存在于高维空间尚不清楚。本文调查了多维香港模型准同步化的随机时间,并揭示了空间的界限和尺寸决定了不同的结果。 具体地说,如果空间是界限的,那么在有限的时间内,所有层面都几乎可以肯定实现准同步化,而在无界限的空间中,只有低维(一和二)案例才能实现准同步化。 此外,各种案例随机时间的不易性得到了证明。


Article 69

Title@2025-07-11 (5): Exploring Design of Multi-Agent LLM Dialogues for Research Ideation

Title: Exploring Design of Multi-Agent LLM Dialogues for Research Ideation Erforschung der Gestaltung von LLM-Dialogen mit mehreren Agenten für die Forschungsideation 探索设计多种机构用LLM 研究主题对话 2507.08350v1

Authors (7): Keisuke Ueda, Wataru Hirota, Takuto Asakura, Takahiro Omi, Kosuke Takahashi, Kosuke Arima, Tatsuya Ishigaki

Large language models (LLMs) are increasingly used to support creative tasks such as research idea generation. While recent work has shown that structured dialogues between LLMs can improve the novelty and feasibility of generated ideas, the optimal design of such interactions remains unclear. In this study, we conduct a comprehensive analysis of multi-agent LLM dialogues for scientific ideation. We compare different configurations of agent roles, number of agents, and dialogue depth to understand how these factors influence the novelty and feasibility of generated ideas. Our experimental setup includes settings where one agent generates ideas and another critiques them, enabling iterative improvement. Our results show that enlarging the agent cohort, deepening the interaction depth, and broadening agent persona heterogeneity each enrich the diversity of generated ideas. Moreover, specifically increasing critic-side diversity within the ideation-critique-revision loop further boosts the feasibility of the final proposals. Our findings offer practical guidelines for building effective multi-agent LLM systems for scientific ideation. Our code is available at https://github.com/g6000/MultiAgent-Research-Ideator.

大型语言模型(LLMs)正越来越多地用于支持研究理念生成等创造性任务。最近的工作表明,LLMs之间的结构性对话可以改进新颖和所产生理念的可行性,但这种互动的最佳设计仍然不明确。在本研究中,我们对多种试剂LLM科学理念对话进行了全面分析;我们比较了代理角色的不同配置、代理人数量和对话深度,以了解这些因素如何影响所产生理念的新颖性和可行性。我们的实验设置包括一个代理商产生想法和另一个批评这些想法的环境,从而促成迭接改进。我们的结果显示,扩大代理商群,深化互动深度,扩大代理商的异质性,每个都丰富了所产生想法的多样性。此外,特别增加了思想-评论方的多样性,进一步增强了最后提案的可行性。我们的调查结果为建立有效的多试剂LM科学理念系统提供了实用的指导方针。我们的代码可在 https://github.com/g600/MultiAgenti-Resear-Idetor查阅。


Article 70

Title@2025-07-11 (5): Conversational Self-Play for Discovering and Understanding Psychotherapy Approaches

Title: Conversational Self-Play for Discovering and Understanding Psychotherapy Approaches Conversational Self-Play für das Entdecken und Verstehen von Psychotherapieansätzen 发现和理解心理疗法方法的相互交流的自我宣传 2503.16521v2

Authors (7): Onno P Kampman, Michael Xing, Charmaine Lim, Ahmad Ishqi Jabir, Ryan Louie, Jimmy Lee, Robert JT Morris

This paper explores conversational self-play with LLMs as a scalable approach for analyzing and exploring psychotherapy approaches, evaluating how well AI-generated therapeutic dialogues align with established modalities.

本文探讨与LLMs进行自言自语,作为分析和探讨心理治疗方法的一种可扩展的方法,评估AI产生的治疗对话与既定模式的配合程度。


Article 71

Title@2025-07-11 (5): CRMAgent: A Multi-Agent LLM System for E-Commerce CRM Message Template Generation

Title: CRMAgent: A Multi-Agent LLM System for E-Commerce CRM Message Template Generation CRMAgent: Ein Multi-Agent LLM-System für E-Commerce CRM-Meldungsvorlagen-Erstellung CRMM 信息模板生成多机构代理LLM系统 2507.08325v1

Authors (3): Yinzhu Quan, Xinrui Li, Ying Chen

In e-commerce private-domain channels such as instant messaging and e-mail, merchants engage customers directly as part of their Customer Relationship Management (CRM) programmes to drive retention and conversion. While a few top performers excel at crafting outbound messages, most merchants struggle to write persuasive copy because they lack both expertise and scalable tools. We introduce CRMAgent, a multi-agent system built on large language models (LLMs) that generates high-quality message templates and actionable writing guidance through three complementary modes. First, group-based learning enables the agent to learn from a merchant’s own top-performing messages within the same audience segment and rewrite low-performing ones. Second, retrieval-and-adaptation fetches templates that share the same audience segment and exhibit high similarity in voucher type and product category, learns their successful patterns, and adapts them to the current campaign. Third, a rule-based fallback provides a lightweight zero-shot rewrite when no suitable references are available. Extensive experiments show that CRMAgent consistently outperforms merchants’ original templates, delivering significant gains in both audience-match and marketing-effectiveness metrics.

在电子商业私人领域渠道,如即时短信和电子邮件,商人直接与客户接触,作为其客户关系管理(客户关系管理)方案的一部分,推动保留和转换。虽然少数最优秀的表演者在编造外向信息方面很出色,但大多数商人都因为缺乏专门知识和可扩缩的工具而难以写好有说服力的副本。我们引入了基于大型语言模型(LLLMs)的多试剂系统CROMAGenter,这是一个多试剂系统,它通过三种互补模式生成高质量的信息模板和可操作的写作指南。首先,基于集团的学习使代理商能够在同一受众部分中学习自己的最优秀信息,并改写低性能的版本。第二,检索和改写取取模板,这些模板共享相同的受众部分,在凭证类型和产品类别中表现出高度相似性,学习其成功模式,并适应当前的运动。第三,基于规则的后退提供轻量级零光的改写,因为没有合适的参考。广泛的实验显示CMAentent 持续地超越了商人的原始模板,在观众和销售效率衡量标准方面都取得了重大成果。


Article 72

Title@2025-07-11 (5): An Outlook on the Opportunities and Challenges of Multi-Agent AI Systems

Title: An Outlook on the Opportunities and Challenges of Multi-Agent AI Systems Ausblick auf die Chancen und Herausforderungen multiagenter KI-Systeme 关于多机构AI系统机会和挑战的展望 2505.18397v2

Authors (15): Fangqiao Tian, An Luo, Jin Du, Xun Xian, Robert Specht, Ganghua Wang, Xuan Bi, Jiawei Zhou, Ashish Kundu, Jayanth Srinivasa, Charles Fleming, Rui Zhang, Zirui Liu, Mingyi Hong, Jie Ding

A multi-agent AI system (MAS) is composed of multiple autonomous agents that interact, exchange information, and make decisions based on internal generative models. Recent advances in large language models and tool-using agents have made MAS increasingly practical in areas like scientific discovery and collaborative automation. However, key questions remain: When are MAS more effective than single-agent systems? What new safety risks arise from agent interactions? And how should we evaluate their reliability and structure? This paper outlines a formal framework for analyzing MAS, focusing on two core aspects: effectiveness and safety. We explore whether MAS truly improve robustness, adaptability, and performance, or merely repackage known techniques like ensemble learning. We also study how inter-agent dynamics may amplify or suppress system vulnerabilities. While MAS are relatively new to the signal processing community, we envision them as a powerful abstraction that extends classical tools like distributed estimation and sensor fusion to higher-level, policy-driven inference. Through experiments on data science automation, we highlight the potential of MAS to reshape how signal processing systems are designed and trusted.

多试剂的AI系统(MAS)由多个自主的代理人组成,这些代理人互动、交流信息和根据内部基因模型作出决定。大型语言模型和工具使用代理人的最近进展使MAS在科学发现和协作自动化等领域越来越实用。然而,关键问题仍然存在:MAS何时比单一试剂系统更有效?代理系统产生了哪些新的安全风险?我们应如何评价其可靠性和结构?本文件概述了一个分析MAS的正式框架,侧重于两个核心方面:有效性和安全。我们探讨MAS是否真正改进了坚固性、适应性和性,以及性能,或者仅仅是重新包装已知的技术,例如共同学习。我们还研究机构间动态如何扩大或抑制系统脆弱性。虽然MAS对信号处理界来说相对新,但我们认为它们是一种强大的抽象,将传统的工具,如分布估计和传感器的聚合扩展到更高层次的政策驱动推论。通过数据科学自动化实验,我们强调MAS有可能重新塑造信号处理系统是如何设计和信任的。


Article 73

Title@2025-07-10 (4): Multi-Actor Generative Artificial Intelligence as a Game Engine

Title: Multi-Actor Generative Artificial Intelligence as a Game Engine Multi-Actor Generative Künstliche Intelligenz als Game Engine 多驱动器生成人工智能作为游戏引擎 2507.08892v1

Authors (9): Alexander Sasha Vezhnevets, Jayd Matyas, Logan Cross, Davide Paglieri, Minsuk Chang, William A. Cunningham, Simon Osindero, William S. Isaac, Joel Z. Leibo

Generative AI can be used in multi-actor environments with purposes ranging from social science modeling to interactive narrative and AI evaluation. Supporting this diversity of use cases – which we classify as Simulationist, Dramatist, and Evaluationist – demands a flexible scenario definition framework. We argue here that a good approach is to take inspiration from tabletop role-playing games (TTRPGs), where a Game Master (GM) is responsible for the environment and generates all parts of the story not directly determined by the voluntary actions of player characters. We argue that the Entity-Component architectural pattern is useful here. In such a system, the GM is not a hardcoded computer game but is itself a configurable entity, composed of components just like any other actor. By design, the approach allows for a separation between the underlying implementation details handled by an engineer, the creation of reusable components, and their composition and configuration managed by a designer who constructs entities from the components. This separation of concerns is instrumental for achieving rapid iteration, maintaining modularity, and ultimately to ensure scalability. We describe the ongoing evolution of the Concordia library in terms of this philosophy, demonstrating how it allows users to effectively configure scenarios that align with their specific goals.

从社会科学建模到互动叙事和AI评估等多种用途环境都可以使用生成的AI。支持这种多样化的使用案例 – – 我们将这些案例归类为模拟、戏剧学和评价学 – – 需要灵活的情景定义框架。我们在此提出,一个好的做法是从桌面角色扮演游戏(TTRPGs)中汲取灵感,游戏大师(GM)负责环境,生成故事中并非由玩家角色自愿行动直接决定的所有部分。我们争辩说,实体-综合建筑模式在这里是有用的。在这样一个系统中,GM不是一个硬编码的计算机游戏,而本身就是一个可配置的实体,与任何其他行为者一样,由各个组成部分组成。通过设计,该方法可以将工程师处理的基本执行细节、可再使用组件的创建及其组成和配置分开,而设计师则从各个组成部分中构建实体。这种关注的分离对于迅速复制、保持模块性并最终确保可缩放性都至关重要。我们描述了Concrica图书馆的演变过程,其用户能够有效地调整其特定哲学目标。


Article 74

Title@2025-07-10 (4): Noise-Enabled Goal Attainment in Crowded Collectives

Title: Noise-Enabled Goal Attainment in Crowded Collectives Lärmfähiges Ziel-Attainment in Crowded Collectives 聚众集体实现无声目标 2507.08100v1

Authors (4): Lucy Liu, Justin Werfel, Federico Toschi, L. Mahadevan

In crowded environments, individuals must navigate around other occupants to reach their destinations. Understanding and controlling traffic flows in these spaces is relevant to coordinating robot swarms and designing infrastructure for dense populations. Here, we combine simulations, theory, and robotic experiments to study how noisy motion can disrupt traffic jams and enable flow as agents travel to individual goals. Above a critical noise level, large jams do not persist. From this observation, we analytically approximate the goal attainment rate as a function of the noise level, then solve for the optimal agent density and noise level that maximize the swarm’s goal attainment rate. We perform robotic experiments to corroborate our simulated and theoretical results. Finally, we compare simple, local navigation approaches with a sophisticated but computationally costly central planner. A simple reactive scheme performs well up to moderate densities and is far more computationally efficient than a planner, suggesting lessons for real-world problems.

在拥挤的环境下,个人必须绕着其他居住者绕行,才能到达目的地。理解和控制这些空间的交通流量对于协调机器人群和设计密集人口基础设施是相关的。在这里,我们结合模拟、理论和机器人实验,研究噪音运动如何能扰乱交通阻塞,并能够作为代理人前往个别目标。在临界的噪音水平之外,大型干扰不会持续。从这个观察中,我们从分析角度将目标实现率与噪音水平的函数相近,然后解决最佳剂密度和噪音水平的问题,以最大限度地提高虫群的目标实现率。我们进行了机器人实验,以证实我们的模拟和理论结果。最后,我们比较了简单的地方导航方法与精密但计算成本昂贵的中心规划器。一个简单的反应计划运行到适中密度,在计算上比规划器效率高得多,为真实世界问题提供教训。


Article 75

Title@2025-07-10 (4): Agent-based visualization of streaming text

Title: Agent-based visualization of streaming text Agentenbasierte Visualisierung von Streaming-Texten 以代理为基础的流流文本可视化 2507.08884v1

Authors (4): Jordan Riley Benson, David Crist, Phil Lafleur, Benjamin Watson

We present a visualization infrastructure that maps data elements to agents, which have behaviors parameterized by those elements. Dynamic visualizations emerge as the agents change position, alter appearance and respond to one other. Agents move to minimize the difference between displayed agent-to-agent distances, and an input matrix of ideal distances. Our current application is visualization of streaming text. Each agent represents a significant word, visualizing it by displaying the word itself, centered in a circle sized by the frequency of word occurrence. We derive the ideal distance matrix from word cooccurrence, mapping higher co-occurrence to lower distance. To depict co-occurrence in its textual context, the ratio of intersection to circle area approximates the ratio of word co-occurrence to frequency. A networked backend process gathers articles from news feeds, blogs, Digg or Twitter, exploiting online search APIs to focus on user-chosen topics. Resulting visuals reveal the primary topics in text streams as clusters, with agent-based layout moving without instability as data streams change dynamically.

我们展示了一个可视化的基础设施, 将数据元素映射给代理商, 这些代理商的行为以这些元素为参数参数。 动态可视化随着代理商的改变而出现, 改变外观并互相反应; 代理商移动以尽量减少显示的代理商到代理商距离之间的差别, 以及理想距离的输入矩阵。 我们目前的应用程序是流文本的可视化。 每个代理商代表一个很重要的单词, 通过显示单词本身, 以单词发生频率的圆圈为中心。 我们从单词发生的圆圈中获取理想的距离矩阵, 绘制更高的共同发生到较低距离的图示。 要在文本上描述共发生的情况, 交叉到圆圈区域的比例大约是单词对频率的比。 一个网络后端进程从新闻源、 博客、 Digg 或 Twitter 中收集文章, 利用在线搜索 API 来关注用户- 的话题。 结果视觉显示文本流中的主要话题, 以代理商为基础的布局不会随着数据流的动态变化而变化。


Article 76

Title@2025-07-10 (4): MAEBE: Multi-Agent Emergent Behavior Framework

Title: MAEBE: Multi-Agent Emergent Behavior Framework MAEBE: Multi-Agent Emergent Behavior Framework 多边代理新兴行为框架 2506.03053v2

Authors (4): Sinem Erisken, Timothy Gothard, Martin Leitgab, Ram Potham

Traditional AI safety evaluations on isolated LLMs are insufficient as multi-agent AI ensembles become prevalent, introducing novel emergent risks. This paper introduces the Multi-Agent Emergent Behavior Evaluation (MAEBE) framework to systematically assess such risks. Using MAEBE with the Greatest Good Benchmark (and a novel double-inversion question technique), we demonstrate that: (1) LLM moral preferences, particularly for Instrumental Harm, are surprisingly brittle and shift significantly with question framing, both in single agents and ensembles. (2) The moral reasoning of LLM ensembles is not directly predictable from isolated agent behavior due to emergent group dynamics. (3) Specifically, ensembles exhibit phenomena like peer pressure influencing convergence, even when guided by a supervisor, highlighting distinct safety and alignment challenges. Our findings underscore the necessity of evaluating AI systems in their interactive, multi-agent contexts.

对孤立的LLMs进行传统的AI安全评价是不够的,因为多试剂AI联合体变得很普遍,带来新的新风险。本文件介绍了多代理新兴行为评价(MAEBE)框架,以系统评估此类风险。我们利用MAEBE和最伟大的良好基准(以及一种新型的双重反向问题技术)来证明:(1)LLM道德偏好,特别是工具伤害的道德偏好,在单一代理体和组合体中都令人惊讶地变得脆弱,随着问题设置而发生重大变化。 (2)LLM联合体的道德推理不能直接从新出现的集团动态造成的孤立的代理体行为中预见出来。(3)具体地说,集合体展示了影响趋同的同行压力等现象,即使由上司指导,也突出了不同的安全和一致性挑战。我们的调查结果强调,有必要在其互动的多剂环境中评价AI系统。


Article 77

Title@2025-07-10 (4): MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework

Title: MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework MF-LLM: Simulation von Populationsentscheidungsdynamiken über ein mittleres Feld Large Language Model Framework MF-LLM:通过一个中外地大语言示范框架模拟人口决策动态 2504.21582v3

Authors (9): Qirui Mi, Mengyue Yang, Xiangning Yu, Zhiyu Zhao, Cheng Deng, Bo An, Haifeng Zhang, Xu Chen, Jun Wang

Simulating collective decision-making involves more than aggregating individual behaviors; it emerges from dynamic interactions among individuals. While large language models (LLMs) offer strong potential for social simulation, achieving quantitative alignment with real-world data remains a key challenge. To bridge this gap, we propose the Mean-Field LLM (MF-LLM) framework, the first to incorporate mean field theory into LLM-based social simulation. MF-LLM models bidirectional interactions between individuals and the population through an iterative process, generating population signals to guide individual decisions, which in turn update the signals. This interplay produces coherent trajectories of collective behavior. To improve alignment with real-world data, we introduce IB-Tune, a novel fine-tuning method inspired by the Information Bottleneck principle, which retains population signals most predictive of future actions while filtering redundant history. Evaluated on a real-world social dataset, MF-LLM reduces KL divergence to human population distributions by 47\% compared to non-mean-field baselines, enabling accurate trend forecasting and effective intervention planning. Generalizing across 7 domains and 4 LLM backbones, MF-LLM provides a scalable, high-fidelity foundation for social simulation.

模拟集体决策涉及的不仅仅是综合个人行为;它产生于个人之间的动态互动;虽然大型语言模型(LLMs)为社会模拟提供了巨大的潜力,但与现实世界数据实现数量一致仍然是一个关键挑战。为了缩小这一差距,我们提议了中-实地LLM(MF-LLM)框架,这是第一个将中-实地理论纳入基于LLM的社会模拟中的第一个框架。MF-LLM模型通过一个迭接过程将个人和人口之间的双向互动纳入到LM社会模拟中,生成人口信号以指导个人决策,而后者又更新信号。这种相互作用产生了一致的集体行为轨迹。为了改善与现实世界数据的一致性,我们引入了IB-Tune,这是受信息博特内克原则启发的一种创新的微调方法,它保留了人口信号,在过滤多余历史的同时最能预测未来行动。MF-LM(LM)将KLM减少KLL与人口分布的差别,比非中等水平基线减少47,从而能够准确的趋势预测和有效干预规划。


Article 78

Title@2025-07-10 (4): Conjugated Capabilities: Interrelations of Elementary Human Capabilities and Their Implication on Human-Machine Task Allocation and Capability Testing Procedures

Title: Conjugated Capabilities: Interrelations of Elementary Human Capabilities and Their Implication on Human-Machine Task Allocation and Capability Testing Procedures Konjugierte Fähigkeiten: Zusammenhänge von elementaren menschlichen Fähigkeiten und deren Implikationen auf Mensch-Maschine-Aufgaben-Zuteilungs- und Fähigkeitsprüfungsverfahren 相容能力:人类基本能力之间的相互关系及其对人类-海洋任务分配和能力测试程序的影响 2507.07560v1

Authors (5): Nils Mandischer, Larissa Füller, Torsten Alles, Frank Flemisch, Lars Mikelsons

Human and automation capabilities are the foundation of every human-autonomy interaction and interaction pattern. Therefore, machines need to understand the capacity and performance of human doing, and adapt their own behavior, accordingly. In this work, we address the concept of conjugated capabilities, i.e. capabilities that are dependent or interrelated and between which effort can be distributed. These may be used to overcome human limitations, by shifting effort from a deficient to a conjugated capability with performative resources. For example: A limited arm’s reach may be compensated by tilting the torso forward. We analyze the interrelation between elementary capabilities within the IMBA standard to uncover potential conjugation, and show evidence in data of post-rehabilitation patients. From the conjugated capabilities, within the example application of stationary manufacturing, we create a network of interrelations. With this graph, a manifold of potential uses is enabled. We showcase the graph’s usage in optimizing IMBA test design to accelerate data recordings, and discuss implications of conjugated capabilities on task allocation between the human and an autonomy.

人的能力和自动化能力是每一种人类自主互动和互动模式的基础。 因此, 机器需要了解人类行为的能力和表现, 并相应调整自己的行为。 在这项工作中, 我们处理共生能力的概念, 即依赖或相互联系的能力, 以及可以分散努力的能力。 这些能力可以用来克服人类的局限性, 将努力从不足的能力转向与性能资源相融合的能力。 例如: 有限的手臂的覆盖范围可以通过向前倾斜来弥补。 我们分析IMBA标准范围内的基本能力之间的相互关系, 以发现潜在的共生, 并在康复后病人的数据中显示证据。 从共生能力, 在固定的制造业应用中, 我们创建了一个相互联系的网络。 通过这个图表, 我们启用了多种潜在用途。 我们展示了图表在优化IMBA测试设计以加速数据记录, 并讨论共生能力对人与自主任务分配的影响。


Article 79

Title@2025-07-10 (4): Toward Real-World Chinese Psychological Support Dialogues: CPsDD Dataset and a Co-Evolving Multi-Agent System

Title: Toward Real-World Chinese Psychological Support Dialogues: CPsDD Dataset and a Co-Evolving Multi-Agent System Auf dem Weg zu echten chinesischen Psychologischen Unterstützungsdialogen: CPsDD-Datensatz und ein gemeinsames Multi-Agenten-System 走向现实世界的中国心理支持对话:CPsDD数据集和共同演进的多行为者系统 2507.07509v1

Authors (3): Yuanchen Shi, Longyin Zhang, Fang Kong

The growing need for psychological support due to increasing pressures has exposed the scarcity of relevant datasets, particularly in non-English languages. To address this, we propose a framework that leverages limited real-world data and expert knowledge to fine-tune two large language models: Dialog Generator and Dialog Modifier. The Generator creates large-scale psychological counseling dialogues based on predefined paths, which guide system response strategies and user interactions, forming the basis for effective support. The Modifier refines these dialogues to align with real-world data quality. Through both automated and manual review, we construct the Chinese Psychological support Dialogue Dataset (CPsDD), containing 68K dialogues across 13 groups, 16 psychological problems, 13 causes, and 12 support focuses. Additionally, we introduce the Comprehensive Agent Dialogue Support System (CADSS), where a Profiler analyzes user characteristics, a Summarizer condenses dialogue history, a Planner selects strategies, and a Supporter generates empathetic responses. The experimental results of the Strategy Prediction and Emotional Support Conversation (ESC) tasks demonstrate that CADSS achieves state-of-the-art performance on both CPsDD and ESConv datasets.

由于压力增加,对心理支持的需求日益增长,这暴露了相关数据集的稀缺,特别是非英语的数据集。为此,我们提议了一个框架,利用有限的真实世界数据和专家知识对两种大语言模型进行微调:对话框生成器和对话框修饰器。发电机创造了基于预设路径的大规模心理咨询对话,指导系统反应战略和用户互动,为有效支持奠定基础。修饰器将这些对话改进为与现实世界数据质量相一致。通过自动化和人工审查,我们构建了中国心理支持对话数据集,其中包括13个群体之间的68K对话、16个心理问题、13个原因和12个支持重点。此外,我们引入了全面代理对话支持系统(CADSS),其中剖析器分析用户特性,一个解析器压缩对话历史,一个规划器选择战略,以及一个支持器生成了同情性反应。战略预测和情感支持对话(ESC)任务的实验结果显示,中国心理支持系统在CPDD和CURS上都实现了状态。


Article 80

Title@2025-07-10 (4): KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows

Title: KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows KVFlow: Effizientes Präfix-Caching zur Beschleunigung von LLM-basierten Multiagenten-Workflows KVFlow: 为加速基于LLM的多重需要工作流程而高效预置缓存 2507.07400v1

Authors (9): Zaifeng Pan, Ajjkumar Patel, Zhengding Hu, Yipeng Shen, Yue Guan, Wan-Lu Li, Lianhui Qin, Yida Wang, Yufei Ding

Large language model (LLM) based agentic workflows have become a popular paradigm for coordinating multiple specialized agents to solve complex tasks. To improve serving efficiency, existing LLM systems employ prefix caching to reuse key-value (KV) tensors corresponding to agents’ fixed prompts, thereby avoiding redundant computation across repeated invocations. However, current systems typically evict KV caches using a Least Recently Used (LRU) policy, which fails to anticipate future agent usage and often discards KV caches shortly before their reuse. This leads to frequent cache misses and substantial recomputation or swapping overhead. We present KVFlow, a workflow-aware KV cache management framework tailored for agentic workloads. KVFlow abstracts the agent execution schedule as an Agent Step Graph and assigns each agent a steps-to-execution value that estimates its temporal proximity to future activation. These values guide a fine-grained eviction policy at the KV node level, allowing KVFlow to preserve entries likely to be reused and efficiently manage shared prefixes in tree-structured caches. Moreover, KVFlow introduces a fully overlapped KV prefetching mechanism, which proactively loads required tensors from CPU to GPU in background threads for agents scheduled in the next step, thereby avoiding cache miss stalls during generation. Compared to SGLang with hierarchical radix cache, KVFlow achieves up to 1.83$\times$ speedup for single workflows with large prompts, and up to 2.19$\times$ speedup for scenarios with many concurrent workflows.

大型语言模型( LLM) 以大型语言模式为基础的代理工作流程已成为协调多个专门代理商解决复杂任务的流行范例。 为了提高效率, 现有的 LLM 系统使用前缀缓存, 重新使用与代理商固定提示相对的键值( KV) , 从而避免重复计算。 然而, 当前系统通常使用最不常用的( LRU) 政策驱逐 KV 缓存, 这无法预测未来代理商的使用情况, 并经常在重新使用之前不久丢弃 KV 缓存 。 这导致频繁的缓存丢失和大量重置或转换管理管理管理管理。 我们展示了 KVFlow, 一个为代理工作量量定制的工作流程- World KVVV 缓存管理框架。 KVFlow 将代理商执行时间表作为代理Step 图表, 并给每个代理商分配一个步骤到执行值, 估计其与未来激活时间的距离。 这些值指导了 KVPO 节点的细化驱逐政策, 允许 KVFlow 保存可能被再利用的单流流流流和高效共享的预置速度, 。 KVlalal-lickraterateal 时间里, 时间里, 需要完全地在 SG 。


Article 81

Title@2025-07-10 (4): Multi-Agent Pathfinding Under Team-Connected Communication Constraint via Adaptive Path Expansion and Dynamic Leading

Title: Multi-Agent Pathfinding Under Team-Connected Communication Constraint via Adaptive Path Expansion and Dynamic Leading Multi-Agent Pathfinding unter Team-Connected Communication Constraint über Adaptive Path Expansion und Dynamic Leading 通过适应性路径扩展和动态领导,在联成一体的通信制约下,开展多机构多方机构路透调查 2501.02770v4

Authors (3): Hoang-Dung Bui, Erion Plaku, Gregoy J. Stein

This paper proposes a novel planning framework to handle a multi-agent pathfinding problem under team-connected communication constraint, where all agents must have a connected communication channel to the rest of the team during their entire movements. Standard multi-agent path finding approaches (e.g., priority-based search) have potential in this domain but fail when neighboring configurations at start and goal differ. Their single-expansion approach – computing each agent’s path from the start to the goal in just a single expansion – cannot reliably handle planning under communication constraints for agents as their neighbors change during navigating. Similarly, leader-follower approaches (e.g., platooning) are effective at maintaining team communication, but fixing the leader at the outset of planning can cause planning to become stuck in dense-clutter environments, limiting their practical utility. To overcome this limitation, we propose a novel two-level multi-agent pathfinding framework that integrates two techniques: adaptive path expansion to expand agent paths to their goals in multiple stages; and dynamic leading technique that enables the reselection of the leading agent during each agent path expansion whenever progress cannot be made. Simulation experiments show the efficiency of our planners, which can handle up to 25 agents across five environment types under a limited communication range constraint and up to 11-12 agents on three environment types under line-of-sight communication constraint, exceeding 90% success-rate where baselines routinely fail.

本文提出一个新的规划框架,以处理在团队连接的通信限制下多试剂路径调查问题,所有代理人员必须在整个移动期间拥有与团队其余部分的连接通信渠道。标准多试剂查找方法(例如基于优先的搜索)在这方面具有潜力,但当周边配置在开始时和目标不同时却失败。他们的一个扩展方法 – – 计算每个代理人员从一开始到仅仅以单一扩展为目标的路径 – – 无法可靠地处理在通信限制下对代理人员进行规划,因为其邻国在导航过程中发生变化。同样,领导追随者方法(例如排)在维持团队通信方面十分有效,但在规划开始时确定领导者可以导致规划被困在拥挤的环境中,限制其实际效用。为了克服这一限制,我们提议一个新的两级多试探框架,将两种技术结合起来:适应性路径扩展,将代理人员路径扩大到多个阶段的目标;动态领先技术,使领导代理人员在每一个代理人员扩展过程中,只要无法取得进展,就能在每一个代理人员扩展过程中重新进行选择。模拟实验,但在规划开始时确定领导者的效率时,在密集的第三种通信限制范围下,在超过第四种环境中可以进行限制范围为第三种环境之下,在第四种情况下进行。