cs.LG @ 2025-06-01: 1952

05-29 (4)

From Chat Logs to Collective Insights: Aggregative Question Answering

Von Chat Logs zu Collective Insights: Aggregative Question Answering

从聊天日志到集体透视:聚合问题解答

2505.23765v1

05-29

Differential Information: An Information-Theoretic Perspective on Preference Optimization

Differentialinformation: Eine informationstheoretische Perspektive zur Preference-Optimierung

差别信息:关于首选优化的信息理论观点

2505.23761v1

05-29

Model Immunization from a Condition Number Perspective

Modell Immunisierung aus einem Zustand Anzahl Perspektive

从条件数字角度进行示范免疫

2505.23760v1

05-29

Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint

Puzzlet von Puzzles: Wenn Vision-Language-Modelle keinen Hinweis aufnehmen können

由谜题拼取的谜题: 当视觉语言模型无法使用提示时

2505.23759v1

05-29

REOrdering Patches Improves Vision Models

REOrdering Patches verbessert Vision Modelle

重新排列补丁改进愿景模式

2505.23751v1

05-29

Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?

Verzerrung der AI Alignment: Optimiert Preference Optimization für Preferences?

AI对齐的扭曲:偏好优化是否优化优惠?

2505.23749v1

05-29

Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Raum-MLLM: Steigerung der MLLM-Kapazitäten in visueller räumlicher Intelligenz

空间-MLLM:增强以视觉为基础的空间情报中的MLLM能力

2505.23747v1

05-29

To Trust Or Not To Trust Your Vision-Language Model’s Prediction

Vertrauen oder nicht Vertrauen in die Vorhersage Ihres Vision-Sprache-Modells

相信或不相信你的视觉语言模型的预测

2505.23745v1

05-29

On the Convergence Analysis of Muon

Zur Konvergenzanalyse von Muon

Muon的趋同分析

2505.23737v1

05-29

EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast

EmotionRankCLAP: Bridging Natural Language Speaking Styles und Ordinal Speech Emotion via Rank-N-Contrast

情感-RankCLAP:通过Ran-N-Contrast将自然语言语言语言的口语风格和普通语言的情感联系起来

2505.23732v1

05-29

Keep Everyone Happy: Online Fair Division of Numerous Items with Few Copies

Halten Sie alle glücklich: Online Fair Division von zahlreichen Artikeln mit wenigen Kopien

让人人快乐:许多物品的在线公平分会,只有很少的影印件。

2408.12845v2

05-29

MuLoCo: Muon is a practical inner optimizer for DiLoCo

MuLoCo: Muon ist ein praktischer Innenoptimierer für DiLoCo

MuLoCo: Muon 是 DiLoCo 的实用内部优化器

2505.23725v1

05-29

SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA

SC-LoRA: Ausbalancieren effizienter Feinsteuerung und Wissenserhaltung über Subraum-kontrainierte LoRA

SC-LORA:通过分空间训练LORA平衡高效微调和知识保护

2505.23724v1

05-29

ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering

ML-Agent: Verstärkung von LLM-Agenten für autonome Maschinenbautechnik

ML-代理:加强自动机械学习工程的LLM代理

2505.23723v1

05-29

Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields

Verteilungsverschiebungen für maschinelle Lernkräfte verstehen und abmildern

机器学习领域理解和缩小分布变化

2503.08674v2

05-29

DiffER: Categorical Diffusion for Chemical Retrosynthesis

DiffER: Kategorische Diffusion für chemische Retrosynthese

DiffER: 化学复制合成的分类扩散

2505.23721v1

05-29

COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents

COBRA: Kontextueller Bandit-Algorithmus für die Sicherung wahrheitsgetreuer strategischer Agenten

COBRA: 确保真实战略媒介的背景土匪比重

2505.23720v1

05-29

FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

FastTD3: Einfaches, schnelles und fähiges Verstärkungslernen für die humanoide Kontrolle

快速TD3: 人类控制简单、快速和有能力的强化学习

2505.22642v2

05-29

TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning

TiRex: Nullschnelle Vorhersagen über lange und kurze Horizonte mit verbessertem In-Context-Lernen

TiRex: 利用强化的内文学习,对长地和短地平线进行零热预测

2505.23719v1

05-29

Foundation Model Hidden Representations for Heart Rate Estimation from Auscultation

Fundamentalmodell versteckte Darstellungen für die Herzfrequenzschätzung aus der Auskultation

基金会 “ 基金会 “ 用于从修术中心速估计的模型隐藏模型代表

2505.20745v2

05-29

Skin Lesion Phenotyping via Nested Multi-modal Contrastive Learning

Haut-Lesion-Phenotypisierung über verschachteltes multimodales kontrastives Lernen

通过Nested多模式反竞争学习进行皮肤脱 Le基因分析

2505.23709v1

05-29

Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better

Wissensisolierende Vision-Sprache-Action-Modelle: Schnell trainieren, schnell laufen, besser generalisieren

知识绝知识的愿景-语言-行动模式:快速列车、快速跑车、更普遍化

2505.23705v1

05-29

(U)NFV: Supervised and Unsupervised Neural Finite Volume Methods for Solving Hyperbolic PDEs

(U)NFV: Überwachte und unüberwachte neurale Finite-Volume-Methoden zur Lösung hyperbolischer PDEs

(U) NFV: 被监督和不受监督的解决双曲 PDE 的神经有限量方法

2505.23702v1

05-29

DiCoFlex: Model-agnostic diverse counterfactuals with flexible control

DiCoFlex: Modell-agnostische diverse Gegenfakten mit flexibler Steuerung

DiCoFlex:具有灵活控制的模型 – – 不可知性多元反事实

2505.23700v1

05-29

Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms

Computational Algebra mit Achtung: Transformer Oracles für Border Basis Algorithmen

注意的计算代数:边境基准比值的变异甲骨文

2505.23696v1

05-29

On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures

Über die Ausbildungskonvergenz von Transformern für die In-Context-Klassifizierung von Gauß-Mischungen

Gaussian混合物内集成分类变异器培训趋同

2410.11778v3

05-29

From Individual Experience to Collective Evidence: A Reporting-Based Framework for Identifying Systemic Harms

Von der individuellen Erfahrung zu kollektiven Beweisen: Ein meldepflichtiger Rahmen für die Identifizierung systemischer Schäden

从个人经验到集体证据:查明系统危害的报告框架

2502.08166v2

05-29

Mobi-$π$: Mobilizing Your Robot Learning Policy

Mobi-$π$: Mobilisierung Ihrer Roboter-Lernpolitik

Mobi-$ 美元:调动机器人学习政策

2505.23692v1

05-29

Unifying Perspectives: Plausible Counterfactual Explanations on Global, Group-wise, and Local Levels

Vereinheitlichende Perspektiven: Plausible gegenfaktische Erklärungen auf globaler, gruppenweiser und lokaler Ebene

统一观点:关于全球、集团和当地雇员的可视反事实解释

2405.17642v2

05-29

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Komponative Funktionen mit Transformern von einfachen Daten lernen

学习从易读数据转换器的学习构成函数

2505.23683v1

05-29

Understanding Mode Connectivity via Parameter Space Symmetry

Mode-Konnektivität über Parameter Raumsymmetrie verstehen

通过参数空间对称法理解模式连通性

2505.23681v1

05-29

SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

SVRPBench: Ein realistischer Maßstab für stochastisches Fahrzeugrouting-Problem

SVRPBench: 蒸汽车辆流出问题的现实基准

2505.21887v2

05-29

Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds

Bayesische Optimierung durch menschliches Feedback: Nah-optimale Reue-Bounds

Bayesian 人体反馈的优化:接近最佳的冷却环

2505.23673v1

05-29

GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

GSO: Herausfordernde Software-Optimierungsaufgaben zur Bewertung von SWE-Agenten

GSO:评估SWE-Agentics的有挑战的软件优化任务

2505.23671v1

05-29

Maximizing Confidence Alone Improves Reasoning

Maximierung des Vertrauens allein verbessert die Vernunft

使信心最大化单独提高合理性

2505.22660v2

05-29

SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression

SLiM: Ein-Schuss-Quantisierung und Sparsamkeit mit Low-Rank-Annäherung für LLM-Gewichtskompression

SLiM: LLM 重量压缩的单射量和与低级别近似相近的分数

2410.09615v3

05-29

LoLA: Low-Rank Linear Attention With Sparse Caching

LoLA: Low-Rank Lineare Aufmerksamkeit mit Sparse Caching

LoLA: 低兰克线性注意, 以粗糙的缓存

2505.23666v1

05-29

AMBER: Adaptive Mesh Generation by Iterative Mesh Resolution Prediction

AMBER: Adaptive Mesh-Generierung durch iterative Mesh-Auflösungsvorhersage

以迭代网目分辨率预测的适应性代谢代谢

2505.23663v1

05-29

Bayesian Perspective on Memorization and Reconstruction

Bayesische Perspektive auf Erinnerung und Wiederaufbau

Bayes人对记忆和重建的看法

2505.23658v1

05-29

Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation

Aktives Layer-Kontrastives Decodieren reduziert Halluzination bei der Generierung von Großsprachenmodellen

大型语言模式生成中活性多语言解层解码减少幻觉

2505.23657v1

05-29

How does Transformer Learn Implicit Reasoning?

Wie lernt Transformer Implizite Vernunft?

变形者如何学习隐含理由?

2505.23653v1

05-29

Optimization-Free Diffusion Model – A Perturbation Theory Approach

Optimierungsfreies Diffusionsmodell – Ein Perturbationstheorie-Ansatz

优化-无优化传播模式 – – 扰动理论方法

2505.23652v1

05-29

Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation

Merge-Friendly Post-Training Quantization für Multi-Target Domain-Anpassung

多目标域适应培训后量化

2505.23651v1

05-29

Optimal Bounds for Adversarial Constrained Online Convex Optimization

Optimale Grenzen für die Online-Konvergenzoptimierung

优化在线电传优化优化

2503.13366v4

05-29

Continuous Chain of Thought Enables Parallel Exploration and Reasoning

Kontinuierliche Gedankenkette ermöglicht parallele Erkundung und Vernunft

连续思考链有助于平行探索和推理

2505.23648v1

05-29

Are Reasoning Models More Prone to Hallucination?

Sind vernünftigere Modelle eher halluzinierend?

理性模型更能让人产生幻觉吗?

2505.23646v1

05-29

Towards Unified Attribution in Explainable AI, Data-Centric AI, and Mechanistic Interpretability

Auf dem Weg zu einer einheitlichen Attribution in erklärbarer KI, datenzentraler KI und mechanistischer Interpretierbarkeit

实现可解释的AI、数据集中AI和机械可解释性的统一归属

2501.18887v3

05-29

Global optimization of graph acquisition functions for neural architecture search

Globale Optimierung von Graphen-Erfassungsfunktionen für die neuronale Architektursuche

全球优化用于神经结构搜索的图图获取功能

2505.23640v1

05-29

Position: Scaling LLM Agents Requires Asymptotic Analysis with LLM Primitives

Position: Skalierung von LLM-Agenten erfordert asymptotische Analyse mit LLM-Primitiven

位置: 缩放 LLM 代理需要用 LLM 原始功能进行抗药性分析

2502.04358v2

05-29

MCP Safety Training: Learning to Refuse Falsely Benign MCP Exploits using Improved Preference Alignment

MCP Safety Training: Lernen, falsch benachbarte MCP-Exploits mit verbesserter Präferenzausrichtung abzulehnen

MCP 安全培训:学会利用改进的优惠协调,错误拒绝 MCP 剥削

2505.23634v1

05-29

Prompting Whisper for Improved Verbatim Transcription and End-to-end Miscue Detection

Prompting Whisper für verbesserte wörtliche Transkription und End-to-End-Missue-Erkennung

逐字记录和终端至终端杂项探测

2505.23627v1

05-29

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Quartett: Native FP4 Training kann für große Sprachmodelle optimal sein

四方:土著FFF4培训可以成为大语言模式的最佳方式

2505.14669v2

05-29

SPACE: SPike-Aware Consistency Enhancement for Test-Time Adaptation in Spiking Neural Networks

SPACE: SPike-Aware Consistency Enhancement für Test-Time-Anpassung in Spiking Neuronal Networks

空间:在Spiking神经网络中加强在测试-时间适应方面的SPike-Aware一致性增强

2504.02298v2

05-29

Instance-Optimality for Private KL Distribution Estimation

Instanz-Optimalität für private KL-Verteilungsabschätzung

私人 KL 分布分布估计的实情- 最佳度

2505.23620v1

05-29

Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes

Wenig scharfe Rede Deepfake Detection Anpassung an Gaußsche Prozesse

Gaussian 过程的“深假探测”适应

2505.23619v1

05-29

One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory

Eine Trajektorie, ein Token: Erdliche Video-Tokenisierung über panoptische Sub-Objekt-Trajektorie

一个轨迹, 一个 Token: 通过泛光子物件轨迹, 固定的视频轨迹

2505.23617v1

05-29

Causal Machine Learning in IoT-based Engineering Problems: A Tool Comparison in the Case of Household Energy Consumption

Kausales maschinelles Lernen in IoT-basierten Engineering-Problemen: Ein Tool-Vergleich im Fall des Haushaltsenergieverbrauchs

以木工工程问题为基础的因果机械学习:家庭能源消费工具比较

2505.12147v2

05-29

Learning Interpretable Differentiable Logic Networks for Tabular Regression

用于制表递减的可解释可解释逻辑网络

2505.23615v1

05-29

Inference-time Scaling of Diffusion Models through Classical Search

Inferenzzeit Skalierung von Diffusionsmodellen durch klassische Suche

通过古典搜索对传播模型进行传播的推断-时间缩放

2505.23614v1

05-29

The Generalized Skew Spectrum of Graphs

Das generalisierte Skew-Spektrum der Graphen

普通的Skew图象光谱

2505.23609v1

05-29

Data Model Design for Explainable Machine Learning-based Electricity Applications

Datenmodell-Design für erklärbare maschinelle Learning-basierte Stromanwendungen

可解释机器学习用电力应用数据模型设计

2505.23607v1

05-29

Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Muddit: Befreiende Generation jenseits von Text-zu-Bild mit einem Unified Discrete Diffusion Model

Muddit: 利用统一分解传播模型在文本到图像之外解放一代

2505.23606v1

05-29

STeCa: Step-level Trajectory Calibration for LLM Agent Learning

STeCa: Schritt-Level-Trajektorienkalibrierung für LLM Agent Learning

STeCa:LLM代理学习的职级轨迹校准

2502.14276v2

05-29

On Transferring Transferability: Towards a Theory for Size Generalization

Übertragbarkeit: Auf dem Weg zu einer Theorie der Größenverallgemeinerung

关于转让可转让性:走向一个通用规模理论

2505.23599v1

05-29

LLM Performance for Code Generation on Noisy Tasks

LLM-Performance für Code-Generierung bei lauten Aufgaben

LLM 噪音任务代码生成的LLM性能

2505.23598v1

05-29

Multilook Coherent Imaging: Theoretical Guarantees and Algorithms

Multilook Coherent Imaging: Theoretische Garantien und Algorithmen

多视相协调成像:理论保障和理算

2505.23594v1

05-29

Position: Federated Foundation Language Model Post-Training Should Focus on Open-Source Models

Position: Federated Foundation Language Model Nachschulung sollte sich auf Open-Source-Modelle konzentrieren

立场:联邦基金会语文示范培训后培训应侧重于开放来源模式

2505.23593v1

05-29

Accelerated Training of Federated Learning via Second-Order Methods

Beschleunigte Ausbildung des Föderierten Lernens über Methoden der zweiten Ordnung

通过二级方法加快联邦学习培训

2505.23588v1

05-29

PCA for Enhanced Cross-Dataset Generalizability in Breast Ultrasound Tumor Segmentation

PCA für verbesserte Cross-Dataset-Verallgemeinerung in der Brust-Ultraschall-Tumor-Segmentierung

五氯苯甲醚,用于在乳房超声波肿瘤分割中增强交叉数据的通用性

2505.23587v1

05-29

On-Policy RL with Optimal Reward Baseline

On-Policy RL mit optimaler Prämienbasis

具有最佳回报基准的政策性RL

2505.23585v1

05-29

Improving Time Series Forecasting via Instance-aware Post-hoc Revision

Verbesserung der Zeitreihenprognose über Instance-aware Post-hoc-Revision

改进时间序列预测,通过 “ 热后后预测 “ 改进时间序列预测

2505.23583v1

05-29

Wake-Informed 3D Path Planning for Autonomous Underwater Vehicles Using A* and Neural Network Approximations

Wake-Informierte 3D-Pfadplanung für autonome Unterwasserfahrzeuge mit A*- und Neuralnetzwerk-Annäherungen

使用A* 和神经网络相近的自动水下车辆的觉醒3D路径规划

2502.01918v2

05-29

BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model

BioReason: Förderung multimodaler biologischer Vernunft innerhalb eines DNA-LLM-Modells

BioReason:在DNA-LLM模型中激励多式生物理由

2505.23579v1

05-29

CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring

CoT Red-Handed: Stresstesting Chain-of-Thought-Überwachung

COT 红手:压力测试研究链监测

2505.23575v1

05-29

Maximum Likelihood Learning of Latent Dynamics Without Reconstruction

Maximale Wahrscheinlichkeit Lernen von latenten Dynamiken ohne Rekonstruktion

学习没有重建的原始动力学

2505.23569v1

05-29

DRO: A Python Library for Distributionally Robust Optimization in Machine Learning

DRO: Eine Python-Bibliothek für Distributional Robuste Optimierung im maschinellen Lernen

DRO: 一个用于在机器学习中进行分配式强力优化的 Python 图书馆

2505.23565v1

05-29

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

Segment Policy Optimization: Effektive Segment-Level-Kreditvergabe in RL für große Sprachmodelle

政策优化优化:大语言模式RL中有效的分部一级信用分配

2505.23564v1

05-29

LEXam: Benchmarking Legal Reasoning on 340 Law Exams

LEXam: Benchmarking der rechtlichen Begründung von 340 Rechtsprüfungen

LEXam:340项法律考试的法律依据基准

2505.12864v2

05-29

Qwen Look Again: Guiding Vision-Language Reasoning Models to Re-attention Visual Information

Qwen Look Again: Leitende Vision-Sprachen-Reasoning-Modelle, um visuelle Informationen erneut zu speichern

再看一遍:指导视觉信息重新阅读的视觉-语言定位依据模式

2505.23558v1

05-29

Learning Parametric Distributions from Samples and Preferences

Parametrische Verteilungen aus Proben und Präferenzen lernen

抽样和优惠制的学习参数分布

2505.23557v1

05-29

Adaptive Federated LoRA in Heterogeneous Wireless Networks with Independent Sampling

Adaptives Federated LoRA in heterogenen drahtlosen Netzwerken mit unabhängiger Probenahme

具有独立抽样调查的多源无线网络中的联邦适应性

2505.23555v1

05-29

Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters

Nachhaltiges CO2-basiertes und wassereffizientes LLM-Scheeduling in Geo-verteilten Cloud-Rechenzentren

地球分布云数据中心的可持续碳软件和水效率高的LLM

2505.23554v1

05-29

Comparing the Moore-Penrose Pseudoinverse and Gradient Descent for Solving Linear Regression Problems: A Performance Analysis

Vergleich der Moore-Penrose Pseudoinverse und Gradient Descent zur Lösung linearer Regressionsprobleme: Eine Leistungsanalyse

将摩尔-彭罗斯-普塞多温和梯底比较以解决线性倒退问题:绩效分析

2505.23552v1

05-29

Diffusion Sampling Correction via Approximately 10 Parameters

Diffusions-Probenahmekorrektur über ca. 10 Parameter

通过大约10个参数校正传播抽样校正

2411.06503v3

05-29

Fast Large Language Model Collaborative Decoding via Speculation

Schnelles Large Language Model Kollaboratives Decodieren über Spekulation

通过投机进行快速大语言合作示范模式

2502.01662v2

05-29

Domain-Aware Tensor Network Structure Search

Domain-Aware Tensor Netzwerkstruktur Suche

域- 软件显示器网络网络结构搜索

2505.23537v1

05-29

It’s a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data

Es ist ein (Blind) Match! Richtung Vision-Sprache Korrespondenz ohne Paralleldaten

这是一个( Blind) 匹配! 向没有平行数据的视觉语言对应函授

2503.24129v2

05-29

NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks

NACHOS: Neurale Architektur Suche nach Hardware eingeschränkt Early Exit Neural Networks

NACHOS: 早期外出神经网络硬件控制系统神经结构搜索

2401.13330v2

05-29

Subgraph Gaussian Embedding Contrast for Self-Supervised Graph Representation Learning

Subgraph Gaussian Einbettungskontrast für selbstüberwachtes Graphen-Darstellungslernen

自支持图表代表制学习的 Subgraph Gaussian 嵌入式对比对比度

2505.23529v1

05-29

Comparative assessment of fairness definitions and bias mitigation strategies in machine learning-based diagnosis of Alzheimer’s disease from MR images

Vergleichende Bewertung von Fairness-Definitionen und Bias-Minderungsstrategien in der maschinellen Lern-basierten Diagnose der Alzheimer-Krankheit aus MR-Bildern

对利用MR图像对阿尔茨海默氏病进行机器学习诊断的公平定义和减少偏见战略的比较评估

2505.23528v1

05-29

Normalizing Flows are Capable Models for RL

Normalisierende Strömungen sind fähige Modelle für RL

正常流动是RL的能力模型

2505.23527v1

05-29

Accelerating AllReduce with a Persistent Straggler

AllReduce mit einem persistenten Straggler beschleunigen

使用持久性斯特拉格驱动器加速全部拖动

2505.23523v1

05-29

Of Mice and Machines: A Comparison of Learning Between Real World Mice and RL Agents

Von Mäusen und Maschinen: Ein Vergleich des Lernens zwischen Real World Mäusen und RL Agenten

Mice和Mings:真实世界Mice和RL代理商之间的学习比较

2505.12204v2

05-29

An AI System for Continuous Knee Osteoarthritis Severity Grading Using Self-Supervised Anomaly Detection with Limited Data

Ein KI-System für kontinuierliche Knie-Osteoarthritis Schweregraduierung mittels selbstüberwachter Anomalieerkennung mit begrenzten Daten

AI 使用有限数据的自超异常检测系统

2407.11500v2

05-29

SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning

SimBa: Einfachheit Bias für das Skalieren von Parametern im Deep Reinforcement Learning

SimBA: 深强化学习中增强参数的简单比值

2410.09754v2

05-29

OmniEarth-Bench: Towards Holistic Evaluation of Earth’s Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data

OmniEarth-Bench: Auf dem Weg zu einer ganzheitlichen Bewertung der sechs Sphären und der Wechselwirkungen zwischen der Erde und multimodalen Erddaten

Omni地球环境:争取全面评价地球六层和与多模式对地观测地球数据交互作用

2505.23522v1

05-29

AnchorAttention: Difference-Aware Sparse Attention with Stripe Granularity

AnkerAchtung: Differenz-Bewusst Sparse Achtung mit Streifen Granularität

锁定目标: 带条形颗粒的差别- 软件分散注意

2505.23520v1

05-29

Hyperspherical Normalization for Scalable Deep Reinforcement Learning

Hypersphärische Normalisierung für skalierbares Deep Reinforcement Learning

可缩放深强化学习超球常规化

2502.15280v2

05-29

SGD Jittering: A Training Strategy for Robust and Accurate Model-Based Architectures

SGD Jittering: Eine Schulungsstrategie für robuste und präzise modellbasierte Architekturen

SGD JGT JUGT JIGT: 强健和准确的建模建筑培训战略

2410.14667v2

05-29

Joint Localization and Activation Editing for Low-Resource Fine-Tuning

Gemeinsame Lokalisierungs- und Aktivierungsbearbeitung für Low-Resource Fine-Tuning

低资源微调联合定位和启动编辑

2502.01179v4

100

05-29

DeepFilterGAN: A Full-band Real-time Speech Enhancement System with GAN-based Stochastic Regeneration

DeepFilterGAN: Ein Full-Band-Real-Time-Speech Enhancement-System mit GAN-basierter stochastischer Regeneration

DeepFilterGAN:全频实时语音增强系统,以GAN为基础进行蒸汽再生

2505.23515v1

101

05-29

Spectrotemporal Modulation: Efficient and Interpretable Feature Representation for Classifying Speech, Music, and Environmental Sounds

Spektrotemporale Modulation: Effiziente und interpretierbare Feature-Darstellung für die Klassifizierung von Sprach-, Musik- und Umweltgeräuschen

时速变化:演讲、音乐和环境声音的分类化演讲、音乐和环境声音的高效和可解释的地物代表

2505.23509v1

102

05-29

Why Machine Learning Models Fail to Fully Capture Epistemic Uncertainty

Warum Modelle des maschinellen Lernens die epistemische Unsicherheit nicht vollständig erfassen

机器学习模型为何不能完全捕捉宇宙的不确定性

2505.23506v1

103

05-29

Hijacking Large Language Models via Adversarial In-Context Learning

Entführen von großen Sprachmodellen über das adversarische In-Context-Lernen

通过对抗性内书学习劫持大语言模式

2311.09948v3

104

05-29

Epistemic Errors of Imperfect Multitask Learners When Distributions Shift

Epistemische Fehler von unvollkommenen Multitask Learner bei Verteilungsverschiebungen

发行转移时不完美的多任务学习者

2505.23496v1

105

05-29

Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Diagnose und Bewältigung von Pitfalls in KG-RAG-Datensätzen: Zu zuverlässigerem Benchmarking

分析和处理KG-RAG数据集的缺陷:争取更可靠的基准

2505.23495v1

106

05-29

Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning

Kurzbefehle in audio-visuellen Deepfake-Erkennungsdatensätzen mit unüberwachtem Lernen

在未经监督的学习的视听深假发现数据集中绕过捷径

2412.00175v3

107

05-29

A False Discovery Rate Control Method Using a Fully Connected Hidden Markov Random Field for Neuroimaging Data

Eine falsche Discovery Rate Control-Methode mit einem vollständig verbundenen versteckten Markov Random Field für Neuroimaging-Daten

假发现率控制方法, 使用完全连接的隐藏 Markov 随机字段来生成 Neuroimage 数据

2505.20688v2

108

05-29

Learning to Poison Large Language Models for Downstream Manipulation

Große Sprachmodelle für Downstream-Manipulation zu vergiften

学习下游操作毒物大语言模式

2402.13459v3

109

05-29

SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training

SGD als Freie Energie Minimierung: Ein thermodynamischer Blick auf neurales Netzwerktraining

SGD作为自由能源最小化:关于神经网络培训的热动力学观点

2505.23489v1

110

05-29

Federated Granger Causality Learning for Interdependent Clients with State Space Representation

Föderiertes Granger-Causality-Lernen für interdependente Kunden mit staatlicher Raumdarstellung

为具有国家空间代表制的相互依存客户提供

2501.13890v4

111

05-29

TimePoint: Accelerated Time Series Alignment via Self-Supervised Keypoint and Descriptor Learning

TimePoint: Beschleunigte Zeitreihenausrichtung über selbstüberwachtes Keypoint- und Descriptor-Lernen

时间点:通过自上调关键点和描述学习加速时间序列调整

2505.23475v1

112

05-29

Refining Labeling Functions with Limited Labeled Data

Verfeinerung von Beschriftungsfunktionen mit begrenzten beschrifteten Daten

用有限标签数据改进标签功能

2505.23470v1

113

05-29

Surveying the space of descriptions of a composite system with machine learning

Vermessung des Raumes der Beschreibungen eines Verbundsystems mit maschinellem Lernen

勘查机器学习综合系统说明的空间

2411.18579v2

114

05-29

Retrieval Visual Contrastive Decoding to Mitigate Object Hallucinations in Large Vision-Language Models

Retrieval Visuelle Kontrastive Dekodierung zu Mitigate-Objekt-Halluzinationen in großen Vision-Sprachen-Modellen

在大型视觉-语言模型中,将检索视觉对抗性脱钩作为稀释物体幻觉的大型视觉-语言模型

2505.20569v2

115

05-29

A Tutorial on Meta-Reinforcement Learning

Ein Tutorial zum Meta-Reinforcement-Lernen

关于元加强学习的教学材料

2301.08028v4

116

05-29

Agentic Knowledgeable Self-awareness

Agentisch sachkundiges Selbstbewußtsein

A. 动态知识自觉意识

2504.03553v2

117

05-29

Pessimism Principle Can Be Effective: Towards a Framework for Zero-Shot Transfer Reinforcement Learning

Pessimismus-Prinzip kann wirksam sein: Auf dem Weg zu einem Rahmen für Null-Shot-Transfer-Verstärkungs-Lernen

悲观主义原则可以有效:建立一个零热转移强化学习框架

2505.18447v2

118

05-29

LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection

LENSLLM: Enthüllen von Feintuning-Dynamik für die LLM-Auswahl

LENSLLLM: 用于选择LLM的连续精细调整动态

2505.03793v2

119

05-29

Broadband Ground Motion Synthesis by Diffusion Model with Minimal Condition

Broadband Ground Motion Synthese durch Diffusion Modell mit minimalem Zustand

以最小条件传播模型进行宽带地面移动合成

2412.17333v2

120

05-29

On Global Convergence Rates for Federated Policy Gradient under Heterogeneous Environment

Globale Konvergenzraten für Föderierten politischen Gradienten unter heterogener Umwelt

关于不同不同环境下联邦政策分级制全球趋同率的全球趋同率

2505.23459v1

121

05-29

Diffusion Guidance Is a Controllable Policy Improvement Operator

Diffusion Guidance ist ein kontrollierbarer Politikverbesserungs-Betreiber

传播指导是可控制的政策改进操作员

2505.23458v1

122

05-29

TabReason: A Reinforcement Learning-Enhanced Reasoning LLM for Explainable Tabular Data Prediction

TabReason: Eine verstärkte Lern-verbesserte Begründung LLM für erklärbare tabellarische Datenvorhersage

TabReson: 用于可解释的图表数据预测的强化学习-提高合理理由的强化学习-强化LLMLM

2505.21807v2

123

05-29

Learning Cascade Ranking as One Network

Kaskaden-Ranking als ein Netzwerk lernen

学习连级安排 “ 一个网络 “ 网络

2503.09492v2

124

05-29

DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

DynaMem: Online-Dynamischer Raum-Semantischer Speicher für mobile Manipulationen in der offenen Welt

DynaMem: 用于开放世界移动操纵的在线动态空间-空间内存

2411.04999v2

125

05-29

Network Inversion for Uncertainty-Aware Out-of-Distribution Detection

Netzwerk-Inversion für unsichere Out-of-Distribution-Erkennung

用于不确定性软件发送外检测的网络转换

2505.23448v1

126

05-29

GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning

GSQ-Tuning: Group-Shared Exponents integer in einer voll quantifizierten Schulung für LLMs On-Device-Fine-Tuning

GSQ-Turning:为在线设计精微调LLM女士提供全面量化培训的集团共享指数整数

2502.12913v3

127

05-29

SCoTT: Strategic Chain-of-Thought Tasking for Wireless-Aware Robot Navigation in Digital Twins

SCoTT: Strategisches Chain-of-Thought-Tasking für Wireless-Aware-Roboternavigation in digitalen Zwillingen

SCTT: “ 数字双双 “ 中无线软件机器人导航战略研究链任务

2411.18212v2

128

05-29

The Strong, Weak and Benign Goodhart’s law. An independence-free and paradigm-agnostic formalisation

The Strong, Weak and Benign Goodharts Gesetz. Eine unabhängigkeitsfreie und paradigmatisch-agnostische Formalisierung

强势、弱弱和本尼·古德哈特法,无独立和无范式、不可知的正规化

2505.23445v1

129

05-29

Strategic Classification with Non-Linear Classifiers

Strategische Klassifizierung mit nicht linearen Klassifikatoren

战略分类与非链分类法战略分类

2505.23443v1

130

05-29

Rethinking Regularization Methods for Knowledge Graph Completion

Überdenken von Regularisierungsmethoden für Wissensgraphenvervollständigung

重新思考知识图完成正规化方法

2505.23442v1

131

05-29

The challenge of hidden gifts in multi-agent reinforcement learning

Die Herausforderung der versteckten Gaben in Multi-Agenten-Verstärkung Lernen

多试剂强化学习中隐藏礼品的挑战

2505.20579v2

132

05-29

LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

LoTUS: Großformatige Maschine entlernen mit einem Geschmack von Ungewissheit

LoTUS: 大型机器与不确定性的味道脱钩

2503.18314v4

133

05-29

Bounded-Abstention Pairwise Learning to Rank

Gebundene Abhaltung Pairwise Learning to Rank

学习排名

2505.23437v1

134

05-29

Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning

Trainieren mit Perturbation, schlussfolgern nach Merging: Ein Zwei-Stufen-Rahmen für kontinuierliches Lernen

接转训练、合并后的推推:持续学习的双阶段框架

2505.22389v2

135

05-29

Emergent Risk Awareness in Rational Agents under Resource Constraints

Emergent Risk Awareness in Rational Agents unter Ressourcenbeschränkungen

资源限制下对合理代理的新兴风险意识

2505.23436v1

136

05-29

Diversity-Aware Policy Optimization for Large Language Model Reasoning

Diversity-Aware-Politikoptimierung für groß angelegte Sprachmodell-Reasoning

大语言示范理由的多样性政策优化

2505.23433v1

137

05-29

Improved Learning via k-DTW: A Novel Dissimilarity Measure for Curves

Verbessertes Lernen über k-DTW: Ein neuartiges Maß an Unähnlichkeit für Kurven

通过 k-DTW改进学习:曲线的新差异措施

2505.23431v1

138

05-29

Proper Dataset Valuation by Pointwise Mutual Information

Richtiger Datensatz Bewertung durch pointwise Gegenseitige Informationen

按点对点相互信息分列的适当数据集估价

2405.18253v3

139

05-29

Understanding and Mitigating Overrefusal in LLMs from an Unveiling Perspective of Safety Decision Boundary

Überrefusal in LLMs aus Sicht der Sicherheitsentscheidungsgrenze zu verstehen und zu mildern

从安全裁定边界的始终如一的视角理解和减轻LLM女士的过度拒绝

2505.18325v2

140

05-29

On the Validity of Head Motion Patterns as Generalisable Depression Biomarkers

Über die Gültigkeit von Head Motion Patterns als Generalisable Depression Biomarkers

头动模式作为可普遍适用的萧条生物标志物的有效性

2505.23427v1

141

05-29

Enhanced DACER Algorithm with High Diffusion Efficiency

Verbesserter DACER-Algorithmus mit hoher Diffusionseffizienz

DACER 高传播效率增强的DACER 计算法

2505.23426v1

142

05-29

Hierarchical Neuro-Symbolic Decision Transformer

Hierarchischer neuro-symbolischer Entscheidungstransformator

等级性神经-共制决定变换器

2503.07148v3

143

05-29

Risk-aware Direct Preference Optimization under Nested Risk Measure

Risikobewusste Direktpräferenzoptimierung unter verschachtelter Risikomaßnahme

内层风险措施下认识到风险的直接最优化

2505.20359v2

144

05-29

OTPTO: Joint Product Selection and Inventory Optimization in Fresh E-commerce Front-End Warehouses

OTPTO: Gemeinsame Produktauswahl und Bestandsoptimierung in Fresh E-Commerce Front-End Warehouses

OTPTO: 在新的电子商务前端仓库中联合产品选择和清单优化

2505.23421v1

145

05-29

Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition

Probeneffiziente menschliche Bewertung großer Sprachmodelle durch maximalen Diskrepanzwettbewerb

通过最大差异竞争对大语言模式进行抽样有效人力评价

2404.08008v2

146

05-29

Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates

Robustheitskongruente Adversarial Training für sicheres maschinelles Lernen Modellaktualisierungen

安全机器学习模型更新的强力和共性安全机器学习模型自动培训

2402.17390v2

147

05-29

Privacy Amplification by Structured Subsampling for Deep Differentially Private Time Series Forecasting

Datenschutzverstärkung durch strukturierte Subsampling für tief differential private Zeitreihen Forecasting

以结构化的分抽样对深相异私人时间序列预测进行隐私放大

2502.02410v2

148

05-29

On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

On-Device Collaborative Language Modeling über eine Mischung aus Generalisten und Spezialisten

通过通识主义者和专家混合组合的在线合作语言建模

2409.13931v4

149

05-29

KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction

KVzip: Query-Agnostic KV Cache-Kompression mit Kontext-Rekonstruktion

KVzip: 在背景重建中压缩缓存

2505.23416v1

150

05-29

Bidirectional predictive coding

Bidirektionale vorausschauende Kodierung

双向预测双向预测编码

2505.23415v1

151

05-29

Identification and Optimal Nonlinear Control of Turbojet Engine Using Koopman Eigenfunction Model

Identifizierung und optimale nichtlineare Steuerung der Turbojet-Engine mit Koopman Eigenfunktionsmodell

使用 Koopman Eigen功能模型对涡轮喷气发动机进行最佳非线性识别和最佳非线性控制

2505.10438v2

152

05-29

Buffer-free Class-Incremental Learning with Out-of-Distribution Detection

Pufferfreies Klassen-Inkrementelles Lernen mit Out-of-Distribution Detection

含有扩散外检测检测的无缓缓度免费类级学习

2505.23412v1

153

05-29

Video Editing for Audio-Visual Dubbing

Videobearbeitung für Audio-Visual-Dubbing

音像视频编辑

2505.23406v1

154

05-29

A Refined Analysis of UCBVI

Eine raffinierte Analyse von UCBVI

UCBVI的精细分析

2502.17370v2

155

05-29

Closed-form Solutions: A New Perspective on Solving Differential Equations

Closed-form Lösungen: Eine neue Perspektive zur Lösung von Differentialgleichungen

封闭式解决办法:解决差异等量的新视角

2405.14620v3

156

05-29

Subgroups Matter for Robust Bias Mitigation

Untergruppen Materie für robuste Bias Mitigation

稳健的Biust Bias 减轻风险的分组事项

2505.21363v2

157

05-29

Unraveling the Interplay between Carryover Effects and Reward Autocorrelations in Switchback Experiments

Entschlüsselung des Interplays zwischen Übertragungseffekten und Belohnungsautokorrelationen in Switchback-Experimenten

在回转实验中解开结转效应与回转回实验中回调自动关系之间的交互作用

2403.17285v3

158

05-29

Dynamic Estimation Loss Control in Variational Quantum Sensing via Online Conformal Inference

Dynamische Abschätzungsverlustkontrolle bei der variationalen Quantensensing über Online-Konforme Inferenz

通过在线非正式推断在变化量测量中动态估计损失控制

2505.23389v1

159

05-29

BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction

BatteryLife: Ein umfassender Datensatz und Benchmark für die Vorhersage der Akkulaufzeit

电池寿命:电池寿命预测综合数据集和基准

2502.18807v4

160

05-29

A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers

Eine statistische Lernperspektive zur halbdualen Neural Optimal Transport Solvers

半对半对半的神经神经优化运输解决方案的统计学习视角

2502.01310v2

161

05-29

Automated Modeling Method for Pathloss Model Discovery

Automatisierte Modellierungsmethode für Pathloss Model Discovery

病理模型发现自动建模方法

2505.23383v1

162

05-29

Tracking Progress Towards Sustainable Development Goal 6 Using Satellite Imagery

Fortschritte auf dem Weg zu einer nachhaltigen Entwicklung verfolgen Ziel 6 Nutzung von Satellitenbildern

利用卫星图像跟踪可持续发展目标6的进展情况

2411.19093v2

163

05-29

Meta-Learning Approaches for Speaker-Dependent Voice Fatigue Models

Meta-Learning-Ansätze für Sprecher-Abhängige Sprachmüdigkeitsmodelle

议长 – – 独立的声音 “ fatigue “ 模式的元学习方法

2505.23378v1

164

05-29

GWQ: Gradient-Aware Weight Quantization for Large Language Models

GWQ: Gradient-Aware Weight Quantization für große Sprachmodelle

GWQ: 大语言模型的渐变软件重量

2411.00850v4

165

05-29

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Das Nachdenken über die Auswahlkriterien bei der Stärkung des Lernens für LLM-Reasoning: Eine Kompetenz-Schwierigkeits-Alignment-Perspektive

重新思考在加强学习学习中为LLM 合理性提供强化学习的抽样标准:能力-困难-协调观点

2505.17652v2

166

05-29

Dynamic Spectral Backpropagation for Efficient Neural Network Training

Dynamische Spektral-Backpropagation für effizientes Neural-Netzwerk-Training

促进高效神经网络培训的动态光谱后方通信

2505.23369v1

167

05-29

Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs

Graph of Records: Steigerung der retrieval Augmented Generation für Langkontext-Zusammenfassung mit Graphen

记录图图:用图表进行长文本摘要的推进检索增量生成器

2410.11001v2

168

05-29

Guarantees of a Preconditioned Subgradient Algorithm for Overparameterized Asymmetric Low-rank Matrix Recovery

Garantien eines vorkonditionierten Subgradienten Algorithmus für überparameterisierte asymmetrische Low-rank Matrix Erholung

保证为超参数化的测量性对称低级矩阵恢复提供先决条件的亚梯分算法的保障

2410.16826v2

169

05-29

Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

Grower-in-the-Loop Interaktives Verstärkungslernen für Greenhouse Climate Control

种植者在Loop-Loop 互动强化学习促进温室气候控制

2505.23355v1

170

05-29

ChatHuman: Chatting about 3D Humans with Tools

ChatHuman: Chatten über 3D-Menschen mit Tools

聊天:用工具聊天关于3D人类

2405.04533v2

171

05-29

BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Behavioural Change

BAH-Datensatz für Ambivalenz/Hesitanzerkennung in Videos für Verhaltensänderungen

BAH 行为变化视频中双向/隐私识别 BAH 数据集

2505.19328v2

172

05-29

Towards Reward Fairness in RLHF: From a Resource Allocation Perspective

Zur Belohnung Fairness in RLHF: Aus Ressourcenzuweisungsperspektive

走向RLHF的奖励公平:从资源分配角度

2505.23349v1

173

05-29

Sentinel: Scheduling Live Streams with Proactive Anomaly Detection in Crowdsourced Cloud-Edge Platforms

Sentinel: Planung von Livestreams mit proaktiver Anomalieerkennung in Crowdsourced Cloud-Edge-Plattformen

哨兵:将现场流排成日程,在人源云源云源平台上进行主动异常探测

2505.23347v1

174

05-29

Graph Positional Autoencoders as Self-supervised Learners

Graphische Positionale Autoencoder als selbstüberwachte Lernende

作为自监管学习者进行定位自动校对的图形图

2505.23345v1

175

05-29

A Descriptor Is All You Need: Accurate Machine Learning of Nonadiabatic Coupling Vectors

Ein Deskriptor ist alles, was Sie brauchen: Genaues maschinelles Lernen von nichtadiabatischen Kupplungsvektoren

描述符是你需要的:非非异相叠合矢量的精确机器学习

2505.23344v1

176

05-29

Matryoshka Model Learning for Improved Elastic Student Models

Matryoshka Model Learning für verbesserte elastische Studentenmodelle

Matryoshka 改进弹性学生模式示范学习模式

2505.23337v1

177

05-29

X2Graph for Cancer Subtyping Prediction on Biological Tabular Data

X2Graph für Krebs Subtyping Vorhersage auf biologische Tabellendaten

用于对生物表表数据进行癌症子图谱预测的X2Graph

2505.23334v1

178

05-29

Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization

Feintuning Next-Scale Visual Autoregressive Modelle mit gruppenrelativer Politikoptimierung

采用群体相对政策优化优化的下尺度视觉自动递减模型

2505.23331v1

179

05-29

Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism

Fehlerübertragung und Decorrelation als potenzieller künstlicher und natürlicher Lernmechanismus

错误广播和装饰关系作为一种潜在的人工和自然学习机制

2504.11558v2

180 05-29 Combinatorial Rising Bandit Kombinatorial Rising Bandit 混合崛起强盗 2412.00798v3

181

05-29

Efficient Parameter Estimation for Bayesian Network Classifiers using Hierarchical Linear Smoothing

Effiziente Parameterschätzung für Bayesian Network Klassifikatoren mit Hierarchical Linear Glättung

Bayesian 网络分类器使用等级线性线性平滑法的高效参数参数估测

2505.23320v1

182

05-29

A Straightforward Gradient-Based Approach for High-Tc Superconductor Design: Leveraging Domain Knowledge via Adaptive Constraints

Ein einfacher gradient-basierter Ansatz für High-Tc-Supraleiter-Design: Nutzung von Domain-Wissen über adaptive Einschränkungen

高Tc超级导体设计的直向渐进式高超导体设计方法:通过适应性制约因素利用域知识

2403.13627v2

183

05-29

Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments

Verbesserung der Genauigkeit der Markerbewertung durch ordinelles Vertrauensmodellierung in Bildungsbewertungen

通过在教育评估中建立常规信任模型,加强标标码的准确度

2505.23315v1

184

05-29

Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition

Adversariale Semantische und Label-Störung Angriff für Fußgänger Attribute Anerkennung

对抗性语义和Label干扰攻击,以确认佩德斯特属性

2505.23313v1

185

05-29

Rethinking Gradient-Based Methods: Multi-Property Materials Design Beyond Differentiable Targets

重新思考渐进方法:超出可区别目标的多财产材料设计

2410.08562v4

186

05-29

Score-based Generative Modeling for Conditional Independence Testing

Score-basierte Generative Modellierung für die Prüfung der bedingten Unabhängigkeit

有条件独立测试基于记分率生成模型

2505.23309v1

187

05-29

MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction

MGE-LDM: Gemeinsame Latente Diffusion für simultane Musikgeneration und Quellenextraktion

MGE-LDM:同时制作音乐和来源采掘联合前期传播

2505.23305v1

188

05-29

Understanding and Mitigating Miscalibration in Prompt Tuning for Vision-Language Models

Verstehen und Abmildern von Fehlkalibrierung bei sofortiger Tuning für Vision-Language-Modelle

理解和减缓视觉语言模型快速开票时的误差

2410.02681v4

189

05-29

How Does Response Length Affect Long-Form Factuality

Wie wirkt sich die Response-Länge auf die Langform-Faktizität aus?

反应时间长度如何影响长期事实质量

2505.23295v1

190

05-29

Multi-Modal Framing Analysis of News

Multi-Modal Framing Analyse der Nachrichten

新闻多模式结构分析

2503.20960v3

191

05-29

Comparative Analysis of the Land Use and Land Cover Changes in Different Governorates of Oman using Spatiotemporal Multi-spectral Satellite Data

Vergleichende Analyse der Bodennutzungs- und Bodenbedeckungsänderungen in verschiedenen Gouvernements von Oman unter Verwendung spatiotemporaler multispektraler Satellitendaten

利用斯帕蒂多光谱多谱段卫星数据对阿曼不同省份土地利用和土地覆盖变化的比较分析

2505.23285v1

192

05-29

Improving Continual Learning Performance and Efficiency with Auxiliary Classifiers

Verbesserung der kontinuierlichen Lernleistung und Effizienz mit Hilfsklassifikatoren

提高持续学习成绩和效率,辅级分级

2403.07404v4

193

05-29

Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

Optimale Protokolle für kontinuierliches Lernen über statistische Physik und Steuerungstheorie

通过统计物理和控制理论不断学习的最佳最佳协议

2409.18061v3

194

05-29

LADA: Scalable Label-Specific CLIP Adapter for Continual Learning

LADA: Skalierbarer Label-Spezifischer CLIP Adapter für kontinuierliches Lernen

旱地退化评估:用于持续学习的可缩放标签特定CLIP适应器

2505.23271v1

195

05-29

Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs

Entfernt Machine Unlearning wirklich Modellwissen? Ein Rahmen für die Prüfung von Unlearning in LLMs

机器取消学习是否真正删除了示范知识? 审计框架是否在LLMM中取消学习?

2505.23270v1

196

05-29

Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning

Behavior-Regularized Diffusion Policy Optimierung für Offline-Verstärkung Lernen

离线强化学习的传播政策优化

2502.04778v2

197

05-29

Efficiently Access Diffusion Fisher: Within the Outer Product Span Space

Effizienter Zugriff auf Diffusion Fisher: Innerhalb des Outer Product Span Space

有效获取扩散渔渔场:在外生产品空间内

2505.23264v1

198

05-29

Stable Thompson Sampling: Valid Inference via Variance Inflation

Stabile Thompson-Probenahme: Gültige Schlussfolgerung durch Varianz-Inflation

稳定汤普森抽样:因通货膨胀差异而得出的有效推论

2505.23260v1

199

05-29

BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

BOFormer: Lernen, Multi-Objektive Bayesian Optimierung über nicht-Markovian RL zu lösen

BOFormer: 学会通过非马尔科维安RL解决多目标巴耶斯最佳利用

2505.21974v2

200

05-29

Skywork Open Reasoner 1 Technical Report

Skywork Open Reasoner 1 Technischer Bericht

” 天窗开放理由1 “ 技术报告

2505.22312v2

201

05-29

Tensor Product Attention Is All You Need

Tensor Produkt-Achtung ist alles, was Sie brauchen

色素产品关注是所有你需要的

2501.06425v4

202

05-29

Sparseformer: a Transferable Transformer with Multi-granularity Token Sparsification for Medical Time Series Classification

Sparseformer: ein übertragbarer Transformer mit Multigranularitäts-Tokensparsifikation für die Klassifizierung medizinischer Zeitreihen

分散式分析器:医疗时间序列分类的可转让变异器,具有多管质质调分法

2503.15578v2

203

05-29

RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting

RiverMamba: Ein staatliches Weltraummodell für globale Flussentladung und Hochwasserprognose

RiverMamba:全球河流排泄和洪水预报国家空间模型

2505.22535v2

204

05-29

Accelerating RLHF Training with Reward Variance Increase

Beschleunigung des RLHF-Trainings mit Belohnungsvarianzsteigerung

加快RLHF培训,增加奖励差异

2505.23247v1

205

05-29

Measuring Participant Contributions in Decentralized Federated Learning

Messung der Teilnehmerbeiträge im dezentralisierten Föderierten Lernen

分权联邦学习中的衡量参与者贡献

2505.23246v1

206

05-29

Are You Using Reliable Graph Prompts? Trojan Prompt Attacks on Graph Neural Networks

Verwenden Sie zuverlässige Graph-Prompts? Trojanische Prompt-Angriffe auf Graph-Neural-Netzwerke

你用的是可靠图形提示吗? Trojan对图形神经网络的迅速攻击

2410.13974v2

207

05-29

Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts

Autonome Datenauswahl mit Zero-shot Generative Klassifikatoren für mathematische Texte

具有数学文本零光生成分类器的自动数据选择

2402.07625v6

208

05-29

Equivalence of stochastic and deterministic policy gradients

Gleichwertigkeit stochastischer und deterministischer politischer Gradienten

政策梯度和确定性政策梯度等同

2505.23244v1

209

05-29

Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game

Sprachagenten mit Verstärkung Lernen für strategisches Spiel im Werwolf Spiel

在狼人游戏中进行战略游戏强化学习的语文代理

2310.18940v4

210

05-29

Joint estimation of smooth graph signals from partial linear measurements

Gemeinsame Schätzung glatter Graphensignale aus partiellen linearen Messungen

对部分线性测量得出的平滑图示信号的联合估计

2505.23240v1

211

05-29

Learn Singularly Perturbed Solutions via Homotopy Dynamics

Singulär perturbed Lösungen über Homotopy Dynamics lernen

通过智多基动力学学习单点受扰动的解决方案

2502.00488v3

212

05-29

HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model

HiDe-LlaVA: Hierarchische Entkopplung zur kontinuierlichen Instruktionstuning von multimodalen Großsprachenmodellen

HIDE-LLALAVA:多式大语言模式连续教学制导的等级脱钩

2503.12941v2

213

05-29

Graph Random Walk with Feature-Label Space Alignment: A Multi-Label Feature Selection Method

Graph Random Walk mit Feature-Label-Raumausrichtung: Eine Multi-Label-Feature-Auswahlmethode

带有地貌标签空间对齐的任意漫步图图 : 多标签特征选择方法

2505.23228v1

214

05-29

am-ELO: A Stable Framework for Arena-based LLM Evaluation

am-ELO: Ein stabiles Rahmenwerk für Arena-basierte LLM-Evaluierung

AM-ELO:基于竞技场的LLM评价稳定框架

2505.03475v2

215

05-29

Generalizability vs. Counterfactual Explainability Trade-Off

Generalisierbarkeit vs. gegenfaktische Erklärbarkeit Trade-Off

通用与反事实解释

2505.23225v1

216

05-29

JANET: Joint Adaptive predictioN-region Estimation for Time-series

JANET: Gemeinsame adaptive Vorhersage-Region Schätzung für Zeitreihen

JANET: 时间序列联合适应性预测N-区域估算

2407.06390v2

217

05-29

A Signed Graph Approach to Understanding and Mitigating Oversmoothing in GNNs

Ein signierter Graphansatz zum Verständnis und zur Milderung von Übersäuerung in GNNs

签署《理解和减缓全球NNNs中过度过度使用问题图表方法》

2502.11394v2

218

05-29

Daunce: Data Attribution through Uncertainty Estimation

Daunce: Datenzuweisung durch Unsicherheitsabschätzung

Daunce:通过不确定性估计数据归属

2505.23223v1

219

05-29

Trajectory Generator Matching for Time Series

Trajektorie Generator passend für Zeitreihen

时间序列匹配轨迹生成器

2505.23215v1

220

05-29

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Engere Datenschutzprüfung von DP-SGD im Hidden State Threat Model

对隐藏国家威胁模式DP-SGD的更严格隐私审计

2405.14457v3

221

05-29

Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces

Verbesserung der parallelen Programmleistung mit LLM-Optimierern über Agent-System-Schnittstellen

通过代理-系统接口改进与LLM优化器的平行方案绩效

2410.15625v3

222

05-29

On the performance of machine-learning-assisted Monte Carlo in sampling from simple statistical physics models

Über die Leistung von Monte Carlo mit maschinellem Lernen bei der Probenahme von einfachen Modellen der statistischen Physik

关于机械学习辅助蒙特卡洛利用简单统计物理模型取样的

2505.22598v2

223

05-29

Towards Robust Overlapping Speech Detection: A Speaker-Aware Progressive Approach Using WavLM

Auf dem Weg zu einer robusten, überlappenden Spracherkennung: Ein Lautsprecher-Bewusst-Progressiver Ansatz mit WavLM

争取强劲的超重叠语音探测:使用WavLM 的演讲者-警示渐进方法

2505.23207v1

224

05-29

Disentangled Multi-span Evolutionary Network against Temporal Knowledge Graph Reasoning

Disentangled Multi-Span Evolutionary Network gegen Temporal Knowledge Graph Reasoning

对抗时间知识图表推理的多空间演进网络

2505.14020v2

225

05-29

Aligning Text to Image in Diffusion Models is Easier Than You Think

Text an Bild in Diffusions-Modellen ausrichten ist einfacher, als Sie denken

在传播模型中将文本对齐到图像比您想象的容易

2503.08250v4

226

05-29

JAPAN: Joint Adaptive Prediction Areas with Normalising-Flows

JAPAN: Gemeinsame adaptive Vorhersagebereiche mit Normalisierungs-Flows

JAPAN: 联合适应性预测区与标准化花束

2505.23196v1

227

05-29

Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning

Weniger ist mehr: Unlocking Spezialisierung von Time Series Foundation Models über strukturiertes Pruning

较少是更多:通过结构式普鲁宁解锁时间序列基础模型的专业化

2505.23195v1

228

05-29

Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection

Multimodale Inverse Aufmerksamkeit Netzwerk mit Intrinsic Discriminant Feature Exploitation für gefälschte Nachrichten Erkennung

多式反向关注网络,利用内在差异性地貌特征利用假新闻探测

2502.01699v2

229

05-29

Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics

Beyond Zero Initialization: Untersuchung der Auswirkungen von Non-Zero Initialization auf LoRA Fine-Tuning Dynamics

零启动后零启动后:调查非零初始化对LORA微调动力学的影响

2505.23194v1

230

05-29

DeepRTE: Pre-trained Attention-based Neural Network for Radiative Tranfer

DeepRTE: Pre-trained Aufmerksamkeit-basiertes Neural-Netzwerk für Radiative Tranfer

DeepRTE: 培训前的辐射Tranfer神经网络,以关注为主的神经网络

2505.23190v1

231

05-29

Plug In and Learn: Federated Intelligence over a Smart Grid of Models

Plug In and Learn: Federated Intelligence über ein Smart Grid aus Modellen

插插插和学习:对智能模型网的联邦情报

2302.04363v4

232

05-29

Dequantified Diffusion-Schr{ö}dinger Bridge for Density Ratio Estimation

Dequantifizierte Diffusion-Schr{ö}dinger-Brücke für Dichte-Verhältnis-Schätzung

密度比率估计的量化扩散 - Schrdinger桥

2505.05034v3

233

05-29

Unsupervisedly Learned Representations: Should the Quest be Over?

Unüberwacht gelernte Repräsentationen: Sollte die Suche vorbei sein?

无人监督的派任代表:调查是否应该结束?

2001.07495v6

234

05-29

Rethinking Positive Pairs in Contrastive Learning

Positive Paare im kontrastistischen Lernen neu denken

在反竞争学习中重新思考正对对

2410.18200v2

235

05-29

Improving the Effective Receptive Field of Message-Passing Neural Networks

Verbesserung des effektiven Empfangsfeldes von message-passing Neural Networks

改进信息传送神经网络的有效接收领域

2505.23185v1

236

05-29

Two Is Better Than One: Rotations Scale LoRAs

Zwei ist besser als eins: Rotationsskala LoRAs

二比一好:轮作规模LORAs

2505.23184v1

237

05-29

MADCluster: Model-agnostic Anomaly Detection with Self-supervised Clustering Network

MADCluster: Modell-agnostische Anomalieerkennung mit selbstüberwachtem Clustering-Netzwerk

MADCluster:使用自监管的集群网进行模型-不可知异常探测

2505.16223v2

238

05-29

FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation

FSL-SAGE: Beschleunigung des Federated Split Learning durch Smashed Activation Gradient Abschätzung

FSL-SAGE:通过分散的激励加速渐进式估算,加速联邦分化学习

2505.23182v1

239

05-29

FreRA: A Frequency-Refined Augmentation for Contrastive Learning on Time Series Classification

FreRA: Eine frequenzrefinierte Augmentation für kontrastives Lernen in der Zeitreihenklassifikation

FreRA:关于时间序列分类的校对性学习频率改进

2505.23181v1

240

05-29

The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning

Die Panaceas zur Verbesserung der Zersetzung mit geringem Rank im kommunikativ-effizienten Federated Learning

改善通信-高效联邦学习中低-兰克分解的全景

2505.23176v1

241

05-29

Contrastive Learning and Abstract Concepts: The Case of Natural Numbers

Kontrastives Lernen und abstrakte Konzepte: Der Fall natürlicher Zahlen

差异学习和抽象概念:自然数字案例

2408.02247v6

242

05-29

Pseudo Multi-Source Domain Generalization: Bridging the Gap Between Single and Multi-Source Domain Generalization

Pseudo-Multi-Source-Domain-Verallgemeinerung: Die Lücke zwischen Single- und Multi-Source-Domain-Verallgemeinerung überbrücken

Pseudo多源多源通用化:缩小单一源和多源通用化之间的差距

2505.23173v1

243

05-29

Global Tensor Motion Planning

Globale Tensor-Bewegungsplanung

全球时势规划

2411.19393v3

244

05-29

Pre-training for Recommendation Unlearning

Vorschulung für Empfehlung Unlearning

建议培训前培训

2505.22649v2

245

05-29

Best Arm Identification with Possibly Biased Offline Data

Best Arm Identification mit möglicherweise Biased Offline Daten

最佳武器标识(可能附带的离线数据)

2505.23165v1

246

05-29

Temporal Relation Extraction in Clinical Texts: A Span-based Graph Transformer Approach

Temporale Beziehungsextraktion in klinischen Texten: Ein Span-basierter Graph Transformer-Ansatz

临床文本中的时间关系抽取时间关系:基于泛泛面的图形变形器方法

2503.18085v2

247

05-29

Implicit Inversion turns CLIP into a Decoder

Implizite Inversion macht CLIP zu einem Decoder

隐隐性 Indicide Inversion 将 CLIP 转换为解码器

2505.23161v1

248

05-29

Topological Adaptive Least Mean Squares Algorithms over Simplicial Complexes

Topologische Adaptive Least Mean Squares Algorithmen über Simplicial Complexes

简单综合体的地形适应性最低中度平方平方平方平方平方平方平方平

2505.23160v1

249

05-29

Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference Services

Privacy-Aware Joint DNN Model Bereitstellung und Partitionierung Optimierung für kollaborative Edge Inferenz Services

DNN 联合DNN 合作边缘推断服务示范部署和分离优化优化模式

2502.16091v3

250

05-29

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Größer, regularisiert, kategorisch: High-Kapacity-Wert-Funktionen sind effiziente Multi-Task-Lerner

大型、正规、分类:高能力价值功能是高效多任务学习者

2505.23150v1

251

05-29

FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing

FlowAlign: Trajektorie-regularisierte, inversionsfreie Fluss-basierte Bildbearbeitung

流动对等: 轨迹- 重新分类、转换- 无流动图像编辑

2505.23145v1

252

05-29

OmniArch: Building Foundation Model For Scientific Computing

OmniArch: Building Foundation Model for Scientific Computing

OmniArch:建筑基金会科学计算模型

2402.16014v3

253

05-29

Policy Filtration for RLHF to Mitigate Noise in Reward Models

Politische Filtration für RLHF zur Mititation von Lärm in Prämienmodellen

将RLHF政策归类为奖励模型中最小噪音的政策

2409.06957v4

254

05-29

Learning to Reason under Off-Policy Guidance

Unter außerpolitischer Anleitung zur Vernunft lernen

根据非政策指导学习理由

2504.14945v4

255

05-29

VERINA: Benchmarking Verifiable Code Generation

VERINA: Benchmarking der überprüfbaren Code-Generierung

VERINA:可核实代码生成基准

2505.23135v1

256

05-29

DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs

DOPPLER: Dual-Policy-Lernen für die Gerätezuordnung in asynchronen Datenflussgraphen

DOPPLER: 同步数据流图表中设备分配的双政策学习

2505.23131v1

257

05-29

Developing Cryptocurrency Trading Strategy Based on Autoencoder-CNN-GANs Algorithms

Entwicklung einer Cryptowährungs-Handelsstrategie auf der Grundlage von Autoencoder-CNN-GAN-Algorithmen

制定基于自动编码器-CNN-GANs算法的加密货币交易战略

2412.18202v5

258

05-29

Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network

Surrogate-Assisted Evolutionary Verstärkung Lernen auf der Grundlage von Autoencoder und Hyperbolic Neural Network

基于自动编码器和双曲神经网络的代用辅助辅助进化辅助进化强化学习

2505.19423v2

259

05-29

Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals

Lernen, in wiederholten Hauptagenten-Problemen mit Adversarial Agent Ankunft zu fördern

学习鼓励与抵达时的对冲代理人员重复发生主要问题

2505.23124v1

260

05-29

BroadGen: A Framework for Generating Effective and Efficient Advertiser Broad Match Keyphrase Recommendations

BroadGen: Ein Framework zur Generierung effektiver und effizienter Advertiser Broad Match Keyphrase-Empfehlungen

BloadGen:一个产生有效和高效广告的高效和高效广告大匹配关键词句建议的框架

2505.19164v2

261

05-29

CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

CASS: Nvidia zu AMD Transpilation mit Daten, Modellen und Benchmark

CASS: Nvidia 到AMD 传输数据、模型和基准

2505.16968v3

262

05-29

To Judge or not to Judge: Using LLM Judgements for Advertiser Keyphrase Relevance at eBay

Zu richten oder nicht zu richten: LLM-Richtungen für Werbetreibende Keyphrase Relevanz bei eBay verwenden

法官或非法官:在eBay使用LLM判决来作广告

2505.04209v2

263

05-29

Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking

Dekom-Renorm-Merge: Modellzusammenführung auf dem richtigen Raum verbessert Multitasking

Decom-Renorm-Meorge:正确空间的模型合并改进多重任务

2505.23117v1

264

05-29

Learning to Reason from Feedback at Test-Time

Von Feedback bei Test-Time zur Vernunft lernen

从测试时的反馈中学习到理由

2502.15771v2

265

05-29

CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Forecasting with Exogenous Variables

CrossLinear: Plug-and-Play-Cross-Korrelation für Zeitreihenvorhersage mit exogenen Variablen einbetten

Crossliear: 用外源变量预测时间序列的插件和插件交叉校正嵌入

2505.23116v1

266

05-29

Instance-dependent Convergence Theory for Diffusion Models

Instanz-abhängige Konvergenztheorie für Diffusionsmodelle

扩散模型集成模型理论

2410.13738v2

267

05-29

FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article

FutureGen: LLM-RAG Ansatz zur Generierung der zukünftigen Arbeit des wissenschaftlichen Artikels

FutureGen:LLM-RAG 产生科学条款未来工作的方法

2503.16561v2

268

05-29

Neural Interpretable PDEs: Harmonizing Fourier Insights with Attention for Scalable and Interpretable Physics Discovery

Neural Interpretable PDEs: Harmonisierung Fourier Insights mit Aufmerksamkeit für skalierbare und Interpretierbare Physik Discovery

神经可解释的PDEs:协调Fourier Insights,注意可缩放和可解释的物理发现

2505.23106v1

269

05-29

LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical Fabrics

LUMION: Schnelle Fehlerwiederherstellung für ML-Jobs mit programmierbaren optischen Stoffen

LUMION: 使用可编程光学制造器快速回收 ML 工作

2505.23105v1

270

05-29

Approximate Thompson Sampling for Learning Linear Quadratic Regulators with $O(\sqrt{T})$ Regret

Ungefähre Thompson-Probenahme für das Lernen linearer quadratischer Regulatoren mit $O(\sqrt{T})$ Bedauern

Thompson 学习线性赤道调节器的近似 Thompson 抽样以 $(\ sqrt{T}) regret $(\ sqrt{T}) 为学习线性赤道调节器

2405.19380v2

271

05-29

Weight Spectra Induced Efficient Model Adaptation

Gewicht Spectra Induzierte effiziente Modellanpassung

引导有效模型适应

2505.23099v1

272

05-29

Learning to Search for Vehicle Routing with Multiple Time Windows

Lernen, nach Fahrzeug Routing mit mehreren Zeitfenstern zu suchen

学习搜索多时间窗口运行的车辆

2505.23098v1

273

05-29

Stochastic Diffusion: A Diffusion Based Model for Stochastic Time Series Forecasting

Stochastische Diffusion: Ein diffusionsbasiertes Modell für stochastische Zeitreihen

斯托卡扩散:以传播为基础的斯托卡时间序列预测模型

2406.02827v2

274

05-29

Constraints and Variables Reduction for Optimal Power Flow Using Hierarchical Graph Neural Networks with Virtual Node-Splitting

Einschränkungen und Variablen-Reduktion für optimalen Stromfluss mittels Hierarchischer Graphen-Neural-Netzwerke mit virtuellem Knoten-Splitting

利用具有虚拟节点切除功能的等级形图形神经网络减少最佳电力流动的制约因素和变数

2411.06268v2

275

05-29

MAP: Revisiting Weight Decomposition for Low-Rank Adaptation

KARTE: Wiederbesuchen der Gewichtsverringerung für Low-Rank-Anpassung

MAP: 重新审视低浓度适应的重量分解

2505.23094v1

276

05-29

Equivariant Spherical Transformer for Efficient Molecular Modeling

Equivarianter Spherical Transformer für effiziente molekulare Modellierung

高效分子建模的等同球质变变变器

2505.23086v1

277

05-29

Gradient Boosting Decision Tree with LSTM for Investment Prediction

Gradienten Auftrieb Entscheidungsbaum mit LSTM für Investitionsvorhersage

与 LSTM 一起逐步促进投资预测决策树

2505.23084v1

278

05-29

Gradient Methods with Online Scaling Part I. Theoretical Foundations

Gradient Methoden mit Online-Skalierung Teil I. Theoretische Grundlagen

在线扩展第一部分的渐进方法理论基础

2505.23081v1

279

05-29

Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble

Zweite Meinungsfrage: Auf dem Weg zu adaptiver klinischer KI über den Konsens des Expert Model Ensembles

第二意见事项:通过专家示范组共识实现适应性临床AI

2505.23075v1

280

05-29

Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

Shortcut-verbundene Experten-Parallelität für die Beschleunigung von Mixture-of-Experts

加速混合专家专家专家平行专家

2404.05019v3

281

05-29

Multi-Modal Learning with Bayesian-Oriented Gradient Calibration

Multi-Modal-Lernen mit Bayesian-Oriented Gradient Calibration

多模式学习,以巴耶斯为主的梯度校准

2505.23071v1

282

05-29

Sparse Linear Bandits with Blocking Constraints

Sparse Linear Bandits mit Blockierung Einschränkungen

带有阻塞限制的粗细线条强力

2410.20041v2

283

05-29

GrokFormer: Graph Fourier Kolmogorov-Arnold Transformers

GrokFormer: Graph Fourier Kolmogorov-Arnold Transformer

GrokFormer:图示 Fourier Kolmogorov-Arnold变形器

2411.17296v3

284

05-29

Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling

Skalierung von Flüssig-Resistenz-Netzwerken für eine effiziente Sequenzmodellierung

增强增强流动性恢复力的流动性能力网络,以建立高效序列建模

2505.21717v2

285

05-29

SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

SORSA: Singuläre Werte und Orthonormale Regularisierte Singuläre Vektoren Anpassung großer Sprachmodelle

SORSA: 单项价值和正正正的正规化的单项矢量,以适应大语言模式

2409.00055v6

286

05-29

M3Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes

M3Bench: Benchmarking Ganzkörper-Bewegungs-Generation für mobile Manipulation in 3D-Szenen

M3Bench:3D场景移动操纵基准全体运动生成

2410.06678v3

287

05-29

Topological Structure Learning Should Be A Research Priority for LLM-Based Multi-Agent Systems

Topologisches Strukturlernen sollte eine Forschungspriorität für LLM-basierte Multi-Agent-Systeme sein

地形结构学习应成为以LLM为基础的多种机构系统的研究重点

2505.22467v2

288

05-29

Efficient Quantum Approximate $k$NN Algorithm via Granular-Ball Computing

Effiziente Quanten Ungefähre $k$NN-Algorithmus über Granular-Ball Computing

通过颗粒球式计算机计算, 近于 $k$NN 的高效量量量

2505.23066v1

289

05-29

Machine Learning Framework for Characterizing Processing-Structure Relationship in Block Copolymer Thin Films

Machine Learning Framework zur Charakterisierung von Verarbeitungs-Struktur-Beziehungen in Block Copolymer Thin Films

确定胶合聚合薄薄膜加工-结构关系特征的机械学习框架

2505.23064v1

290

05-29

Loss-Guided Model Sharing and Local Learning Correction in Decentralized Federated Learning for Crop Disease Classification

Loss-Guided Model Sharing und lokale Lernkorrektur bei dezentralisiertem Föderated Learning für die Klassifizierung von Crop Diseases

关于作物疾病分类的分散化联邦学习中损失指导模式共享和地方学习校正

2505.23063v1

291

05-29

Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data

Composite Flow passend zum Verstärkungslernen mit Shifted-Dynamics-Daten

与上下动动量数据匹配的强化学习综合流程

2505.23062v1

292

05-29

Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design

Spekulative Dekodierung trifft auf Quantisierung: Kompatibilitätsbewertung und Hierarchisches Framework Design

投机性下限符合量化:兼容性评价和等级框架设计

2505.22179v2

293

05-29

DINGO: Constrained Inference for Diffusion LLMs

DINGO: Beschränkte Schlussfolgerung für Diffusion LLMs

DINGO: 扩散长效LMM的连续推论

2505.23061v1

294

05-29

Improved Last-Iterate Convergence of Shuffling Gradient Methods for Nonsmooth Convex Optimization

Verbesserte letzte Konvergenz der schrumpfenden Gradienten-Methoden für rauchfreie Convex-Optimierung

优化非移动convex最佳化的渐进式打碎方法的改进后最后

2505.23056v1

295

05-29

CDR-Agent: Intelligent Selection and Execution of Clinical Decision Rules Using Large Language Model Agents

CDR-Agent: Intelligente Auswahl und Durchführung klinischer Entscheidungsregeln unter Verwendung von Large Language Model Agents

CDR-代理:明智选择和执行使用大语言示范物剂的临床决定规则

2505.23055v1

296

05-29

Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network

Lernen von suboptimalen Daten in der kontinuierlichen Kontrolle über Auto-Regressive Soft Q-Network

通过自动递减软软QNetwork, 从连续控制中的亚最佳数据中学习

2502.00288v2

297

05-29

DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration

DenoiseRotator: Verbesserung der Beschneidungsfestigkeit für LLMs durch Bedeutungskonzentration

DenoisRotator:通过重视浓度提高LLMs的稳健力

2505.23049v1

298

05-29

ProDiff: Prototype-Guided Diffusion for Minimal Information Trajectory Imputation

ProDiff: Prototypen-geführte Diffusion für minimale Information Trajektorie Imputation

ProDiff: 用于最小信息轨迹截肢的原型类型辅助扩散

2505.23048v1

299

05-29

Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping

Nicht konvexe stochastische Optimierung unter schwerfälligen Geräuschen: Optimale Konvergenz ohne gradientes Clipping

在重困噪音下非convex 斯托卡优化: 没有梯度缩放的最佳趋同

2412.19529v4

300

05-29

From Theory to Application: Fine-Tuning Large EEG Model with Real-World Stress Data

Von der Theorie zur Anwendung: Feintuning-Großes EEG-Modell mit realen Stressdaten

从理论到应用:使用现实世界应激数据精美应用大型电子EEG模型

2505.23042v1

301

05-29

TINED: GNNs-to-MLPs by Teacher Injection and Dirichlet Energy Distillation

TINED: GNNs-to-MLPs von Lehrerinjektion und Dirichlet Energy Destillation

TINED:通过教师注射和稀释能源蒸馏,将GNNs改为MLP

2412.11180v3

302

05-29

One Model for One Graph: A New Perspective for Pretraining with Cross-domain Graphs

Ein Modell für einen Graphen: Eine neue Perspektive für das Pretraining mit domänenübergreifenden Graphen

一图一模型:带有跨领域图的训练前新视角

2412.00315v2

303

05-29

Cross-modal RAG: Sub-dimensional Retrieval-Augmented Text-to-Image Generation

Cross-modal RAG: Sub-dimensionale Retrieval-Augmented Text-to-Image Generation

跨模式RAG:次二维检索增强的文本到图像生成

2505.21956v2

304

05-29

Case-Based Reasoning Enhances the Predictive Power of LLMs in Drug-Drug Interaction

Case-Based Reasoning verbessert die vorausschauende Kraft von LLMs in der Arzneimittel-Drogen-Interaktion

以个案为依据的理由加强药物-药物相互作用LLMs的预测能力

2505.23034v1

305

05-29

Exploring the Limitations of Mamba in COPY and CoT Reasoning

Erforschung der Grenzen von Mamba in COPY und CoT Reasoning

探索COPY和COT理由解释中Mamba的局限性

2410.03810v3

306

05-29

AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge

AntiLeakBench: Datenkontamination durch automatisches Konstruieren von Benchmarks mit aktualisiertem Real-World-Wissen verhindern

防止泄漏:利用最新现实世界知识自动建立基准,防止数据污染

2412.13670v2

307

05-29

Bayesian Neural Scaling Laws Extrapolation with Prior-Fitted Networks

Bayesische Neural Scaling-Gesetze Extrapolation mit vormontierten Netzwerken

Bayesian神经扩增法与事先确定网络的外推法

2505.23032v1

308

05-29

Diverse Prototypical Ensembles Improve Robustness to Subpopulation Shift

Unterschiedliche prototypische Ensembles verbessern die Robustheit der Subpopulationsverschiebung

提高亚人口变换能力

2505.23027v1

309 05-29 Graph Wave Networks Graphische Wellennetze 图图波网络 2505.20034v2

310

05-29

Offline Learning for Combinatorial Multi-armed Bandits

Offline-Lernen für kombinatorische Multi-Armed Bandits

多武装混合强盗离线学习

2501.19300v2

311

05-29

An Empirical Study of Federated Prompt Learning for Vision Language Model

Eine empirische Studie über Federated Prompt Learning for Vision Language Model

联邦快速学习促进愿景语言模式经验研究

2505.23024v1

312

05-29

GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

GuardAgent: LLM-Agenten durch einen Guard Agent durch wissensgestützte Vernunft schützen

警卫人员:由警卫人员通过 “ 知识化理由 “ 保护有限责任公司代理

2406.09187v3

313

05-29

SCORPIO: Serving the Right Requests at the Right Time for Heterogeneous SLOs in LLM Inference

SCORPIO: Den richtigen Anfragen zur richtigen Zeit für heterogene SLOs in LLM-Schlussfolgerung dienen

在LLM推理中异基因性溶液的适当时间满足正确的要求

2505.23022v1

314

05-29

SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models

SciHorizon: Benchmarking von KI-für-Science Readiness von wissenschaftlichen Daten zu großen Sprachmodellen

SciHorizon:将AI-SciHorizon科学准备程度从科学数据基准确定为大语言模式

2503.13503v3

315

05-29

BECAME: BayEsian Continual Learning with Adaptive Model MErging

BECAME: BayEsian Continual Learning mit adaptivem Modell-Merging

BECAME: 采用适应性示范招生模型的巴伊连续学习

2504.02666v2

316

05-29

$K^2$VAE: A Koopman-Kalman Enhanced Variational AutoEncoder for Probabilistic Time Series Forecasting

$K^2$VAE: Ein Koopman-Kalman-Verbesserter Variations-AutoEncoder für probabilistische Zeitreihenprognosen

2美元VAE: 概率时间序列预测的Koopman-Kalman增强变异自动编码器

2505.23017v1

317

05-29

Hyperbolic-PDE GNN: Spectral Graph Neural Networks in the Perspective of A System of Hyperbolic Partial Differential Equations

Hyperbolic-PDE GNN: Spektral Graph Neural Networks in the Perspective of A System of Hyperbolic Partial Differential Equations

GNN: 从超曲偏偏部分异差系统的角度看待光谱图形神经网络

2505.23014v1

318

05-29

SplitLoRA: Balancing Stability and Plasticity in Continual Learning Through Gradient Space Splitting

SplitLoRA: Balance Stabilität und Plastizität im kontinuierlichen Lernen durch gradienten Raum Splitting

Split LoRA:通过逐步空间分割在持续学习中平衡稳定和可塑性

2505.22370v2

319

05-29

Scalable Complexity Control Facilitates Reasoning Ability of LLMs

Skalierbare Komplexitätskontrolle erleichtert die Fähigkeit von LLMs, sich zu verankern

C. 便利理理动利利利利商利利利利利商利利利利利商利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利利

2505.23013v1

320

05-29

BA-LoRA: Bias-Alleviating Low-Rank Adaptation to Mitigate Catastrophic Inheritance in Large Language Models

BA-LoRA: Bias-Alleviating Low-Rank Anpassung an Mitigate Katastrophische Vererbung in großen Sprachmodellen

BA-LORA:在大语言模型中,对减轻灾害传承的低率适应

2408.04556v5

321

05-29

EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

EmergentTTS-Eval: Bewertung von TTS-Modellen auf komplexe Prosodic, Expressivität und sprachliche Herausforderungen mit Model-as-a-Judge

新兴TTS-Eval:利用 “ 模拟即审法官 “ 评估关于复杂立案、表达性和语言挑战的TTS模型

2505.23009v1

322

05-29

QLIP: A Dynamic Quadtree Vision Prior Enhances MLLM Performance Without Retraining

QLIP: Eine dynamische Quadtree Vision verbessert die MLLM-Performance ohne Umschulung

QLIP: 动态的四方愿景,事先提高MLLM业绩,不再培训

2505.23004v1

323

05-29

Universal Sequence Preconditioning

Universelle Sequenz Vorkonditionierung

通用序列序序预设

2502.06545v2

324

05-29

Hybrid Cross-domain Robust Reinforcement Learning

Hybrides Cross-Domain Robustes Verstärkungslernen

跨部门加强强化学习

2505.23003v1

325

05-29

Improved and Oracle-Efficient Online $\ell_1$-Multicalibration

Verbesserte und Oracle-Effizient Online $\ell_1$-Multikalibrierung

改进和 Oracle-Effacient 在线 $\ell_1美元-多边校准

2505.17365v2

326

05-29

Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning

Dolphin: Ein programmierbares Framework für skalierbares neurosymbolisches Lernen

Dolphin: 可缩放的神经元学习程序框架

2410.03348v4

327

05-29

A Bayesian Model Selection Criterion for Selecting Pretraining Checkpoints

Ein Bayesian Modellauswahl-Kriterium für die Auswahl von Vortrainings-Checkpoints

选择培训前检查站的巴伊西亚示范甄选标准标准

2410.05612v2

328

05-29

HydraNet: Momentum-Driven State Space Duality for Multi-Granularity Tennis Tournaments Analysis

HydraNet: Momentum-getriebene State Space-Dualität für Multi-Granularity-Tennisturniere Analyse

HydraNet: 动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-动力-

2505.21882v2

329

05-29

Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment

Jenseits der Belohnung Hacking: Kausale Belohnungen für großsprachige Modellausrichtung

优胜后加分:大语言模型对齐的因果奖励

2501.09620v2

330

05-29

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

ReinFlow: Feinsteuerungs-Flow Matching-Politik mit Online-Verstärkungs-Lernen

ReinFlow: 与在线强化学习匹配流动政策的微调

2505.22094v2

331

05-29

Is Attention Required for Transformer Inference? Explore Function-preserving Attention Replacement

Ist Achtung für Transformer-Inferenz erforderlich? Erkunden Sie Funktionserhaltende Aufmerksamkeitsersatz

需要注意吗? 探索功能保持注意替换

2505.21535v2

332

05-29

LLM Agents for Bargaining with Utility-based Feedback

LLM-Agenten für Schnäppchen mit Utility-basiertem Feedback

LLM 与基于利用的反馈进行交涉的代理代理

2505.22998v1

333

05-29

Theoretical Foundations of the Deep Copula Classifier: A Generative Approach to Modeling Dependent Features

Theoretische Grundlagen des Deep Copula Klassifikators: Ein generativer Ansatz zur Modellierung abhängiger Merkmale

深 Cocula 分类法理论基础:建模附属地貌的开创性方法

2505.22997v1

334

05-29

Walking the Weight Manifold: a Topological Approach to Conditioning Inspired by Neuromodulation

Wiege manifold gehen: ein topologischer Ansatz zur Konditionierung Inspiriert durch Neuromodulation

身穿轻重背重力:在神经调节的启发下,从地形学角度处理条件问题

2505.22994v1

335

05-29

Number of Clusters in a Dataset: A Regularized K-means Approach

Anzahl der Cluster in einem Datensatz: Ein regularisierter K-Mittelansatz

数据集中的组群数量:正规化的K手段方法

2505.22991v1

336

05-29

MenTeR: A fully-automated Multi-agenT workflow for end-to-end RF/Analog Circuits Netlist Design

MenTeR: Ein vollautomatisierter Multi-AgenT-Workflow für End-to-End-RF/Analog-Schaltungen Netlist Design

MenTeR: 终端至终端RF/Analog 电路网络列表设计全自动多元T工作流程

2505.22990v1

337

05-29

Effects of Dropout on Performance in Long-range Graph Learning Tasks

Auswirkungen des Dropouts auf die Leistungsfähigkeit in großflächigen Graphen-Lernaufgaben

辍学对远程图表学习任务绩效的影响

2502.07364v2

338

05-29

Model-Preserving Adaptive Rounding

Modellschonende adaptive Rundung

模型保护适应性四舍五入

2505.22988v1

339

05-29

Knowledge Distillation for Reservoir-based Classifier: Human Activity Recognition

Wissensdestillation für Reservoir-basierte Klassifikator: Menschliche Aktivitätserkennung

以储量为基础的分类法知识蒸馏:人类活动认识

2505.22985v1

340

05-29

A Computational Approach to Improving Fairness in K-means Clustering

Ein Computational Approach zur Verbesserung der Fairness im K-Mittel-Clustering

改进K类手段分类组合的公平性计算方法

2505.22984v1

341

05-29

MedRAX: Medical Reasoning Agent for Chest X-ray

MedRAX: Medizinischer Reasoning Agent für Bruströntgen

MedraX: 胸前X光医疗理疗代理

2502.02673v2

342

05-29

Theoretical guarantees on the best-of-n alignment policy

Theoretische Garantien für die optimale Ausrichtungspolitik

关于最佳协调政策理论保障

2401.01879v3

343

05-29

Learning coordinated badminton skills for legged manipulators

Koordinierte Badminton-Fähigkeiten für Legged Manipulatoren lernen

为腿脚操纵者学习协调的羽毛球技能

2505.22974v1

344

05-29

EquiReg: Equivariance Regularized Diffusion for Inverse Problems

EquiReg: Äquivarianz Regularisierte Diffusion für Inverse Probleme

equireg: 用于反向问题的公平、正规化传播

2505.22973v1

345

05-29

Minimal Sufficient Views: A DNN model making predictions with more evidence has higher accuracy

Minimal Ausreichende Ansichten: Ein DNN-Modell, das Vorhersagen mit mehr Beweisen macht, hat höhere Genauigkeit

最低限度的充分意见:一个DNN模型,用更多证据作出预测,其准确性更高

2402.01095v2

346

05-29

MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary Programming

MermaidFlow: Neudefinition der agentischen Workflow-Generierung durch sicherheitsbeschränkte evolutionäre Programmierung

美人鱼:通过受安全限制的进化方案拟订,重新确定干燥性工作流的产生

2505.22967v1

347

05-29

Exploring Scaling Laws for EHR Foundation Models

Erforschung von Skalierungsgesetzen für EHR-Stiftungsmodelle

探索EHR基金会模式的扩展法律

2505.22964v1

348

05-29

INRFlow: Flow Matching for INRs in Ambient Space

INRFlow: Flow Passend für INRs im Umgebungsraum

INFRFlow: 环境空间IRR的流量匹配

2412.03791v2

349

05-29

ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

ToMAP: Training Gegner-Bewusst LLM überzeugt mit Theorie des Geistes

ToMAP:培训有思想理论的对抗者软件软件LLM

2505.22961v1

350

05-29

Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

Multi-Agenten-Debatte als Test-Time Scaling: Eine systematische Studie der bedingten Wirksamkeit

重新审议作为试验时间尺度的多机构辩论:对有条件有效性的系统研究

2505.22960v1

351

05-29

Unveiling Environmental Impacts of Large Language Model Serving: A Functional Unit View

Enthüllen von Umweltauswirkungen von großsprachigen Modellen: Eine funktionale Einheitsansicht

大型语文服务模式的不懈环境影响:职能单位观点

2502.11256v2

352

05-29

CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance

CodeSteer: Symbolisch-Augmentierte Sprachmodelle über Code/Text Anleitung

代码器:通过编码/文本指导的代码/文本指导的代码器:代号辅助语言模式

2502.04350v2

353

05-29

Understanding Bias Reinforcement in LLM Agents Debate

Verständnis der Bias-Verstärkung in LLM-Agenten-Debatte

了解LLLM代理商的强化申请

2503.16814v2

354

05-29

Performance Guaranteed Poisoning Attacks in Federated Learning: A Sliding Mode Approach

Leistungsgarantie Vergiftung Angriffe im Föderierten Lernen: Ein Schiebemodus Ansatz

联邦学习中保证中毒袭击的绩效:一种脱落模式方法

2505.16403v2

355

05-29

CellFlux: Simulating Cellular Morphology Changes via Flow Matching

CellFlux: simulierende zelluläre Morphologie-Änderungen durch Flow Matching

细胞通量:通过流动匹配模拟细胞生理变化

2502.09775v2

356

05-29

Directed Graph Grammars for Sequence-based Learning

Gezielte Graphen-Grammatik für sequenzbasiertes Lernen

以序列为基础的学习方向图表语法

2505.22949v1

357

05-28 (3)

NegVQA: Can Vision Language Models Understand Negation?

NegVQA: Können Visions-Sprachmodelle Negation verstehen?

NegVQA:视觉语言模式能理解差吗?

2505.22946v1

358

05-28

Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates

Kann LLMs CLIP deciive? Benchmarking Adversarial Compositionalität der vortrainierten multimodalen Darstellung über Textaktualisierungen

LLMs CLIP能否通过文本更新确定培训前多模式代表的反向构成基准?

2505.22943v1

359

05-28

Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified?

Sind Domain Generalization Benchmarks mit Genauigkeit auf der Zeile falsch angegeben?

域通用基准与误标线的准确性是否一致?

2504.00186v2

360

05-28

Generative Social Choice: The Next Generation

Generative soziale Wahl: Die nächste Generation

产生社会选择:下一代

2505.22939v1

361

05-28

Is Noise Conditioning Necessary? A Unified Theory of Unconditional Graph Diffusion Models

Ist die Lärmkonditionierung notwendig? Eine einheitliche Theorie der Bedingungslosen Graphen-Diffusionsmodelle

是否有必要设定噪音条件? 无条件图形扩散模型的统一理论

2505.22935v1

362

05-28

Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging

Unraveling LoRA Interferenz: Orthogonale Subräume für robuste Modellzusammenführung

开放 LoRA 干涉度: 用于强力模型合并的正弦形子空间

2505.22934v1

363

05-28

K-Paths: Reasoning over Graph Paths for Drug Repurposing and Drug Interaction Prediction

K-Paths: Begründung über Graphenpfade für Drogenrepurposing und Drogeninteraktionsvorhersage

K-Paths: 以图解路径为依据进行药物再定位和药物相互作用预测

2502.13344v3

364

05-28

How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias

Wie Transformer lernen Regelmäßige Spracherkennung: Eine theoretische Studie über Trainingsdynamik und Implizite Bias

变换人如何学习常规语言识别:关于培训动态和隐含偏见的理论研究

2505.00926v3

365

05-28

Scalable Parameter and Memory Efficient Pretraining for LLM: Recent Algorithmic Advances and Benchmarking

Skalierbare Parameter und Speicher Effizientes Vortraining für LLM: Algorithmische Fortschritte und Benchmarking

LLM的可缩放参数和记忆高效预修培训:最近的演算进展和基准

2505.22922v1

366

05-28

Unlocking Mental Health: Exploring College Students’ Well-being through Smartphone Behaviors

Entsperren der psychischen Gesundheit: Erforschen des Wohlbefindens der Studenten durch Smartphone-Verhalten

解锁心理健康:通过智能手机行为探索大学生福祉

2502.08766v2

367

05-28

Enhancing Semi-supervised Learning with Zero-shot Pseudolabels

Halbbeaufsichtigtes Lernen mit Null-Shot-Pseudo-Labels verbessern

用零弹Pseudo标签加强半监督的学习

2502.12584v2

368

05-28

cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning

cadrille: Multimodale CAD-Rekonstruktion mit Online-Verstärkung

与在线强化学习相结合的多模式 CAD重建

2505.22914v1

369

05-28

Mustafar: Promoting Unstructured Sparsity for KV Cache Pruning in LLM Inference

Mustafar: Förderung unstrukturierter Sparsamkeit für KV Cache Pruning in LLM Inferenz

Mustafar:在LLM推理中促进KV Cache Pruning的无结构平衡

2505.22913v1

370

05-28

GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation

GraphEval: Ein leichter Graph-basierter LLM-Rahmen für die Idee-Evaluierung

图图Eval:基于轻量图图的理论评估LLM框架

2503.12600v2

371

05-28

Ensuring User-side Fairness in Dynamic Recommender Systems

Gewährleistung der benutzerseitigen Fairness in dynamischen Recommender-Systemen

确保动态建议系统在用户方面的公平公正

2308.15651v3

372

05-28

SP2RINT: Spatially-Decoupled Physics-Inspired Progressive Inverse Optimization for Scalable, PDE-Constrained Meta-Optical Neural Network Training

SP2RINT: Spatially-Decoupled Physics-Inspired Progressive Inverse Optimization für skalierbare, PDE-Constrained Meta-Optical Neural Network Training

SP2RINT: 空间-减速物理激励-渐进式反向优化,用于可缩放、PDE-受培训的元神经网络培训

2505.18377v2

373

05-28

Defining Foundation Models for Computational Science: A Call for Clarity and Rigor

Fundamentalmodelle für die Computerwissenschaft definieren: Ein Ruf nach Klarheit und Starrheit

界定计算科学基础模型:要求明确和严格

2505.22904v1

374 05-28 Norm-Bounded Low-Rank Adaptation Normgebundene Low-Rank-Anpassung 适应性 2501.19050v3

375

05-28

On the Dynamic Regret of Following the Regularized Leader: Optimism with History Pruning

Zum dynamischen Bedauern, dem regularisierten Führer zu folgen: Optimismus mit Geschichtsveredelung

在追赶正规领导人之后的强烈遗憾:对历史的乐观态度

2505.22899v1

376

05-28

The Geometry of ReLU Networks through the ReLU Transition Graph

Die Geometrie von ReLU-Netzwerken durch den ReLU-Übergangsgraphen

通过 ReLU 过渡图绘制 ReLU 网络的几何图

2505.11692v2

377

05-28

Neural Networks as Universal Finite-State Machines: A Constructive Deterministic Finite Automaton Theory

Neurale Netzwerke als universelle Finite-State-Maschinen: Eine konstruktive Deterministische Finite-Automaten-Theorie

神经网络作为普遍有限国家机器:具有建设性决定作用的有限自定义理论

2505.11694v2

378

05-28

A Combinatorial Theory of Dropout: Subnetworks, Graph Geometry, and Generalization

A Combinatorial Theory of Dropout: Subnetzwerke, Graphische Geometrie und Generalisierung

辍学综合理论:子网络、图形几何和一般化

2504.14762v2

379

05-28

Smart Surrogate Losses for Contextual Stochastic Linear Optimization with Robust Constraints

Intelligente Surrogatverluste für kontextuelle stochastische Linearoptimierung mit robusten Einschränkungen

具有强力限制的内幕斯托卡式线性优化的智能代谢损失

2505.22881v1

380

05-28

Signal attenuation enables scalable decentralized multi-agent reinforcement learning over networks

Signaldämpfung ermöglicht skalierbares dezentrales Multi-Agenten-Verstärkungslernen über Netzwerke

信号减速使可伸缩的分散式多试剂强化学习超越网络

2505.11461v2

381

05-28

CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models

CFP-Gen: Kombinatorische funktionelle Proteinerzeugung über Diffusions-Sprachenmodelle

CFP-Gen:通过传播语言模式生成混合功能性蛋白质

2505.22869v1

382

05-28

Multimodal Survival Modeling in the Age of Foundation Models

Multimodale Überlebensmodellierung im Zeitalter der Gründungsmodelle

基金会时代多模式生存模型

2505.07683v2

383

05-28

CrossNAS: A Cross-Layer Neural Architecture Search Framework for PIM Systems

CrossNAS: Ein Cross-Layer Neural Architecture Search Framework für PIM-Systeme

CrossNAS:PIM系统跨行业神经结构搜索框架

2505.22868v1

384

05-28

Scaling Offline RL via Efficient and Expressive Shortcut Models

Skalierung von Offline-RL über effiziente und Expressive Shortcut-Modelle

通过高效和直表达快捷键模式缩放离线 RL

2505.22866v1

385

05-28

Your Data, My Model: Learning Who Really Helps in Federated Learning

Ihre Daten, mein Modell: Lernen, die wirklich hilft beim Federated Learning

您的数据, 我的模型: 学习谁真正帮助联邦学习

2409.02064v3

386

05-28

Causal-PIK: Causality-based Physical Reasoning with a Physics-Informed Kernel

Causal-PIK: Kausalitätsbasierte Physical Reasoning mit einem physikinformierten Kernel

原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-因物理内心造成的身体原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-原因-因物理

2505.22861v1

387

05-28

Permissioned LLMs: Enforcing Access Control in Large Language Models

Zugelassene LLMs: Erzwingen der Zugriffskontrolle in großen Sprachmodellen

获得许可的LLMM:在大语言模型中实施访问控制

2505.22860v1

388

05-28

NGPU-LM: GPU-Accelerated N-Gram Language Model for Context-Biasing in Greedy ASR Decoding

NGPU-LM: GPU-beschleunigtes N-Gram-Sprachenmodell für Kontext-Biasing in Greedy ASR-Dekodierung

NGPU-LM: 加速GPU-加速型N-Gram语语模式,用于在贪婪ASR标记中进行背景切换

2505.22857v1

389

05-28

Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning

Nutzung von nicht gekennzeichneten Daten durch Kernel-Funktion Annäherung im Offline-Verstärkungs-Lernen

在离线强化学习中,通过 Kernel 函数相近接近的内核功能利用未贴标签的数据分享来利用无标签数据分享

2408.12307v3

390

05-28

Point Cloud Synthesis Using Inner Product Transforms

Punkt-Cloud-Synthese mit inneren Produkt-Transformationen

使用内产产品变换的点云合成

2410.18987v3

391

05-28

RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation

RocqStar: Leveraging-ähnliche Retrieval- und Agentiksysteme für die Rocq-Generation

RocqStar:利用利用相似度驱动回收系统和干系统来生成Rocq

2505.22846v1

392

05-28

Entropy-regularized Gradient Estimators for Approximate Bayesian Inference

Entropie-regularisierte Gradienten-Estimatoren für ungefähre Bayesische Schlussfolgerung

用于近近贝耶斯推断的全天正规化梯度测算器

2503.11964v3

393

05-28

Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion

Jenseits der Permutationssymmetrie der Transformer: Die Rolle der Rotation für die Modellfusion

变异器超越变异对称:变动对模型融合的作用

2502.00264v2

394

05-28

Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation

Bayesian Attention Mechanism: Ein probabilistisches Framework für die Positionskodierung und Kontextlängen-Extrapolation

Bayesian注意机制:定位编码和背景长度外推概率框架

2505.22842v1

395

05-28

Kernel-Smoothed Scores for Denoising Diffusion: A Bias-Variance Study

Kernelgeglättete Punktzahlen für die Denoisierung der Diffusion: Eine Bias-Varianz-Studie

Disoising 扩散的内核悬浮分数:生物量变化研究

2505.22841v1

396

05-28

Development and Validation of SXI++ LNM Algorithm for Sepsis Prediction

Entwicklung und Validierung von SXI++ LNM-Algorithmus für Sepsis-Vorhersage

SXI+++ LNM 测距算法的制定和校验

2505.22840v1

397

05-28

How Do Diffusion Models Improve Adversarial Robustness?

Wie verbessern Diffusionsmodelle die widrige Robustheit?

传播模型如何改善反逆能力?

2505.22839v1

398

05-28

Bridging Distribution Shift and AI Safety: Conceptual and Methodological Synergies

Bridging Distribution Shift und KI-Sicherheit: Konzeptionelle und methodische Synergien

搭桥分配转变与AI安全:概念与方法的协同作用

2505.22829v1

399

05-28

PGLearn – An Open-Source Learning Toolkit for Optimal Power Flow

PGLearn – Ein Open-Source-Learning-Toolkit für optimalen Stromfluss

PGLearn – – 最佳电力流动开放源学习工具包

2505.22825v1

400

05-28

Comparing Human and AI Rater Effects Using the Many-Facet Rasch Model

Vergleich menschlicher und KI-Rater-Effekte mit dem Multi-Facet-Rasch-Modell

使用多面 Rasch 模型比较人类和AI Rater效应

2505.18486v2

401

05-28

Hybrid Disagreement-Diversity Active Learning for Bioacoustic Sound Event Detection

Hybride Disagreement-Diversity Aktives Lernen für die bioakustische Sound-Erkennung

生物声波声音事件探测发现活动积极学习

2505.20956v2

402

05-28

Scalable Differentially Private Bayesian Optimization

Skalierbare differenzierte private Bayesian-Optimierung

Bayesian优化化

2502.06044v2

403

05-28

When Collaborative Filtering is not Collaborative: Unfairness of PCA for Recommendations

Wenn Kollaborative Filterung nicht kollaborativ ist: Unfairness von PCA für Empfehlungen

当协作过滤不是协作过滤时:常设仲裁院不公平以征求建议

2310.09687v2

404

05-28

Preference Learning with Response Time

Präferenz-Lernen mit Reaktionszeit

具有响应时间的优先学习

2505.22820v1

405

05-28

IMTS is Worth Time $\times$ Channel Patches: Visual Masked Autoencoders for Irregular Multivariate Time Series Prediction

IMTS ist Zeit wert $\times$ Channel Patches: Visual Masked Autoencoder für irreguläre Multivariate Time Series Prediction

IMTS 是有价值的时间 $\ times$$ 频道补丁: 用于非常规多变时间序列预测的视觉蒙面自动编码器

2505.22815v1

406

05-28

Regression and Forecasting of U.S. Stock Returns Based on LSTM

Regression und Prognose von US-Aktienrenditen basierend auf LSTM

根据LSTM对美国库存收益的回归和预测

2502.05210v3

407

05-28

X-Factor: Quality Is a Dataset-Intrinsic Property

X-Factor: Qualität ist eine datensatzintrinsische Eigenschaft

X 要素: 质量是一个数据集 - Intrins 属性

2505.22813v1

408

05-28

Credit Risk Identification in Supply Chains Using Generative Adversarial Networks

Kreditrisikoidentifizierung in Lieferketten mit generativen Adversarial-Netzwerken

利用产生反逆网络的供应链中的信用风险识别

2501.10348v4

409

05-28

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Hocheffiziente und effektive LLMs mit Multi-Boolean-Architekturen

多Boolean建筑群高效益、高效益、高效益、高效益、高效益、高效益的LLMs

2505.22811v1

410

05-28

Distribution free M-estimation

Verteilungsfreie M-Schätzung

免费分发 M - 估计

2505.22807v1

411

05-28

Anomalies by Synthesis: Anomaly Detection using Generative Diffusion Models for Off-Road Navigation

Anomalien durch Synthese: Anomalieerkennung mit generativen Diffusionsmodellen für Off-Road-Navigation

合成反常现象:使用非轨道导航生成扩散模型进行异常检测

2505.22805v1

412

05-28

CLUE: Neural Networks Calibration via Learning Uncertainty-Error alignment

CLUE: Neurale Netzwerke Kalibrierung über Learning Uncertainty-Error Alignment

CLUE:通过学习不确定性-差错对齐校准神经网络

2505.22803v1

413

05-28

Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning

Instruct-SkillMix: Eine leistungsstarke Pipeline für LLM Instruction Tuning

指令- SkillMix: 用于LLM 指令导导图的强大管道

2408.14774v4

414

05-28

SequentialBreak: Large Language Models Can be Fooled by Embedding Jailbreak Prompts into Sequential Prompt Chains

SequentialBreak: Große Sprachmodelle können durch Einbetten von Jailbreak Prompts in Sequential Prompt Chains ausgeblendet werden

顺序式布雷克:大语言模型可以通过将破狱线索嵌入顺序式提示链来蒙骗大语言模型

2411.06426v3

415

05-28

Efficient Preimage Approximation for Neural Network Certification

Effiziente Preimage-Annäherung für die Neural Network Zertifizierung

神经网络认证的高效预感近似率

2505.22798v1

416

05-28

DeSocial: Blockchain-based Decentralized Social Networks

DeSocial: Dezentrale soziale Netzwerke auf Blockchain-Basis

社会:基于供应链的权力下放社会网络

2505.21388v2

417

05-28

The Empirical Mean is Minimax Optimal for Local Glivenko-Cantelli

Das Empirische Mittel ist Minimax Optimal für lokale Glivenko-Cantelli

当地格利文科-坎泰利的经验中值为 Minimax 最佳当地格利文科-坎泰利

2410.02835v2

418

05-28

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

KVQuant: In Richtung 10 Millionen Kontextlänge LLM-Inferenz mit KV Cache-Quantisierung

KVQuant: 努力达到1000万个内长长LLM 与 KV 缓存量推论

2401.18079v6

419

05-28

Navigating the Latent Space Dynamics of Neural Models

Navigation der latenten Raumdynamik von Neuralmodellen

导航内壳模型的冷层空间动态

2505.22785v1

420

05-28

On the definition and importance of interpretability in scientific machine learning

Zur Definition und Bedeutung der Deutbarkeit im wissenschaftlichen maschinellen Lernen

关于科学机器学习中可解释性的定义和重要性

2505.13510v2

421

05-28

Adaptive Exploration for Multi-Reward Multi-Policy Evaluation

Adaptive Exploration für Multi-Reward Multi-Policy-Bewertung

多方奖励多政策评价的适应性探索

2502.02516v2

422

05-28

Temporal Convolutional Autoencoder for Interference Mitigation in FMCW Radar Altimeters

Temporal Convolutional Autoencoder für Interferenzmilderung in FMCW Radar Höhenmessern

FMCC 雷达测高仪中用于减少干扰干扰的时时变自动算器

2505.22783v1

423

05-28

Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean-Field Games

Finite-Sample-Konvergenzgrenzen für die Optimierung der Treuhandregion-Politik in Mittelfeld-Spielen

平地运动会中信任区政策优化

2505.22781v1

424

05-28

Machine Learning Models Have a Supply Chain Problem

Modelle des maschinellen Lernens haben ein Problem mit der Lieferkette

机器学习模式有供应链问题

2505.22778v1

425

05-28

GraphNarrator: Generating Textual Explanations for Graph Neural Networks

GraphNarrator: Erzeugen von Texterklärungen für Graph Neuronale Netzwerke

图示记录器:生成图形神经网络的文字解释

2410.15268v2

426

05-28

The Value of Information in Human-AI Decision-making

Der Wert von Informationen in der Mensch-AI-Entscheidungsfindung

信息在人类-大赦国际决策中的价值

2502.06152v4

427

05-28

Calibrated Value-Aware Model Learning with Stochastic Environment Models

Kalibriertes wertbewusstes Modelllernen mit stochastischen Umweltmodellen

使用存储环境模型校准价值软件模型学习

2505.22772v1

428

05-28

Multivariate de Bruijn Graphs: A Symbolic Graph Framework for Time Series Forecasting

Multivariate de Bruijn Graphen: Ein symbolisches Graphen-Framework für die Vorhersage von Zeitreihen

布鲁伊图多变量图:时间序列预测符号图框架

2505.22768v1

429

05-28

Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks

Degenerierung von Mess- und Regellösungen über aufgabenorientierte recurrente Neuralnetzwerke hinweg

跨任务技术经常性神经网络的退化

2410.03972v2

430

05-28

Test-time augmentation improves efficiency in conformal prediction

Testzeitvergrößerung verbessert die Effizienz in der konformen Vorhersage

提高试验时间的提高提高符合预测的效率

2505.22764v1

431

05-28

Generalizable Representation Learning for fMRI-based Neurological Disorder Identification

Generalisierbares Repräsentationslernen für die fMRI-basierte neurologische Störungserkennung

FMRI基于神经疾病识别的神经疾病学学习

2412.16197v2

432

05-28

MIAS-SAM: Medical Image Anomaly Segmentation without thresholding

MIAS-SAM: Medizinische Bildanomalie Segmentierung ohne Schwellenbildung

MIAS-SAM: 医学形象非典型分割,无阈值

2505.22762v1

433

05-28

Non-convex entropic mean-field optimization via Best Response flow

Nicht konvexe entropische Mittelfeld-Optimierung über Best Response Flow

通过最佳反应流程优化非convex 电子中位平均场

2505.22760v1

434

05-28

FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference

FlashFormer: Ganzmodell-Kernel für effiziente Low-Batch-Inferenz

FlashFormer: 用于高效低批量推断的全模块内核

2505.22758v1

435

05-28

Decomposing Elements of Problem Solving: What “Math” Does RL Teach?

Zersetzende Elemente der Problemlösung: Was “Math” lehrt RL?

问题解决的分解要素:RL教什么“马思”?

2505.22756v1

436

05-28

Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

Darstellungsdynamiken von Diffusionsmodellen durch Low-Dimensional Modeling verstehen

通过低多样性建模理解通过低多样性建模传播模型的动态

2502.05743v2

437

05-28

VideoRAG: Retrieval-Augmented Generation over Video Corpus

VideoRAG: Retrieval-Augmented Generation über Video Corpus

VideoRAG: 利用视频公司回收的原始一代

2501.05874v3

438

05-28

Self-orthogonalizing attractor neural networks emerging from the free energy principle

Selbst-orthogonalisierendes Attraktor-Neuralnetzwerk, das aus dem Prinzip der freien Energie entspringt

根据自由能源原则建立的自我调整的吸引人神经网络

2505.22749v1

439

05-28

An unsupervised method for MRI recovery: Deep image prior with structured sparsity

Eine unüberwachte Methode für die MRT-Wiederherstellung: Tiefenbild vor mit strukturierter Sparsamkeit

MRI 恢复的一种不受监督的方法: 结构宽度之前的深图像

2501.01482v3

440

05-28

StarBASE-GP: Biologically-Guided Automated Machine Learning for Genotype-to-Phenotype Association Analysis

StarBASE-GP: Biologisch geführtes automatisiertes maschinelles Lernen für die Analyse von Genotyp-zu-Phenotyp-Verbindungen

StarBASE-GP: 基因型至极型协会分析的生物辅助自动计算机学习

2505.22746v1

441

05-28

Information-Computation Gaps in Quantum Learning via Low-Degree Likelihood

Informations-Computation Lücken im Quanten-Lernen über Low-Degree Likelihood

通过低贫困风险学习的量子学习中的信息估计差距

2505.22743v1

442

05-28

Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing

Darstellung Shattering in Transformers: Synthetische Studie mit Wissensbearbeitung

在变形器中代表变形器:带有知识编辑的合成研究

2410.17194v4

443

05-28

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

AutoL2S: Auto-Lang-Short-Reasoning für effiziente große Sprachmodelle

自动L2S:高效大语言模式的自动长期短期理由

2505.22662v1

444

05-28

3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model

3DLLM-Mem: Langzeit-Raum-Temporal-Speicher für körpereigenes 3D-Großsprachmodell

3DLLM-Mem:3D大语言模型内嵌成的3D大语言长期空间-时间记忆

2505.22657v1

445

05-28

Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents

Position: Ungewissheitsquantifizierung braucht eine Neubewertung für großsprachige Modellagenten

位置:大语言示范物剂的不确定性量化需求评估

2505.22655v1

446

05-28

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Sherlock: Selbstkorrekte Vernunft in Vision-Sprachen-Modellen

夏洛克:视觉语言模型中的自我校正理由

2505.22651v1

447

05-28

On Learning Verifiers for Chain-of-Thought Reasoning

Über das Lernen von Prüfern für die Ketten-of-Thought-Reasoning

关于研究链理由的学习验证符

2505.22650v1

448

05-28

Private Rate-Constrained Optimization with Applications to Fair Learning

Private Rate-Constrained Optimization mit Anwendungen für faires Lernen

利用公平学习申请实现优化

2505.22703v1

449 05-28 Spectral Survival Analysis Spektrale Überlebensanalyse 光谱生存分析 2505.22641v1

450

05-28

SimProcess: High Fidelity Simulation of Noisy ICS Physical Processes

SimProcess: Hohe Fidelity-Simulation von lärmigen ICS-Physischen Prozessen

中间过程:高菲力模拟有噪音的ICS物理过程

2505.22638v1

451

05-28

Understanding (Un)Reliability of Steering Vectors in Language Models

Verständnis (Un)Zuverlässigkeit von Steuerungsvektoren in Sprachmodellen

(un) 语言模式指导矢量的可靠性

2505.22637v1

452

05-28

Spatial Knowledge Graph-Guided Multimodal Synthesis

Raumwissen Graph-geführte multimodale Synthese

空间知识图表辅助多模式合成

2505.22633v1

453

05-28

GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks

GraphOmni: Ein umfassender und erweiterbarer Benchmark-Rahmen für große Sprachmodelle zu graphtheoretischen Aufgaben

图图Omni:图理学任务大语言模型综合和可扩展基准框架

2504.12764v3

454

05-28

SCIZOR: A Self-Supervised Approach to Data Curation for Large-Scale Imitation Learning

SCIZOR: Ein selbstüberwachter Ansatz zur Datenkuration für großflächiges Imitationslernen

SCIZOR: 大规模模拟学习数据计算法的自我监督办法

2505.22626v1

455

05-28

Principled Out-of-Distribution Generalization via Simplicity

Prinzipielle Nicht-Verteilung Verallgemeinerung über Einfachheit

通过简单化普遍化

2505.22622v1

456

05-28

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Der Entropie-Mechanismus des Verstärkten Lernens für sinnvolle Sprachmodelle

理由语言模式强化学习的全英机制

2505.22617v1

457

05-28

Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

Bridging Supervised Learning und Verstärkung Lernen in Mathe-Reasoning

在数学原因方面的受监督学习和强化学习架桥

2505.18116v2

458

05-28

Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Voll heterogene Grafenregression mit tiefen Doppelpoisson-Netzwerken

带有深双 Poisson 网络的全导流计数回归

2406.09262v4

459

05-28

Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency

Abgeschirmte Diffusion: Erzeugen von neuen und vielfältigen Bildern mit Sparse Repellency

盾牌扩散:利用微缩生成新奇和多样化图像

2410.06025v3

460

05-28

Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight Decay

Inverse Probleme mit tiefen linearen neuralen Netzwerken lösen: Globale Konvergenzgarantien für gradienten Abstieg mit Gewichtsverfall

解决深线神经神经网络的反面问题:全球一致保障渐变后裔与体重衰减

2502.15522v2

461

05-28

Chest Disease Detection In X-Ray Images Using Deep Learning Classification Method

Brusterkrankungen Detektion in Röntgenbildern mit Deep Learning-Klassifikationsmethode

利用深学习分类方法在X射线图像中检测胸前疾病

2505.22609v1

462

05-28

AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

AutoElicit: Mit großen Sprachmodellen für vorausschauende Modellierung von Expertenvoraussagen

自动:在预测模拟中使用大语言模型,供专家使用

2411.17284v5

463

05-28

One Rank at a Time: Cascading Error Dynamics in Sequential Learning

Ein Rang zu einer Zeit: Cascading Error Dynamics in Sequential Learning

一次一排: 序列学习中连带错误动态

2505.22602v1

464

05-28

Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

Adjoint Sampling: Hoch skalierbare Diffusions-Probenehmer über Adjoint Matching

联合采样:通过联合配配制的高可缩放扩散采样器

2504.11713v3

465

05-28

Machine Unlearning under Overparameterization

Maschine Unlearning unter Überparameterisierung

超参数化下脱学机

2505.22601v1

466

05-28

HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym

HDDLGym: Ein Tool zum Studieren multi-agenter Hierarchischer Probleme, definiert in HDDL mit OpenAI Gym

HDDLGym: 与 OpenAI Gym 一起研究在HDDL 中界定的多代理等级问题的工具

2505.22597v1

467

05-28

SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Synworld: 用于改进制剂行动知识的虚拟情景合成

2504.03561v2

468

05-28

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

Self-Error-Instruct: Verallgemeinern von Fehlern für LLMs Mathematische Begründung

自错误教学法: 数学理由LLMs 的错误一般化

2505.22591v1

469

05-28

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

VTool-R1: VLMs lernen mit Bildern zu denken, indem sie mehr über multimodale Werkzeugnutzung lernen

VTool-R1:VLMs通过多模式工具使用强化学习学习如何用图像思考

2505.19255v2

470

05-28

ReLearn: Unlearning via Learning for Large Language Models

ReLearn: Entlernen über Learning for Large Language Models

Reearn:通过学习大语言模式来重新学习

2502.11190v3

471

05-28

Benignity of loss landscape with weight decay requires both large overparametrization and initialization

Die Benignität der Verlustlandschaft mit dem Verfall des Gewichts erfordert sowohl große Überparametrierung als auch Initialisierung

损失景观与体重衰减的尊严要求大规模过度平衡和初始化

2505.22578v1

472

05-28

FNOPE: Simulation-based inference on function spaces with Fourier Neural Operators

FNOPE: Simulationsbasierte Inferenz auf Funktionsräumen mit Fourier-Neural-Betreibern

FNOPE: Fourier神经操作员对功能空间的模拟推推

2505.22573v1

473

05-28

PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion

PRISM: Videodatensatz-Kondensation mit progressiver Veredelung und Einfügung für Sparse Motion

PRISM: 视频数据集浓缩,并逐步精化和插入,用于微缩移动

2505.22564v1

474

05-28

Geometric Hyena Networks for Large-scale Equivariant Learning

Geometrische Hyänennetze für großmaßstäbliches Äquivalent-Lernen

大规模平等学习的几何Hyena网络

2505.22560v1

475

05-28

Preference Adaptive and Sequential Text-to-Image Generation

Präferenz Adaptive und sequentielle Text-zu-Bild-Generierung

适应性和顺序性文字到图像生成

2412.10419v2

476

05-28

Can Copulas Be Used for Feature Selection? A Machine Learning Study on Diabetes Risk Prediction

Kann Copulas für die Feature-Auswahl verwendet werden? Eine maschinelle Studie über Diabetes Risikovorhersage

Copulas 能够用来选择特质吗? 糖尿病风险预测的机器学习研究。

2505.22554v1

477

05-28

Data-Distill-Net: A Data Distillation Approach Tailored for Reply-based Continual Learning

Data-Distill-Net: Ein Datendestillationsansatz, der auf Reply-based Continual Learning zugeschnitten ist

Data-still-Net:为基于答复的不断学习量身定制的数据蒸馏方法

2505.20135v2

478

05-28

DES-LOC: Desynced Low Communication Adaptive Optimizers for Training Foundation Models

DES-LOC: Entsynced Low Communication Adaptive Optimizers for Training Foundation Models

DES-LOC:为培训基金会模型提供发光的低通信适应性适应性优化剂

2505.22549v1

479

05-28

A Human-Centric Approach to Explainable AI for Personalized Education

Ein menschlich-zentraler Ansatz zur erklärbaren KI für die personalisierte Bildung

以人文文化方式解释个人个性化教育的可解释的AI

2505.22541v1

480

05-28

Uncertainty Quantification with Proper Scoring Rules: Adjusting Measures to Prediction Tasks

Ungewissheitsquantifizierung mit korrekten Bewertungsregeln: Anpassung von Maßnahmen an Vorhersageaufgaben

以适当排序规则对不确定性进行量化:预测任务调整措施

2505.22538v1

481

05-28

TabularQGAN: A Quantum Generative Model for Tabular Data

TabularQGAN: Ein Quantum Generatives Modell für Tabulardaten

表格QGAN:表格数据量子生成模型

2505.22533v1

482

05-28

Prediction of the Most Fire-Sensitive Point in Building Structures with Differentiable Agents for Thermal Simulators

Vorhersage des feuerempfindlichsten Punkts in Gebäudestrukturen mit differenzierbaren Agenten für thermische Simulatoren

预测热模拟器使用不同物剂建造结构时最能防火的火敏度点

2502.03424v4

483

05-28

Training RL Agents for Multi-Objective Network Defense Tasks

Schulung von RL-Agenten für multi-objektive Netzwerkverteidigungsaufgaben

多目标网络防御任务培训RL代理

2505.22531v1

484

05-28

Symplectic Generative Networks (SGNs): A Hamiltonian Framework for Invertible Deep Generative Modeling

Symplektische Generative Netzwerke (SGNs): Ein Hamiltonsches Framework für invertible Deep Generative Modeling

症状产生网络:一个汉密尔顿框架,用于可垂直产生深层产生模型的建立

2505.22527v1

485

05-28

Test-Time Alignment of Discrete Diffusion Models with Sequential Monte Carlo

Test-Time Alignment von diskreten Diffusionsmodellen mit Sequential Monte Carlo

使用顺序式蒙特卡洛的分解传播模型的测试时间对齐

2505.22524v1

486

05-28

Evaluating Supervised Learning Models for Fraud Detection: A Comparative Study of Classical and Deep Architectures on Imbalanced Transaction Data

Bewertung von überwachten Lernmodellen für Betrugserkennung: Eine vergleichende Studie klassischer und tiefer Architekturen zu unausgewogenen Transaktionsdaten

评价受监督的欺诈侦查学习模式:关于不平衡交易数据的经典和深层结构比较研究

2505.22521v1

487

05-28

IGNIS: A Neural Network Framework for Robust Parameter Estimation in Archimedean Copulas

IGNIS: Ein neurales Netzwerk-Framework für robuste Parameterschätzungen in Archimedischen Copulas

INGNIS: Archimedean Copuulas 强参数估计神经网络框架

2505.22518v1

488

05-28

Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?

Kolmogorov-Arnold Achtung: Ist erlernbare Aufmerksamkeit besser für Vision Transformer?

科尔莫戈罗夫-阿诺尔德关注:对愿景转变者来说,学习关注是否更好?

2503.10632v2

489

05-28

Accelerating Optimization via Differentiable Stopping Time

Beschleunigung der Optimierung durch differenzierbare Stoppzeit

通过有区别的停止时间加速优化

2505.22509v1

490

05-28

Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models

Closed-Form Training Dynamics Reveal Erlernte Funktionen und lineare Struktur in Word2Vec-ähnlichen Modellen

类似Word2Vec 模型中的封闭形式培训动态观测发现特性和线形结构

2502.09863v2

491

05-28

Sparsification and Reconstruction from the Perspective of Representation Geometry

Sparsifikation und Rekonstruktion aus Sicht der Repräsentationsgeometrie

从代表制角度看分解与重建

2505.22506v1

492

05-28

Geometric GNNs for Charged Particle Tracking at GlueX

Geometrische GNNs für geladene Partikelverfolgung bei GlueX

GNNs 用于凝胶X充电粒子跟踪的几何 GNNs

2505.22504v1

493

05-28

Assessing Quantum Advantage for Gaussian Process Regression

Bewertung des Quantenvorteils für Gaussian Process Regression

评估高山进程倒退的量度优势

2505.22502v1

494

05-28

Novelty Detection in Reinforcement Learning with World Models

Neuheitserkennung im Verstärkungslernen mit Weltmodellen

利用世界模式加强学习新颖发现

2310.08731v4

495

05-28

ProSpero: Active Learning for Robust Protein Design Beyond Wild-Type Neighborhoods

ProSpero: Aktives Lernen für robustes Proteindesign jenseits von Wild-Typ-Nachbarschaften

ProSpero:在野生部落邻里以外积极学习巨型蛋白设计

2505.22494v1

496

05-28

Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation

Entmystifizierung des Paradoxon der wichtigen Probenahme mit einer geschätzten historisch-nachfolgenden Verhaltenspolitik in der Off-Policy-Bewertung

以非政策评价中的估计历史依赖者行为政策来解开重要性抽样反常现象的神秘化

2505.22492v1

497

05-28

On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling

Über die überraschende Wirksamkeit großer Lernraten unter Standardbreitenskalierung

根据标准宽宽度比例扩大的大型学习率的惊人效果

2505.22491v1

498

05-28

Understanding Adversarial Training with Energy-based Models

Verständnis von Adversarial Training mit energiebasierten Modellen

与基于能源模式的对等培训的谅解

2505.22486v1

499

05-28

Intrinsic User-Centric Interpretability through Global Mixture of Experts

Intrinsische Benutzer-Centric-Interpretability durch globale Mischung von Experten

通过全球专家混合解释

2402.02933v4

500

05-28

A Closer Look at Multimodal Representation Collapse

Ein genauerer Blick auf multimodale Darstellungskollaps

更仔细地审视多模式代表制的崩溃

2505.22483v1

501

05-28

Hypothesis Testing in Imaging Inverse Problems

Hypothesenprüfung in bildgebenden Inversen Problemen

想象反反问题假设测试

2505.22481v1

502

05-28

Position: Don’t Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

Position: Verwenden Sie den CLT nicht in LLM-Evalen mit weniger als ein paar hundert Datenpunkten

位置: 不要在LLM Evals中使用 CLT, 其数据点小于几百个数据点

2503.01747v3

503

05-28

Non-Asymptotic Analysis of (Sticky) Track-and-Stop

Nicht-asymptotische Analyse von (Sticky) Track-and-Stop

对(Stiskky)轨道和停止的非症状分析

2505.22475v1

504

05-28

Bridging Language, Vision and Action: Multimodal VAEs in Robotic Manipulation Tasks

Überbrückung von Sprache, Vision und Aktion: Multimodale VAE in Robotermanipulationsaufgaben

架桥语言、愿景和行动:机器人操纵任务中的多式机动性

2404.01932v2

505

05-28

Forecasting Multivariate Urban Data via Decomposition and Spatio-Temporal Graph Analysis

Voraussichtliche Multivariate Stadtdaten durch Zersetzung und räumlich-Temporale Graphenanalyse

通过分解和时空空间图分析预测多变量城市数据

2505.22474v1

506

05-28

Pure Exploration with Infinite Answers

Reine Exploration mit unendlichen Antworten

纯探索无无限答案

2505.22473v1

507

05-28

CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs

CPINN-ABPI: Physik-informierte Neuralnetze für genaue Leistungsschätzung in MPCs

CPINN-ABPI: MPSoCs中精确功率估计物理内建神经网络

2505.22469v1

508

05-28

FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation

FitCF: Ein Framework für die automatische Feature-Importanz-geführte kontrafaktische Beispielgenerierung

FitCF: 自动地物、重要引导反事实实例生成框架

2501.00777v3

509

05-28

Embedding Safety into RL: A New Take on Trust Region Methods

Einbettung der Sicherheit in RL: Ein neuer Ansatz für Methoden der Vertrauensregion

将安全嵌入RL:信任区域方法的新做法

2411.02957v3

510

05-28

OptiMindTune: A Multi-Agent Framework for Intelligent Hyperparameter Optimization

OptiMindTune: Multi-Agenten-Framework für intelligente Hyperparameter-Optimierung

OptiMindTunne: 智能超参数优化的多机构框架

2505.19205v2

511

05-28

Depth-Based Matrix Classification for the HHL Quantum Algorithm

Tiefenbasierte Matrix-Klassifikation für den HHL-Quantenalgorithmus

HHL 量图算法的深度矩阵分类

2505.22454v1

512

05-28

Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Unüberwachte Nachschulung für Multi-Modal LLM Reasoning via GRPO

无人监督的多模式LLM通过GROPO进行多模式LLM进修培训后培训

2505.22453v1

513

05-28

Position: All Current Generative Fidelity and Diversity Metrics are Flawed

Position: Alle aktuellen Generativen Fidelity und Diversity Metrics sind abgeflacht

位置:所有当前产生分裂性和多样性

2505.22450v1

514

05-28

SOReL and TOReL: Two Methods for Fully Offline Reinforcement Learning

SOReL und TOReL: Zwei Methoden für vollständiges Offline-Verstärkungslernen

SOLEL和TOREL: 完全脱线强化学习的两种方法

2505.22442v1

515

05-28

Variational Positive-incentive Noise: How Noise Benefits Models

Variational Positiv-incentive Noise: Wie Lärm Vorteile Modelle

变化式积极积极激励噪音:如何创造噪音效益模式

2306.07651v2

516

05-28

LAMBDA: A Large Model Based Data Agent

LAMBDA: Ein großer modellbasierter Datenagent

LAMBDA:一个大型模型数据代理

2407.17535v3

517

05-28

Data-Driven Antenna Miniaturization: A Knowledge-Based System Integrating Quantum PSO and Predictive Machine Learning Models

Datengetriebene Antenne Miniaturisierung: Ein wissensbasiertes System zur Integration von Quanten-PSO und vorausschauenden Machine Learning-Modellen

数据驱动天线微型化:以知识为基础的系统综合量子PSO和可预测性机器学习模型

2505.22440v1

518

05-28

Synonymous Variational Inference for Perceptual Image Compression

Synonyme Variationsableitung für Wahrnehmungsbildkompression

感知图像压缩的同义同义变异推理

2505.22438v1

519

05-28

Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

Ausgelagerte Diffusionsprobenahme: Effiziente hintere Inferenz in latenten Räumen generativer Modelle

外部外包扩散采样:在基因变异模型潜在空间中有效的后继推论

2502.06999v2

520

05-28

C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models

C-LoRA: Kontextuelle Low-Rank-Anpassung für Unsicherheitsabschätzungen in großen Sprachmodellen

C-LORA:用于大语言模型中不确定性估算的不确定性估算的上下文性低风险适应

2505.17773v2

521

05-28

AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

AstroVisBench: Ein Code-Bench für wissenschaftliche Computing und Visualisierung in der Astronomie

AstroVisbench:天文科学计算和可视化标准

2505.20538v2

522

05-28

Scaling Reasoning without Attention

Skalierung ohne Aufmerksamkeit

无人注意的调整理由

2505.22425v1

523

05-28

STaR-Bets: Sequential Target-Recalculating Bets for Tighter Confidence Intervals

StaR-Bets: Sequentielle Target-Rekalkulationswetten für engere Vertrauensintervalle

STaR-Bets: 更密切信任间隔的序列目标-计算重新计算保证

2505.22422v1

524

05-28

Beyond Verifiable Rewards: Scaling Reinforcement Learning for Language Models to Unverifiable Data

Jenseits von überprüfbaren Belohnungen: Skalierung von Verstärkung Lernen für Sprachmodelle zu unüberprüfbaren Daten

超越可核实的奖励:加强语文模式的强化学习,以获得不可核实的数据

2503.19618v2

525

05-28

Mitigating Overthinking in Large Reasoning Models via Manifold Steering

Überdenken in großen Vernunftmodellen durch Manifold Steering verhindern

通过 MManicform 指导减轻大型理性模型中的过度思考

2505.22411v1

526

05-28

Decoupled Subgraph Federated Learning

Entkoppelter Subgraph Federated Learning

分校分科分科分科分科

2402.19163v3

527

05-28

Beyond External Monitors: Enhancing Transparency of Large Language Models for Easier Monitoring

Jenseits von externen Monitoren: Verbesserung der Transparenz von großen Sprachmodellen für eine einfachere Überwachung

外部监测之外的外部监测:提高大语言模型的透明度,促进更易监测

2502.05242v2

528

05-28

BILBO: BILevel Bayesian Optimization

BILBO: BILevel Bayesian Optimierung

BILBO: BI级巴耶斯最佳优化

2502.02121v2

529

05-28

Simultaneously Solving FBSDEs and their Associated Semilinear Elliptic PDEs with Small Neural Operators

Gleichzeitige Lösung von FBSDs und ihren zugehörigen semilinearen elliptischen PDEs mit kleinen neuralen Operatoren

与小型神经操作器同时解决FBSDEs及其相关半线性椭圆形粒体

2410.14788v2

530

05-28

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

Inferenz-Time Scaling für Flow-Modelle über stochastische Generation und Rollover Budget Forcing

通过存储器生成和滚转预算推力对流动模型的推推时间调整

2503.19385v4

531

05-28

Physics-Informed Distillation of Diffusion Models for PDE-Constrained Generation

Physik-informierte Destillation von Diffusionsmodellen für PDE-kontrainierte Generation

PDE - 受培训的一代的传播模型的物理改造

2505.22391v1

532

05-28

Revisiting Feature Interactions from the Perspective of Quadratic Neural Networks for Click-through Rate Prediction

Überprüfung von Feature-Interaktionen aus der Perspektive quadratischer neuraler Netzwerke für Click-through-Rate-Vorhersage

从 “ 点击通速率预测 “ 四方神经网络的角度重新审视地貌相互作用

2505.17999v2

533

05-28

DAM: Domain-Aware Module for Multi-Domain Dataset Condensation

DAM: Domain-Aware-Modul für Multi-Domain-Datensatz-Kondensation

DAM: 多域数据集集中的域- 软件模块

2505.22387v1

534

05-28

When do neural networks learn world models?

Wann lernen neuronale Netzwerke Weltmodelle?

神经网络何时学习世界模型?

2502.09297v3

535

05-28

Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection

Infinite-dimensionale Mahalanobis-Distanz mit Anwendungen zur kernisierten Neuheitserkennung

无限的马哈拉诺比斯距离,应用内核新闻探测技术

2407.11873v2

536

05-28

Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning

Überwindung von Dimensional Factorization Limits in diskreten Diffusionsmodellen durch Quantum Joint Distribution Learning

通过量子联合分发学习克服分辨传播模式中的分量限制

2505.05151v2

537

05-28

A Divide-and-Conquer Approach for Modeling Arrival Times in Business Process Simulation

Ein Divide-and-Conquer-Ansatz für die Modellierung von Ankunftszeiten in der Business Process Simulation

在模拟商业进程中模拟抵达时

2505.22381v1

538

05-28

Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Association

Lipschitz-Driven 不确定性为空间协会量化

2502.06067v2

539

05-28

Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization

Memento No More: Coaching von KI-Agenten zu Master mehrere Aufgaben durch Hinweise Internalisierung

不再纪念:通过Hints内部化,指导AI代理人员掌握多项任务

2502.01562v2

540

05-28

Update Your Transformer to the Latest Release: Re-Basin of Task Vectors

Aktualisieren Sie Ihren Transformer auf die neueste Version: Re-Basin der Task-Vektoren

将您的变换器更新为最新版本: 任务矢量的重新 Basin

2505.22697v1

541

05-28

An Empirical Evaluation of Rewiring Approaches in Graph Neural Networks

Eine empirische Bewertung der Verdrahtungsansätze in Graphen-Neuralen Netzwerken

对图形神经网络重新布线方法的经验评价

2305.19717v2

542

05-28

Topological Eigenvalue Theorems for Tensor Analysis in Multi-Modal Data Fusion

Topologische Eigenwert-Theoreme für die Tensoranalyse in multi-Modal Data Fusion

多模式数据融合中用于天线分析的多模式数据融合中的表光分析的表性地球价值地形学理论论

2409.09392v3

543

05-28

Computing Optimal Transport Maps and Wasserstein Barycenters Using Conditional Normalizing Flows

Computing Optimal Transport Maps und Wasserstein Barycenter mit bedingten Normalisierungsflüssen

使用条件性正常流动的最佳运输地图和瓦塞尔斯坦百分点

2505.22364v1

544

05-28

Directed Homophily-Aware Graph Neural Network

Regie führte homophily-aware Graph Neural Network

直导光电图神经网络

2505.22362v1

545

05-28

Continuum-armed Bandit Optimization with Batch Pairwise Comparison Oracles

Kontinuierliche Bandit-Optimierung mit Batch Pairwise Vergleich Oracles

以批次对称比较甲骨文优化利用批次对称比较

2505.22361v1

546

05-28

Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification

多级损失多级损失多级损失多级分分分分化中梯源普遍化的多级几何事项

2505.22359v1

547

05-28

Budget-Adaptive Adapter Tuning in Orthogonal Subspaces for Continual Learning in LLMs

Budget-Adaptive Adapter Tuning in Orthogonal Subspaces für kontinuierliches Lernen in LLMs

用于LLMM中持续学习的正方形子空间的预算-ADA 预算-ADA 调适器图案

2505.22358v1

548

05-28

Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings

Eignungsfilter: Ein statistisches Rahmenwerk für die Klassifikator-Evaluierung in Real-World-Einsatzeinstellungen

适用性过滤器:在现实世界部署设置中进行分类评价的统计框架

2505.22356v1

549

05-28

Look Within or Look Beyond? A Theoretical Comparison Between Parameter-Efficient and Full Fine-Tuning

Schauen Sie nach innen oder schauen Sie darüber hinaus? Ein theoretischer Vergleich zwischen Parameter-Effizient und Full Fine-Tuning

内观还是外观? 参数有效与完全精准之间的理论比较。

2505.22355v1

550

05-28

Context-sensitive neocortical neurons transform the effectiveness and efficiency of neural information processing

Kontext-sensible neocortical Neuronen verwandeln die Wirksamkeit und Effizienz der neuronalen Informationsverarbeitung

环境敏感的新园艺神经元改变神经信息处理的效益和效率

2207.07338v7

551

05-28

AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings

AKRMap: Adaptive Kernel-Regression für vertrauenswürdige Visualisierung von Cross-Modal-Embeddings

AKRMap:跨模式嵌入的可信赖可视化的适应性内核倒退

2505.14664v2

552

05-28

Progressive Data Dropout: An Embarrassingly Simple Approach to Faster Training

Progressive Data Dropout: Ein verblüffend einfacher Ansatz zum schnelleren Training

渐进数据辍学:快速培训的一个令人尴尬的简单方法

2505.22342v1

553

05-28

Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start

Multimodale Reasoning durch verstärktes Lernen mit kaltem Start fördern

通过 “ 冷起 “ 的强化学习推进多模式理由

2505.22334v1

554

05-28

Credal Prediction based on Relative Likelihood

Credal Prediction basierend auf relativer Likelihood

基于相对可能性的裂变预测

2505.22332v1

555

05-28

Learning in Stackelberg Games with Non-myopic Agents

Lernen in Stackelberg Spiele mit nicht-myopischen Agenten

学习与非中色剂在斯塔克尔贝格运动会中的学习

2208.09407v3

556

05-28

When Does Neuroevolution Outcompete Reinforcement Learning in Transfer Learning Tasks?

Wann führt Neuroevolution das Verstärkte Lernen in Transfer-Lernaufgaben durch?

在转让学习任务方面,神经革命何时会超越竞争加强学习?

2505.22696v1

557

05-28

LLM-ODDR: A Large Language Model Framework for Joint Order Dispatching and Driver Repositioning

LLM-ODDR: Ein großes Sprachmodell für Joint Order Dispatching und Driver Repositioning

LLM-ODDD:联合调度和司机重新定位大语言示范框架

2505.22695v1

558

05-28

Individualised Counterfactual Examples Using Conformal Prediction Intervals

Individualisierte gegenfaktische Beispiele mit konformen Vorhersageintervallen

使用非正式预测间隔的个别反事实实例

2505.22326v1

559

05-28

A Closer Look on Memorization in Tabular Diffusion Model: A Data-Centric Perspective

Ein genauerer Blick auf die Erinnerung an Tabular Diffusion Modell: Eine datenzentrische Perspektive

更仔细地看一看表格传播模型中的记忆化:数据核心视角

2505.22322v1

560

05-28

Core Context Aware Transformers for Long Context Language Modeling

Core Context Aware Transformers für lange Kontext-Sprachenmodellierung

长语语言建模核心认知变型器

2412.12465v2

561

05-28

Copresheaf Topological Neural Networks: A Generalized Deep Learning Framework

Copresheaf Topologische neurale Netzwerke: Ein generalisiertes Deep Learning Framework

Copresheaf 地形神经网络:普遍深层学习框架

2505.21251v2

562

05-28

If Pigs Could Fly… Can LLMs Logically Reason Through Counterfactuals?

Wenn Schweine fliegen könnten… können LLMs logischerweise durch Gegenfakten denken?

如果猪能飞…

2505.22318v1

563

05-28

Rethinking BPS: A Utility-Based Evaluation Framework

Rethinking BPS: Ein Nutzen-basierter Bewertungsrahmen

重新思考BPS:基于公用事业的评价框架

2505.22316v1

564

05-28

MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections

MUDDFormer: Breaking Residual Engpässe in Transformatoren über Multiway Dynamic Dense Connections

MUDDFormer:通过多路动态感应连接在变形器中打破残余瓶颈

2502.12170v2

565

05-28

From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization

Von Dormant zu Gelöscht: Tamper-Resistent Unlearning durch Gewicht-Raum-Regularisierung

从杜尔曼特移到删除:通过宽空正规化,让塔帕-较远摆脱学习

2505.22310v1

566

05-28

FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration

FireQ: Schnelle INT4-FP8-Kernel- und RoPE-gestützte Quantisierung für LLM-Inferenzbeschleunigung

消防:快速INT4-FFP8 内核和ROPE-感知的LLM 推推加速量

2505.20839v2

567

05-28

Transformers Pretrained on Procedural Data Contain Modular Structures for Algorithmic Reasoning

Transformer vorgebildet auf verfahrenstechnische Daten enthalten modulare Strukturen für algorithmische Vernunft

在包含用于算法理由的模块结构的程序性数据方面受过预先培训的变异器

2505.22308v1

568

05-28

Risk-Informed Diffusion Transformer for Long-Tail Trajectory Prediction in the Crash Scenario

Risiko-informierter Diffusionstransformator für langspurige Trajektorien-Vorhersage im Crash-Szenario

崩溃设想情景中长帆轨迹预测风险化传导变异器

2501.16349v2

569

05-28

Robustness and Cybersecurity in the EU Artificial Intelligence Act

Robustheit und Cybersicherheit im EU-Gesetz über künstliche Intelligenz

《欧盟人工情报法》中的强力和网络安全

2502.16184v2

570

05-28

Versatile Cardiovascular Signal Generation with a Unified Diffusion Transformer

Vielseitige kardiovaskuläre Signalgenerierung mit einem Unified Diffusion Transformer

具有统一扩散变异器的心血管心血管信号生成

2505.22306v1

571

05-28

LLäMmlein: Compact and Competitive German-Only Language Models from Scratch

LLäMmlein: Kompakte und wettbewerbsfähige deutschsprachige Sprachmodelle von Scratch

LläMmlein:来自斯克拉奇的契约和竞争性独德语言模式

2411.11171v4

572

05-28

Diss-l-ECT: Dissecting Graph Data with Local Euler Characteristic Transforms

Diss-l-ECT: Entschlüsselung von Graphendaten mit lokalen Euler-Charakteristik-Transformationen

Diss- l- ECT: 用本地电磁特征变换解析图表数据

2410.02622v2

573

05-28

360-LLaMA-Factory: Plug & Play Sequence Parallelism for Long Post-Training

360-LlaMA-Fabrik: Plug & Play-Sequenz-Parallelität für langes Nachtraining

360-LLamaMA-Factory: 长期培训之后的插件和播放序列平行主义

2505.22296v1

574

05-28

Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Light-R1: Curriculum SFT, DPO und RL für Long COT aus Scratch und darüber hinaus

Light-R1:SFT、DPO和RL课程,用于Scratch及以后的长期COT

2503.10460v4

575

05-28

MoRE: A Mixture of Low-Rank Experts for Adaptive Multi-Task Learning

MoRE: Eine Mischung aus Low-Rank Experten für adaptives Multi-Task Learning

MoRE: 适应性多任务学习低级专家混合组合

2505.22694v1

576

05-28

Rethinking the Unsolvable: When In-Context Search Meets Test-Time Scaling

Das Unlösbare neu denken: Wenn In-Context Search Test-Time Scaling trifft

重新思考无法解答的问题: 当 In-Ctext 搜索遇到测试时间缩放时

2505.22290v1

577

05-28

A Variational Perspective on Generative Protein Fitness Optimization

Eine abwechslungsreiche Perspektive auf generative Protein-Fitness-Optimierung

关于最优化的生质蛋白质健身的变异视角

2501.19200v2

578

05-28

Random Feature Representation Boosting

Zufällige Merkmalsdarstellung steigert sich

随机特性显示促进

2501.18283v3

579

05-28

Sample Efficient Robot Learning in Supervised Effect Prediction Tasks

Beispiel Effizientes Roboter-Lernen in überwachten Effekt-Vorhersage-Aufgaben

在监督效应预测任务中提高机器人学习效率

2412.02331v2

580

05-28

From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning

Von den Kerneln zu den Features: Eine Multi-Scale Adaptive Theorie des Feature Learning

从核心到地貌特征:多尺度适应性地貌学习理论

2502.03210v2

581

05-28

Zero-Shot Mono-to-Binaural Speech Synthesis

Null-Schuss-Mono-bis-Binaural-Sprachsynthese

零热单声词合成

2412.08356v2

582

05-28

Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

利用语言代理框架中的双重进程理论促进实时同时人类-AI合作

2502.11882v5

583

05-28

TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup

TransMLA: Migration von GQA-Modellen zu MLA mit voller DeepSeek-Kompatibilität und Speedup

TransMLA:将GQA模型迁移到具有全深搜索兼容性和加速性的司法协助模式

2502.07864v4

584

05-28

Full Domain Analysis in Fluid Dynamics

Vollständige Domänenanalyse in Fluiddynamik

流体动态全域分析

2505.22275v1

585

05-28

EventFlow: Forecasting Temporal Point Processes with Flow Matching

EventFlow: Vorhersage von zeitlichen Punktprozessen mit Flow Matching

事件:预测与流动匹配的时点进程

2410.07430v2

586

05-28

Reward Generalization in RLHF: A Topological Perspective

Lohnverallgemeinerung in RLHF: Eine topologische Perspektive

RLHF的奖励普遍化:地形学观点

2402.10184v7

587

05-28

A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators

Eine neuartige Charakterisierung des Populationsgebiets unter der Risikodeckungskurve (AURC) und Raten von Finite Sample-Schätzern

风险覆盖曲线下人口区的新特点和有限抽样估计率

2410.15361v3

588

05-28

Improving Rule-based Reasoning in LLMs using Neurosymbolic Representations

Verbesserung der regelbasierten Reasoning in LLMs mit neurosymbolischen Darstellungen

改进使用新阳性表示法的LLM中基于规则的理据

2502.01657v3

589

05-28

Training on Plausible Counterfactuals Removes Spurious Correlations

Training auf Plausible Counterfactals entfernt spurlose Korrelationen

关于可视反事实消除污损的培训

2505.16583v3

590

05-28

LiDAR Based Semantic Perception for Forklifts in Outdoor Environments

LiDAR basierte semantische Wahrnehmung für Gabelstapler im Freien

室外环境中叉车使用基于 LiDAR 的语义感

2505.22258v1

591

05-28

Something’s Fishy In The Data Lake: A Critical Re-evaluation of Table Union Search Benchmarks

Irgendetwas ist Fishy In The Data Lake: Eine kritische Neubewertung der Tabelle Union Suche Benchmarks

“数据湖中的鱼:对表格联合搜索基准的重要重新评估”

2505.21329v2

592

05-28

Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training

Revisiting Group Relative Policy Optimization: Einblicke in die On-Policy- und Off-Policy-Schulung

重新审视小组相对政策优化:对政策和非政策培训的深入了解

2505.22257v1

593

05-28

Train Sparse Autoencoders Efficiently by Utilizing Features Correlation

Bahnsparse Autoencoder effizient durch die Nutzung von Funktionen Korrelation

通过使用地物关联, 高效地列列“ 分散的自动编译器” 。

2505.22255v1

594

05-28

A Unified Online-Offline Framework for Co-Branding Campaign Recommendations

Ein einheitliches Online-Offline-Rahmenwerk für Co-Branding-Kampagnenempfehlungen

联合捆绑运动建议统一在线离线框架

2505.22254v1

595

05-28

B-XAIC Dataset: Benchmarking Explainable AI for Graph Neural Networks Using Chemical Data

B-XAIC Datensatz: Benchmarking Erklärbare KI für Graph Neuronale Netzwerke unter Verwendung chemischer Daten

B-XAIC数据集:使用化学数据的图形神经网络基准可解释的AI

2505.22252v1

596

05-28

Evaluating Compact LLMs for Zero-Shot Iberian Language Tasks on End-User Devices

Bewertung kompakter LLMs für blitzfreie iberische Sprachaufgaben auf Endbenutzer-Geräten

评价关于最终用户装置的零 - 低 - 低 - 高 - 伊比利亚语语言任务

2504.03312v2

597

05-28

UDuo: Universal Dual Optimization Framework for Online Matching

UDuo: Universal Dual Optimization Framework für Online-Matching

UDuo: 通用双优化在线匹配框架

2505.22243v1

598

05-28

Reinforcement Learning with Verifiable Rewards: GRPO’s Effective Loss, Dynamics, and Success Amplification

Verstärktes Lernen mit überprüfbaren Belohnungen: Effektiver Verlust, Dynamik und Erfolgsverstärkung von GRPO

利用可核实的奖励加强学习:GROP的有效损失、动态和成功扩展

2503.06639v3

599

05-28

Rethinking GNN Expressive Power from a Distributed Computational Model Perspective

Überdenken von GNN Expressive Power aus einer distributed Computational Model Perspective

从分配的计算模型模型角度重新思考GNNN 的表达力

2410.01308v3

600

05-28

NRFormer: Nationwide Nuclear Radiation Forecasting with Spatio-Temporal Transformer

NRFormer: landesweite Vorhersage der nuklearen Strahlung mit Spatio-Temporal Transformer

NR 前:利用时空变压器进行全国核辐射预报

2410.11924v3

601

05-28

On Provable Length and Compositional Generalization

Auf evable Länge und kompositorische Verallgemeinerung

关于可预见长度和组成式通泛化

2402.04875v6

602

05-28

Yambda-5B – A Large-Scale Multi-modal Dataset for Ranking And Retrieval

Yambda-5B – Ein multimodaler Datensatz für das Ranking und das Retrieval

Yambda-5B – – 用于排名和检索的大型多模式数据集

2505.22238v1

603

05-28

Decision-Focused Forecasting: A Differentiable Multistage Optimisation Architecture

Entscheidungsorientierte Prognose: Eine differenzierbare mehrstufige Optimierungsarchitektur

决定重点预测:可区别的多阶段优化结构

2405.14719v2

604

05-28

Optimal kernel regression bounds under energy-bounded noise

Optimale Kernel-Regressionsgrenzen unter energiegebundenem Rauschen

在受能源限制的噪音下的最佳内核回归界限

2505.22235v1

605

05-28

Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models

Qualität Across-Sprachen beurteilen: Ein mehrsprachiger Ansatz zur Vorschulung von Datenfiltern mit Sprachmodellen

判断各语文的质量:采用多种语文办法,利用语言模式进行培训前数据过滤

2505.22232v1

606

05-28

You Do Not Fully Utilize Transformer’s Representation Capacity

Sie nicht voll nutzen Transformer-Repräsentanz Kapazität

您没有充分利用变换器的代表能力

2502.09245v2

607

05-28

Solver-Free Decision-Focused Learning for Linear Optimization Problems

Solver-Free decision-focused Learning für lineare Optimierungsprobleme

处理线性优化问题的无解决者决定-集中学习

2505.22224v1

608

05-28

Taming Recommendation Bias with Causal Intervention on Evolving Personal Popularity

Zähmungsempfehlung Bias mit ursächlicher Intervention zur Entwicklung persönlicher Beliebtheit

” 与个人大众演变的因果关系干预 “ 的 “ 比亚斯 “ 和 “ 个人大众演变 “ 的 “ 比亚斯 “ 建议

2505.14310v2

609

05-28

Quantum framework for Reinforcement Learning: Integrating Markov decision process, quantum arithmetic, and trajectory search

Quanten-Framework for Reinforcement Learning: Markov-Entscheidungsprozess, Quantenarithmetik und Flugbahnsuche integrieren

强化学习的量子框架:纳入Markov决策程序、量数算术和轨迹搜索

2412.18208v3

610

05-28

Advancing Sequential Numerical Prediction in Autoregressive Models

Advancing Sequential Numerical Prediction in Autoregressive Modelle

自动递减模型中推进序列序号预测

2505.13077v2

611

05-28

On the Within-class Variation Issue in Alzheimer’s Disease Detection

Zur klasseninternen Variationsfrage bei der Alzheimer-Erkennung

阿尔茨海默氏氏病检测的类内变化变化问题

2409.16322v2

612

05-28

Interpreting CLIP with Hierarchical Sparse Autoencoders

CLIP mit Hierarchical Sparse Autoencodern interpretieren

使用等级式的粗度自动解析器解释 CLIP

2502.20578v2

613

05-28

LaMM: Semi-Supervised Pre-Training of Large-Scale Materials Models

LaMM: Halbüberwachte Vorausbildung von großformatigen Werkstoffmodellen

LAMM: 大型材料模型的半监督前培训

2505.22208v1

614

05-28

Pitfalls of Rule- and Model-based Verifiers – A Case Study on Mathematical Reasoning

Pitfalls of Rule- and Model-based Verifiers – Eine Fallstudie zur mathematischen Begründung

规则和基于示范的验证符咒 – – 关于数学理由的个案研究

2505.22203v1

615

05-28

Enhancing Uncertainty Estimation and Interpretability via Bayesian Non-negative Decision Layer

Verbesserung der Unsicherheitsabschätzung und -interpretierbarkeit über Bayesian Non-negative Decision Layer

通过Bayesian非负决定层加强不确定性的估算和解释

2505.22199v1

616

05-28

An Augmentation-Aware Theory for Self-Supervised Contrastive Learning

Eine Augmentations-Bewusst-Theorie für selbstüberwachtes kontrastives Lernen

自我监督违规学习的增强- 软件软件理论

2505.22196v1

617

05-28

Physics-inspired Generative AI models via real hardware-based noisy quantum diffusion

Physik-inspirierte Generative KI-Modelle über reale Hardware-basierte laute Quantendiffusion

通过实实在在的硬件噪音量子扩散产生人工智能模型

2505.22193v1

618

05-28

Beyond RMSE and MAE: Introducing EAUC to unmask hidden bias and unfairness in dyadic regression models

Jenseits von RMSE und MAE: Einführung des EUC zur Enttarnung versteckter Bias und Ungerechtigkeit in dyadischen Regressionsmodellen

RUSE 和MAE 之后的RUSE 和MAE:将EAUC引入dyadic回归模型中隐蔽的偏见和不公平现象

2401.10690v5

619

05-28

LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently

LoRA-One: Ein-Schritt-Full Gradient könnte genug für feines Tuning von großen Sprachmodellen sein, wahrscheinlich und effizient

LORA-OI: 精巧、高效、可预见和高效的微调大语言模型的单步全步可满足需要

2502.01235v2

620

05-28

LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits

LC-Tsallis-INF: Generalisierte Best-of-Both-Worlds Lineare Kontextbanditen

LC-Tsallis-INF: 普遍化的两世界最佳线性线性直线性范围内的强盗

2403.03219v3

621

05-28

Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes

Kontinuierliche und diskrete Diffusion mit nicht gleichzeitigen Diffusionsprozessen

与非平行扩散进程一起进行连续和分解的不连续和分解文本传播

2505.22165v1

622

05-28

AgriFM: A Multi-source Temporal Remote Sensing Foundation Model for Crop Mapping

AgriFM: Multi-Source-Modell für die zeitliche Fernerkundung

AgriFM:多种来源的时空遥感基金会作物绘图模型

2505.21357v2

623

05-28

The informativeness of the gradient revisited

Die Aufschlusskraft des Gradienten wurde überarbeitet

重新讨论的梯度信息性

2505.22158v1

624

05-28

Towards Practical Defect-Focused Automated Code Review

Auf dem Weg zu einer praktischen fehlerorientierten automatisierten Code-Überprüfung

走向实际失效-受污染的自动编码审查

2505.17928v2

625

05-28

Uncertainty Estimation for Heterophilic Graphs Through the Lens of Information Theory

Ungewissheitsschätzung für heterophile Graphen durch die Linse der Informationstheorie

信息镜头信息理论流流中异血哲学图谱的不确定性估计

2505.22152v1

626

05-28

Oryx: a Performant and Scalable Algorithm for Many-Agent Coordination in Offline MARL

Oryx: ein performanter und skalierbarer Algorithmus für viele-Agenten-Koordination in Offline MARL

Oryx: MARL 离线下许多机构协调的性能和可缩放的数值

2505.22151v1

627

05-28

Gradient Boosting Reinforcement Learning

Gradientenfördernde Stärkung des Lernens

逐步推进强化学习

2407.08250v2

628

05-28

Bridging Arbitrary and Tree Metrics via Differentiable Gromov Hyperbolicity

Überbrückung von Willkür- und Baummetrics durch differenzierbare Gromov-Hyperbolizität

通过差别化格罗莫夫双向主义

2505.21073v2

629

05-28

Limited Generalizability in Argument Mining: State-Of-The-Art Models Learn Datasets, Not Arguments

Begrenzte Verallgemeinerbarkeit im Argumentbergbau: State-of-The-Art-Modelle lernen Datensätze, keine Argumente

《争议采矿业的限制性通用性:国家与艺术中的模式学习数据集,非论据》

2505.22137v1

630

05-28

RAD: Redundancy-Aware Distillation for Hybrid Models via Self-Speculative Decoding

RAD: Redundanz-Bewusst-Destillation für Hybridmodelle über selbstspekulative Decodierung

RAD: 通过自投机代号为混合模型进行再利用-软件蒸馏

2505.22135v1

631

05-28

JEDI: Latent End-to-end Diffusion Mitigates Agent-Human Performance Asymmetry in Model-Based Reinforcement Learning

JEDI: Latent End-to-End-Diffusion mildert die Asymmetrie von Agent-Human Performance im modellbasierten Verstärkungslernen

JEDI: 以模型为基础的加强学习中前端至终端扩散消化剂-人类性能对称性

2505.19698v2

632

05-28

Optimize Cardinality Estimation Model Pretraining by Simplifying the Training Datasets

Kardinalitätsabschätzungsmodell optimieren Vorschulung durch Vereinfachung der Trainingsdatensätze

通过简化培训数据集,优化红红心估计模型预培训模式

2502.14350v2

633

05-28

Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL

Neuvisualisierung von Schwach-zu-Strong-Verallgemeinerung in Theorie und Praxis: Reverse KL vs. Forward KL

重新审视理论和实践中弱到强的简单化:反向 KL vs. fward KL

2502.11107v3

634

05-28

BiMi Sheets: Infosheets for bias mitigation methods

BiMi Sheets: Infosheets für Methoden zur Biasminderung

BiMi 工作表:用于减少偏差方法的信息表

2505.22114v1

635

05-28

Understanding Model Ensemble in Transferable Adversarial Attack

Model-Ensemble in übertragbarem Widersacher-Angriff verstehen

理解可转让反向攻击中可相互转让攻击的示范组合

2410.06851v3

636

05-28

The quest for the GRAph Level autoEncoder (GRALE)

Die Suche nach dem GRAph Level AutoEncoder (GRALE)

寻求GRALE(GRALE)的GRAP 高级自动编码器(GRALE)

2505.22109v1

637

05-28

Inclusive, Differentially Private Federated Learning for Clinical Data

Inklusives, differenziert privates Federated Learning für klinische Daten

包容性、差异化私联校临床数据学习

2505.22108v1

638

05-28

Curse of High Dimensionality Issue in Transformer for Long-context Modeling

Fluch der Hochdimensionalitätsfrage im Transformer für die Langkontextmodellierung

变异器中高多维度问题的诅咒,用于长期建模

2505.22107v1

639

05-28

Devil is in the Details: Density Guidance for Detail-Aware Generation with Flow Models

Devil ist in den Details: Dichte-Anleitung für Detail-Aware-Generation mit Flow-Modellen

魔鬼在细节中: 使用流动模型生成详细软件的密度指导

2502.05807v2

640

05-28

Visuospatial Cognitive Assistant

活性呼吸空间感知助理

2505.12312v3

641

05-28

Efficient Dynamic Shielding for Parametric Safety Specifications

Effiziente dynamische Abschirmung für parametrische Sicherheitsspezifikationen

用于参数安全规格的有效动态防护

2505.22104v1

642

05-28

Towards Visuospatial Cognition via Hierarchical Fusion of Visual Experts

Auf dem Weg zur Visuospatialen Kognition durch hierarchische Fusion von visuellen Experten

争取通过视觉专家的等级化融合实现纵向空间聚合

2505.12363v3

643

05-28

Conditional Denoising Meets Polynomial Modeling: A Flexible Decoupled Framework for Time Series Forecasting

Bedingtes Stören trifft auf Polynommodellierung: Ein flexibles entkoppeltes Framework für die Zeitreihenprognose

满足多面性建模:时间序列预测灵活拆分框架

2410.13253v6

644

05-28

On the Transferability and Discriminability of Repersentation Learning in Unsupervised Domain Adaptation

Über die Übertragbarkeit und Diskriminierbarkeit von Representation Learning in unüberwachter Domain-Anpassung

关于无监督域适应中可转让性和可转让性

2505.22099v1

645

05-28

Knowledge Base Construction for Knowledge-Augmented Text-to-SQL

Knowledge Base Construction für wissensbasierte Text-zu-SQL

知识强化文字到SQL知识基础建设

2505.22096v1

646

05-28

Diffusion Models as Cartoonists: The Curious Case of High Density Regions

Diffusionsmodelle als Karikaturisten: Der seltsame Fall von Regionen mit hoher Dichte

作为漫画家的传播模型:高密度地区令人好奇的案例

2411.01293v4

647

05-28

High Volume Rate 3D Ultrasound Reconstruction with Diffusion Models

Hohe Lautstärke 3D-Ultraschall-Rekonstruktion mit Diffusions-Modellen

3D超声波重建,采用传播模型

2505.22090v1

648

05-28

Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN

Basis- und Exponentvorhersage in mathematischen Ausdrücken mit Multi-Output CNN

利用有线电视新闻网的多种产出对数学表达式进行基础和指数预测

2407.14967v2

649

05-28

Domain-Specific Pruning of Large Mixture-of-Experts Models with Few-shot Demonstrations

Domain-spezifisches Pruning von großen Mixture-of-Experts-Modellen mit nur wenigen Demonstrationen

大型混合型专家模型的域特定情景,少发示范

2504.06792v2

650

05-28

PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning

PADAM: Parallel gemittelter Adam reduziert Fehler bei stochastischer Optimierung im wissenschaftlichen maschinellen Lernen

PADAM: 平行平均 Adam 减少科学机器学习中随机优化的错误

2505.22085v1

651

05-28

Hyperbolic recurrent neural network as the first type of non-Euclidean neural quantum state ansatz

Hyperbolisches rezidivierendes neuronales Netzwerk als erste Art von nicht-euklidischen neuronalen Quantenzustandsansatz

超双曲经常性神经网络,作为第一种非欧洲的神经量子状态 ansatz

2505.22083v1

652

05-28

Improved Bounds for Swap Multicalibration and Swap Omniprediction

Verbesserte Bounds für Swap Multikalibrierung und Swap Omniprediction

用于交换多校准和交换面宽度的改进宽度

2505.20885v2

653

05-28

LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation

LongReD: Degradierung von Langtext-Großen Sprachmodellen durch Restaurationsdestillation

LongReD:通过恢复蒸馏减少长长长大语言模型的短期退化

2502.07365v3

654

05-28

A Hybrid Multi-Factor Network with Dynamic Sequence Modeling for Early Warning of Intraoperative Hypotension

Hybrides Multi-Factor-Netzwerk mit dynamischer Sequenzmodellierung zur Frühwarnung von intraoperativer Hypotonie

混合多要素网络,具有动态序列模型模型,以及早警告不合作水分的不合作状态;

2409.11064v3

655

05-28

Can Test-time Computation Mitigate Memorization Bias in Neural Symbolic Regression?

Kann Testzeit-Computation Mitigate Memorization Bias in Neural Symbolische Regression?

测试时计算在神经符号回落中是否可模拟记忆回弹?

2505.22081v1

656 05-28 The Resurrection of the ReLU Die Auferstehung der ReLU 鲁鲁的复活, 2505.22074v1

657

05-28

PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models

PRMBench: Ein feinkörniger und anspruchsvoller Benchmark für Prozess-Level-Reward-Modelle

PRMBBench:进程一级奖励模式的精细和质疑基准

2501.03124v4

658

05-28

Message-Passing GNNs Fail to Approximate Sparse Triangular Factorizations

Message-Passing-GNNs fehlschlagen an ungefähren Sparse Dreiecks-Fabrizierungen

投送信件 GNN 失败于近似偏差的三角三角因子化

2502.01397v2

659

05-28

Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head

Dual-Head-Wissensdestillation: Optimierung der Logits-Nutzung mit Hilfe eines Hilfskopfes

双头知识蒸馏:用辅助头加强登录的使用

2411.08937v2

660

05-28

Learning Latent Graph Structures and their Uncertainty

Lernen Latent Graph Structures und ihre Unsicherheit

学习后边图结构及其不确定性

2405.19933v2

661

05-28

Towards Resilient and Sustainable Global Industrial Systems: An Evolutionary-Based Approach

Auf dem Weg zu stabilen und nachhaltigen globalen Industriesystemen: ein evolutionärer Ansatz

走向具有复原力和可持续的全球工业系统:基于演变的方法

2503.11688v2

662

05-28

Quantum Kernel Learning for Small Dataset Modeling in Semiconductor Fabrication: Application to Ohmic Contact

Quanten-Kernel-Lernen für kleine Datensätze Modellierung in Halbleiterfertigung: Anwendung auf Ohm-Kontakt

半导体制造中小型数据集建模的量子核心学习: Ohmic 接触的应用

2409.10803v3

663

05-28

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Eine umfassende Umfrage in LLM(-Agent) Full Stack Sicherheit: Daten, Schulung und Bereitstellung

用LLLM(-代理)全堆安全:数据、培训和部署进行的全面调查

2504.15585v3

664

05-28

ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

ORIGEN: Zero-Shot 3D-Orientierungsgrundierung in Text-zu-Bild-Generierung

将零热3D定向定位作为产生文字到图像的基础

2503.22194v2

665

05-28

Reinforced Reasoning for Embodied Planning

Verstärkte Begründung für die körperbetonte Planung

强化规划强化理由

2505.22050v1

666

05-28

Differentiable Generalized Sliced Wasserstein Plans

Unterschiedliche generalisierte Wasserstein-Pläne

刀切瓦西斯坦计划

2505.22049v1

667

05-28

Learning Curves of Stochastic Gradient Descent in Kernel Regression

Lernkurven des stochastischen Gradienten Abstiegs in Kernel-Regression

内核倒退中尾部渐变源的学习曲线

2505.22048v1

668

05-28

Learning to Steer Learners in Games

Lernen zu Steer Learners in Spielen

在运动会中学习向运动会中的稳坐学生学习

2502.20770v2

669

05-28

PUATE: Efficient Average Treatment Effect Estimation from Treated (Positive) and Unlabeled Units

PUATE: Effiziente Schätzung des durchschnittlichen Behandlungseffekts aus behandelten (Positiven) und nicht gekennzeichneten Einheiten

PUATE: 高效平均处理效果估算处理(积极)单位和无标签单位的高效平均处理效果

2501.19345v2

670

05-28

MultiScale Contextual Bandits for Long Term Objectives

MultiScale Contextual Bandits für langfristige Ziele

长期目标多层次背景影响

2503.17674v2

671

05-28

Latent Mamba Operator for Partial Differential Equations

Latent Mamba Operator für partielle Differentialgleichungen

部分差异方程的中端 Mamba 运算符

2505.19105v2

672

05-28

Estimating the Effects of Sample Training Orders for Large Language Models without Retraining

Bewertung der Auswirkungen von Mustertrainingsaufträgen für große Sprachmodelle ohne Umschulung

估计无再培训的大语言模式抽样培训令的影响

2505.22042v1

673

05-28

Detecting Undesired Process Behavior by Means of Retrieval Augmented Generation

Erkennung von unerwünschtem Prozessverhalten mittels retrievaler Augmented Generation

通过回收增加一代的手段检测不想要的流程行为

2505.22041v1

674

05-28

Revisiting In-Context Learning with Long Context Language Models

Das In-Context-Lernen mit langen Kontext-Sprachmodellen

以长方语言模式重新研究内文学习

2412.16926v3

675

05-28

Weakly-Supervised Contrastive Learning for Imprecise Class Labels

Schwachüberwachtes Kontrastives Lernen für ungenaue Klassen-Etiketten

简便类标签的微弱监督反竞争学习

2505.22028v1

676

05-28

Evaluation of the impact of expert knowledge: How decision support scores impact the effectiveness of automatic knowledge-driven feature engineering (aKDFE)

Bewertung der Auswirkungen von Expertenwissen: Wie die Entscheidungsunterstützung die Wirksamkeit des automatischen wissensbasierten Feature Engineerings beeinflusst (aKDFE)

评价专家知识的影响:决策支持的评分如何影响知识驱动的自动知识特性工程(KDFE)的有效性

2504.05928v2

677

05-28

Efficient Online Reinforcement Learning for Diffusion Policy

Effizientes Online-Verstärkungslernen für die Diffusionspolitik

高效在线强化学习促进传播政策

2502.00361v3

678

05-28

Model Diffusion for Certifiable Few-shot Transfer Learning

Modell-Diffusion für zertifizierbares Transfer-Lernen mit wenigen Fotos

可核证的 “ 几光 “ 转让学习模型传播

2502.06970v2

679

05-28

Learning in Compact Spaces with Approximately Normalized Transformers

Lernen in kompakten Räumen mit etwa normalisierten Transformatoren

学习与大约正常化变异器的紧凑空间的学习

2505.22014v1

680

05-28

SageAttention2++: A More Efficient Implementation of SageAttention2

SageAttention2++: Effizientere Umsetzung von SageAttention2

SageAttention2++:更有效地实施SageAttention2

2505.21136v2

681

05-28

A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

Eine umfassende Real-World Bewertung von Audio Watermarking Algorithmen: Werden sie überleben Neural Codecs?

对音频水标定法的全面现实世界评估:它们能否生存神经规范?

2505.19663v2

682

05-28

Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains

Domaino1s: Leitende LLM-Gründung für erklärbare Antworten in High-Stakes-Domains

域1:在高占用域中解释可解答案的指导性LLM

2501.14431v2

683

05-28

Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences

Align-DA: Align Score-basierte atmosphärische Daten Assimilation mit mehreren Präferenzen

Aleign-DA: 与多重优惠相仿的一致计分大气数据

2505.22008v1

684

05-28

Generalization Analysis for Supervised Contrastive Representation Learning under Non-IID Settings

Generalisierungsanalyse für überwachtes Kontrastives Repräsentationslernen unter Nicht-IID-Einstellungen

在非IID设置下受监督的违反代表制学习的通用分析

2505.04937v3

685

05-28

Locking-Free Training of Physics-Informed Neural Network for Solving Nearly Incompressible Elasticity Equations

Locking-Free Training of Physics-informed Neural Network for Solving Fast Incompressible Elasticity Equations

用于解决近不压缩弹性等量的物理内成神经网络的无锁化培训

2505.21994v1

686

05-28

Identifying Causal Direction via Variational Bayesian Compression

Identifizierung der Kausalrichtung durch variationale Bayesische Kompression

通过变异贝耶斯压缩确定因果方向

2505.07503v3

687

05-28

ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning

ACE: Exploring Activation Cosine Ähnlichkeit und Varianz für genaues und kalibrationseffizientes LLM Pruning

ACE: 探索在准确度和校准-有效LLM Pruning 方面活跃共生相近性和差异

2505.21987v1

688

05-28

Reward-Independent Messaging for Decentralized Multi-Agent Reinforcement Learning

Reward-independent Messaging für dezentralisiertes Mehr-Agenten-Verstärkungs-Lernen

权力下放多机构加强学习分权式多机构加强学习的回报独立通信

2505.21985v1

689

05-28

How to Synthesize Text Data without Model Collapse?

Wie können Sie Textdaten ohne Modellkollaps synthesieren?

如何在没有模式折叠的情况下合成文本数据 ?

2412.14689v3

690

05-28

Latent Weight Diffusion: Generating reactive policies instead of trajectories

Latent Weight Diffusion: Erzeugen von reaktiven Strategien anstelle von Trajektorien

负负重扩散: 产生反应性政策, 而不是轨迹

2410.14040v2

691

05-28

Two-Stage Feature Generation with Transformer and Reinforcement Learning

Zweistufige Feature-Generierung mit Transformer und Verstärkungslernen

具有变换器和强化学习的两阶段特色生成

2505.21978v1

692

05-28

Judging LLMs on a Simplex

LLMs auf einem Simplex zu urteilen

以简单方式判断LLMs

2505.21972v1

693

05-28

Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Heterogene Token-Übertragung in LLM-Wissensbearbeitung abmildern

减轻LLLM知识编辑中变异式 Tok 超称

2502.00602v2

694

05-28

Robust Reward Alignment via Hypothesis Space Batch Cutting

Robuste Belohnung Ausrichtung durch Hypothesis Raum Batch Schneiden

通过假设空间批量切割进行强力奖励调整

2502.02921v3

695

05-28

Cooperation of Experts: Fusing Heterogeneous Information with Large Margin

Kooperation von Experten: Verschmelzende Heterogene Informationen mit großer Spanne

专家合作:利用具有较大边际效应的异种信息

2505.20853v2

696

05-28

EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles

EnsemW2S: Verbesserung der Schwach-zu-Strong-Verallgemeinerung mit großsprachigen Modellensembles

EnsemW2S:用大语言模型组合加强弱至强的通用化

2505.21959v1

697

05-28

A Stochastic Approximation Approach for Efficient Decentralized Optimization on Random Networks

Ein stochastischer Annäherungsansatz für eine effiziente dezentralisierte Optimierung von Random Networks

随机网络高效分散优化优化的斯托卡接近方法

2410.18774v2

698

05-28

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Kimi k1.5: Skalierungs-Verstärkungs-Lernen mit LLMs

Kimi k1.5:利用LLMs加强加强学习

2501.12599v3

699

05-28

Stochastic Primal-Dual Double Block-Coordinate for Two-way Partial AUC Maximization

Stochastische primäre Doppelblockkoordinate für Zwei-Wege-Partielle AUC-Maximierung

双向部分AUC 最大化

2505.21944v1

700

05-28

Continual Learning Beyond Experience Rehearsal and Full Model Surrogates

Kontinuierliches Lernen über die Erfahrung hinaus Proben und vollständige Modellüberlagerungen

排练和全模模范代理公司

2505.21942v1

701

05-28

Go With the Flow: Fast Diffusion for Gaussian Mixture Models

Mit dem Fluss gehen: Schnelle Diffusion für Gaussian Mixture Models

随流而去:高山混合模型的快速扩散

2412.09059v4

702

05-28

Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection

Praktische Adversarialangriffe auf stochastische Banditen durch gefälschte Dateninjektion

通过假数据注射,实际对抗性攻击斯托卡强盗

2505.21938v1

703

05-28

ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

ReQFlow: Rektifizierter Quaternionsfluss für effiziente und hochwertige Protein-Backbone-Generation

ReQFlow:为高效和高品质蛋白后骨生成而调整的四量流动

2502.14637v3

704

05-28

Higher-Order Group Synchronization

Gruppensynchronisierung mit höherer Ordnung

高级分级组同步化

2505.21932v1

705

05-28

Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning

Ermittlung von Kriterien für die Neugewichtung von Verlusten zur Verbesserung des LLM-Entlernens

探索损失重新加权标准,加强LLM 重新学习

2505.11953v2

706

05-28

Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets

Effizientes Ensemble für die Feinabstimmung von Sprachmodellen auf mehreren Datensätzen

多个数据集微调语言模型高效组合组合

2505.21930v1

707

05-28

Efficient Logit-based Knowledge Distillation of Deep Spiking Neural Networks for Full-Range Timestep Deployment

Effiziente Logit-basierte Wissensdestillation von Tiefen-Spiking-Neural-Netzwerken für die Bereitstellung von Vollstrecken-Zeitschritten

用于全红时间步骤部署的深渗透神经网络的高效基于逻辑的知识蒸馏

2501.15925v2

708

05-28

Subspecialty-Specific Foundation Model for Intelligent Gastrointestinal Pathology

Subspezialitätsspezifisches Stiftungsmodell für intelligente Gastrointestinalpathologie

智能气胃肠道病理学

2505.21928v1

709

05-28

RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination

RenderFormer: Transformer-basiertes Neural-Rendering von Dreiecksnetzen mit globaler Beleuchtung

成形前:以变形器为基础的以全球光化为工具的三角三角光板的神经成形

2505.21925v1

710

05-28

FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design

FALCON: Ein ML-Framework für vollautomatisierte Layout-Kontrainierte analoge Schaltungen

FALCON: 完全自动布局约束模拟电路设计 ML 框架

2505.21923v1

711

05-28

Self-supervised Learning Method Using Transformer for Multi-dimensional Sensor Data Processing

Selbstüberwachte Lernmethode mit Transformer für mehrdimensionale Sensordatenverarbeitung

利用变压器进行多维传感器数据处理的自监督学习方法

2505.21918v1

712

05-28

SlimLLM: Accurate Structured Pruning for Large Language Models

SlimLLM: Genau strukturiertes Pruning für große Sprachmodelle

SlimLLM:大型语言模型的准确结构审慎

2505.22689v1

713

05-28

Understanding the behavior of representation forgetting in continual learning

Das Verhalten der Repräsentation verstehen vergessen im kontinuierlichen Lernen

理解在不断学习中遗忘的代言人行为

2505.20970v2

714

05-28

ExpProof : Operationalizing Explanations for Confidential Models with ZKPs

ExpProof : Operationalisierung von Erklärungen für vertrauliche Modelle mit ZKPs

利用:对ZKPs的机密模型的解释投入运作

2502.03773v3

715

05-28

Taming Transformer Without Using Learning Rate Warmup

Zähmung Transformer ohne Verwendung von Lernrate Warmup

塔姆变形器不使用学习速率暖化

2505.21910v1

716

05-28

Criticality and Safety Margins for Reinforcement Learning

Kritizität und Sicherheitsmargen für verstärktes Lernen

强化学习的临界和安全边缘

2409.18289v2

717

05-28

Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Verstärktes Lernen für Out-of-Distribution-Reasoning in LLMs: Eine empirische Studie zur diagnostischen Gruppencodierung

在LLMM中加强分配外原因的强化学习:诊断相关群体编码经验研究

2505.21908v1

718

05-28

OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

OVERT: Ein Benchmark für eine überwiderrechtliche Bewertung von Text-zu-Bild-Modellen

GUT: 对文本到图像模型的反否决评价基准

2505.21347v2

719

05-28

Geometry-Informed Neural Operator Transformer

Geometrie-informierter Neuraloperator Transformer

智能神经操作器变换器

2504.19452v3

720

05-28

Integrating Intermediate Layer Optimization and Projected Gradient Descent for Solving Inverse Problems with Diffusion Models

Integration von Intermediate Layer Optimization und projizierter Gradient Descent zur Lösung inverser Probleme mit Diffusionsmodellen

整合中间层优化和预测梯度,以解决传播模型的反向问题

2505.20789v2

721

05-28

Combinatorial Reinforcement Learning with Preference Feedback

Kombinatorisches Stärkungslernen mit Präferenz-Feedback

结合强化学习与优先反馈

2502.10158v2

722

05-28

ReGNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction

ReGNet: Reziproke Raum-Bewusst-Langstrecken-Modellierung für kristalline Eigenschaftsvorhersage

ReGNet:水晶财产预测的对等空间-软件长距离模型模型

2502.02748v2

723

05-28

Language-Enhanced Representation Learning for Single-Cell Transcriptomics

Sprachverstärktes Repräsentationslernen für Single-Cell-Transkriptomik

单一计算机转基因学的提高语言代表性学习

2503.09427v3

724

05-28

Federated Continual Graph Learning

Föderiertes kontinuierliches Graphenlernen

联邦连续图学习

2411.18919v3

725

05-28

Towards Large Reasoning Models for Agriculture

Auf dem Weg zu groß angelegten Konzepten für die Landwirtschaft

争取实现农业大理由解释模式

2505.19259v2

726

05-28

Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization

Komprimierende Sine-Activated Low-Rank-Adapter durch Quantisierung nach dem Training

通过培训后定量化压缩松状活动低Rank适应器

2505.21895v1

727

05-28

SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training

SDPO: Importance-Sampled Direct Preference Optimierung für stabile Diffusionsschulungen

SDPO: 稳定传播培训的重要性抽样直接优惠优化

2505.21893v1

728

05-28

ControlTac: Force- and Position-Controlled Tactile Data Augmentation with a Single Reference Image

ControlTac: Kraft- und positionsgesteuerte taktile Datenvergrößerung mit einem einzigen Referenzbild

控制塔克: 带有单一参考图像的力控和位置控轨迹数据增强

2505.20498v2

729

05-28

Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion

Fast lineare Konvergenz unter Minimal-Score Annahmen: Quantisierte Transition Diffusion

在最低分数假设下几乎线性聚合:量化过渡扩散

2505.21892v1

730

05-28

Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models

Auf dem Weg zu robuster automatisierter Wahrnehmungsqualitätsbewertung mit Sprachstiftungsmodellen

以语音基金会模式进行强有力的自主声音质量评估

2505.21356v2

731

05-28

Symbolic Foundation Regressor on Complex Networks

Symbolischer Foundation-Regressor auf komplexen Netzwerken

复杂网络上的反射器

2505.21879v1

732

05-28

Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning

Hybride Batch-Normalisierung: Lösung des Dilemmas der Batch-Normalisierung im Federated Learning

混合批次正常化:解决联邦学习中批次正常化的难题

2505.21877v1

733

05-28

Targeted Unlearning Using Perturbed Sign Gradient Methods With Applications On Medical Images

Gezieltes Lernen mit gestörten Zeichen Gradient Methoden mit Anwendungen auf medizinischen Bildern

采用固定信号渐进方法,在医学图像上应用医学图象,有针对性地取消学习

2505.21872v1

734

05-28

Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Robot Learning

Coarse-to-fine Q-Network mit Aktionssequenz für dateneffizientes Roboterlernen

Coarse 至 fine Q 网络与数据效率机器人学习行动序列

2411.12155v4

735

05-28

Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures

Mini-Batch Coresets für speichereffiziente Sprachmodellschulungen auf Datenmischungen

记忆效率语言数据混合模型培训微型批量核心数据集

2407.19580v4

736

05-28

Revisiting Bayesian Model Averaging in the Era of Foundation Models

Bayesianisches Modell im Zeitalter der Gründungsmodelle neu besuchen

重新审查基金会模式时代的贝耶斯模式

2505.21857v1

737

05-28

Meta Co-Training: Two Views are Better than One

Meta Co-Training: Zwei Ansichten sind besser als eine

Meta联合培训:两种观点比一种观点更好

2311.18083v5

738

05-28

Investigating the effectiveness of multimodal data in forecasting SARS-COV-2 case surges

Untersuchung der Wirksamkeit multimodaler Daten bei der Prognose von SARS-COV-2-Fallfluten

调查多式联运数据在预测SARS-COV-2案件激增方面的有效性

2505.22688v1

739

05-28

Multi-Label Bayesian Active Learning with Inter-Label Relationships

Multi-Label Bayesian Aktives Lernen mit inter-Label Beziehungen

多标签贝耶斯人积极学习与跨标签关系

2411.17941v2

740

05-28

Improving the Variance of Differentially Private Randomized Experiments through Clustering

Verbesserung der Varianz von differenziert privaten Randomisierten Experimenten durch Clustering

通过集群化改进差异私人随机化实验的差异

2308.00957v3

741

05-28

ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model

ItDPDM: Informationstheoretisches Diskretes Poisson-Diffusionsmodell

ITDDDM:信息-理论分辨偏异Poisson传播模型

2505.05082v3

742

05-28

Solving Empirical Bayes via Transformers

Lösen von Empirischen Buchten über Transformer

通过变换器解决实证贝贝

2502.09844v2

743 05-28 Continuous Thought Machines Kontinuierliche Gedankenmaschinen 连续思考机 2505.05522v3

744

05-28

Statistical Inference for Temporal Difference Learning with Linear Function Approximation

Statistische Schlussfolgerung für zeitliches Differenzlernen mit linearer Funktionsannäherung

与线性函数接近一致的时空差异学习统计推推

2410.16106v3

745

05-28

A Provable Approach for End-to-End Safe Reinforcement Learning

Ein realistischer Ansatz für das Ende-zu-Ende sichere Stärkungslernen

最终至最终安全强化学习的可行办法

2505.21852v1

746

05-28

Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories

Streaming Flow Policy: Vereinfachende Diffusion$/$ Flow-Matching-Richtlinien durch Behandlung von Aktionsbahnen als Flow-Trajektorien

流流流流流流政策:通过将行动轨迹作为流动轨迹处理,简化以美元/美元/美元的流量匹配政策

2505.21851v1

747

05-28

Spectral clustering for dependent community Hawkes process models of temporal networks

Spektrales Clustering für abhängige Community Hawkes Prozessmodelle von zeitlichen Netzwerken

依赖依赖性社区霍克斯时间网络过程模型光谱群群群

2505.21845v1

748

05-28

A Physics-Informed Learning Framework to Solve the Infinite-Horizon Optimal Control Problem

Ein physikinformiertes Lernrahmenwerk zur Lösung des Unendlichen-Horizon-Optimalen Steuerungsproblems

解决无限 – – 霍里佐最佳控制问题的物理综合学习框架

2505.21842v1

749

05-28

An Optimistic Algorithm for online CMDPS with Anytime Adversarial Constraints

Optimistischer Algorithmus für Online-CMDPS mit jederzeit feindlichen Einschränkungen

带有任何时间的反逆限制的在线 CMDPS 优化算法

2505.21841v1

750

05-28

Natural Language Reinforcement Learning

Natürliche Sprache Stärkung Lernen

自然语言强化学习

2411.14251v3

751

05-28

UniMoGen: Universal Motion Generation

UniMoGen: 宇宙运动一代

2505.21837v1

752

05-27 (2)

Inferring Traffic Models in Terminal Airspace from Flight Tracks and Procedures

Ableiten von Verkehrsmodellen im Terminal-Luftraum von Flugspuren und -verfahren

从飞行轨道和程序中推断终端航空空间的交通模式

2303.09981v3

753

05-27

TuneComp: Joint Fine-tuning and Compression for Large Foundation Models

TuneComp: Gemeinsame Feinabstimmung und Kompression für große Fundamentmodelle

TununComp:大型基金会模型的联合微调和压缩

2505.21835v1

754

05-27

Constrained Discrete Diffusion

Beschränkte diskrete Diffusion

限制的分解扩散

2503.09790v2

755

05-27

In Search of Adam’s Secret Sauce

Auf der Suche nach Adams geheimer Sauce

寻找亚当的秘密香肠

2505.21829v1

756 05-27 Music Source Restoration Restaurierung der Musikquelle 音乐来源恢复 2505.21827v1

757

05-27

From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization

Von EduVisBench zu EduVisAgent: Ein Benchmark- und Multi-Agent-Framework für eine sinnvolle pädagogische Visualisierung

从Edu Visb bench到Edu Visbench-Edu VisbearAgender:有理性的可视化教育基准和多机构框架

2505.16832v2

758

05-27

Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones

Lassen Sie mich nachdenken! Eine lange Kette des Denkens kann es wert sein, auf jeden Fall viele kurze Menschen

让我想想吧!一个长期的思考链可能值得一试有很多短一个

2505.21825v1

759

05-27

Unsupervised Latent Pattern Analysis for Estimating Type 2 Diabetes Risk in Undiagnosed Populations

Unüberwachte Latent Pattern Analyse zur Schätzung des Typ-2-Diabetes-Risikos in nicht diagnostizierten Populationen

未经监督的对未诊断的人群2型糖尿病风险估算的

2505.21824v1

760

05-27

An Innovative Data-Driven and Adaptive Reinforcement Learning Approach for Context-Aware Prescriptive Process Monitoring

Ein innovativer datengetriebener und adaptiver Weiterbildungsansatz für die kontext-aware Prescriptive Prozessüberwachung

采用创新型数据驱动和适应性强化学习方法,用于内容软件指令程序监测

2501.10543v2

761

05-27

DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

DiffMS: Diffusionserzeugung von Molekülen auf Massenspektren

DiffMS: 受质量光谱约束的分子的扩散生成

2502.09571v2

762

05-27

Representative Language Generation

Repräsentative Sprachgenerierung

代代代语语代语代语代

2505.21819v1

763

05-27

Optimizing Data Augmentation through Bayesian Model Selection

Optimierung der Datenvergrößerung durch Bayesian Model Selection

通过Bayesian模式选择优化数据增加

2505.21813v1

764

05-27

Learning Enhanced Ensemble Filters

Enhanced Ensemble Filter lernen

学习增强的组合过滤器

2504.17836v2

765

05-27

ThinkGuard: Deliberative Slow Thinking Leads to Cautious Guardrails

ThinkGuard: Besonnenes langsames Denken führt zu voreiligen Wärtern

思考指南:慎重考虑的慢思考引领谨慎警卫车

2502.13458v2

766

05-27

Voice Quality Dimensions as Interpretable Primitives for Speaking Style for Atypical Speech and Affect

Sprachqualitätsdimensionen als Interpretierbare Primitive für sprechenden Stil für atypische Sprache und Affekt

语音质量方面作为非非典型演讲和影响说话风格的可解释的原始语言

2505.21809v1

767

05-27

Towards Operational Automated Greenhouse Gas Plume Detection

Auf dem Weg zu einer operationell automatisierten Treibhausgas-Plume-Erkennung

实现操作性自动温室气体管道探测

2505.21806v1

768

05-27

From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs

Von der Anfahrt zu den Cones: Erforschung multidimensionaler Darstellungen von Propositional Facts in LLMs

” 从方向到锥体:探索液晶中各种潜在事实的多层面代表 “

2505.21800v1

769

05-27

PolarGrad: A Class of Matrix-Gradient Optimizers from a Unifying Preconditioning Perspective

PolarGrad: Eine Klasse von Matrix-Gradienten-Optimierern aus einer einheitlichen Sicht der Vorkonditionierung

极地格:从统一前置角度出发的矩阵-高压优化器类别

2505.21799v1

770

05-27

A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Ein General-Purpose-Theorem für hochwahrscheinliche Grenzen stochastischer Annäherung mit Polyak Average

具有聚氨基挥动作用的斯托克相吸合高概率波断的普通用途理论

2505.21796v1

771

05-27

End-to-End Breast Cancer Radiotherapy Planning via LMMs with Consistency Embedding

End-to-End-Brustkrebs-Radiotherapie Planung über LMMs mit Konsistenz-Embedding

通过具有一致嵌入的LMMs进行端至端乳腺癌放射治疗规划

2311.15876v4

772

05-27

Multimodal Federated Learning: A Survey through the Lens of Different FL Paradigms

Multimodales Federated Learning: Eine Umfrage durch die Linse verschiedener FL-Paradigmen

多模式联邦学习:通过不同FL范式的镜头进行调查

2505.21792v1

773

05-27

LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models

LV-XAttn: Verteilte Cross-Attention für lange visuelle Eingänge in multimodalen großen Sprachmodellen

LV-XAttn:多式大语言模型中长视输入分布式交叉注意

2502.02406v3

774

05-27

Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks

以美元为单位、以美元为单位、以美元为单位、以美元为单位、以目标为单位的全球最小化器

2505.21791v1

775

05-27

Faster Rates for Private Adversarial Bandits

Schnellere Preise für private Adversarial Bandits

私人反盗贼的速率

2505.21790v1

776

05-27

Wanda++: Pruning Large Language Models via Regional Gradients

Wanda++: Beschneiden großer Sprachmodelle über regionale Gradienten

Wanda+++:通过区域渐变来保护大语言模式

2503.04992v3

777

05-27

Born a Transformer – Always a Transformer?

Geboren ein Transformer - immer ein Transformer?

天生的变形人 - - 总是变形人?

2505.21785v1

778

05-27

Universal Approximation of Mean-Field Models via Transformers

Universelle Annäherung von Mittelwert-Feld-Modellen über Transformer

通过变压器实现平均实地模型普遍接近

2410.16295v2

779

05-27

Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models

Wasserzeichen im Sand: Unmöglichkeit der starken Wasserzeichen für generative Modelle

沙沙中的水印:在生成模型中使用强水标志的可能性

2311.04378v5

780

05-27

P-DROP: Poisson-Based Dropout for Graph Neural Networks

P-DROP: Poisson-basiertes Dropout für Graphen-Neural-Netzwerke

PDROP: 石形神经网络的 Poisson-Poisson 辍学

2505.21783v1

781

05-27

Diffusion Adversarial Post-Training for One-Step Video Generation

Diffusions-Adversarial-Post-Training für die One-Step-Videogenerierung

单步制录像制作单步制片后培训

2501.08316v2

782

05-27

Memorization to Generalization: Emergence of Diffusion Models from Associative Memory

Erinnerung an die Verallgemeinerung: Entstehung von Diffusionsmodellen aus dem assoziativen Gedächtnis

记忆化为普遍化:共同内存传播模型的出现

2505.21777v1

783

05-27

DualSchool: How Reliable are LLMs for Optimization Education?

DualSchool: Wie zuverlässig sind LLMs für die Optimierungsbildung?

两所学校:优化教育LLMs有多可靠?

2505.21775v1

784

05-27

Backdoors in DRL: Four Environments Focusing on In-distribution Triggers

Hintertüren in DRL: Vier Umgebungen mit Fokus auf In-Distribution Trigger

DRL的后门:四个环境,侧重于内部分配触发器

2505.17248v2

785

05-27

Beyond 1D: Vision Transformers and Multichannel Signal Images for PPG-to-ECG Reconstruction

Beyond 1D: Vision Transformers und Multichannel Signal Images für PPG-zu-ECG-Rekonstruktion

1D之后:为重建PPPG至ECG提供愿景变形器和多通道信号图像

2505.21767v1

786

05-27

Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop

Erklärbare multimodale Zeitreihenvorhersage mit LLM-in-the-Loop

与LLM in-Loop的可解释的多时时间序列预测

2503.01013v2

787

05-27

TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster

TS-RAG: Retrieval-Augmented Generation basierte Time Series Foundation Modelle sind stärker Zero-Shot Forecaster

TS-RAG:基于时间序列的回收-养殖一代基于时间序列的基础模型是更强的零热预测仪

2503.07649v3

788

05-27

Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization

Puristische Korrelationen in der hochdimensionalen Regression: Die Rollen der Regularisierung, der Einfachheit Bias und der Überparameterisierung

高度倒退中的纯净误值:常规化、简易生物和过度计量化的作用

2502.01347v2

789

05-27

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

FRAMES-VQA: Benchmarking Fine-Tuning Robustheit über Multi-Modal Shifts in der visuellen Fragestellung

FRAMES-VQA:确定视觉问题解答中多模式变化的精确调整强度基准

2505.21755v1

790

05-27

Path Planning for Masked Diffusion Model Sampling

Pfadplanung für maskierte Diffusions-Modell-Probenahme

蒙面扩散模型取样规划路径

2502.03540v4

791

05-27

Hierarchical Reinforcement Learning with Uncertainty-Guided Diffusional Subgoals

Hierarchisches Stärkungslernen mit unsicheren, diffusionalen Unterzielen

具有不确定性的梯级强化学习,有不确定的辅助分传播目标

2505.21750v1

792

05-27

Revisiting Bi-Linear State Transitions in Recurrent Neural Networks

Bi-Lineare State Transitions in recurrenten neuralen Netzwerken erneut besuchen

在经常性神经网络中重新审查双利那尔州过渡

2505.21749v1

793

05-27

Privacy for Free in the Overparameterized Regime

Privatsphäre kostenlos im überparameterisierten Regime

过度计量制度中的免费隐私

2410.14787v2

794

05-27

Learning to See More: UAS-Guided Super-Resolution of Satellite Imagery for Precision Agriculture

Mehr erfahren: UAS-geführte Super-Resolution von Satellitenbildern für Präzisionslandwirtschaft

学习更多见:UAS-UAS指导的精密农业卫星图像超级分辨率

2505.21746v1

795

05-27

Simulating the Unseen: Crash Prediction Must Learn from What Did Not Happen

Das Unsichtbare simulieren: Crash Prediction muss lernen, was nicht passiert ist

模拟看不见:崩溃预测必须从没有发生的事情中吸取教训

2505.21743v1

796

05-27

Outlier-Robust Linear System Identification Under Heavy-tailed Noise

Ausreißer-Robust Lineare System-Identifikation unter stark verdichtetem Lärm

在重尾噪音下识别线性系统

2501.00421v2

797

05-27

What is Adversarial Training for Diffusion Models?

Was ist ein Adversarial Training für Diffusionsmodelle?

传播模型的反向培训是什么?

2505.21742v1

798

05-27

Polynomial Chaos Expanded Gaussian Process

Polynomisches Chaos erweiterter Gauß-Prozess

扩大的高斯进程

2405.01052v2

799

05-27

Moment kernels: a simple and scalable approach for equivariance to rotations and reflections in deep convolutional networks

Momentkerne: ein einfacher und skalierbarer Ansatz für Gleichmäßigkeit zu Rotationen und Reflexionen in tiefen konvolutionären Netzwerken

动力核心:一种简单和可伸缩的方法,在深刻的革命网络中,对轮换和反射的等同性采取简单和可伸缩的办法

2505.21736v1

800

05-27

Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization

Adressierung von Konzept-Mislabeling in Konzept-Bottleneck-Modellen durch Preference-Optimierung

通过优先优化处理概念瓶颈模式中的概念误贴标签问题

2504.18026v2

801

05-27

Non-Markovian Discrete Diffusion with Causal Language Models

Nicht-Markovianische Diskrepanz mit kausalen Sprachmodellen

非马尔科维语非马尔科维语分辨语言模式的传播

2502.09767v2

802

05-27

MIND-Stack: Modular, Interpretable, End-to-End Differentiability for Autonomous Navigation

MIND-Stack: Modular, interpretierbar, End-to-End-Unterscheidbarkeit für die autonome Navigation

MIND-Stack: 自主航行的模块、可解释、端到端至端差异

2505.21734v1

803

05-27

LaX: Boosting Low-Rank Training of Foundation Models via Latent Crossing

LaX: Förderung der Low-Rank-Schulung von Stiftungsmodellen durch Latent Crossing

LaX:通过中转交叉促进基金会模型的低射速培训

2505.21732v1

804

05-27

Deep Reinforcement Learning Agents are not even close to Human Intelligence

Deep Enforcement Learning Agents sind nicht einmal der menschlichen Intelligenz nahe

深强化学习代理机构甚至离人类情报机构不近

2505.21731v1

805

05-27

Are Statistical Methods Obsolete in the Era of Deep Learning?

Sind statistische Methoden im Zeitalter des tiefen Lernens überholt?

统计方法是否在深层学习时代过时?

2505.21723v1

806

05-27

Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape

Sattel-zu-Sattel-Dynamik in Deep ReLU Networks: Low-Rank Bias bei der ersten Sattelflucht

深 ReLU 网络中的套装到套接的动态动态: 第一次套装逃跑中的低兰克比亚

2505.21722v1

807

05-27

CTBENCH: A Library and Benchmark for Certified Training

CTBENCH: Eine Bibliothek und Benchmark für zertifizierte Ausbildung

CTBENCH: 注册培训的图书馆和基准

2406.04848v4

808

05-27

Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference

Nahezu dimensionsunabhängige Konvergenz des mittleren Feldes Black-Box Variationale Schlussfolgerung

中 - 现场黑 - 生物- 黑 - 生物- 黑 - 生物-

2505.21721v1

809

05-27

Simple Guidance Mechanisms for Discrete Diffusion Models

Einfache Leitmechanismen für diskrete Diffusionsmodelle

分辨传播模型的简单指导机制

2412.10193v3

810

05-27

Training Dynamics of In-Context Learning in Linear Attention

Trainingsdynamik des In-Context-Lernens in linearer Aufmerksamkeit

线线性关注的内文学习培训动态

2501.16265v2

811

05-27

Network classification through random walks

Netzwerkklassifizierung durch zufällige Spaziergänge

通过随机行走进行网络分类

2505.21706v1

812

05-27

AMSFL: Adaptive Multi-Step Federated Learning via Gradient Difference-Based Error Modeling

AMSFL: Adaptives Multi-Step-Federated Learning über gradient Difference-based Error Modeling

ASFL:通过基于差异的渐进错误建模进行适应性多阶段联邦学习

2505.21695v1

813

05-27

What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization

Welche Daten ermöglichen optimale Entscheidungen? Eine genaue Charakterisierung für lineare Optimierung

什么数据能使最佳决定实现最佳决定? 线性优化的精确属性

2505.21692v1

814

05-27

LLMPR: A Novel LLM-Driven Transfer Learning based Petition Ranking Model

LLMPR: Ein neuartiges LLM-getriebenes Transfer-Learning-basiertes Petitions-Ranking-Modell

LLMPR:基于请愿排级的新式LLM-驱动转移学习模式

2505.21689v1

815

05-27

Empirical analysis of binding precedent efficiency in Brazilian Supreme Court via case classification

Empirische Analyse der verbindlichen Präzedenzeffizienz im brasilianischen Obersten Gerichtshof über die Fallklassifizierung

通过案件分类对巴西最高法院具有约束力的先例效率进行经验分析

2407.07004v3

816

05-27

Probabilistic Reasoning with LLMs for k-anonymity Estimation

Probabilistische Begründung mit LLMs für k-Anonymitätsschätzung

K-匿名性估计法LLMs的概率推理

2503.09674v3

817

05-27

Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models

Verbesserung der Benutzerverhaltensvorhersage: Annotator-Metadaten in überwachten Machine Learning-Modellen nutzen

改进用户行为预测:在受监督的机器学习模型中利用标记元数据

2503.21000v2

818

05-27

tenSVD algorithm for compression

tenSVD-Algorithmus zur Kompression

用于压缩的 10SVD 算法

2505.21686v1

819

05-27

Edit Distance Robust Watermarks via Indexing Pseudorandom Codes

Entfernung bearbeiten Robuste Wasserzeichen über Indexierung Pseudorandom Codes

通过索引化 Peredorandom 代码编辑远程硬体水印

2406.02633v2

820

05-27

Incentivizing Permissionless Distributed Learning of LLMs

Anreize für das unbefugte Lernen von LLMs

激励对LLMM的无自由分配的学习

2505.21684v1

821

05-27

multivariateGPT: a decoder-only transformer for multivariate categorical and numeric data

multivariateGPT: ein nur Decoder-Transformator für multivariate kategoriale und numerische Daten

多个变量GPT: 用于多变量绝对数据和数字数据的解码器专用变压器

2505.21680v1

822

05-27

Fast meta-solvers for 3D complex-shape scatterers using neural operators trained on a non-scattering problem

Schnelle Meta-Lösung für 3D-Komplex-Spritzer mit neuronalen Operatoren, die auf einem nicht-streuenden Problem geschult sind

使用神经操作员就非碎裂问题接受培训的3D复合碎片散散射器快速元解析器

2405.12380v2

823

05-27

Robust LLM Alignment via Distributionally Robust Direct Preference Optimization

Robuste LLM-Ausrichtung über distributiv robuste Direktpräferenzoptimierung

通过分布式强力直接首选项优化对齐

2502.01930v2

824

05-27

What happens when generative AI models train recursively on each others’ generated outputs?

Was passiert, wenn generative KI-Modelle rekursiv auf den jeweils anderen generierten Ausgängen trainieren?

当基因化的AI模型对彼此产生的产出进行回溯性培训时会怎样呢?

2505.21677v1

825

05-27

In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention

In-Context Lineare Regression Demystified: Trainingsdynamik und mechanistische Interpretierbarkeit von Multi-Head Softmax Achtung

内负线倒退:对多头软体注意力进行动态和机械解释的培训

2503.12734v2

826

05-27

Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

Schnelles lebenslanges Adaptives Inverses Verstärktes Lernen aus Demonstrationen

从示范活动中学习

2209.11908v8

827

05-27

Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing

Adaptive Frontier Exploration von Graphen mit Anwendungen für netzwerkbasierte Krankheitstests

适应性边界探索应用网络基疾病测试图图的适应性边界探索

2505.21671v1

828

05-27

Efficient Controllable Diffusion via Optimal Classifier Guidance

Effiziente steuerbare Diffusion über Optimal Classifier Guidance

通过最佳分类指南有效控制可控扩散

2505.21666v1

829

05-27

Constraint-Adaptive Policy Switching for Offline Safe Reinforcement Learning

Constraint-Adaptive Policy Switching für Offline-sicheres Ausbau-Lernen

离线安全强化学习约束性强化政策转换

2412.18946v2

830

05-27

PreGenie: An Agentic Framework for High-quality Visual Presentation Generation

PreGenie: Agentisches Framework für hochwertige visuelle Präsentationsgeneration

PreGenie:高质量视觉演示制作的代理框架

2505.21660v1

831

05-27

STACI: Spatio-Temporal Aleatoric Conformal Inference

STACI: Spatio-Temporale aleatorische Konforme Schlussfolgerung

STACI: 斯帕迪奥-时空空气迁移

2505.21658v1

832

05-27

Explainability of Large Language Models using SMILE: Statistical Model-agnostic Interpretability with Local Explanations

Erklärbarkeit großer Sprachmodelle mit SMILE: Statistische Modell-agnostische Interpretierbarkeit mit lokalen Erklärungen

使用SMILE解释大语言模型的可解释性:统计模型 – – 与当地解释的可解释性

2505.21657v1

833

05-27

BACON: A fully explainable AI model with graded logic for decision making problems

BACON: Ein voll erklärbares KI-Modell mit abgestufter Logik für Entscheidungsprobleme

具有决策问题分级逻辑的完全可解释的AI模型

2505.14510v3

834

05-27

AutoSGD: Automatic Learning Rate Selection for Stochastic Gradient Descent

AutoSGD: Automatische Lernrate-Auswahl für stochastische Gradient Descent

AutoSGD: 存储渐变后代自动学习率选择

2505.21651v1

835

05-27

QuARI: Query Adaptive Retrieval Improvement

QUARI: Abfrage Adaptive Verbesserung des Retrievals

QuARI: 查询适应性检索改进

2505.21647v1

836

05-27

PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects

Private: Differenzielle private Vertrauensintervalle für durchschnittliche Behandlungseffekte

普里瓦特:对平均待遇影响有区别的私人信任互换

2505.21641v1

837

05-27

Efficient Diffusion Models for Symmetric Manifolds

Effiziente Diffusionsmodelle für symmetrische Manifolds

高效扩散对称操纵模型

2505.21640v1

838

05-27

Apprenticeship learning with prior beliefs using inverse optimization

Lehrlingsstudium mit früheren Überzeugungen mit inverser Optimierung

利用反向优化进行具有先入先信的学徒学习

2505.21639v1

839

05-27

Is Your LLM Overcharging You? Tokenization, Transparency, and Incentives

Ist Ihr LLM überladen Sie? Tokenization, Transparenz, und Incentives

您的法学硕士是否对你太过苛刻?

2505.21627v1

840

05-27

Localized Weather Prediction Using Kolmogorov-Arnold Network-Based Models and Deep RNNs

Lokalisierte Wettervorhersage mit Kolmogorov-Arnold-Netzwerk-basierten Modellen und tiefen RNNs

利用Kolmogorov-Arnold网络模型和深区域网网

2505.22686v1

841

05-27

Learning Where to Learn: Training Distribution Selection for Provable OOD Performance

Lernen, wo man lernen kann: Training Distribution Selection for Provable OOD Performance

学习从何学习:选择培训分布,以选择可实现的OOD业绩

2505.21626v1

842

05-27

VideoMarkBench: Benchmarking Robustness of Video Watermarking

VideoMarkBench: Benchmarking Robustheit von Video Watermarking

视频MarkBench:视频水标记基准的坚实性

2505.21620v1

843

05-27

Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making

Schweigen ist kein Konsens: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making

沉默不是共识:通过用于临床决策的Catfish代理商在多方代理LLMs中破坏协议的偏见

2505.21503v1

844

05-27

UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents

UI-Genie: Ein selbstverbesserender Ansatz zur iterativen Steigerung von MLLM-basierten mobilen GUI-Agenten

UI-Genie: 一种自我改进的方法,用于在刺激下促进基于MLLLM的移动图形界面工具

2505.21496v1

845

05-27

Reinforcing General Reasoning without Verifiers

Verstärkung der allgemeinen Vernunft ohne Prüfer

加强一般理由说明,无验证人

2505.21493v1

846

05-27

Be Decisive: Noise-Induced Layouts for Multi-Subject Generation

Entscheidend sein: Lärminduzierte Layouts für die mehrteilige Generierung

Be Decisive: 多主题生成的噪音生成布局

2505.21488v1

847

05-27

Hardware-Efficient Attention for Fast Decoding

Hardware-Effiziente Aufmerksamkeit für schnelle Dekodierung

快速下标记的硬件高效关注

2505.21487v1

848

05-27

Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-index Models

Algorithmen und SQ Lower Bounds für robustes Lernen Real-valuierte Multi-Index-Modelle

强力学习实时估价多指数模型的等级和 SQ 下角宽度

2505.21475v1

849

05-27

Annealing Flow Generative Models Towards Sampling High-Dimensional and Multi-Modal Distributions

Annealing Flow Generative Modelle zur Probenahme hochdimensionaler und multi-Modalen Verteilungen

用于取样的高多样性和多模式分布和多模式分布的Ananining流程生成模型

2409.20547v4

850

05-27

SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge

SOSBENCH: Benchmarking der Sicherheitsausrichtung auf wissenschaftliche Erkenntnisse

SOSBENCH:以科学知识为安全协调基准

2505.21605v1

851

05-27

Guide your favorite protein sequence generative model

Führen Sie Ihre Lieblings-Protein-Sequenz generative Modell

指导您最喜爱的蛋白质序列基因模型

2505.04823v2

852

05-27

When Are Concepts Erased From Diffusion Models?

Wann werden Konzepte von Diffusionsmodellen ausgelöscht?

概念何时从传播模型中消失?

2505.17013v3

853

05-27

On the Robustness of Adversarial Training Against Uncertainty Attacks

Über die Robustheit des zweifelhaften Trainings gegen Ungewissheitsangriffe

关于防止不确定袭击的反逆训练的有力性

2410.21952v2

854 05-27 Causal Posterior Estimation Kausale hintere Schätzung Causal Posides 估计值 2505.21468v1

855

05-27

GeLLMO: Generalizing Large Language Models for Multi-property Molecule Optimization

GeLLMO: Verallgemeinern von großen Sprachmodellen für Multi-Property-Molekül-Optimierung

GELLMO:通用多财产分子优化大语言模型

2502.13398v2

856

05-27

High-Dimensional Calibration from Swap Regret

Hochdimensionale Kalibrierung aus Swap-Regret

从 Swap Regret 进行高维校准

2505.21460v1

857

05-27

Designing Cyclic Peptides via Harmonic SDE with Atom-Bond Modeling

Konzipieren von Cyclic Peptides über Harmonische SDE mit Atom-Bond-Modellierung

通过使用原子-体型建模的波力SDE, 设计圆性五氯苯并配有原子-体型建模

2505.21452v1

858

05-27

Training neural control variates using correlated configurations

Ausbildung von Neuralsteuerungsvariaten mit korrelierten Konfigurationen

使用相关配置的培训神经控制变异

2505.07719v2

859

05-27

When Two LLMs Debate, Both Think They’ll Win

Wenn zwei LLMs diskutieren, denken beide, dass sie gewinnen werden

当两个LLM 辩论, 双方都认为他们会赢

2505.19184v2

860

05-27

Leveraging XP and CRISP-DM for Agile Data Science Projects

Nutzung von XP und CRISP-DM für agile Data Science Projekte

利用XP和CRISP-DM为敏感数据科学项目发挥杠杆作用

2505.21603v1

861

05-27

Can Large Reasoning Models Self-Train?

Können sich große vernünftigen Modelle selbst entwickeln?

大理由模型能够自我培训吗?

2505.21444v1

862

05-27

Autoencoding Random Forests

Zufällige Wälder automatisch kodieren

自动编码随机森林

2505.21441v1

863

05-27

ANCHOLIK-NER: A Benchmark Dataset for Bangla Regional Named Entity Recognition

ANCHOLIK-NER: Ein Benchmark-Datensatz für Bangla Regional Named Entity Recognition

ANCHOLIK-NER:孟加拉地区命名实体识别基准数据集

2502.11198v3

864

05-27

Measuring Fine-Grained Relatedness in Multitask Learning via Data Attribution

Messung der feinkörnigen Verbundenheit im Multitasking-Lernen über Datenzuweisung

通过数据归责衡量多任务学习中的细微关联

2505.21438v1

865

05-27

Distributional Scaling for Emergent Capabilities

Verteilungsskalierung für Emergent Capabilities

新兴市场能力分配比例

2502.17356v3

866

05-27

Attribute-Efficient PAC Learning of Sparse Halfspaces with Constant Malicious Noise Rate

Effizientes PAC-Lernen von Sparse-Halbräumen mit konstanter bösartiger Lärmrate

以常态恶意噪音率学习粗微半空空间的属性- 有效 PAC 学习

2505.21430v1

867

05-27

QuForge: A Library for Qudits Simulation

QuForge: Eine Bibliothek für Qudits Simulation

Quforge: Quits 模拟图书馆

2409.17716v2

868

05-27

Stochastic Online Conformal Prediction with Semi-Bandit Feedback

Stochastische Online-Konforme Vorhersage mit Halbbandit Feedback

具有半银行反馈的在线非正式预测

2405.13268v3

869

05-27

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

R2R: Effizientes Navigieren unterschiedlicher Vernunftpfade mit klein-großen Model Token Routing

R2R: 以小型模型调速器有效导航差异性理性路径

2505.21600v1

870

05-27

Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning

Politische Induktion: Vorhersage des Startup-Erfolgs durch erklärbares Memory-Augmented In-Context Learning

政策介绍:通过可解释的记忆增强的内文学习预测启动成功

2505.21427v1

871

05-27

Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks

Individuelles Verhalten in agentenbasierten Modellen mit Graph Diffusionsnetzwerken lernen

具有图表传播网络的基于代理模型的学习个人行为

2505.21426v1

872

05-27

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

GenPO: Generative Diffusionsmodelle treffen auf On-Policy-Verstärkungs-Lernen

GENPO: 符合政策强化学习的生成传播模式

2505.18763v2

873

05-27

A Lightweight Method to Disrupt Memorized Sequences in LLM

Eine leichte Methode zum Disruptieren von gemerkten Sequenzen in LLM

LLM 中破坏记忆序列的轻量方法

2502.05159v2

874

05-27

Can Large Language Models Understand Symbolic Graphics Programs?

Können große Sprachmodelle symbolische Grafikprogramme verstehen?

大语言模型能理解符号图形程序吗?

2408.08313v4

875

05-27

Optimizing Deep Learning for Skin Cancer Classification: A Computationally Efficient CNN with Minimal Accuracy Trade-Off

Deep Learning für Hautkrebs-Klassifikation optimieren: Ein Computational Efficient CNN mit minimaler Genauigkeit Trade-Off

最优化皮肤癌症分类深层学习:计算效率高的有线电视新闻网与最低准确性交易

2505.21597v1

876

05-27

Learning optimal treatment strategies for intraoperative hypotension using deep reinforcement learning

Optimale Therapiestrategien für intraoperative Hypotonie mit Deep-Enforcement-Lernen

利用深强化学习学习,学习采用最佳治疗战略,以弥补职业内衰退

2505.21596v1

877

05-27

Relevance-driven Input Dropout: an Explanation-guided Regularization Technique

Relevanz-gesteuerter Input Dropout: eine Erklärungs-geführte Regularisierungstechnik

由相关性驱动的 “ 投入辍学:解释指导规范化技术 “

2505.21595v1

878

05-27

Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges

Benchmarking Spatiotemporal Reasoning in LLMs und Reasoning Models: Fähigkeiten und Herausforderungen

确定LLM和理由模型的偏差理由基准:能力和挑战

2505.11618v2

879

05-27

Conflicting Biases at the Edge of Stability: Norm versus Sharpness Regularization

Widersprüchliche Biasen am Rande der Stabilität: Norm versus Schärfe Regularisierung

稳定边缘的冲突两重冲突:规范与尖锐的规范化

2505.21423v1

880

05-27

When Shift Happens - Confounding Is to Blame

Wenn es zu einer Verschiebung kommt - Verwirren ist die Schuld

发生变迁时 - 令人不安的是责怪

2505.21422v1

881

05-27

A Physics-Augmented GraphGPS Framework for the Reconstruction of 3D Riemann Problems from Sparse Data

Ein physikgestütztes GraphGPS-Framework für den Wiederaufbau von 3D Riemann-Problemen aus Sparse-Daten

物理辅助图形GPS框架,用于从简简数据中重建3D里伊曼问题

2505.21421v1

882

05-27

From Continual Learning to SGD and Back: Better Rates for Continual Linear Models

Vom kontinuierlichen Lernen bis hin zu SGD und Back: Bessere Preise für kontinuierliche lineare Modelle

从持续学习到SGD和后退:持续线性模型的更好比率

2504.04579v2

883

05-27

Efficiently Scaling LLM Reasoning with Certaindex

Effiziente Skalierung der LLM-Vernunft mit bestimmtem Dex

高效扩增 LLM 使用 emitedex 说明

2412.20993v2

884

05-27

A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment

Ein Rahmen für die strittige Analyse von Entscheidungsunterstützungssystemen vor der Einführung

在部署之前对决定支助系统进行反对分析的框架

2505.21414v1

885

05-27

Comparison of the Cox proportional hazards model and Random Survival Forest algorithm for predicting patient-specific survival probabilities in clinical trial data

Vergleich des Cox-Proportional-Hazards-Modells und des Random Survival Forest-Algorithmus zur Vorhersage patientenspezifischer Überlebenswahrscheinlichkeiten in klinischen Studiendaten

比较Cox按比例比例危害模型和随机生存森林算法,以预测临床试验数据中特定患者生存概率

2502.03119v2

886

05-27

MRSD: Multi-Resolution Skill Discovery for HRL Agents

MRSD: Multi-Resolution Skill Discovery für HRL-Agenten

MRSD: HRL代理机构多分辨率技能发现

2505.21410v1

887

05-27

Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks

Dual Natural Gradient Descent für skalierbare Ausbildung von physikinformierten Neuronalen Netzwerken

物理内成形神经网络可缩放培训

2505.21404v1

888

05-27

A Convergence Theory for Diffusion Language Models: An Information-Theoretic Perspective

Eine Konvergenztheorie für Diffusions-Sprachmodelle: Eine informationstheoretische Perspektive

传播语言模型集成理论:信息理论视角

2505.21400v1

889

05-27

Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling

Factual Self-Awareness in Sprachmodellen: Repräsentation, Robustheit und Skalierung

语言模式中的事实自觉意识:代表性、强力和比例

2505.21399v1

890

05-27

Square$χ$PO: Differentially Private and Robust $χ^2$-Preference Optimization in Offline Direct Alignment

Square$x$PO: Differential privat und robust $x^2$-Preference Optimierung in Offline Direct Alignment

平方美元=美元PO$:在离线直接调整中区别对待的私人和强势的美元=2美元-优惠优化

2505.21395v1

891

05-27

Foundation Models on a Budget: Approximating Blocks in Large Vision Models

Basismodelle auf einem Budget: Annähernde Blöcke in großen Visionsmodellen

预算模式基础模式:大愿景模式中类似障碍

2410.04941v5

892

05-27

Leveraging the Power of Conversations: Optimal Key Term Selection in Conversational Contextual Bandits

Die Macht der Gespräche nutzen: Optimale Auswahl der Schlüsselbegriffe in konversatorischen Kontextbanditen

利用对话的力量:在对话背景强盗中最佳关键条件选择

2505.21393v1

893

05-27

Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features

Finite-Probenanalyse von linearen zeitlichen Unterschieden Lernen mit willkürlichen Funktionen

具有任意地貌特征的线性时间上差异学习的简单抽样分析

2505.21391v1

894

05-27

DeCAF: Decentralized Consensus-And-Factorization for Low-Rank Adaptation of Foundation Models

DeCAF: Dezentrale Konsens-und-Factorisierung für Low-Rank-Anpassung von Stiftungsmodellen

DeCAF: 基金会模式的低成本改造的分散化共识和因素

2505.21382v1

895

05-27

Securing Federated Learning against Backdoor Threats with Foundation Model Integration

Sichern von Federated Learning gegen Hintertürbedrohungen durch die Integration von Foundation-Modellen

安全联邦学习应对后门威胁,采用基金会模式一体化模式

2410.17573v3

896

05-27

Linear $Q$-Learning Does Not Diverge in $L^2$: Convergence Rates to a Bounded Set

Lineares $Q$-Lernen unterscheidet sich nicht in $L^2$: Konvergenzraten zu einem begrenzten Satz

线性 $Q $ 美元学习的学习不以 $L $2 美元进行 : 汇合率与环形集的汇合率

2501.19254v4

897

05-27

Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Chain-of-Zoom: Extreme Super-Resolution über Scale Autoregression und Preference Alignment

缩放链缩放链 : 通过缩放自动递减和偏好对齐, 极超分辨率

2505.18600v2

898

05-27

Improving LLM-based Global Optimization with Search Space Partitioning

Verbesserung der globalen Optimierung auf LLM-Basis mit Search Space Partitioning

改进以LLM为基础的全球最佳利用搜索空间分割法

2505.21372v1

899

05-27

PLANETALIGN: A Comprehensive Python Library for Benchmarking Network Alignment

PLANETALIGN: Eine umfassende Python-Bibliothek für die Ausrichtung von Benchmarking-Netzwerken

PlanETALIGN: 用于基准确定网络协调的综合性俾顿图书馆

2505.21366v1

900

05-27

Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders

Auf dem Weg zur Verdolmetschbarkeit ohne Opfer: treue Dense-Layer-Zersetzung mit Mischung aus Decodern

实现无牺牲的解释性:忠实的高密度层分解与代谢物混合

2505.21364v1

901

05-27

CRISP-NAM: Competing Risks Interpretable Survival Prediction with Neural Additive Models

CRISP-NAM: Konkurrenzfähige Risiken interpretierbare Überlebensvorhersage mit neuralen Additivenmodellen

CRIISP-NAM: 与神经添加模型相竞争的风险解释性生存预测

2505.21360v1

902

05-27

Learning with Selectively Labeled Data from Multiple Decision-makers

Lernen mit selektiv beschrifteten Daten von mehreren Entscheidungsträgern

学习来自多个决策者的选择性标签数据

2306.07566v4

903

05-27

Leveraging Large Language Models for Bengali Math Word Problem Solving with Chain of Thought Reasoning

Nutzung von großen Sprachmodellen für Bengalische Mathematik-Wort-Probleme bei der Lösung der Kette der Gedankenveranlagung

利用大语言模型解决孟加拉语数学字词与思维链理性的解决问题

2505.21354v1

904

05-27

Diffusion Predictive Control with Constraints

Diffusion Predictive Control mit Einschränkungen

受限制的预测控制

2412.09342v2

905

05-27

An Uncertainty-Aware ED-LSTM for Probabilistic Suffix Prediction

Eine unsichere ED-LSTM für probabilistische Suffix-Vorhersage

用于概率后置物后置物预测的不确定性( ED-LSTM) 的不确定性警告 ED-LSTM

2505.21339v1

906

05-27

Controlling Participation in Federated Learning with Feedback

Mit Feedback die Teilnahme am Föderierten Lernen kontrollieren

控制参加有反馈的联邦学习

2411.19242v2

907

05-27

PeerGuard: Defending Multi-Agent Systems Against Backdoor Attacks Through Mutual Reasoning

PeerGuard: Verteidigen von Multi-Agenten-Systemen gegen Hintertürangriffe durch gegenseitige Vernunft

同伴保护:捍卫多机构系统,防止通过相互理由进行后门攻击

2505.11642v2

908

05-27

Adaptive Sample Sharing for Multi Agent Linear Bandits

Adaptive Probenfreigabe für Multi Agent Linear Bandits

多剂线性强盗的适应性样本共享

2309.08710v3

909

05-27

Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness

Sign-Operator für den Umgang mit schwerfälligen Geräuschen in Nicht-Konvex-Optimierung: Hohe Wahrscheinlichkeitsgrenzen unter $(L_0, L_1)$-Smoothness

在非Convex优化情况下处理重故障噪音的签名操作员: 高概率弹道低于$(L_0, L_1), 低于$(L_1)

2502.07923v2

910

05-27

Joint Learning in the Gaussian Single Index Model

Gemeinsames Lernen im Gaussischen Einzelindexmodell

Gaussian单一指数模式联合学习

2505.21336v1

911

05-27

DHP: Discrete Hierarchical Planning for Hierarchical Reinforcement Learning Agents

DHP: Diskrete Hierarchische Planung für Hierarchische Verstärkungs-Learning Agents

DHP: 等级加强学习代理的分级分级规划

2502.01956v2

912 05-27 Structure from Collision Struktur aus Kollision 来自碰撞的结构 2505.21335v1

913

05-27

Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

Robustheit und Genauigkeit in der Mischung von Experten optimieren: Ein Dual-Model-Ansatz

优化专家混合中的力量和准确性:双模式办法

2502.06832v3

914

05-27

Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices

Eingewickelt Gaussian auf der Mannigfaltigkeit der Symmetrischen Positiven Definiten Matrizen

以正负负负负下方矩阵的方块包装高森

2502.01512v3

915

05-27

Scheduling with Uncertain Holding Costs and its Application to Content Moderation

Planung mit unsicheren Holdingkosten und deren Anwendung auf Content Moderation

与不确定的控股成本及其对内容调节应用的时间安排

2505.21331v1

916

05-27

UGCE: User-Guided Incremental Counterfactual Exploration

UGCE: 用户指导的递增反事实探索

2505.21330v1

917

05-27

Bencher: Simple and Reproducible Benchmarking for Black-Box Optimization

Bencher: Einfaches und reproduzierbares Benchmarking für Black-Box-Optimierung

座谈人: 简化和可复制的黑箱优化基准

2505.21321v1

918

05-27

A Cross Modal Knowledge Distillation & Data Augmentation Recipe for Improving Transcriptomics Representations through Morphological Features

Ein Cross Modal Knowledge Destillation & Data Augmentation Rezept zur Verbesserung von Transkriptionsdarstellungen durch morphologische Merkmale

一种交叉模式知识蒸馏和数据增强休息室,以通过生理特征改进转基因医学的表现形式

2505.21317v1

919

05-27

It’s complicated. The relationship of algorithmic fairness and non-discrimination regulations for high-risk systems in the EU AI Act

Es ist kompliziert. Das Verhältnis algorithmischer Fairness- und Nichtdiskriminierungsvorschriften für Hochrisikosysteme im EU-AI-Gesetz

这很复杂,在欧盟的AI法案中, 高风险系统的算法公正和不歧视规定之间的关系。

2501.12962v3

920

05-27

Item Cluster-aware Prompt Learning for Session-based Recommendation

Artikel Cluster-aware Prompt Learning für sitzungsbasierte Empfehlung

项目集群意识快速学习促进基于会议的建议

2410.04756v2

921

05-27

Overcoming Spurious Solutions in Semi-Dual Neural Optimal Transport: A Smoothing Approach for Learning the Optimal Transport Plan

Überwinden von sauberen Lösungen im halbdualen Neural Optimalen Verkehr: Ein glättender Ansatz für das Lernen des optimalen Verkehrsplans

克服半双轨神经优化运输中的纯净解决方案:学习最佳运输计划的平滑方法

2502.04583v2

922

05-27

Interlocking-free Selective Rationalization Through Genetic-based Learning

Interlocking-free Selektive Rationalisierung durch gentechnisch-basiertes Lernen

通过基于遗传的学习实现互连、无互闭和无互换的选择性合理化

2412.10312v2

923

05-27

Optimizing fMRI Data Acquisition for Decoding Natural Speech with Limited Participants

Optimierung der fMRI-Datenerfassung für die Dekodierung von Natural Speech mit begrenzten Teilnehmern

优化FMRI数据获取,以便与有限参加者进行自然演讲

2505.21304v1

924

05-27

Large Language Models Miss the Multi-Agent Mark

Große Sprachmodelle vermissen das Multi-Agent Mark

大语言模型

2505.21298v1

925

05-27

Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation

Auf dem Weg zur Anpassung von Open Source großen Sprachmodellen für die Erstellung klinischer Notizen auf Expertenebene

努力调整用于专家级临床笔记制作的开放源大语言模型

2405.00715v6

926

05-27

LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning

LoFT: Low-Rank-Anpassung, die sich wie Full-Fine-Tuning verhält

LOFT: 行为如完全精美调整的低朗适应

2505.21289v1

927

05-27

GSAT: Graph Structure Attention Networks

GSAT: Grafische Struktur

GSAT: 图表结构关注网络

2505.21288v1

928

05-27

Learnable Kernel Density Estimation for Graphs

Erlernbare Kerneldichteschätzung für Graphen

可学习的内核密度

2505.21285v1

929

05-27

Optimal Pricing for Data-Augmented AutoML Marketplaces

Optimale Preise für datengesteigerte AutoML-Märkte

数据增强自动自动ML 市场最佳定价

2310.17843v2

930

05-27

Accelerated Parallel Tempering via Neural Transports

Beschleunigung des parallelen Temperierens über neurale Transporte

通过神经运输加速平行探险

2502.10328v2

931

05-27

Dual-Directed Algorithm Design for Efficient Pure Exploration

Dual-Directed-Algorithm-Design für effizientes Pure-Exploring

高效纯勘探的双重稀释算法设计

2310.19319v3

932

05-27

Taylor expansion-based Kolmogorov-Arnold network for blind image quality assessment

Taylor-expansionsbasiertes Kolmogorov-Arnold-Netzwerk für blinde Bildqualitätsbewertung

以泰勒为扩展基地的Kolmogorov-Arnold盲人图像质量评估网络

2505.21592v1

933

05-27

Minimizing False-Positive Attributions in Explanations of Non-Linear Models

Minimierung falsch-positiver Attribute in Erklärungen nicht-linearer Modelle

尽量减少解释非碱模型中的虚假动机归属

2505.11210v2

934

05-27

ResKoopNet: Learning Koopman Representations for Complex Dynamics with Spectral Residuals

ResKoopNet: Koopman-Repräsentanzen für komplexe Dynamiken mit Spektralresidualen lernen

ResKoopNet:学习 Koopman 代表器, 用于使用光谱残余物的复杂动态

2501.00701v4

935

05-27

Mitigating Molecular Aggregation in Drug Discovery with Predictive Insights from Explainable AI

Mildernde molekulare Aggregation in der Drogenentdeckung mit vorausschauenden Erkenntnissen von erklärbarer KI

利用可解释的人工智能的预测洞察力减轻药物发现中的分子聚合

2306.02206v2

936

05-27

BindEnergyCraft: Casting Protein Structure Predictors as Energy-Based Models for Binder Design

BindEnergyCraft: Proteinstrukturvorhersagen als energiebasierte Modelle für Binder-Design

Bind EnergyCraft: 将蛋白结构预测器作为Binder设计以能源为基础的模型

2505.21241v1

937

05-27

Breaking the Performance Ceiling in Complex Reinforcement Learning requires Inference Strategies

Breaking the Performance Ceiling in komplexen Verstärkungs-Lernen erfordert Inferenz-Strategien

综合加强学习中业绩上限的打破需要推断战略

2505.21236v1

938

05-27

STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization

STRAP: Spatio-Temporal Pattern Retrieval für Out-of-Distribution-Verallgemeinerung

STRAP: 普遍分发的Spadio-Temporal 样板回收

2505.19547v2

939

05-27

FRIREN: Beyond Trajectories – A Spectral Lens on Time

FRIREN: Jenseits von Trajektorien – Eine Spektrallinse auf Zeit

在轨迹之外 – – 时光透镜

2505.17370v2

940

05-27

Is Hyperbolic Space All You Need for Medical Anomaly Detection?

Ist hyperbolischer Raum alles, was Sie für medizinische Anomalie-Erkennung benötigen?

超双曲空间是否所有你需要的医疗异常检测?

2505.21228v1

941

05-27

Why Do More Experts Fail? A Theoretical Analysis of Model Merging

Warum scheitern weitere Experten? Eine theoretische Analyse der Modellzusammenführung

为何有更多的专家失败?对模式合并的理论分析

2505.21226v1

942

05-27

The dark side of the forces: assessing non-conservative force models for atomistic machine learning

Die dunkle Seite der Kräfte: Bewertung nicht konservativer Kraftmodelle für atomistisches maschinelles Lernen

部队的黑暗面:评估非保守力量模型,以进行原子学机器学习

2412.11569v3

943

05-27

Wavelet Flow For Extragalactic Foreground Simulations

Wavelet Flow für extragalaktische Foreground Simulationen

用于外星际前景模拟的波浪流

2505.21220v1

944

05-27

Addressing Data Quality Decompensation in Federated Learning via Dynamic Client Selection

Adressierung von Datenqualitätsentkompensation im Federated Learning über Dynamic Client Selection

通过动态客户选择解决联邦学习中的数据质量补偿问题

2505.21219v1

945

05-27

Transfer learning for multifidelity simulation-based inference in cosmology

Transfer-Lernen für Multifidelity-Simulationsbasierte Schlussfolgerungen in der Kosmologie

在宇宙学中进行多种不贞行为模拟推论的转让性学习

2505.21215v1

946

05-27

Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning

Auf dem Weg zur Enthüllung der Wirksamkeit von Klein-Scale-Fine-Tuning im R1-Stil Verstärktes Lernen

提高R1型强化学习中小规模微调的效力

2505.17988v2

947

05-27

Input Convex Kolmogorov Arnold Networks

投入 Convex Kolmogorov Arnold 网络

2505.21208v1

948

05-27

Towards Identifiability of Interventional Stochastic Differential Equations

Zur Identifizierbarkeit interventioneller stochastischer Differentialgleichungen

实现干预性斯托卡差异等同的可识别性

2505.15987v2

949

05-27

Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs

Universal Reasoner: Ein einfacher, komponierbarer Plug-and-Play-Reasoner für gefrorene LLMs

通用理由:冻结长效LMs的单一、可合成插管和布局理由

2505.19075v2

950

05-27

Developing hybrid mechanistic and data-driven personalized prediction models for platelet dynamics

Entwicklung hybrider mechanistischer und datengesteuerter personalisierter Vorhersagemodelle für Thrombozytendynamik

开发混合机械和数据驱动的小板板动力学混合机械和个人化预测模型

2505.21204v1

951

05-27

Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Implizite Dynamische Flussfusion (IDFF) für generative Modellierung

用于产生建模的隐含动态流动融合(IDFF)

2409.14599v4

952

05-27

Crop recommendation with machine learning: leveraging environmental and economic factors for optimal crop selection

Kulturempfehlung mit maschinellem Lernen: Nutzung ökologischer und wirtschaftlicher Faktoren für eine optimale Ernteauswahl

采用机械学习的作物建议:利用环境和经济因素优化作物选择

2505.21201v1

953

05-27

Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning

Pioniere 4-Bit FP-Quantisierung für Diffusionsmodelle: Mixup-Sign-Quantisierung und Timestep-Aware Feintuning

推出4-Bit FP 扩散模型量化:混合- Sign 量度和时间步骤- 软件精美调试

2505.21591v1

954

05-27

Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM’s Instruction-Following Capabilities

Enthüllen von instruction-spezifischen Neuronen & Experten: Ein analytischer Rahmen für die instruction-following Fähigkeiten von LLM

具体未完成的指示性具体神经和专家:LLM教学-执行能力分析框架

2505.21191v1

955

05-27

Exploring the Latent Capacity of LLMs for One-Step Text Generation

Erforschung der Latent-Kapazität von LLMs für die einstufige Textgenerierung

探索单步制文本生成LLMs的原始能力

2505.21189v1

956

05-27

Equivariant Representation Learning for Symmetry-Aware Inference with Guarantees

Gleichwertiges Repräsentationslernen für Symmetrie-Bewusstschluss mit Garantien

关于有担保的对称-软件推断的等同代表制学习

2505.19809v2

957

05-27

PoisonSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing

GiftSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing

毒物群:通过示范众包普及有害信息合成

2505.21184v1

958

05-27

Learning What to Do and What Not To Do: Offline Imitation from Expert and Undesirable Demonstrations

Lernen, was zu tun ist und was nicht: Offline-Imitation von Experten und unerwünschten Demonstrationen

学会做什么做什么和不做什么:专家的脱线模仿和不受欢迎的示威

2505.21182v1

959

05-27

Latent label distribution grid representation for modeling uncertainty

Latent Label Distribution Grid Darstellung für Modellierung Unsicherheit

用于模拟不确定性模型的延迟标签分配网格代表

2505.21180v1

960

05-27

Improved Online Confidence Bounds for Multinomial Logistic Bandits

Verbesserte Online-Konfidenzgrenzen für multinomiale Logistische Banditen

提高多军后勤大盗的在线信任度

2502.10020v4

961

05-27

Topological Deep Learning for Speech Data

Topologisches Deep Learning für Sprachdaten

为语音数据进行地形深层学习

2505.21173v1

962

05-27

Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation

Parameter Effizientes kontinuierliches Lernen mit dynamischer Low-Rank-Anpassung

具有动态低Rank适应性的持续学习

2505.11998v2

963

05-27

STEB: In Search of the Best Evaluation Approach for Synthetic Time Series

STEB: Auf der Suche nach dem besten Bewertungsansatz für die Synthetische Zeitreihe

STEB:寻求合成时间系列的最佳评价方法

2505.21160v1

964

05-27

Model as Loss: A Self-Consistent Training Paradigm

Modell als Verlust: Ein selbstkonsistentes Trainingsparadigma

损失模型:自我协调培训模型

2505.21156v1

965

05-27

FlexiReg: Flexible Urban Region Representation Learning

FlexiReg: Flexibles Stadtraum-Repräsentanz-Lernen

灵活的城市地区代表性学习:灵活的城市地区代表性学习

2503.09128v2

966

05-27

Predicate Invention for Bilevel Planning

Prädikat Erfindung für Bilevel-Planung

双级规划预发明

2203.09634v3

967

05-27

Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score

Halbüberwachte konforme Vorhersage mit nicht markiertem Nonkonformity Score

带有未贴标签的不合规分数的半超半常规预测

2505.21147v1

968

05-27

A Distributional Treatment of Real2Sim2Real for Object-Centric Agent Adaptation in Vision-Driven Deformable Linear Object Manipulation

Eine distributive Behandlung von Real2Sim2Real für die Anpassung an Objekt-Zentrische Agenten in visionsgetriebener, deformierbarer linearer Objektmanipulation

在视觉-驱动式可变线性物体操纵中用于物体中心剂适应的Real2Sim2Real的分布式处理法

2502.18615v2

969

05-27

Hallucinations are inevitable but can be made statistically negligible. The “innate” inevitability of hallucinations cannot explain practical LLM issues

Halluzinationen sind unvermeidlich, können aber statistisch vernachlässigbar gemacht werden. Die “angeborene” Unvermeidbarkeit von Halluzinationen kann praktische LLM-Probleme nicht erklären

幻觉的“内在”不可避免性无法解释实际的LLM问题。

2502.12187v2

970

05-27

A Predicting Phishing Websites Using Support Vector Machine and MultiClass Classification Based on Association Rule Techniques

Eine Vorhersage Phishing-Websites mit Unterstützung Vektor-Maschine und Multi-Klasse Klassifizierung basierend auf Assoziation Regel Techniken

基于协会规则技术的利用辅助病媒机和多类分类的预测钓鱼网站

2505.21141v1

971

05-27

HeteroBA: A Structure-Manipulating Backdoor Attack on Heterogeneous Graphs

HeteroBA: Ein strukturmanipulierender Backdoor-Angriff auf Heterogene Graphen

异型BA:结构调节式后门对异种图的后门攻击

2505.21140v1

972

05-27

Identifying Heart Attack Risk in Vulnerable Population: A Machine Learning Approach

Identifikation von Herzinfarktrisiko in gefährdeter Bevölkerung: Ein Ansatz zum maschinellen Lernen

查明弱势人口中的心脏攻击风险:机械学习方法

2505.21139v1

973

05-27

Learning Single Index Models with Diffusion Priors

Einzelindexmodelle mit Diffusion Priors lernen

具有传播前版本的学习单一指数模式

2505.21135v1

974

05-27

Robust and Computation-Aware Gaussian Processes

Robuste und rechnergestützte Gaußsche Prozesse

强力和计算- 软件软件高斯进程

2505.21133v1

975

05-27

Backpropagation-free Spiking Neural Networks with the Forward-Forward Algorithm

Rückpropagierungsfreie Spiking-Neural-Netzwerke mit dem vorwärts-vorwärts-Algorithmus

带有前向前向演算法的无后向反向反向光谱反向神经网络

2502.20411v2

976

05-27

MetaGS: A Meta-Learned Gaussian-Phong Model for Out-of-Distribution 3D Scene Relighting

MetaGS: Ein meta-erlerntes Gaussian-Phong-Modell für 3D-Szenen-Erhellung im Out-of-Distribution-Bereich

MetaGS: 3D号场景光化模型

2405.20791v2

977

05-27

Universal Value-Function Uncertainties

Universelle Wert-Funktions-Unsicherheiten

通用价值-功能不确定性

2505.21119v1

978

05-27

A Lightweight Multi-Expert Generative Language Model System for Engineering Information and Knowledge Extraction

Ein leichtes Multi-Expert Generatives Sprachmodellsystem für Engineering Information and Knowledge Extraction

工程信息和知识采掘轻量多专家生成语言示范系统

2505.21109v1

979

05-27

Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance

Bedingte Diffusionsmodelle mit klassifikatorfreier Gibbs-ähnlicher Anleitung

有条件传播模式,附有无分类者免费吉布布斯类指南

2505.21101v1

980

05-27

Random Walk Diffusion for Efficient Large-Scale Graph Generation

Random Walk Diffusion für effiziente großformatige Graphengeneration

高效大型图表生成的随机漫步扩散

2408.04461v2

981

05-27

Do you see what I see? An Ambiguous Optical Illusion Dataset exposing limitations of Explainable AI

Sehen Sie, was ich sehe? Ein Ambiguous Optical Illusion Dataset, das Beschränkungen der erklärbaren KI aufdeckt

你看到我所看到的吗?一个模糊的光学幻影数据集暴露了可解释的人工智能的局限性。

2505.21589v1

982

05-27

Sequential Function-Space Variational Inference via Gaussian Mixture Approximation

Sequentielle Funktions-Raum Variationelle Schlussfolgerung über Gaußsche Mischungsannäherung

通过高森混ixture近似加速发生序列函数-空间空间变动推断

2503.07114v2

983

05-27

Thinker: Learning to Think Fast and Slow

Denker: Schnell und langsam denken lernen

思考者:学会快速和缓慢思考

2505.21097v1

984

05-27

Improved Impossible Tuning and Lipschitz-Adaptive Universal Online Learning with Gradient Variations

Verbessertes Unmögliches Tuning und Lipschitz-Adaptives Universal Online-Lernen mit gradienten Variationen

改进不可能的图金和利普施维茨-适应性通用在线学习,有渐进变异

2505.21095v1

985

05-27

Recurrent Memory for Online Interdomain Gaussian Processes

Recurrent Speicher für Online-Interdomain Gaussian Prozesse

Gaussian 在线内部进程经常性内存

2502.08736v3

986

05-27

Out of the Shadows: Exploring a Latent Space for Neural Network Verification

Out of the Shadows: Erforschen eines latenten Raumes für neurale Netzwerkverifizierung

暗影外:探索神经网络的原始空间核查

2505.17854v2

987

05-27

Efficient Large Language Model Inference with Neural Block Linearization

Effiziente großsprachige Modellinferenz mit neuraler Blocklinearisierung

高效大语言模型与神经区块线性线性结合的推断

2505.21077v1

988

05-27

Red-Teaming Text-to-Image Systems by Rule-based Preference Modeling

Red-Teaming Text-to-Image-Systeme durch regelbasiertes Preference-Modelling

通过基于规则的首选模式建立红色团队式文本到图像系统

2505.21074v1

989

05-27

A domain adaptation neural network for digital twin-supported fault diagnosis

Ein neuronales Netzwerk für die Domänenanpassung für die digitale Doppel-unterstützte Fehlerdiagnose

数字双支持缺陷诊断领域适应性神经神经网络

2505.21046v1

990

05-27

Scalable and adaptive prediction bands with kernel sum-of-squares

Skalierbare und adaptive Vorhersagebänder mit Kernel-Summe von Quadraten

可缩放和适应性预测带带内核和平方总和的可缩放和适应性预测波段

2505.21039v1

991

05-27

Unraveling Indirect In-Context Learning Using Influence Functions

Indirektes In-Context-Lernen mit Einflussfunktionen entschlüsseln

利用影响功能进行分散的间接间接内文学习

2501.01473v2

992

05-27

CellCLAT: Preserving Topology and Trimming Redundancy in Self-Supervised Cellular Contrastive Learning

CellCLAT: Topologie und Trimming Redundanz im selbstüberwachten zellulären Kontrastiven Lernen erhalten

CellCLAT: 在自我维持的细胞抵触学习中保留地形学和三角再利用

2505.21587v1

993

05-27

Directed Semi-Simplicial Learning with Applications to Brain Activity Decoding

Direktes Semi-Simplizielles Lernen mit Anwendungen zur Entschlüsselung der Gehirnaktivität

定向半简化学习,应用脑活动解码

2505.17939v2

994

05-27

LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms

LLaMEA-BO: Ein evolutionärer Algorithmus für die automatische Generierung Bayesischer Optimierungsalgorithmen

LLAMEA-BO:用于自动生成贝耶斯优化优化生成的大型语言模型进化演化算法

2505.21034v1

995

05-27

Optimizing Case-Based Reasoning System for Functional Test Script Generation with Large Language Models

Optimierung des Case-Based-Reasoning-Systems für die Generierung funktionaler Testskripte mit großen Sprachmodellen

为具有大语言模型的功能测试脚本生成优化基于个案的理由说明系统

2503.20576v3

996

05-27

Generalizable and Robust Spectral Method for Multi-view Representation Learning

Verallgemeinerbare und robuste Spektralmethode für Multi-View Representative Learning

多视角代表制学习通用和强力光谱方法

2411.02138v3

997

05-27

FeatInv: Spatially resolved mapping from feature space to input space using conditional diffusion models

FeatInv: Räumlich aufgelöstes Mapping vom Feature Space zum Input Space mit bedingten Diffusionsmodellen

FeatInv:使用有条件扩散模型从地物空间到输入空间的空间空间的空间分辨率绘图

2505.21032v1

998

05-27

TabAttackBench: A Benchmark for Adversarial Attacks on Tabular Data

TabAttackBench: Ein Benchmark für feindliche Angriffe auf Tabellendaten

TabAttack Bench: 表格数据对抗性攻击基准

2505.21027v1

999

05-27

PaSa: An LLM Agent for Comprehensive Academic Paper Search

PaSa: Ein LLM-Agent für umfassende wissenschaftliche Papiersuche

Pasa: 法学硕士全面学术论文搜索代理

2501.10120v2

1000

05-27

Multi-Mode Process Control Using Multi-Task Inverse Reinforcement Learning

Multi-Mode-Prozesssteuerung mit Multi-Task Inverse Verstärkungslernen

利用多任务反向强化学习进行多模式程序控制

2505.21026v1

1001

05-27

Text-Queried Audio Source Separation via Hierarchical Modeling

Textbefragte Audioquelle Trennung über Hierarchische Modellierung

通过等级制建模模式对文本查询的音频源分离

2505.21025v1

1002

05-27

Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers

Pause Tokens erhöhen streng die Expressivität der konstant-tiefen Transformer

严格提高常数面变换器的表达性

2505.21024v1

1003

05-27

NeuralOM: Neural Ocean Model for Subseasonal-to-Seasonal Simulation

NeuralOM: Neurales Ozeanmodell für die Simulation von Subsaisonal-zu-Seasonal

神经力OM:次季节到季节模拟神经海洋模型

2505.21020v1

1004

05-27

Cardiac Digital Twins at Scale from MRI: Open Tools and Representative Models from ~55000 UK Biobank Participants

Cardiac Digital Twins auf Scale von MRI: Offene Werkzeuge und repräsentative Modelle von ~55000 britischen Biobank-Teilnehmern

来自MRI的大规模心脏病数字双对:来自~55000英国生物库参与者的开放工具和代表模型

2505.21019v1

1005

05-27

Federated Instrumental Variable Analysis via Federated Generalized Method of Moments

通过联邦通用时数方法进行的联邦仪器变量分析

2505.21012v1

1006

05-27

Unified Alignment Protocol: Making Sense of the Unlabeled Data in New Domains

Unified Alignment Protocol: Sense der unmarkierten Daten in neuen Domains

统一对齐协议: 在新域域中感知无标签数据

2505.21010v1

1007

05-27

Transformers in Protein: A Survey

Transformer in Protein: Eine Umfrage

蛋白质变换器:调查

2505.20098v2

1008

05-27

Fairness in Federated Learning: Fairness for Whom?

Fairness im Federated Learning: Fairness für wen?

联邦学习中的公平性:谁的公平性?

2505.21584v1

1009

05-27

Efficient and Unbiased Sampling from Boltzmann Distributions via Variance-Tuned Diffusion Models

Effiziente und unvoreingenommene Probenahme von Boltzmann Distributionen über Variance-Tuned Diffusion Modelle

Boltzmann分销公司通过差异传播模型进行高效和无偏见的抽样

2505.21005v1

1010

05-27

BIPNN: Learning to Solve Binary Integer Programming via Hypergraph Neural Networks

BIPNN: Lernen, Binäre Integer-Programmierung über Hypergraph Neuronale Netzwerke zu lösen

BIPNN: 学习通过超光速神经网络解决二元整数编程

2505.20997v1

1011

05-27

Efficient Identity and Position Graph Embedding via Spectral-Based Random Feature Aggregation

Effiziente Einbettung von Identitäts- und Positionsdiagrammen über spektralbasierte Random Feature Aggregation

通过光谱-基于随机地物聚合的高效身份和位置图嵌入

2505.20992v1

1012

05-27

Identifying Super Spreaders in Multilayer Networks

Identifizieren von Superspreizern in Multilayer-Netzwerken

识别多层网络中的超级传播器

2505.20980v1

1013

05-27

Deep k-grouping: An Unsupervised Learning Framework for Combinatorial Optimization on Graphs and Hypergraphs

Deep k-grouping: Ein unüberwachter Lernrahmen für die kombinatorische Optimierung von Graphen und Hypergraphen

深 k 组: 图形和高光谱组合优化的无人监督的学习框架

2505.20972v1

1014

05-27

Semantic Communication meets System 2 ML: How Abstraction, Compositionality and Emergent Languages Shape Intelligence

Semantische Kommunikation trifft System 2 ML: Wie Abstraktion, Kompositionalität und Emergente Sprachen Formintelligenz

语义通信满足系统2 ML:如何抽象、组成和新兴语言形式情报

2505.20964v1

1015

05-27

Resampling Filter Design for Multirate Neural Audio Effect Processing

Resampling Filter Design für Multirate Neural Audio Effect Processing

多立体神经音频效果处理的抽取过滤器设计

2501.18470v2

1016

05-27

Efficient and Microphone-Fault-Tolerant 3D Sound Source Localization

Effiziente und Mikrofon-Fehler-Tolerante 3D-Soundquelle Lokalisierung

高效的麦克风和麦克风-默认的 3D 声音源源本地化

2505.20961v1

1017

05-27

Personalized Clustering via Targeted Representation Learning

Personalisiertes Clustering über gezieltes Repräsentationslernen

通过有针对性的代表学习进行个性化集群组合

2412.13690v3

1018

05-27

Unveiling Impact of Frequency Components on Membership Inference Attacks for Diffusion Models

Auswirkungen von Frequenzkomponenten auf Mitgliedschafts-Inferenzangriffe für Diffusionsmodelle enthüllen

频率组成部分对传播模型的传播成员推断攻击的不懈影响

2505.20955v1

1019

05-27

More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives

Mehr ist nicht immer besser? Viel-Shot-In-Context-Lernen mit differenzierten und neugewichtigen Zielen verbessern

越多越好,越多越好?用差异化和再加权目标,加强多热化的内流学习

2501.04070v3

1020

05-27

Double Descent Meets Out-of-Distribution Detection: Theoretical Insights and Empirical Analysis on the role of model complexity

Doppelter Abstieg trifft auf Out-of-Distribution Detection: Theoretische Erkenntnisse und empirische Analyse zur Rolle der Modellkomplexität

双重人种与分配外探测:关于模型复杂性作用的理论洞察和经验分析

2411.02184v2

1021

05-27

Recovering Fairness Directly from Modularity: a New Way for Fair Community Partitioning

Fairness direkt aus Modularität zu gewinnen: ein neuer Weg für faire Gemeinschaftspartitionierung

直接从模式中恢复公平:公平社区分割的新途径

2505.22684v1

1022

05-27

Scattering Networks on Noncommutative Finite Groups

Streunetze für nichtkommutative Finite-Gruppen

关于非调解性有限集团的散射网络

2505.20950v1

1023

05-27

shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python

shapr: Erklären von Machine Learning-Modellen mit bedingten Shapley-Werten in R und Python

Shapr:解释R和Python中带有有条件阴影值的机器学习模型

2504.01842v2

1024

05-27

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

Zwei Experten sind alles, was Sie zum Lenken Denken brauchen: Kognitive Bemühungen in MoE-Reasoning-Modellen ohne zusätzliches Training verstärken

两位专家是指导思考所需要的两个专家:在没有额外培训的情况下加强教育部理由说明模式中的认知努力

2505.14681v2

1025

05-27

Efficient Spectral Control of Partially Observed Linear Dynamical Systems

Effiziente Spektralsteuerung teilweise beobachteter linearer dynamischer Systeme

局部观察线性动态系统的有效光谱控制

2505.20943v1

1026

05-27

Towards Training One-Step Diffusion Models Without Distillation

Auf dem Weg zum Training von Ein-Schritt-Diffusionsmodellen ohne Destillation

培训不蒸馏的单级传播模型

2502.08005v3

1027

05-27

Revisiting Sparsity Constraint Under High-Rank Property in Partial Multi-Label Learning

Überprüfung der Sparsamkeitsbeschränkungen unter Hochrangigem Eigentum im Teil-Multi-Label-Lernen

重新审视部分多标签学习中高等级属性下的平等限制

2505.20938v1

1028

05-27

EPIC: Efficient Position-Independent Caching for Serving Large Language Models

EPIC: Effizientes positionsunabhängiges Caching für das Servieren großer Sprachmodelle

EPIC: 高效的、独立定位的为大语言模式服务的工作

2410.15332v3

1029

05-27

Linear Bandits with Non-i.i.d. Noise

Lineare Banditen mit Non-i.i.d. Lärm

带有非i.i.d. 噪音的线形强盗

2505.20017v2

1030

05-27

NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion

NatADiff: Adversariale Grenzführung für natürliche Adversariale Diffusion

NatadADiff: 自然反向扩散反向边界指南

2505.20934v1

1031

05-27

MLMC-based Resource Adequacy Assessment with Active Learning Trained Surrogate Models

MLMC-basierte Ressourcenadäquatitätsbewertung mit aktiven Learning-Trained-Surrogate-Modellen

以MLMC为基础的基于MLMC的资源充足性评估,与积极学习、经过培训的代用模型进行资源充足性评估

2505.20930v1

1032

05-27

Label Leakage in Federated Inertial-based Human Activity Recognition

Label-Leakage in Föderated Inertial-based Human Activity Recognition

以联邦为本的人类活动确认中联邦内地人类活动确认中的Label渗漏

2505.20924v1

1033

05-27

Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

Multi-Agenten-Weltmodellierung aus einer diffusionsinspirierten Perspektive Revue passieren

从传播启发的视角重新审视多股权世界建模

2505.20922v1

1034

05-27

Humble AI in the real-world: the case of algorithmic hiring

Humble KI in der realen Welt: der Fall der algorithmischen Einstellung

现实世界中的黄土人工智能:算法雇用案例

2505.20918v1

1035

05-27

A Kernelised Stein Discrepancy for Assessing the Fit of Inhomogeneous Random Graph Models

Eine zerkleinerte Stein-Diskrepanz für die Beurteilung der Passform von inhomogenen Zufallsgraphenmodellen

用于评估不相容随机图模型是否适合的内核化石 Stein 差异性评估

2505.21580v1

1036

05-27

Exploring the Boundary of Diffusion-based Methods for Solving Constrained Optimization

Erforschung der Grenzen von diffusionsbasierten Methoden zur Lösung eingeschränkter Optimierung

探索以传播为基础的解决受限制的优化的解决方法的界限

2502.10330v3

1037

05-27

A data augmentation strategy for deep neural networks with application to epidemic modelling

Eine Datenvergrößerungsstrategie für tiefe neuronale Netzwerke mit Anwendung in der Epidemiemodellierung

用于流行病建模的深层神经网络数据增强战略

2502.21033v2

1038

05-27

“Oh LLM, I’m Asking Thee, Please Give Me a Decision Tree”: Zero-Shot Decision Tree Induction and Embedding with Large Language Models

“Oh LLM, ich frage dich, bitte gib mir einen Entscheidungsbaum”: Nullschnelle Entscheidungsbauminduktion und Einbettung mit großen Sprachmodellen

“哦,LLM,我问你,请给我一棵决定树”: “零热决定树上演和嵌入大语言模型”

2409.18594v2

1039

05-27

Music Foundation Model as Generic Booster for Music Downstream Tasks

Music Foundation Modell als Generic Booster für Downstream-Aufgaben

音乐基金会模式,作为音乐下流任务通用推进器

2411.01135v3

1040

05-27

Simple Relative Deviation Bounds for Covariance and Gram Matrices

Einfache relative Abweichungen für Kovarianz und Gram Matrices

常数和小数母体的简单相对偏差宽度

2410.05754v3

1041

05-27

Enhancing Performance of Explainable AI Models with Constrained Concept Refinement

Leistungssteigerung erklärbarer KI-Modelle mit eingeschränkter Konzeptverfeinerung

增强可解释的AI 概念改进模型的绩效

2502.06775v2

1042

05-27

Achieving binary weight and activation for LLMs using Post-Training Quantization

Erreichen des binären Gewichts und Aktivierung für LLMs mit Post-Training Quantization

利用培训后量化办法使LLMMs实现二进制加权和激活

2504.05352v2

1043

05-27

Frequency-Aware Masked Autoencoders for Human Activity Recognition using Accelerometers

Frequency-Aware Maskierte Autoencoder für die Erkennung menschlicher Aktivität mit Beschleunigungsmessern

使用加速计识别人类活动的频率软件

2502.17477v2

1044

05-27

How Do Transformers Learn Variable Binding in Symbolic Programs?

Wie lernen Transformer variable Bindungen in Symbolischen Programmen?

变换者如何在符号程序中学习变数绑定 ?

2505.20896v1

1045

05-27

DeepConvContext: A Multi-Scale Approach to Timeseries Classification in Human Activity Recognition

DeepConvContext: Ein mehrstufiger Ansatz zur Zeitreihenklassifizierung in der Anerkennung menschlicher Aktivität

深刻信念:人类活动确认中的时间序列分类的多比额表办法

2505.20894v1

1046

05-27

One-Time Soft Alignment Enables Resilient Learning without Weight Transport

One-Time Soft Alignment ermöglicht resilientes Lernen ohne Gewicht Transport

一次性软对齐使有弹性的学习无需体力运输

2505.20892v1

1047

05-27

ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention

ComplexEhemaliger: Disruptived Advance Transformer Inferenz-Fähigkeit über Head-Specific Complex Vector Achtung

复杂形式:通过头部特定复杂矢量的注意,干扰推进变压器推断能力

2505.10222v2

1048

05-27

Power-Law Decay Loss for Large Language Model Finetuning: Focusing on Information Sparsity to Enhance Generation Quality

Macht-Rechts-Dekay-Verlust für große Sprachmodell Finetuning: Fokussierung auf Informationssparsität zur Verbesserung der Generationsqualität

大语言模型调整的功率法减退损失:侧重于信息平等以提高世代质量

2505.16900v3

1049

05-27

Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective

Auf dem Weg zur Analyse und dem Verständnis der Grenzen von VAPO: Eine theoretische Perspektive

分析和理解VAPO的局限性:理论视角

2505.17997v2

1050

05-27

Fedivertex: a Graph Dataset based on Decentralized Social Networks for Trustworthy Machine Learning

Fedivertex: ein Graph Dataset auf Basis dezentralisierter sozialer Netzwerke für vertrauenswürdiges maschinelles Lernen

Fedivertex:基于分散社会网络的图表数据集,用于可信赖的机器学习

2505.20882v1

1051

05-27

Generalizable Heuristic Generation Through Large Language Models with Meta-Optimization

Generalisierbare Heuristische Generation durch große Sprachmodelle mit Meta-Optimierung

通过配有元-优化的大型语言模型实现可普遍实现的超营养代

2505.20881v1

1052

05-27

Conditional Distribution Compression via the Kernel Conditional Mean Embedding

Conditional Distribution Compression über den Kernel Conditional Mean Embedding

通过内核有条件平均嵌入式压缩有条件分发

2504.10139v2

1053

05-27

Machine Learning - Driven Materials Discovery: Unlocking Next-Generation Functional Materials – A minireview

Machine Learning - Driven Materials Discovery: Locking Next-Generation Functional Materials – Eine Minireview

机器学习 – – 驱动材料发现:解锁下一轮启动功能材料 – – 小型审查

2503.18975v2

1054

05-27

In Context Learning with Vision Transformers: Case Study

Im Kontext Lernen mit Vision Transformers: Fallstudie

与愿景变异者进行背景学习:案例研究

2505.20872v1

1055

05-27

RL-SPH: Learning to Achieve Feasible Solutions for Integer Linear Programs

RL-SPH: Lernen, um durchführbare Lösungen für Integer-Lineare-Programme zu erreichen

RL-SPH:学习为整数线性方案找到可行的解决办法

2411.19517v5

1056

05-27

Leveraging Diffusion Models for Parameterized Quantum Circuit Generation

Nutzung von Diffusionsmodellen für die parameterisierte Quantum Circuit Generation

利用可计量量子电路生成的传播模型

2505.20863v1

1057

05-27

Model Agnostic Differentially Private Causal Inference

Modell Agnostisch unterschiedliche private Kausalableitung

示范性Agnistic 区分法私人原因推断

2505.19589v2

1058

05-27

UOD: Unseen Object Detection in 3D Point Cloud

UOD: Unsichtbare Objekterkennung in 3D-Punkt-Cloud

UOD: 3D点云中未见物体探测

2401.03846v2

1059

05-27

Aggregation Buffer: Revisiting DropEdge with a New Parameter Block

Aggregation Buffer: DropEdge mit einem neuen Parameterblock erneut aufrufen

聚合缓冲:用新参数块重新检查下坡面

2505.20840v1

1060

05-27

Tuning LLM Judge Design Decisions for 1/1000 of the Cost

Tuning LLM Richter Design Entscheidungen für 1/1000 der Kosten

1 000美元费用1 000美元法官设计决定

2501.17178v4

1061

05-27

HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling

HAD: Hybride Architektur Destillation übertrifft Lehrer in genomischer Sequenzmodellierung

HAD:混合结构蒸馏(混合结构蒸馏)

2505.20836v1

1062

05-27

Beyond Semantics: The Unreasonable Effectiveness of Reasonless Intermediate Tokens

Jenseits von Semantik: Die unvernünftige Wirksamkeit von vernünftigen Zwischenmarken

超越语义:无理性中肯的不合理效力

2505.13775v2

1063

05-27

Concentration Distribution Learning from Label Distributions

Konzentrationsverteilung Lernen von Etikettenverteilungen

从标签分发中学习

2505.21576v1

1064

05-27

The Third Pillar of Causal Analysis? A Measurement Perspective on Causal Representations

Die dritte Säule der Kausalanalyse? Eine Messperspektive auf Kausaldarstellungen

Causal 分析的第三个支柱? Causal 代表比例的衡量观点

2505.17708v2

1065

05-27

HybridLinker: Topology-Guided Posterior Sampling for Enhanced Diversity and Validity in 3D Molecular Linker Generation

HybridLinker: Topologie-geführte hintere Probenahme für verbesserte Diversität und Validität in der 3D-Molekularlinker-Generation

GlubLinker: 3D 分子联系器生成中加强多样性和有效性的地形学-指导外表抽样

2502.17349v3

1066

05-27

Do We Need All the Synthetic Data? Towards Targeted Synthetic Image Augmentation via Diffusion Models

Brauchen wir alle synthetischen Daten? Auf dem Weg zu einer gezielten Synthetischen Bildvergrößerung über Diffusionsmodelle

我们需要所有合成数据吗?通过扩散模型实现有针对性的合成图像增强

2505.21574v1

1067

05-27

Spectral-inspired Neural Operator for Data-efficient PDE Simulation in Physics-agnostic Regimes

Spektral-inspirierter Neuraloperator für dateneffiziente PDE-Simulation in physik-agnostischen Regimes

物理 – – 不可知系统数据高效PDE模拟光导神经操作器

2505.21573v1

1068

05-27

Convergence of Clipped-SGD for Convex $(L_0,L_1)$-Smooth Optimization with Heavy-Tailed Noise

Konvergenz von Clipped-SGD für Convex $(L_0,L_1)$-Smooth-Optimierung mit schwerfälligem Lärm

使用 Cllipped-SGD 组合(L_0,L_1) $- 与重故障噪音平滑优化

2505.20817v1

1069

05-27

Mixture of Low Rank Adaptation with Partial Parameter Sharing for Time Series Forecasting

Mischung aus Low-Rank-Anpassung mit Teilparameter-Sharing für Zeitreihen-Prognose

低级别适应与时间序列预测部分参数共享混合

2505.17872v2

1070

05-27

Interpretable Credit Default Prediction with Ensemble Learning and SHAP

Interpretierbare Credit Default Vorhersage mit Ensemble Learning und SHAP

组合学习和SHAP的可解释信用默认预测

2505.20815v1

1071

05-27

Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary Domains

Geometry Aware Operator Transformer als effizientes und präzises Neural Surrogate für PDEs auf willkürlichen Domains

操作者变异器作为任意域中PDEs的高效和准确神经外壳

2505.18781v2

1072

05-27

Thickness-aware E(3)-Equivariant 3D Mesh Neural Networks

Dicke bewusst E(3)-Equivariante 3D-Mesh-Neurale Netze

E(3)-等离 3D 3D 气象神经网络

2505.21572v1

1073

05-27

Step-wise Adaptive Integration of Supervised Fine-tuning and Reinforcement Learning for Task-Specific LLMs

Schrittweise adaptive Integration von überwachtem Feinabstimmungs- und Verstärkungslernen für aufgabenspezifische LLMs

监督特定任务专责性微调和强化学习的渐进式适应性整合

2505.13026v2

1074

05-27

Simple yet Effective Graph Distillation via Clustering

Einfache und dennoch effektive Graphendestillation über Clustering

通过集群进行简单而有效的图形蒸馏

2505.20807v1

1075

05-27

FCOS: A Two-Stage Recoverable Model Pruning Framework for Automatic Modulation Recognition

FCOS: Ein zweistufiges, wiederherstellbares Modell-Beschneidungs-Framework für die automatische Modulationserkennung

FCOS: 自动调整识别的双层可回收模型保护框架

2505.21571v1

1076

05-27

Quantum Machine Learning in Healthcare: Evaluating QNN and QSVM Models

Quantum Machine Learning in Healthcare: Bewertung von QNN- und QSVM-Modellen

QNN和QSVM模型评估 QNN和QSVM模型

2505.20804v1

1077

05-27

Sentiment Reasoning for Healthcare

Sentiment Reasoning für die Gesundheitsversorgung

保健的情感理由

2407.21054v4

1078

05-27

Leaner Transformers: More Heads, Less Depth

Leaner Transformer: Mehr Köpfe, weniger Tiefe

皮质变形器: 更多的头, 更少深度

2505.20802v1

1079

05-27

Multi-VQC: A Novel QML Approach for Enhancing Healthcare Classification

Multi-VQC: Ein neuartiger QML-Ansatz zur Verbesserung der Gesundheitsklassifikation

多VQC:加强保健分类的新QML方法

2505.20797v1

1080

05-27

A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models

Eine Graphenperspektive zur Untersuchung struktureller Wissensmuster in großen Sprachmodellen

《大语言模式知识结构模式研究图示展望》

2505.19286v2

1081

05-27

Amortized Bayesian Workflow

Amortisierter Bayesischer Workflow

摊还的贝耶斯人工作流量

2409.04332v2

1082

05-27

Where You Place the Norm Matters: From Prejudiced to Neutral Initializations

Wo Sie die Norm-Materien platzieren: Von voreingenommenen zu neutralen Initialisierungen

将规范问题放在哪里: 从偏见到中立初始化

2505.11312v3

1083

05-27

Enhancing Wearable Tap Water Audio Detection through Subclass Annotation in the HD-Epic Dataset

Verbesserung der tragbaren Wasserhahn-Audioerkennung durch Unterklasse-Annotation im HD-Epic-Datensatz

通过在HD-Epic数据集中分级注解,加强穿戴式塔普水音频探测

2505.20788v1

1084

05-27

LIB-KD: Learning Inductive Bias, Not Just Parameters A New Perspective on Knowledge Distillations

LIB-KD: Induktive Bias lernen, nicht nur Parameter Eine neue Perspektive auf Wissensdestillationen

LIB-KD:学习感性偏见,而不仅仅是知识蒸馏的新视角参数

2310.00369v3

1085

05-27

Low-Rank Adapting Models for Sparse Autoencoders

Low-Rank Anpassungsmodelle für Sparse Autoencoder

普通自动解析器低 Rank 适应模型

2501.19406v2

1086

05-27

STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation

STITCH-OPE: Trajektorienstiche mit geführter Diffusion für Off-Policy-Bewertung

STSTTCH-OPE: 非政策评价的引导传播的轨迹

2505.20781v1

1087

05-27

SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long Sequences

SpecExtend: Ein Drop-in-Enhancement für spekulative Decoding von langen Sequenzen

外观:对长期序列的投机性代谢的减少增强

2505.20776v1

1088

05-27

T-REX: Mixture-of-Rank-One-Experts with Semantic-aware Intuition for Multi-task Large Language Model Finetuning

T-REX: Mixture-of-Rank-One-Experts mit semantischer Intuition für Multi-Task Large Language Model Finetuning

T-REX:多任务大语言模型微调中具有语义认知度的多任务大语言模型微调混合型兰克单方专家

2404.08985v2

1089

05-27

Non-invasive maturity assessment of iPSC-CMs based on optical maturity characteristics using interpretable AI

Nicht-invasive Bewertung der Laufzeit von iPSC-CMs auf der Grundlage optischer Reifemerkmale unter Verwendung interpretierbarer KI

使用可解释的AI根据光学成熟度特性对iPSC-CMMs进行非侵入性成熟度评估

2505.20775v1

1090

05-27

TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state

TimePro: Effiziente Multivariate Langzeit-Zeitreihen-Prognose mit variabler und zeitversetzter Hyperstate

具有可变和时间warware超状态预测的高效多变长期时间序列

2505.20774v1

1091

05-27

MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning

MetaSlot: Durchbruch durch die feste Anzahl von Slots im Objekt-Zentrischen Lernen

MetaSlot: 打破对象中心学习中的固定空格数

2505.20772v1

1092

05-27

ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

ChemHAS: Hierarchische Agenzien-Stacking zur Verbesserung von Chemiewerkzeugen

ChemHAS:加强化学工具的等级代理人

2505.21569v1

1093

05-27

Divide-Fuse-Conquer: Eliciting “Aha Moments” in Multi-Scenario Games

Divide-Fuse-Conquer: Eliciting “Aha Momente” in Multi-Szenario-Spiele

分裂-裂变:在多种场景运动会中激发“哈动力”

2505.16401v2

1094

05-27

Robust and Explainable Detector of Time Series Anomaly via Augmenting Multiclass Pseudo-Anomalies

Robuster und erklärbarer Detektor der Zeitreihenanomalie durch Augmenting-Multiclass-Pseudoanomalien

通过增强多级优度反射器反射反射器,对时间序列时间序列进行强力和可解释的探测器

2505.20765v1

1095

05-27

ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval

ConText-CIR: Von Konzepten lernen im Text für das komponierte Bild-Retrieval

ConText-CIR:从合成图像检索文本中的概念学习

2505.20764v1

1096

05-27

Learning to Explain Air Traffic Situation

Erklären der Lage im Luftverkehr

学习解释空中交通状况

2502.10764v2

1097

05-27

Practical estimation of the optimal classification error with soft labels and calibration

Praktische Schätzung des optimalen Klassifizierungsfehlers mit Softlabels und Kalibrierung

用软标签和校准校准对最佳分类错误的实际估计

2505.20761v1

1098

05-27

Multi-Stage Speaker Diarization for Noisy Classrooms

Mehrstufige Speaker-Diarisierung für Lärmklassenräume

多级发言人多级发言人吵闹教室的响声

2505.10879v2

1099

05-27

Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model

Paarweise Optimale Transporte für Training All-to-All Flow-Based Condition Transfer Modell

以对等方式最佳运输培训全到所有流动条件转让模式

2504.03188v2

1100

05-27

Scalable Model Merging with Progressive Layer-wise Distillation

Skalierbares Modell Zusammenführen mit progressiver schichtweiser Destillation

可缩放模型与递进图层蒸馏法合并

2502.12706v2

1101

05-27

Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction

Uni-Instruct: Einstufiges Diffusionsmodell durch Unified Diffusion Divergence Instruction

Uni- Instruct: 通过统一扩散分散指令单步扩散模型

2505.20755v1

1102

05-27

Stationary MMD Points for Cubature

Stationäre MMD-Punkte für Kubature

Cubature 固定的 MMMD点

2505.20754v1

1103

05-27

EaqVLA: Encoding-aligned Quantization for Vision-Language-Action Models

EaqVLA: Kodierungsorientierte Quantisierung für Vision-Language-Action-Modelle

EaqVLA: 愿景-语言-行动模式的编码和一致的量化

2505.21567v1

1104

05-27

Map Space Belief Prediction for Manipulation-Enhanced Mapping

Karte Raum Glaube Vorhersage für manipulations-verbesserte Mapping

人工-增强绘图的地图空间信仰预测

2502.20606v2

1105

05-27

MOLLM: Multi-Objective Large Language Model for Molecular Design – Optimizing with Experts

MOLLM: Multi-Objective Large Language Model for Molecular Design – Optimierung mit Experten

MOLLM: 分子设计多目标大语言模型 – – 与专家优化

2502.12845v2

1106

05-27

‘Hello, World!’: Making GNNs Talk with LLMs

“Hallo, Welt!”: GNNs mit LLMs sprechen zu lassen

“你好,世界!” “让GNNs和LLMs说话”

2505.20742v1

1107

05-27

Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?

Können kleine Sprachmodelle Geräuschmuster lernen, nicht lernen und erhalten?

小语言模型能够学习、不学习和保留噪音模式吗?

2407.00996v3

1108

05-27

Detecting Informative Channels: ActionFormer

Informative Kanäle erkennen: AktionEhemaliger

检测信息渠道:行动前

2505.20739v1

1109

05-27

Adversarial bandit optimization for approximately linear functions

Adversariale Bandit-Optimierung für etwa lineare Funktionen

大约直线功能的对面土匪优化

2505.20734v1

1110

05-27

SPA-RL: Reinforcing LLM Agents via Stepwise Progress Attribution

SPA-RL: Verstärkung der LLM-Agenten durch schrittweise Fortschrittszuweisung

SPA-RL:通过逐步推进加强LLM代理

2505.20732v1

1111

05-27

Semi-supervised Clustering Through Representation Learning of Large-scale EHR Data

Halbüberwachtes Clustering durch Repräsentationslernen von EHR-Großdaten

通过代表学习大规模电子人力资源数据,进行半监督的集群组合

2505.20731v1

1112

05-27

What LLMs Miss in Recommendations: Bridging the Gap with Retrieval-Augmented Collaborative Signals

Was LLMs in Empfehlungen vermissen: Die Lücke mit retrieval-Augmented Collaborative Signals überbrücken

在建议中错过了什么的LLM女士:用检索增强的合作信号弥合差距

2505.20730v1

1113

05-27

Energy-based generator matching: A neural sampler for general state space

Energiebasierte Generator-Matching: Ein neuronaler Sampler für den allgemeinen Zustandsraum

基于能源的发电机匹配:一般状态空间的神经取样器

2505.19646v2

1114

05-27

A reinforcement learning agent for maintenance of deteriorating systems with increasingly imperfect repairs

Ein Verstärkungs-Lernmittel für die Instandhaltung von verschlechternden Systemen mit zunehmend unvollkommenen Reparaturen

强化学习代理,用于维护修理越来越不完善的恶化系统

2505.20725v1

1115

05-27

LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation

LeDiFlow: Erlernter, verteilungsgeführter Fluss passend zur beschleunigten Bildgenerierung

LediFlow:为加速图像生成而实现的派发指导流动匹配

2505.20723v1

1116

05-27

Diffusion Model-based Activity Completion for AI Motion Capture from Videos

Diffusion Modellbasierte Aktivitätsvervollständigung für AI Motion Capture aus Videos

AI 从视频中抓取 AI 运动的传播示范活动完成

2505.21566v1

1117

05-27

Recurrent Neural Operators: Stable Long-Term PDE Prediction

Recurrent Neural Operators: Stabile Langzeit-PDE-Vorhersage

经常性神经操作员:稳定的长期PDE预测

2505.20721v1

1118

05-27

ProgCo: Program Helps Self-Correction of Large Language Models

ProgCo: Programm hilft bei der Selbstkorrektur großer Sprachmodelle

ProgC:帮助大语言模式自我校正方案

2501.01264v2

1119

05-27

LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models

LatentExplainer: Erklären von latenten Darstellungen in tiefgenerativen Modellen mit multimodalen großen Sprachmodellen

前任Explainer:在多模式大语言模型的深创模型中解释前述表述

2406.14862v6

1120

05-27

PCDCNet: A Surrogate Model for Air Quality Forecasting with Physical-Chemical Dynamics and Constraints

PCDCNet: Ein Surrogate-Modell für die Luftqualitätsprognose mit physikalisch-chemischer Dynamik und Einschränkungen

PCDCNet:利用物理化学动态和制约因素进行空气质量预测的替代模型

2505.19842v2

1121

05-27

What is Fair? Defining Fairness in Machine Learning for Health

Was ist fair? Fairness im maschinellen Lernen für die Gesundheit definieren

什么是公平?界定机器保健学习的公平性

2406.09307v5

1122

05-27

Are Data Embeddings effective in time series forecasting?

Sind Daten-Embeddings in der Zeitreihenvorhersage wirksam?

数据嵌入在时间序列预测中是否有效?

2505.20716v1

1123

05-27

Wideband RF Radiance Field Modeling Using Frequency-embedded 3D Gaussian Splatting

Wideband RF Radiance Field Modellierung mit Frequenz eingebettet 3D Gaussian Splatting

使用频率组合的 3D 高斯平面

2505.20714v1

1124

05-27

Does Graph Prompt Work? A Data Operation Perspective with Theoretical Analysis

Funktioniert Graph Prompt? Eine Datenbetriebsperspektive mit theoretischer Analyse

《图表迅速工作吗? 带有理论分析的数据操作视角》

2410.01635v2

1125

05-27

Time-Series Learning for Proactive Fault Prediction in Distributed Systems with Deep Neural Structures

Time-Series Learning für proaktive Fehlervorhersage in verteilten Systemen mit tiefen neuralen Strukturen

深心神经结构分布系统预发性故障预测时间序列学习

2505.20705v1

1126

05-27

NeUQI: Near-Optimal Uniform Quantization Parameter Initialization

NeUQI: Beinahe-optimale einheitliche Quantisierung Parameter Initialisierung

NeUQI: 近最佳统一量化参数初始化

2505.17595v2

1127

05-27

Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases

Zwischen Circuits und Chomsky: Pre-Pretraining auf Formal Languages Imparts Linguistic Biases

巡回巡回和乔姆斯基之间:正式语言语言语言预科培训

2502.19249v2

1128

05-27

vCache: Verified Semantic Prompt Caching

vCache: Verifizierter semantischer Prompt-Caching

vCache: 校验语义快速缓冲

2502.03771v3

1129

05-27

Multi-instance Learning as Downstream Task of Self-Supervised Learning-based Pre-trained Model

Multi-Instance-Lernen als Downstream-Aufgabe des selbstüberwachten Learning-basierten vortrainierten Modells

将多机构学习作为自监督学习模式培训前模式的下游任务

2505.21564v1

1130

05-27

Sparsified State-Space Models are Efficient Highway Networks

Sparsifizierte State-Space-Modelle sind effiziente Highway-Netzwerke

国家空间模型是高效公路网

2505.20698v1

1131

05-27

Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models

Token-Level Akzeptieren oder ablehnen: Ein Micro Alignment-Ansatz für große Sprachmodelle

接受或拒绝时肯级别:大语言模式微调整方法

2505.19743v2

1132

05-27

Generating Hypotheses of Dynamic Causal Graphs in Neuroscience: Leveraging Generative Factor Models of Observed Time Series

Generieren von Hypothesen dynamischer Kausalgraphen in der Neurowissenschaft: Nutzung generativer Faktorenmodelle beobachteter Zeitreihen

在神经科学中生成动态因果图的假设:利用观测时间序列的生成因数模型

2505.20697v1

1133

05-27

Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration

Navigieren Sie das Unbekannte: Verbesserung der LLM-Vernunft mit intrinsischer Motivation geführte Exploration

导航未知:利用内在动力性引导探索加强LLM

2505.17621v2

1134

05-27

Temporal Saliency-Guided Distillation: A Scalable Framework for Distilling Video Datasets

Temporale Saliency-geführte Destillation: Ein skalierbares Framework für die Destillierung von Videodatensätzen

时间性盐度-指导蒸馏:用于蒸馏视频数据集的可缩放框架

2505.20694v1

1135

05-27

Phir Hera Fairy: An English Fairytaler is a Strong Faker of Fluent Speech in Low-Resource Indian Languages

Phir Hera Fairy: Ein englisches Märchen ist ein starker Faker der fließenden Rede in Low-Resource indischen Sprachen

Phir Hera Fairy:英国仙女是印度低资源语言流利流利的有力名人

2505.20693v1

1136

05-27

Evidential Deep Active Learning for Semi-Supervised Classification

Evidentielles tiefes aktives Lernen für semi-überwachte Klassifikation

半监督分类的证明深层积极学习

2505.20691v1

1137

05-27

Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Beschleunigung der RL für LLM-Vernunft mit optimaler Regression

以最优优势回归加速 LLL 来计算LLM 加速RL 原因

2505.20686v1

1138

05-27

A Survey of LLM $\times$ DATA

Eine Umfrage über LLM $\times$ DATEN

对LLLM 美元-美元-美元-美元-数据数据的调查

2505.18458v2

1139

05-27

MODULI: Unlocking Preference Generalization via Diffusion Models for Offline Multi-Objective Reinforcement Learning

MODULI: Locking Preference Generalization via Diffusion Models for Offline Multi-Objective Reinforcement Learning

MODULI:通过离线多目标强化学习扩散模型解锁普及

2408.15501v2

1140

05-27

SELF-PERCEPT: Introspection Improves Large Language Models’ Detection of Multi-Person Mental Manipulation in Conversations

SELF-PERCEPT: Introspection verbessert die Erkennung von Multi-Person-Gedankenmanipulation in Gesprächen durch große Sprachmodelle

SELF-PERCEPT: 调查改进大语言模型在对话中探测多人心理操纵

2505.20679v1

1141

05-27

Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System

Viele Köpfe sind besser als eins: Verbesserte wissenschaftliche Idee-Generation durch ein LLM-basiertes Multi-Agent-System

许多领导人比一个领导人好得多:由以LLM为基础的多种机构系统改进科学思想的一代

2410.09403v4

1142

05-27

LLM-Guided Reinforcement Learning: Addressing Training Bottlenecks through Policy Modulation

LLM-geführtes Stärkungslernen: Bewältigung von Ausbildungsengpässen durch politische Modulation

LLM-LLM-指导强化学习:通过政策调整解决培训瓶颈问题

2505.20671v1

1143

05-27

From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

Vom Sehen zum Tun: Überbrücken von Vernunft und Entscheidung für die Robotermanipulation

从看到做:机器人操纵的搭桥理由和决定

2505.08548v2

1144

05-27

RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts

RE-Bench: Bewertung der KI-FuE-Fähigkeiten von Sprachmodellagenten gegen menschliche Experten

RE-BENCH: 对照人类专家评估语言模范代理商的AI研究与开发的前沿能力

2411.15114v2

1145

05-27

Predicting and Understanding College Student Mental Health with Interpretable Machine Learning

Vorhersagen und Verständnis College Student Mental Health mit Interpretable Machine Learning

预测和理解学院学生心理健康与可解释机器学习

2503.08002v2

1146

05-27

Continuous-Time Attention: PDE-Guided Mechanisms for Long-Sequence Transformers

Continuous-Time-Achtung: PDE-geführte Mechanismen für lange Sequenztransformatoren

持续关注:长序列变换者PDE-指导机制

2505.20666v1

1147

05-27

Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond

Auf dem Weg zu LLM Unlearning Resilient to Relearning Attacks: Eine scharfsinnige Minimierungsperspektive und darüber hinaus

走向LLM 学会学会学会学会重新学习攻击的不学习能力:锐化-尽量减少知识的视角及展望

2502.05374v4

1148

05-27

BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting Models

BLAST: Ausgewogene Zeitreihen für universelle Vorhersagemodelle

BLAST: 通用预测模型平衡抽样时间序列

2505.17871v2

1149

05-27

Generalized and Personalized Federated Learning with Foundation Models via Orthogonal Transformations

Generalisiertes und personalisiertes Federated Learning mit Gründungsmodellen über Orthogonale Transformationen

通过矫形转变形成基础模型的通用和个性化联邦学习

2505.19888v2

1150

05-27

ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

ReMA: Meta-Denken lernen für LLMs mit Multi-Agenten-Verstärkungs-Lernen

ReMA:学习多机构强化学习的LLMLM的元思维

2503.09501v3

1151

05-27

How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines

Wie können neurale Netzwerke mit Skalierungsgesetzen ausgebaut werden? Eine Umfrage und praktische Leitlinien

如何提升具有扩展法的神经网络?

2502.12051v3

1152

05-27

Enhancing Time Series Forecasting via a Parallel Hybridization of ARIMA and Polynomial Classifiers

Verbesserung der Zeitreihenprognose über eine parallele Hybridisierung von ARIMA und Polynom-Klassifikatoren

通过ARIMA和多边分类的平行混合预测增强时间序列

2505.06874v2

1153

05-27

An Optimisation Framework for Unsupervised Environment Design

Ein Rahmen für die Optimierung des unbeaufsichtigten Umweltdesigns

无人监督环境设计优化框架

2505.20659v1

1154

05-27

When More is Less: Understanding Chain-of-Thought Length in LLMs

Wenn mehr weniger ist: Verstehst du die Kettenlänge in LLMs?

越少越多: 了解LLM 中所寻求的链条长度

2502.07266v3

1155

05-27

Prompting Decision Transformers for Zero-Shot Reach-Avoid Policies

Prompting Decision Transformers für Zero-Shot-Reach-Aoid-Politiken

推动零热切无损政策决策变革者

2505.19337v2

1156

05-27

New Paradigm of Adversarial Training: Releasing Accuracy-Robustness Trade-Off via Dummy Class

Neuer Paradigma der Adversarial Training: Freigabe von Genauigkeit-Robustheit-Trade-Off über Dummy-Klasse

反向培训新范例:通过Dummi类实现释放准确性-交战交易

2410.12671v2

1157

05-27

FRABench and GenEval: Scaling Fine-Grained Aspect Evaluation across Tasks, Modalities

FRABench und GenEval: Skalierung feinkörniger Aspekte Bewertung über Aufgaben, Modalitäten hinweg

FRA Bench和GenEval:扩大对各任务、方式、方式和方式的精细评价

2505.12795v2

1158

05-27

Voronoi-grid-based Pareto Front Learning and Its Application to Collaborative Federated Learning

Voronoi-Grid-basiertes Pareto-Front-Lernen und seine Anwendung auf kollaboratives Federated Learning

以Voronoi-Grid为基础的Pareto阵线学习及其在联邦学习合作组织中的应用

2505.20648v1

1159

05-27

Moment Expansions of the Energy Distance

Momenterweiterungen der Energieentfernung

扩大能源距离时间

2505.20647v1

1160

05-27

Evaluating Training in Binarized Neural Networks Through the Lens of Algorithmic Information Theory

Bewertung der Ausbildung in Binarized Neural Networks durch die Linse der algorithmischen Informationstheorie

通过分析信息理论的透镜评估神经网络的觉测培训

2505.20646v1

1161

05-27

Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain

Aufgabenoptimierte konvolutionäre recurrente Netzwerke richten sich an taktile Verarbeitung im Nagetierhirn

与鼠脑中触摸处理相适应的任务优化的革命经常网络

2505.18361v2

1162

05-27

Can Past Experience Accelerate LLM Reasoning?

Kann vergangene Erfahrung LLM Reasoning beschleunigen?

以往经验能否加快LLM理由解释?

2505.20643v1

1163

05-27

PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

PosterO: Strukturierung von Layout-Strukturen zur Aktivierung von Sprachmodellen in der Generierung von generalisierten Content-Aware-Layouts

PosterO: 构建布局树以在通用内容软件布局生成中启用语言模型

2505.07843v2

1164

05-27

Rethinking MUSHRA: Addressing Modern Challenges in Text-to-Speech Evaluation

Rethinking MUSHRA: Bewältigung moderner Herausforderungen in der Text-zu-Speech-Bewertung

重新思考MUSHRA:应对文本到语音评价中的现代挑战

2411.12719v3

1165

05-27

Pointing the Way: Refining Radar-Lidar Localization Using Learned ICP Weights

Den Weg weisen: Verfeinerung der Radar-Lidar-Lokalisierung mit erfahrenen ICP-Gewichten

指向方向:利用比较方案所积累的重量改进雷达-里达尔的本地化

2309.08731v4

1166

05-27

GMoE: Empowering LLMs Fine-Tuning via MoE Graph Collaboration

GMoE: Stärkung von LLMs Feinsteuerung über MoE Graph Collaboration

GMOE:通过教育部图表合作,赋予LLMs Fine-Turning女士权力

2412.16216v3

1167

05-27

Non-identifiability distinguishes Neural Networks among Parametric Models

Nicht-Identifizierbarkeit unterscheidet neurale Netzwerke zwischen parametrischen Modellen

不可识别性将神经网络区分为参数模型

2504.18017v2

1168

05-27

Scintillation pulse characterization with spectrum-inspired temporal neural networks: case studies on particle detector signals

Scintillation-Pulscharakterisierung mit spektruminspirierten zeitlichen neuronalen Netzwerken: Fallstudien zu Partikeldetektor-Signalen

与受频谱启发的时时神经网络的闪烁脉冲定性:粒子探测器信号案例研究

2410.07267v3

1169

05-27

Policy Design for Two-sided Platforms with Participation Dynamics

Politikgestaltung für zweiseitige Plattformen mit Partizipationsdynamik

具有参与动态的双面平台政策设计

2502.01792v2

1170

05-27

Explaining Concept Shift with Interpretable Feature Attribution

Erklären von Konzeptverschiebungen mit interpretierbarer Eigenschaftszuweisung

解释解释概念转变与可解释性地物归属

2505.20634v1

1171

05-27

Adaptive Backtracking Line Search

Adaptive Rückverfolgungszeilensuche

适应性后回跟踪线搜索

2408.13150v2

1172

05-27

Test-Time Learning for Large Language Models

Test-Time Learning für große Sprachmodelle

大语言模型试验时间学习

2505.20633v1

1173

05-27

Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training

Einschließlich flexibler Bildkonditionierung in Text-zu-Video-Diffusionsmodelle ohne Training

将灵活的图像条件纳入无培训的文本到视频传播模型

2505.20629v1

1174

05-27

Position: Adopt Constraints Over Penalties in Deep Learning

Position: Überstrapazierte Strafen im Deep Learning adoptieren

职位:在深深学习中采用约束措施以凌驾刑罚

2505.20628v1

1175

05-27

JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 Minutes

JaxRobotarium: Schulung und Einsatz von Multi-Roboter-Politik in 10 Minuten

JaxRobotior:10分钟内培训和部署多机器人政策

2505.06771v2

1176

05-27

Knowledge Distillation Approach for SOS Fusion Staging: Towards Fully Automated Skeletal Maturity Assessment

Wissensdestillationsansatz für SOS-Fusionsstaging: Auf dem Weg zu einer vollautomatischen Skeletalreifebewertung

利用知识蒸馏方法解决求求求融合问题:全面自动化骨骼成熟期评估

2505.21561v1

1177

05-27

SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation

SeqPO-SiMT: Sequentielle Politikoptimierung für die gleichzeitige maschinelle Übersetzung

SeqPO-SIMT:同步机器翻译的序列政策优化

2505.20622v1

1178

05-27

Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning

Mehrstufige Zertifizierte Verteidigung gegen vergiftende Angriffe im Offline-Verstärkungslernen

多级认证防卫,防止在离线强化学习中进行毒物攻击

2505.20621v1

1179

05-27

An Inexact Halpern Iteration with Application to Distributionally Robust Optimization

Eine ungenaue Halpern-Iteration mit Anwendung zur distributiv robusten Optimierung

用于分布强力优化优化的不精确 Halpern 迭代

2402.06033v3

1180

05-27

SoftPQ: Robust Instance Segmentation Evaluation via Soft Matching and Tunable Thresholds

SoftPQ: Robuste Instance Segmentierungsbewertung über Soft Matching und Tunable Thresholds

软PQ:通过软匹配和金枪鱼分量阈值进行强力实例分化评价

2505.12155v2

1181

05-27

Real-Time Stress Monitoring, Detection, and Management in College Students: A Wearable Technology and Machine-Learning Approach

Echtzeit-Stress-Monitoring, Detection und Management in College-Studenten: Ein Wearable-Technologie- und Machine-Learning-Ansatz

大学生实时应力监测、检测和管理:穿戴技术和机械学习方法

2505.15974v2

1182

05-27

LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers

LLM-FE: Automatisiertes Feature Engineering für Tabellendaten mit LLMs als Evolutionsoptimierer

LLM-FE: 制表数据的自动地貌工程,LLMM作为进化优化器

2503.14434v2

1183

05-27

PhySense: Sensor Placement Optimization for Accurate Physics Sensing

PhySense: Sensor-Platzierungs-Optimierung für präzise Physik Sensing

感应:精确物理遥感传感器定位优化

2505.18190v2

1184

05-27

Intelligent Incident Hypertension Prediction in Obstructive Sleep Apnea

Intelligente Hypertonie-Vorhersage bei obstruktiver Schlafapnoe

阻碍睡眠的智能性事件超强度预测

2505.20615v1

1185

05-27

A Concentration Bound for TD(0) with Function Approximation

Ein Konzentrationsbund für TD(0) mit Funktionsannäherung

具有函数接近度的 TD(0) 的浓度界值

2312.10424v3

1186

05-27

REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning

实际检索: 数学理由的回收增量精液预言

2505.20613v1

1187

05-27

Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models

Roboflow100-VL: Ein Multi-Domain-Objekterkennungs-Benchmark für Vision-Language-Modelle

机器人流100-VL:愿景-语言模型多功能物体探测基准

2505.20612v1

1188

05-27

Hierarchical Mamba Meets Hyperbolic Geometry: A New Paradigm for Structured Language Embeddings

Hierarchische Mamba trifft auf Hyperbolische Geometrie: Ein neues Paradigma für strukturierte Spracheinbettungen

等级式 Mamba 相遇超双曲几何: 结构化语言嵌入的新范式

2505.18973v2

1189

05-27

Integral Imprecise Probability Metrics

Integral Ungenaue Wahrscheinlichkeits-Metriken

综合综合不全性障碍概率概率度量

2505.16156v2

1190

05-27

Improving Generative Inverse Design of Rectangular Patch Antennas with Test Time Optimization

Verbesserung des generativen Inversen Designs von rechteckigen Patchantennen mit Testzeitoptimierung

改进带测试时间优化的矩形补边天线的生成反向设计

2505.18188v2

1191

05-27

InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling

InstGenIE: Generative Bildbearbeitung mit Mask-aware Caching und Scheduling effizient gemacht

InstGenie: 生成图像编辑, 高效使用防面具图像缓冲和排程

2505.20600v1

1192

05-27

Randomly Sampled Language Reasoning Problems Explain Limits of LLMs

Zufällig gemusterte Sprachbegründungsprobleme erklären Grenzen von LLMs

随机抽样语言原因问题解释LLMM限制

2501.02825v5

1193

05-26 (1)

GenMol: A Drug Discovery Generalist with Discrete Diffusion

GenMol: Ein Drug Discovery Generalist mit diskreter Diffusion

GenMol: 具有分辨扩散作用的药物发现通俗主义者

2501.06158v2

1194

05-26

Prot2Token: A Unified Framework for Protein Modeling via Next-Token Prediction

Prot2Token: Ein einheitliches Framework für Proteinmodellierung über Next-Token-Vorhersage

Prot2Token:通过次声预测建立蛋白模型的统一框架

2505.20589v1

1195

05-26

Bidirectional Variational Autoencoders

Bidirektionale Variationale Autoencoder

双向多向自动自动编码器

2505.16074v2

1196

05-26

Balancing Performance and Costs in Best Arm Identification

Ausgewogene Leistung und Kosten bei der Ermittlung der besten Waffen

平衡最佳武器识别的性能和费用

2505.20583v1

1197

05-26

Training a Generally Curious Agent

Ein allgemein neugieriger Agent ausbilden

a 训练一般好奇剂

2502.17543v3

1198

05-26

Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL

Strg-DNA: Kontrollierbare Zell-Typ-spezifische Regulatorische DNA-Design über eingeschränkte RL

Ctrl-DNA:通过受控RL设计可控细胞-Type-具体监管DNA

2505.20578v1

1199

05-26

Emotion Classification In-Context in Spanish

Emotion Classification In-Context auf Spanisch

西班牙文《情感分类西班牙文内引文》

2505.20571v1

1200

05-26

Bi-Level Unsupervised Feature Selection

Bi-Level-Unüberwachte Feature-Auswahl

双级不受监督的地物选择

2505.20563v1

1201

05-26

Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

Jenseits von Markovian: Reflektierende Exploration über Bayes-Adaptive RL für LLM-Reasoning

马尔科维安之后:通过Bayes-Adapative RL进行反射勘探,用于LLM 理由分析

2505.20561v1

1202

05-26

Advancing Molecular Machine Learning Representations with Stereoelectronics-Infused Molecular Graphs

Advancing Molecular Machine Learning Representations mit stereoelectronics-infused Molecular Graphs

具有立体电子成份式分子图的分子机学习演示

2408.04520v2

1203

05-26

Causal Composition Diffusion Model for Closed-loop Traffic Generation

Causal Composition Diffusion Modell für die Closed-Loop-Verkehrserzeugung

闭闭环交通流量生成原因构成传播模式

2412.17920v3

1204

05-26

Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text

Task-informierte Anti-Kurriculum durch Masken verbessert Downstream-Performance auf Text

通过遮罩改进文字下流业绩,以任务化的反文体

2502.12953v2

1205

05-26

Learning a Pessimistic Reward Model in RLHF

Ein pessimistisches Belohnungsmodell in RLHF lernen

在RLHF学习悲观奖励模式

2505.20556v1

1206

05-26

A ZeNN architecture to avoid the Gaussian trap

Eine ZeNN-Architektur, um die Gaussische Falle zu vermeiden

避免高斯陷阱的 ZeNN 建筑

2505.20553v1

1207

05-26

Estimating Motor Symptom Presence and Severity in Parkinson’s Disease from Wrist Accelerometer Time Series using ROCKET and InceptionTime

Abschätzung von Motorsymptome und Schweregrad bei Parkinson-Krankheit aus der Wrist Accelerometer Time Serie mit ROCKET und InceptionTime

利用 ROCKET 和受孕时间从风速计时间序列中估计帕金森氏病的机动症状存在和严重性

2304.11265v3

1208

05-26

TAPIP3D: Tracking Any Point in Persistent 3D Geometry

TAPIP3D: Verfolgung eines beliebigen Punktes in persistenter 3D-Geometrie

TAPIP3D:跟踪持久性三维几何中的任何点

2504.14717v2

1209

05-26

Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling

Erreichen von Anpassungsfähigkeit und Optimität für mehrarmige Banditen mit Expenential-Kullback Leibler Maillard Sampling

利用Expernitial-Kullback Leiber Leiber Maillard抽样,实现多武装强盗的适应性和最佳性

2502.14379v2

1210

05-26

Quantum Speedups in Regret Analysis of Infinite Horizon Average-Reward Markov Decision Processes

Quantum Speedups bei der Bedauernsanalyse von Unendlichen Horizon durchschnittlichen Markov-Entscheidungsprozessen

对无限地平地平平线平均回报Markov决定程序进行遗憾分析时的量量加速

2310.11684v4

1211

05-26

RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs

RL nur im Namen? Analyse der strukturellen Annahmen im RL-Post-Training für LLMs

仅限名称的RL?分析在RL为LLMs提供的培训后培训中的结构假设

2505.13697v2

1212

05-26

Covariate-Adjusted Deep Causal Learning for Heterogeneous Panel Data Models

Kovariate-adjusted Deep Causal Learning für heterogene Panel-Datenmodelle

异质小组数据模型的共变调整深因学习

2505.20536v1

1213

05-26

Rotary Masked Autoencoders are Versatile Learners

Rotary Masked Autoencoder sind vielseitige Lerner

扶轮式遮罩自动算术员是多功能学习者

2505.20535v1

1214

05-26

HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell Data

HiPoNet: Ein Multi-View-Komplexnetzwerk für hochdimensionale Point-Cloud- und Single-Cell-Daten

HipoNet:高多面点和单细胞数据多视图简易复杂的网络

2502.07746v2

1215

05-26

One-shot Robust Federated Learning of Independent Component Analysis

强力学习独立构成部分分析

2505.20532v1

1216

05-26

Prediction-Enhanced Monte Carlo: A Machine Learning View on Control Variate

Vorhersage-erweitert Monte Carlo: Eine Machine-Learning-Ansicht auf Steuerungsvariate

预测增强的蒙特卡洛:关于控制Variatte的机械学习观点

2412.11257v2

1217

05-26

Fast Calculation of Feature Contributions in Boosting Trees

Schnelle Berechnung von Feature-Beiträgen bei der Förderung von Bäumen

快速计算推动树的特性贡献

2407.03515v2

1218

05-26

Training Articulatory Inversion Models for Inter-Speaker Consistency

Training Artikulatorische Inversionsmodelle für die Konsistenz zwischen den Lautsprechern

供发言者间和谐使用的培训用人工转换模型

2505.20529v1

1219

05-26

DYMAG: Rethinking Message Passing Using Dynamical-systems-based Waveforms

DYMAG: Nachricht neu denken Passieren mit Dynamisch-Systeme-basierten Wellenformen

DYMAG: 利用动态系统波形重新思考信息传递方式

2309.09924v5

1220

05-26

Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks

Lernpolitische Ausschüsse für effektive Personalisierung in MDPs mit unterschiedlichen Aufgaben

在有不同任务的多边发展方案中促进有效个性化的学习政策委员会

2503.01885v2

1221

05-26

Towards Fully FP8 GEMM LLM Training at Scale

Auf dem Weg zum vollständigen RP8 GEMM LLM Training auf Scale

GEMM GEMM LLM 大规模培训

2505.20524v1

1222

05-26

Scaling over Scaling: Exploring Test-Time Scaling Pareto in Large Reasoning Models

Skalierung über Skalierung: Untersuchung von Test-Zeit-Skalierung Pareto in großen vernünftigen Modellen

缩放过缩放: 探索大型理由模型中的测试时间缩放派

2505.20522v1

1223

05-26

Semi-Explicit Neural DAEs: Learning Long-Horizon Dynamical Systems with Algebraic Constraints

Halbexplizite neurale DAEs: Lernen von langhorizontigen dynamischen Systemen mit algebraischen Einschränkungen

半显性神经DAEs:学习具有代数限制的长毛利区动态系统

2505.20515v1

1224

05-26

On a Neural Implementation of Brenier’s Polar Factorization

Über eine neurale Umsetzung von Breniers Polarfaktorisierung

布赖尼尔极地化的神经实施

2403.03071v4

1225

05-26

A Novel Convolutional Neural Network-Based Framework for Complex Multiclass Brassica Seed Classification

Ein neuartiges konvolutionäres neurales Netzwerk-basiertes Framework für die komplexe Klassifizierung von mehrstufigen Brassica-Samen

复杂多级巴西种子种子分类新革命神经网络框架

2505.21558v1

1226

05-26

Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures

Beispiel und Karte aus einem einzigen Convex-Potential: Erzeugung mit konjugierenden Momenten

单一汇合潜能的样本和地图:使用协同时间措施生成

2503.10576v2

1227

05-26

Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review

Verkörperte KI mit Basismodellen für mobile Serviceroboter: Ein Systematischer Test

与 “ 移动服务机器人:系统审查 “ 基金会模型

2505.20503v1

1228

05-26

Retrieve to Explain: Evidence-driven Predictions for Explainable Drug Target Identification

Erklären Sie: Evidenz-getriebene Vorhersagen für erklärbare Drogenziel-Identifikation

寻求解释:对可解释药物目标识别的由证据驱动的预测

2402.04068v4

1229

05-26

CLEVRER-Humans: Describing Physical and Causal Events the Human Way

CLEVRER-Mensch: Physikalische und kausale Ereignisse auf menschliche Weise beschreiben

CLEVRER-人类:将自然和因果事件描述为人类道路

2310.03635v2

1230

05-26

Distributionally Robust Optimization

Verteilungsstarke Optimierung

分布强力优化

2411.02549v3

1231

05-26

Avoid Forgetting by Preserving Global Knowledge Gradients in Federated Learning with Non-IID Data

Vermeiden Sie das Vergessen, indem Sie globale Wissensgradienten im Föderierten Lernen mit nicht-ID-Daten bewahren

避免在使用非二二二维数据进行联邦学习时因保留全球知识进步而被遗忘

2505.20485v1

1232

05-26

Towards Efficient Training of Graph Neural Networks: A Multiscale Approach

Auf dem Weg zu einer effizienten Ausbildung von Graphen-Neuralen Netzwerken: Ein multiskaliger Ansatz

争取对图形神经网络进行有效培训:一种多部门办法

2503.19666v3

1233

05-26

CardioPatternFormer: Pattern-Guided Attention for Interpretable ECG Classification with Transformer Architecture

CardioPatternFormer: Mustergeführte Aufmerksamkeit für die Interpretierbare EKG-Klassifikation mit Transformer-Architektur

卡尔迪·皮德·皮德罗·弗德:对具有变形结构的可解释的ECG分类的典型引导关注

2505.20481v1

1234

05-26

Leveraging Sparsity for Sample-Efficient Preference Learning: A Theoretical Perspective

Sparsamkeit für stichprobeneffizientes Preference-Lernen: Eine theoretische Perspektive

利用差距促进抽样有效优先学习:理论视角

2501.18282v3

1235

05-26

From learnable objects to learnable random objects

Von lernbaren Objekten zu lernbaren zufälligen Objekten

从可学习对象到可学习随机对象

2504.00847v2

1236

05-26

Stochastic Preconditioning for Neural Field Optimization

Stochastische Vorkonditionierung für die Neuralfeldoptimierung

神经场优化的斯托克预设设备

2505.20473v1

1237

05-26

WeatherEdit: Controllable Weather Editing with 4D Gaussian Field

WeatherEdit: Kontrollierbare Wetterbearbeitung mit 4D Gaussian Field

气象编辑: 4D Gaussian 字段的可控天气编辑

2505.20471v1

1238

05-26

Recursive Deep Inverse Reinforcement Learning

Rekursives tiefes Inverse-Verstärkung-Lernen

递归深反向强化学习

2504.13241v4

1239

05-26

Learning with Expected Signatures: Theory and Applications

Lernen mit erwarteten Signaturen: Theorie und Anwendungen

学习与预期签名:理论和应用

2505.20465v1

1240

05-26

Federated Learning-Distillation Alternation for Resource-Constrained IoT

Federated Learning-Destillation Alternative für ressourcengebundenes IoT

资源培训型IOT 资源培训型IOT替代物

2505.20456v1

1241

05-26

Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection

Skalierungsgesetze für das Vergessen beim Finetuning mit Vorschulungs-Dateninjektion

调整前数据输入时遗忘法律的扩大范围

2502.06042v2

1242

05-26

BlastOFormer: Attention and Neural Operator Deep Learning Methods for Explosive Blast Prediction

BlastOFormer: Aufmerksamkeit und neuraler Operator Deep Learning Methoden zur explosiven Blast-Vorhersage

BLastO Former: 爆炸性爆炸预测的注意和神经操作员深学习方法

2505.20454v1

1243

05-26

Active Learning for Multiple Change Point Detection in Non-stationary Time Series with Deep Gaussian Processes

Aktives Lernen für Multiple Change Point Detection in nicht-stationären Zeitreihen mit tiefen Gauß-Prozessen

与深高斯进程一起在非静止时间序列中进行多变点探测活动学习

2505.20452v1

1244

05-26

Symmetry constrained neural networks for detection and localization of damage in metal plates

Symmetrie eingeschränkte neuronale Netze zur Erkennung und Lokalisierung von Schäden in Metallplatten

用于金属板块损害探测和定位的对称约束神经网络

2409.06084v3

1245

05-26

Time Series Generation Under Data Scarcity: A Unified Generative Modeling Approach

Zeitreihenerstellung unter Datenknappheit: Ein einheitlicher generativer Modellierungsansatz

数据缺乏情况下的时间序列生成:统一生成模式方法

2505.20446v1

1246

05-26

HoPE: Hybrid of Position Embedding for Length Generalization in Vision-Language Models

HoPE: Hybrid der Positionseinbettung für die Längenverallgemeinerung in Vision-Language-Modelle

HoPE:愿景-语言模型中长期通用化所嵌入的立场组合

2505.20444v1

1247

05-26

AI Learning Algorithms: Deep Learning, Hybrid Models, and Large-Scale Model Integration

KI-Learning-Algorithmen: Deep Learning, hybride Modelle und großformatige Modellintegration

AI 学习等级:深学习、混合模型和大型模型整合

2410.09186v3

1248

05-26

Holes in Latent Space: Topological Signatures Under Adversarial Influence

Löcher im latenten Raum: Topologische Signaturen unter dem Einfluss von Adversarien

低空空洞:在对立影响下的地形签名

2505.20435v1

1249

05-26

Kernel Quantile Embeddings and Associated Probability Metrics

Kernel-Quantile-Embeddings und zugehörige Wahrscheinlichkeits-Metriken

内核量量嵌入器及相关概率

2505.20433v1

1250

05-26

Differentiable Quadratic Optimization For The Maximum Independent Set Problem

Unterschiedliche quadratische Optimierung für das maximale unabhängige Set-Problem

最大独立集集问题可区别的二次二次曲线优化

2406.19532v6

1251

05-26

Self-reflective Uncertainties: Do LLMs Know Their Internal Answer Distribution?

Selbstreflektierende Unsicherheiten: Kennen LLMs ihre interne Antwortverteilung?

自我反感的不确定性:LLMs知道他们的内部答案分布吗?

2505.20295v1

1252

05-26

Reasoning LLMs are Wandering Solution Explorers

Grundlegende LLMs sind wandernde Lösungs-Explorer

理据LLMs是游荡的解决方案探索者

2505.20296v1

1253

05-26

Lorentz Local Canonicalization: How to Make Any Network Lorentz-Equivariant

Lorentz lokale Canonicalization: Wie man jedes Netzwerk Lorentz-Equivariant

Lorentz 本地 Canonicalization : 如何制造任何网络 Lorentz- Equivalication

2505.20280v1

1254

05-26

Solving Hidden Monotone Variational Inequalities with Surrogate Losses

Lösen versteckter monotoner Variationsungleichheiten mit Surrogatverlusten

解决与代谢损失的隐藏单式单体差异性不平等

2411.05228v3

1255

05-26

The Coverage Principle: A Framework for Understanding Compositional Generalization

Das Coverage-Prinzip: Ein Rahmen für das Verständnis der kompositorischen Verallgemeinerung

覆盖范围原则:理解普遍组成框架

2505.20278v1

1256

05-26

Probabilistic Kernel Function for Fast Angle Testing

Probabilistische Kernel-Funktion für schnelle Winkelprüfung

用于快速角测试的概率内核函数

2505.20274v1

1257

05-26

Comparing Neural Network Encodings for Logic-based Explainability

Vergleich von Neural Network Encodings für Logic-basierte Erklärbarkeit

比较基于逻辑的解释性神经网络编码

2505.20269v1

1258

05-26

Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits

Ergebnisbasiertes Online-Verstärkungslernen: Algorithmen und grundlegende Grenzen

基于成果的在线强化学习:等级和基本限制

2505.20268v1

1259

05-26

syftr: Pareto-Optimal Generative AI

syftr: Pareto-Optimal Generative KI

Syftr: Pareto- Opmatimal 生成 AI

2505.20266v1

1260

05-26

Lifelong Safety Alignment for Language Models

Lebenslange Sicherheitsausrichtung für Sprachmodelle

语言模型终身安全比对

2505.20259v1

1261

05-26

GRAPE: Optimize Data Mixture for Group Robust Multi-target Adaptive Pretraining

GRAPE: Optimierung der Datenmischung für ein robustes Multi-Target-Adaptives Vortraining

GRAPE: 优化集体强力多目标适应性预备培训的数据混合

2505.20380v1

1262

05-26

Position: Mechanistic Interpretability Should Prioritize Feature Consistency in SAEs

Position: Mechanische Dolmetschbarkeit sollte Feature-Konsistenz in SAEs priorisieren

位置: 机械可解释性:应优先考虑高级专业环境评估中的地物一致性

2505.20254v1

1263

05-26

Unveiling AI’s Blind Spots: An Oracle for In-Domain, Out-of-Domain, and Adversarial Errors

Enthüllen der Blind-Spots von KI: Ein Oracle für In-Domain-, Out-of-Domain- und Adversarial-Fehler

大赦国际不懈的《盲人点:内地、外地和反向错误的甲骨文》

2410.02384v3

1264

05-26

Learning Extrapolative Sequence Transformations from Markov Chains

Extrapolative Sequenztransformationen von Markov-Ketten lernen

来自Markov 链条的学习外推序列变换

2505.20251v1

1265

05-26

On the Guidance of Flow Matching

Über die Anleitung von Flow Matching

流动配对指南

2502.02150v3

1266

05-26

TACO: Training-free Sound Prompted Segmentation via Semantically Constrained Audio-visual CO-factorization

TACO: Schulungsfreie Klang-Prompt-Segmentierung über semantisch eingeschränkte Audio-visuelle CO-Fabrizierung

TACO:通过模拟压缩培训的视听共同推动因素,进行无培训、无培训的音频快速分割

2412.01488v3

1267

05-26

Efficient Optimization Accelerator Framework for Multistate Ising Problems

Effizientes Optimierungs-Beschleuniger-Framework für Multistate Ising-Probleme

高效高效优化多州化问题加速加速框架

2505.20250v1

1268

05-26

RedAHD: Reduction-Based End-to-End Automatic Heuristic Design with Large Language Models

RedAHD: Reduktionsbasiertes, End-to-End-Automatisches Heuristisches Design mit großen Sprachmodellen

REDAHD: 具有大语言模型的后端至后端自动超量设计

2505.20242v1

1269

05-26

DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning

DreamPRM: Domain-regewichtetes Prozess-Reward-Modell für multimodale Vernunft

DreamPRM: 多边理由解释的负重评分进程奖励模式

2505.20241v1

1270

05-26

SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems

SITCOM: Triple-Consistent Diffusions-Probenahme für inverse Probleme

SITCOM: 反问题递进三联扩散抽样

2410.04479v2

1271

05-26

A Temporal Difference Method for Stochastic Continuous Dynamics

Eine zeitliche Differenzmethode für stochastische kontinuierliche Dynamik

存储连续动态的时差方法

2505.15544v3

1272

05-26

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

RAGEN: Selbst-Evolution in LLM-Agenten durch Multi-Turn-Verstärkungs-Lernen verstehen

通过多阶段强化学习了解LLM代理商的自我演变

2504.20073v2

1273

05-26

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

SFT-Erinnerungen, RL Generalisiert: Eine vergleichende Studie des Stiftungsmodells nach der Ausbildung

SFT Memorizes,RL一般化:基金会培训模式模型比较研究

2501.17161v2

1274

05-26

Variational Deep Learning via Implicit Regularization

Variationales Deep Learning durch Implizite Regularisierung

通过隐性规范化进行不同的深层学习

2505.20235v1

1275

05-26

Multimodal Federated Learning With Missing Modalities through Feature Imputation Network

Multimodales Federated Learning mit fehlenden Modalitäten durch Feature Imputation Network

通过特征截肢网络以失踪模式进行多模式联邦学习

2505.20232v1

1276

05-26

From What to How: Attributing CLIP’s Latent Components Reveals Unexpected Semantic Reliance

Von was zu wie: Zuweisen von CLIPs latenten Komponenten zeigt ungeahnte semantische Zuverlässigkeit

从何到如何: 将 CLIP 的内部部件流出异常的语义依赖性归结为 CLIP 的内部批量。

2505.20229v1

1277

05-26

FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

FLAME-MoE: Eine transparente End-to-End-Forschungsplattform für Mixture-of-Experts-Sprachmodelle

FLAME-MOE:混合专家语言模型透明端对端研究平台

2505.20225v1

1278

05-26

Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects

Chain-of-Thought für autonomes Fahren: Eine umfassende Umfrage und Zukunftsaussichten

寻求自主驾驶:全面调查和未来前景

2505.20223v1

1279

05-26

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Rollen Sie die Würfel & Blick, bevor Sie springen: Gehen über die kreativen Grenzen der Next-Token-Vorhersage

跳跃前的骰子滚动和看一看:超越了次声预测的创造性极限

2504.15266v2

1280

05-26

Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Gradient Flow Passend zum Lernen von Update-Dynamik im neuralen Netzwerktraining

神经网络培训中学习更新动态动态的渐进流程匹配

2505.20221v1

1281

05-26

Open the Eyes of MPNN: Vision Enhances MPNN in Link Prediction

Öffnen Sie die Augen von MPNN: Vision verbessert MPNN in Link Prediction

MPNNN的 “ 睁开眼 “ :愿景在 “ 连结预测 “ 中加强MPNN

2505.08266v2

1282

05-26

New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results

Neue Perspektiven auf die Polyak Stepsize: Surrogate-Funktionen und negative Ergebnisse

关于 “ 多边步骤的新观点:代理功能和消极结果 “

2505.20219v1

1283

05-26

Fine-grained List-wise Alignment for Generative Medication Recommendation

Feinkörnige List-Wise-Ausrichtung für Generative Medikamente Empfehlung

生产用药建议精制清单调整

2505.20218v1

1284

05-26

Parameter-Efficient Fine-Tuning with Column Space Projection

Parameter-Effizient Feintuning mit Säulenraumprojektion

带有列空间投射的高效参数精密设计

2505.20211v1

1285

05-26

FedECA: A Federated External Control Arm Method for Causal Inference with Time-To-Event Data in Distributed Settings

FedECA: Eine Federated External Control Arm Methode für ursächliche Schlussfolgerungen mit Zeit-bis-Event-Daten in verteilten Einstellungen

FedECA:在分布环境中利用时间到时间的数据进行因果关系推断的联邦外部控制武器法

2311.16984v9

1286

05-26

Temporal Sampling for Forgotten Reasoning in LLMs

Zeitliche Probenahme für vergessene Vernunft in LLMs

LLM 被遗忘原因的时间抽样

2505.20196v1

1287

05-26

FunReason: Enhancing Large Language Models’ Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement

FunReason: Erweiterung der Funktion großer Sprachmodelle durch Multiscale-Verluste und automatisierte Datenverfeinerung durch Selbst-Refinement

FunReason:通过自我改进、多尺度损失和数据自动化改进加强大语言模型功能

2505.20192v1

1288

05-26

Private Geometric Median in Nearly-Linear Time

Private Geometrische Medien in fast linearer Zeit

近利时私人几何中位数

2505.20189v1

1289

05-26

Research on feature fusion and multimodal patent text based on graph attention network

Forschungsarbeiten über Feature Fusion und multimodalen Patenttext auf der Grundlage von Graphen Aufmerksamkeit Netzwerk

根据图示关注网络研究地物聚合和多式专利法

2505.20188v1

1290

05-26

UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design

UniMoMo: Unified Generative Modellierung von 3D-Molekülen für De Novo Binder Design

UniMomo:De Novo Binder 设计3D Molecules的统一生成模型

2503.19300v3

1291

05-26

Linearization of ReLU Activation Function for Neural Network-Embedded Optimization: Optimal Day-Ahead Energy Scheduling

Linearisierung der ReLU-Aktivierungsfunktion für neurale Netzwerk-Embedded-Optimierung: Optimale Day-Ahead-Energieplanung

ReLU神经网络激活功能的线性化

2310.01758v2

1292

05-26

Bayesian Optimisation Against Climate Change: Applications and Benchmarks

Bayesische Optimierung gegen den Klimawandel: Anwendungen und Benchmarks

Bayesian最佳应对气候变化:应用和基准

2306.04343v2

1293

05-26

On the Volatility of Shapley-Based Contribution Metrics in Federated Learning

Über die Volatilität von Shapley-Based Contribution Metrics im Federated Learning

联邦学习中基于毛质的贡献度量变化无常

2405.08044v4

1294

05-26

No Free Lunch: Non-Asymptotic Analysis of Prediction-Powered Inference

Kein kostenloses Mittagessen: Nicht-asymptotische Analyse von Vorhersage-Powered Inferenz

无免费午餐:预测力推论的非心理分析

2505.20178v1

1295

05-26

The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination

Die Macht des iterativen Filterns für überwachtes Lernen mit (schwerer) Kontaminierung

受监督学习(重)污染的迭代过滤功能

2505.20177v1

1296

05-26

“KAN you hear me?” Exploring Kolmogorov-Arnold Networks for Spoken Language Understanding

“KAN hörst du mich?” Kolmogorov-Arnold-Netzwerke für gesprochenes Sprachverständnis erkunden

探索科尔莫戈洛夫-阿诺尔德语言理解网络

2505.20176v1

1297

05-26

mPOLICE: Provable Enforcement of Multi-Region Affine Constraints in Deep Neural Networks

mPOLICE: Wahrscheinliche Durchsetzung von Multi-Region Affine-Konstraints in tiefen neuralen Netzwerken

MPOLICE: 在深神经网络中以可行方式执行多种区域同系限制

2502.02434v2

1298

05-26

Virtual Cells: Predict, Explain, Discover

Virtuelle Zellen: Vorhersagen, Erklären, Entdecken

虚拟细胞: 预测、解释、发现

2505.14613v2

1299

05-26

A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation

Ein theoretischer Rahmen für Grokking: Interpolation gefolgt von Riemannsche Norm Minimierung

Grokking理论框架:内插,然后是Riemannian Norm 最小化

2505.20172v1

1300

05-26

From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data

Von der Ausrichtung zur Weiterentwicklung: Bootstrapping Audio-Language Alignment mit synthetischen Daten

从对齐到推进: 用合成数据推动音频语言对齐

2505.20166v1

1301

05-26

Capability-Based Scaling Laws for LLM Red-Teaming

Capability-Based Scaling-Gesetze für LLM Red-Teaming

LLM 红色团队合作以能力为基础的增强法律

2505.20162v1

1302

05-26

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Prismatische Synthese: Gradientenbasierte Datendiversifizierung steigert Generalisierung in LLM-Reasoning

理论综合:基于逐步的数据多样化促进LLM理由说明的概括化

2505.20161v1

1303

05-26

Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities

Gedachte politische Optimierung: Überwindung externer Leitlinien und interner Fähigkeiten

优化政策:将外部指导和内部能力结合起来

2505.15692v2

1304

05-26

Polynomial, trigonometric, and tropical activations

Polynomische, trigonometrische und tropische Aktivierungen

多边、三角和热带活性

2502.01247v2

1305

05-26

On the (Non) Injectivity of Piecewise Linear Janossy Pooling

Auf der (Nicht-)Injektivität der stückweise linearen Janossy-Pooling

在Peaxy Linear Janosy 集合的喷射上,

2505.20150v1

1306

05-26

SeMe: Training-Free Language Model Merging via Semantic Alignment

SeMe: Training-freies Sprachmodell Zusammenführen über semantische Ausrichtung

SeME:通过语义一致合并的无培训语言模式

2505.20144v1

1307

05-26

Model Stitching by Functional Latent Alignment

Modellstitching durch funktionale Latent Alignment

通过功能性前端对齐进行模型切换

2505.20142v1

1308

05-26

GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models

GUARD: Rollenspiel zur Generierung von Jailbreakings in natürlicher Sprache zur Prüfung der Einhaltung der Leitlinie für große Sprachmodelle

GUARD: 利用《大语言模式遵守试验准则准则》创造以自然语言破门破门

2402.03299v5

1309

05-26

Error Optimization: Overcoming Exponential Signal Decay in Deep Predictive Coding Networks

Fehler-Optimierung: Überwindung exponentieller Signaldekay in tiefen vorausschauenden Codierungsnetzwerken

错误优化 : 克服深预报编码网络中的指数信号衰减

2505.20137v1

1310

05-26

P$^2$ Law: Scaling Law for Post-Training After Model Pruning

P$^2$ Gesetz: Skalierungsgesetz für Post-Training nach Modellprüfung

P$2美元法律:示范 “ 谨慎 “ 后培训后培训后扩大法

2411.10272v3

1311

05-26

AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings

AweDist: Aufmerksamkeitsbewusste Einbettung Destillation für neue Eingabe-Token-Einbettungen

AweDist: 新的输入式嵌入式嵌入器的注意嵌入蒸馏

2505.20133v1

1312

05-26

InfoBridge: Mutual Information estimation via Bridge Matching

InfoBridge: Gegenseitige Informationsschätzung über Bridge Matching

InfoBridge:通过桥梁匹配进行相互信息估计

2502.01383v2

1313

05-26

Outcome-based Reinforcement Learning to Predict the Future

Ergebnisbasiertes Bewehrungslernen zur Vorhersage der Zukunft

基于成果的强化学习,以预测未来

2505.17989v2

1314

05-26

Tensorization is a powerful but underexplored tool for compression and interpretability of neural networks

Tensorisierung ist ein leistungsfähiges, aber unerforschtes Werkzeug zur Kompression und Interpretationsfähigkeit neuronaler Netzwerke

电温是压缩和解释神经网络的强大但探索不足的工具

2505.20132v1

1315

05-26

MolEditRL: Structure-Preserving Molecular Editing via Discrete Diffusion and Reinforcement Learning

MolEditRL: Strukturschonende molekulare Bearbeitung durch diskretes Diffusions- und Verstärkungslernen

MoldEditRL:通过分解分解和扩散及强化学习保持结构的分子编辑

2505.20131v1

1316

05-26

Balancing Interference and Correlation in Spatial Experimental Designs: A Causal Graph Cut Approach

Balance zwischen Interferenz und Korrelation in räumlichen Experimentaldesigns: Ein ursächlicher Graphenschnitt-Ansatz

空间实验设计中平衡干扰和关联:因果图表切割法

2505.20130v1

1317

05-26

Uncertainty Quantification for LLM-Based Survey Simulations

Ungewissheitsquantifizierung für LLM-basierte Umfragesimulationen

以LLM为基础的LLM调查模拟器的不确定性定量

2502.17773v3

1318

05-26

From Tables to Time: How TabPFN-v2 Outperforms Specialized Time Series Forecasting Models

Von Tabellen zur Zeit: Wie TabPFN-v2 Modelle der speziellen Zeitreihenvorhersage übertrifft

从表格到时间: TabPFN-v2 如何表现超过专门时间序列预测模型

2501.02945v3

1319

05-26

Understanding Generalization in Diffusion Models via Probability Flow Distance

Verallgemeinerung in Diffusionsmodellen über Wahrscheinlichkeitsflussentfernung verstehen

通过概率流动远距离理解扩散模型的通用化

2505.20123v1

1320

05-26

Likelihood-Ratio Regularized Quantile Regression: Adapting Conformal Prediction to High-Dimensional Covariate Shifts

Likelihood-Ratio Regularized Quantile Regression: Anpassung der konformen Vorhersage an hochdimensionale Kovariate Verschiebungen

常规量化递减:调整对高多元共变变化的正规预测

2502.13030v2

1321

05-26

Algorithmic Control Improves Residential Building Energy and EV Management when PV Capacity is High but Battery Capacity is Low

Algorithmische Steuerung verbessert Wohngebäude Energie-und EV-Management, wenn PV-Kapazität ist hoch, aber Batterie-Kapazität ist gering

当光电池容量高但电池容量低时,控制电量控制改进住宅建筑的能源和EV管理,改善住宅建筑的能源和EV管理

2505.20377v1

1322

05-26

Generative diffusion for perceptron problems: statistical physics analysis and efficient algorithms

Generative Diffusion für Perceptronprobleme: statistische Physikanalyse und effiziente Algorithmen

生成感官问题扩散:统计物理分析和有效算法

2502.16292v2

1323 05-26 Proxy-Free GFlowNet Proxy-freies GFlowNet 无代理的GFlowNet 2505.20110v1

1324

05-26

Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning

Verfeinerung von Text-zu-Multiview-Diffusion durch Verstärkungslernen

通过强化学习改进微小的中文本到多视图传播

2505.20107v1

1325

05-26

Preference-Based Gradient Estimation for ML-Guided Approximate Combinatorial Optimization

Präferenzbasierte Gradientenschätzung für ML-geführte annähernde Kombinator-Optimierung

ML- Guided 近似组合优化的基于优惠的渐进式测算

2502.19377v2

1326

05-26

Spurious Privacy Leakage in Neural Networks

Spurious Privacy Leakage in neuralen Netzwerken

神经网络中的净隐私渗漏

2505.20095v1

1327

05-26

A fast sound power prediction tool for genset noise using machine learning

Ein schnelles Sound-Power-Prognose-Tool für Genset-Rausch mit maschinellem Lernen

利用机器学习来快速可靠电源预测工具,用于使用机器学习的genseet噪音

2505.20079v1

1328

05-26

Grokking ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior

Grokking ExPLAIND: Vereinheitlichung von Modell, Daten und Trainingszuweisung zum Studieren von Modellverhalten

Grokking ExPLAIND: 用于研究模型行为的统一模型、数据和培训归属

2505.20076v1

1329

05-26

An Out-Of-Distribution Membership Inference Attack Approach for Cross-Domain Graph Attacks

Ein Out-Of-Distribution-Mitgliedschaft Inferenz Angriff Ansatz für Cross-Domain Graph Attacks

跨领域石块袭击的批外分配成员推推攻击方法

2505.20074v1

1330

05-26

SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

SafeDPO: Ein einfacher Ansatz zur direkten Preference-Optimierung mit erhöhter Sicherheit

SafeDPO: 以强化安全方式直接优化优惠的简单办法

2505.20065v1

1331

05-26

SAEs Are Good for Steering – If You Select the Right Features

SAEs sind gut für das Lenken – wenn Sie die richtigen Funktionen auswählen

SAEs 有利于指导 – – 如果您选择了正确的特性

2505.20063v1

1332

05-26

Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting

Time-VLM: Erforschung multimodaler Vision-Sprachenmodelle für Augmented Time Series Forecasting

时间-VLM:探索扩大时间序列预测的多模式愿景-语言模型

2502.04395v2

1333

05-26

Sable: a Performant, Efficient and Scalable Sequence Model for MARL

Sable: ein leistungsfähiges, effizientes und skalierbares Sequenzmodell für MARL

电缆:MARL的性能、高效和可缩放序列模型

2410.01706v5

1334

05-26

Ankh3: Multi-Task Pretraining with Sequence Denoising and Completion Enhances Protein Representations

Ankh3: Multi-Task Pretraining mit Sequenz Denoisieren und Vollendung verbessert Proteindarstellungen

Ankh3: 具有序列取消和完成的多任务预先培训,加强蛋白质代表制

2505.20052v1

1335

05-26

Catoni-Style Change Point Detection for Regret Minimization in Non-Stationary Heavy-Tailed Bandits

Catoni-Style Change Point Detection für Reue Minimierung in nicht-stationären schwer-gefährdeten Banditen

用于在非连续重型重航匪徒中最遗憾最小化的卡特托尼- 轮式变速点探测

2505.20051v1

1336

05-26

Synthetic Time Series Forecasting with Transformer Architectures: Extensive Simulation Benchmarks

Synthetische Zeitreihenprognosen mit Transformer-Architekturen: Umfangreiche Simulations-Benchmarks

利用变形建筑结构预测合成时间序列:广泛模拟基准

2505.20048v1

1337

05-26

Convex Approximation of Two-Layer ReLU Networks for Hidden State Differential Privacy

Convex-Annäherung von Zwei-Layer-ReLU-Netzwerken für versteckte staatliche differentielle Privatsphäre

隐藏式国家差异隐私双线雷路网络的连接近似

2407.04884v3

1338

05-26

Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

Kontrolle des neuralen Zusammenbruchs verbessert Out-of-Distribution Detection und Transfer Learning

控制神经崩溃增强传播外探测和转让学习

2502.10691v2

1339

05-26

Beyond Simple Concatenation: Fairly Assessing PLM Architectures for Multi-Chain Protein-Protein Interactions Prediction

Beyond Simple Concatenation: Fairly Assessing PLM Architectures for Multi-Chain Protein-Protein Interaktionen Prediction

超越简单星系:公平评估多沙因蛋白因-蛋白因相互作用预测的PLM结构

2505.20036v1

1340

05-26

TeleSparse: Practical Privacy-Preserving Verification of Deep Neural Networks

TeleSparse: Praktische Datenschutz-Bewahrung von Tiefen-Neural-Netzwerken

远程分离:深海神经网络的实际隐私保护核查

2504.19274v2

1341

05-26

ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers

ViTaPEs: Visuotaktile Positionskodierungen für die modulübergreifende Ausrichtung in multimodalen Transformatoren

ViTAPEs:多式变换器中跨模式对齐的变量定位位置编码

2505.20032v1

1342

05-26

Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

Mehrere Abstiege im Deep Learning als Folge von Order-Chaos-Übergängen

作为有秩序的赵国过渡的一个序列的深层学习中的多种族后裔

2505.20030v1

1343

05-26

Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)

Korrelation von Instruktions-Tuning (in multimodalen Modellen) mit visionssprachlicher Verarbeitung (im Gehirn)

与视觉语言处理(大脑中)相交校正(多式联运模式)

2505.20029v1

1344

05-26

Multi-modal brain encoding models for multi-modal stimuli

Multimodale Gehirnkodierungsmodelle für multimodale Reize

多模式刺激多模式大脑编码模型

2505.20027v1

1345

05-26

Gradient Inversion Transcript: Leveraging Robust Generative Priors to Reconstruct Training Data from Gradient Leakage

Gradient Inversion Transcript: Leveraging Robust Generative Priors to Reconstruct Trainingsdaten von Gradient Leakage

梯度反转轨迹:从梯度渗漏中重新构建培训数据的杠杆化强力生成前程

2505.20026v1

1346

05-26

Human-Aligned Image Models Improve Visual Decoding from the Brain

Menschlich ausgerichtete Imagemodelle verbessern die visuelle Dekodierung aus dem Gehirn

人与人之间的图像模型改进大脑的视觉解码

2502.03081v2

1347

05-26

Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare

Ontologie- und LLM-basierte Datenharmonisierung für das Federated Learning in Healthcare

以本体学和LLM为基础的保健方面联邦学习数据统一

2505.20020v1

1348

05-26

ProcessBench: Identifying Process Errors in Mathematical Reasoning

ProcessBench: Identifizierung von Prozessfehlern in mathematischer Reasoning

进程快节: 识别数学原因中的进程错误

2412.06559v4

1349

05-26

Kernel-based estimators for functional causal effects

kernbasierte Schätzwerte für funktionelle kausale Effekte

功能因果效应的内核核心估计值

2503.05024v3

1350

05-26

Data-Dependent Regret Bounds for Constrained MABs

Datendependent Regret Bounds for Constrained MABs

受约束 MAB 的受控数据依赖的 Regret Bounds

2505.20010v1

1351 05-26 Prediction-Powered E-Values Voraussichtliche E-Werte 预测力电子价值 2502.04294v2

1352

05-26

TabPFN: One Model to Rule Them All?

TabPFN: Ein Modell, um sie alle zu beherrschen?

TabPFN: 一种模式来统治他们吗?

2505.20003v1

1353

05-26

Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents

Unvollkommenheit: Simulieren von Studenten mit unterschiedlichen kognitiven Ebenen mit LLM-basierten Agenten

普及缺陷:利用基于LLM的代理物模拟具有不同认知水平的学生

2505.19997v1

1354

05-26

Learning Optimal Multimodal Information Bottleneck Representations

Optimales Lernen multimodaler Informationen Engpässe Vertretungen

学习最佳最佳多模式信息

2505.19996v1

1355

05-26

Distortion Resilience for Goal-Oriented Semantic Communication

Distortion Resilienz für zielorientierte semantische Kommunikation

目标导向语义交流的扭曲复原力

2309.14587v2

1356

05-26

Federated Domain Generalization with Data-free On-server Matching Gradient

Föderierte Domain-Verallgemeinerung mit datenfreiem On-Server-Zustimmungs-Gradient

具有无数据观测站上与渐变匹配的无数据观测器的联邦通用域

2501.14653v2

1357

05-26

Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach

Bedauerliche Analyse von durchschnittlichen Unichain-MDPs über einen actor-Critic-Ansatz

通过“行动者-批评办法”对平均回报单链式微DP的遗憾分析

2505.19986v1

1358

05-26

Bridging The Multi-Modality Gaps of Audio, Visual and Linguistic for Speech Enhancement

Überbrückung der Multi-Modalitätslücken von Audio, Visual und Linguistik zur Sprachverbesserung

弥合视听和语言的多模式差距,加强语言、视听能力

2501.13375v2

1359

05-26

Rethinking Probabilistic Circuit Parameter Learning

Probabilistisches Parameter-Lernen neu denken

重新思考概率电路参数学习

2505.19982v1

1360

05-26

Differential Privacy Analysis of Decentralized Gossip Averaging under Varying Threat Models

Differential Privacy Analyse dezentralisierter Gossip Average unter unterschiedlichen Bedrohungsmodellen

对不同威胁模式下分散的流民的隐私差异分析

2505.19969v1

1361

05-26

Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)

Position: Löse schichtweise lineare Modelle, um neurale dynamische Phänomene zu verstehen (Neuraler Kollaps, Emergence, Lazy/Rich Regime und Grokking)

位置:首先理解神经动态现象的解层图层线性模型(神经崩溃、新出现、Lazy/Rich制度和Grokking)

2502.21009v2

1362

05-26

Learning to Select In-Context Demonstration Preferred by Large Language Model

Lernen, In-Kontext-Demonstration zu wählen Bevorzugt nach großen Sprachmodellen

学习选择大语言模式首选的文本内演示

2505.19966v1

1363

05-26

The Limits of Preference Data for Post-Training

Die Grenzen der Präferenzdaten für das Post-Training

培训后优先数据限值

2505.19964v1

1364

05-26

Robustly optimal dynamics for active matter reservoir computing

Robust optimale Dynamik für das Recreservoir Computing mit aktiven Materien

活性物质储油层计算强有力的最佳动态

2505.05420v2

1365

05-26

Explanatory Summarization with Discourse-Driven Planning

Erklärende Zusammenfassung mit diskursgetriebener Planung

与 “ 分流规划 “ 结合的解释性总结

2504.19339v3

1366

05-26

RAP: Runtime-Adaptive Pruning for LLM Inference

RAP: Runtime-Adaptive Pruning für LLM-Inferenz

RAP:LLM 推断的运行时间-适应性节制

2505.17138v2

1367

05-26

Multi-Type Point Cloud Autoencoder: A Complete Equivariant Embedding for Molecule Conformation and Pose

Multi-Type-Punkt-Cloud-Autoencoder: Ein komplettes Equivariant-Embedding für Molekülkonformation und Pose

多类型点云云自动编码器:分子构造和脉冲的完全等同嵌入

2405.13791v3

1368

05-26

MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research

MLR-Bench: Bewertung von KI-Agenten auf Open-Ended Machine Learning Research

MLR-Bench:评估AI公司在开放式机械学习研究方面的代理机构

2505.19955v1

1369

05-26

An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning

Ein erklärbares Diagnose-Framework für neurodegenerative Dementias durch Verstärkungsoptimierte LLM-Reasoning

通过强化-优化LLM解释性理疗理由的神经医学性痴呆症可解释的诊断框架

2505.19954v1

1370

05-26

Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions

Welche Datenattribute stimulieren die Mathe- und Code-Reasoning? Eine Untersuchung über Einflussfunktionen

哪些数据属性刺激数学和代码理由? 通过影响函数进行调查

2505.19949v1

1371

05-26

SaSi: A Self-augmented and Self-interpreted Deep Learning Approach for Few-shot Cryo-ET Particle Detection

SaSi: Ein selbst-augmentierter und selbst-interpretierter Deep-Learning-Ansatz für die wenige Schuss Cryo-ET Partikelerkennung

SaSi:对几近的Cryo-ET粒子探测自增强和自我解释的深层学习方法

2505.19948v1

1372

05-26

Dynamically Learned Test-Time Model Routing in Language Model Zoos with Service Level Guarantees

Dynamisch gelerntes Test-Time-Modell-Routing in Sprachmodell Zoos mit Service-Level-Garantien

具有服务级保障的语文示范动物园动态学习测试时间模型运行

2505.19947v1

1373

05-26

Inverse Q-Learning Done Right: Offline Imitation Learning in $Q^π$-Realizable MDPs

Inverse Q-Learning Done Right: Offline-Imitation Lernen in $Q^π$-realisierbaren MDPs

逆向Q- 学习完成右: 以可变元DP为单位的离线模拟学习($$- $- 可变 MDP)

2505.19946v1

1374

05-26

RefinedFields: Radiance Fields Refinement for Planar Scene Representations

Verfeinerte Felder: Strahlungsfelder Verfeinerung für planare Szenendarstellungen

精炼田地: 辐射田地

2312.00639v4

1375

05-26

Can Visual Encoder Learn to See Arrows?

Kann Visual Encoder lernen, Pfeile zu sehen?

视觉编码器能学会看到箭头吗 ?

2505.19944v1

1376

05-26

Beyond Freezing: Sparse Tuning Enhances Plasticity in Continual Learning with Pre-Trained Models

Beyond Freezing: Sparse Tuning verbessert Plastizität im kontinuierlichen Lernen mit vortrainierten Modellen

超出冻结范围:在继续学习过程中,采用培训前模式,粗略的加注可增强可塑性

2505.19943v1

1377

05-26

Task-Oriented Low-Label Semantic Communication With Self-Supervised Learning

Aufgabenorientierte kabelarme semantische Kommunikation mit selbstüberwachtem Lernen

以任务为导向的低标签低标签语义交流与自控学习

2505.19940v1

1378

05-26

Efficient Time Series Processing for Transformers and State-Space Models through Token Merging

Effiziente Zeitreihenverarbeitung für Transformatoren und State-Space-Modelle durch Token Merging

通过 Token 合并对变形器和国家空间模型的有效时间序列处理

2405.17951v2

1379

05-26

Constructing a BPE Tokenization DFA

Aufbau einer BPE Tokenization DFA

正在构建 BPE 磁盘化 DFA

2405.07671v2

1380

05-26

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Modellierung von Multi-Task-Modellen, die als adaptives projektives Gradientenabsinken zusammenwachsen

模拟多任务模式模型合并为适应性预测梯度下层

2501.01230v3

1381

05-26

Logic Gate Neural Networks are Good for Verification

Logic Gate Neural Networks sind gut für die Verifikation

逻辑门神经网络有利于核查

2505.19932v1

1382

05-26

JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs

JailbreakRadar: Umfassende Bewertung von Jailbreak Attacken gegen LLMs

Jailbreb Radar:全面评估对LLMs的越狱袭击

2402.05668v3

1383

05-26

Semantic-Aware Resource Management for C-V2X Platooning via Multi-Agent Reinforcement Learning

Semantic-Aware Ressourcenmanagement für C-V2X Platooning über Multi-Agent Verstärkungslernen

通过多机构强化学习进行 C-V2X 等离子处理的语义软件资源管理

2411.04672v2

1384

05-26

Cellwise and Casewise Robust Covariance in High Dimensions

Cellwise und Casewise Robuste Kovarianz in hohen Abmessungen

高维度的单元格和大小写常量

2505.19925v1

1385

05-26

Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL

Bellman-Updates vertrauen lernen: Selektive State-Adaptive Regularisierung für Offline RL

学习信任 Bellman 更新信息: 选择性国家适应性离线转线常规化

2505.19923v1

1386

05-26

(Un)supervised Learning of Maximal Lyapunov Functions

(Un)überwachtes Lernen von maximalen Lyapunov-Funktionen

(无受监督的学习 Maximal Lyapunov 函数的学习

2408.17246v2

1387

05-26

A Probabilistic Model for Non-Contrastive Learning

Ein probabilistisches Modell für nicht kontrastives Lernen

非交流性学习概率模型

2501.13031v2

1388

05-26

APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization

APE: Ein datenzentrischer Benchmark für effiziente LLM-Anpassung in der Textzusammenfassung

APE: 文本摘要中高效LLM适应数据中心基准

2505.19912v1

1389

05-26

Inverse Problem Sampling in Latent Space Using Sequential Monte Carlo

Inverse Problem-Sampling im Latent Space mit Sequential Monte Carlo

利用定序蒙特卡洛在低层空间进行逆向问题抽样

2502.05908v2

1390

05-26

ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining

ESLM: Risiko-Averse Selective Language Modeling für effizientes Vortraining

ESLM: 有效培训前风险-反风险选择语言建模

2505.19893v1

1391

05-26

APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs

APB: Beschleunigen des verteilten Long-Context-Schlussfolgerungens durch Übergeben von komprimierten Kontextblöcken über GPUs

APP: 通过通过横跨 GPU 传递压缩的上下文区块加速分布式长文字推文

2502.12085v2

1392

05-26

A Langevin sampling algorithm inspired by the Adam optimizer

Ein Langevin-Sampling-Algorithmus, inspiriert vom Adam-Optimierer

由亚当优化器启发的Langevin取样算法

2504.18911v2

1393

05-26

Learning mechanical systems from real-world data using discrete forced Lagrangian dynamics

Mechanische Systeme aus realen Daten mit diskreter, erzwungener Lagrange-Dynamik lernen

使用离散强制拉格朗江动力从真实世界数据中学习机械系统

2505.20370v1

1394

05-26

Single-Agent vs. Multi-Agent LLM Strategies for Automated Student Reflection Assessment

Single-Agent vs. Multi-Agent LLM-Strategien für die automatisierte Bewertung von Studentenreflexionen

学生自动反省评估战略

2504.05716v2

1395

05-26

Target Specific De Novo Design of Drug Candidate Molecules with Graph Transformer-based Generative Adversarial Networks

Zielspezifisches De Novo-Design von Wirkstoff-Kandidatenmolekülen mit Graph Transformer-basierten Generativen Adversarial-Netzwerken

配有基于图形变形器的成形反转基因网络的药物候选分子具体新设计

2302.07868v7

1396

05-26

Risk-Averse Reinforcement Learning with Itakura-Saito Loss

Risiko-Averse Verstärkungs-Lernen mit Itakura-Saito-Verlust

以Itakuura-Saito损失进行反风险强化学习

2505.16925v2

1397

05-26

Explaining the role of Intrinsic Dimensionality in Adversarial Training

Erklärung der Rolle der Intrinsischen Dimensionalität im Adversarial Training

解释内在多面性在相互培训中的作用

2405.17130v2

1398

05-26

Multi-Graph Inductive Representation Learning for Large-Scale Urban Rail Demand Prediction under Disruptions

Multi-Graph Induktives Representationslernen für großflächige Nachfragevorhersage für die Stadtbahn unter Störungen

大型城市铁路需求预测中断下的大型城市铁路需求预测

2408.15619v2

1399

05-26

Deep Active Inference Agents for Delayed and Long-Horizon Environments

Tiefe aktive Inferenz-Agenten für verzögerte und lang-Horizonte Umgebungen

延迟和长-Horizon环境的深海活性推断剂

2505.19867v1

1400

05-26

HS-STAR: Hierarchical Sampling for Self-Taught Reasoners via Difficulty Estimation and Budget Reallocation

HS-STAR: Hierarchische Probenahme für selbstlernende Vernunfter über Schwierigkeitsschätzung und Budget-Umverteilung

HS-STAR:通过难以估计和预算重新定位为自学理性者进行等级抽样

2505.19866v1

1401

05-26

Information-theoretic Generalization Analysis for Expected Calibration Error

Informationstheoretische Generalisierungsanalyse für erwarteten Kalibrierungsfehler

预期校准错误信息理论概括分析

2405.15709v2

1402

05-26

FruitNeRF++: A Generalized Multi-Fruit Counting Method Utilizing Contrastive Learning and Neural Radiance Fields

FruitNeRF++: Eine generalisierte Multi-Fruit-Counting-Methode, die kontrastives Lernen und neurale Strahlungsfelder nutzt

水果NeRF++:通用的多功能计数方法,利用矛盾学习和神经辐射场

2505.19863v1

1403

05-26

KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics example

KAN verbessern wir die HEP-Klassifizierungsaufgaben? Kolmogorov-Arnold Networks für ein LHC-Physikbeispiel

KAN我们改进了HEP分类任务? KAN我们改进了HEP分类任务? Kolmogorov-Arnold网络应用到一个LHC物理范例

2408.02743v2

1404

05-26

Variance-Reduced Cascade Q-learning: Algorithms and Sample Complexity

Varianzreduziertes Kaskade Q-Lernen: Algorithmen und Probenkomplexität

差异减少的连级学习:等级和抽样复杂性

2408.06544v2

1405

05-26

REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Large Reasoning Models

REA-RL: Reflection-Aware Online-Verstärkungs-Lernen für effiziente große Vernunftmodelle

REA-RL:为高效大型理由模型进行反思-软件在线强化学习

2505.19862v1

1406

05-26

Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?

Editing as Unlearning: Sind Methoden der Wissensbearbeitung starke Grundlagen für großes Sprachmodell Unlearning?

编辑为 “ 重新学习:知识编辑方法是否为大语言模式的 “ 退出学习 “ 的 “ 大语言模式 “ 的 “ 坚实基线 “ ?

2505.19855v1

1407

05-26

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

DISCOVER: Automatisiertes Curricula für Sparse-Reward-Verstärkungs-Lernen

DISCOV: 失学-退职强化学习自动化课程

2505.19850v1

1408

05-26

Efficient Deconvolution in Populational Inverse Problems

Effiziente Dekonvolution in inversen Bevölkerungsproblemen

人口逆向问题的有效演变

2505.19841v1

1409

05-26

One Surrogate to Fool Them All: Universal, Transferable, and Targeted Adversarial Attacks with CLIP

Ein Surrogate an Narren: All: Universelle, übertragbare und gezielte Widersacherangriffe mit CLIP

以CLIP取代 “ 愚人Them all “ :通用、可转移和有针对性的对立攻击

2505.19840v1

1410

05-26

Multi-Agent Reinforcement Learning in Cybersecurity: From Fundamentals to Applications

Multi-Agenten-Verstärkung Lernen in Cybersicherheit: Von Grundlagen zu Anwendungen

网络安全多机构强化多机构网络安全学习:从基础到应用

2505.19837v1

1411

05-26

DiffNMR: Advancing Inpainting of Randomly Sampled Nuclear Magnetic Resonance Signals

DiffNMR: Advancing Inpainting von zufällig gemusterten Kernmagnetresonanzsignalen

DiffNMR:推进随机抽样核磁共振信号的油漆

2505.20367v1

1412

05-26

Revisiting Glorot Initialization for Long-Range Linear Recurrences

Wiederbesuch der Glorot-Initialisierung für langanhaltende lineare Wiederholungen

重新审查长频线性线性重现的地球初始化

2505.19827v1

1413

05-26

Foundation Models for Tabular Data within Systemic Contexts Need Grounding

Basismodelle für tabellarische Daten in systemischen Kontexten benötigen Erdung

系统环境中需要依据的表格数据基础模型

2505.19825v1

1414

05-26

An Introductory Survey to Autoencoder-based Deep Clustering – Sandboxes for Combining Clustering with Deep Learning

Eine Einführungsstudie zum Autoencoder-basierten Deep Clustering – Sandboxen für die Kombination von Clustering mit Deep Learning

以自动编码器为基础的深层集束 – – 将集束与深层学习相结合的沙箱的介绍性调查

2504.02087v2

1415

05-26

LAPA-based Dynamic Privacy Optimization for Wireless Federated Learning in Heterogeneous Environments

LAPA-basierte Dynamic Privacy Optimization for Wireless Federated Learning in heterogenen Umgebungen

以LAPA为基础的在多种不同环境无线联邦学习的动态隐私优化

2505.19823v1

1416

05-26

Poison in the Well: Feature Embedding Disruption in Backdoor Attacks

Gift im Brunnen: Feature Einbetten von Disruption in Backdoor-Angriffe

井中毒:幕后袭击中的特异性嵌入干扰

2505.19821v1

1417

05-26

InfoCons: Identifying Interpretable Critical Concepts in Point Clouds via Information Theory

InfoCons: Identifizieren von interpretierbaren kritischen Konzepten in Punktwolken über Informationstheorie

信息库:通过信息理论确定点云中可解释的关键概念

2505.19820v1

1418

05-26

Fast Differentiable Modal Simulation of Non-linear Strings, Membranes, and Plates

Schnelle differenzierbare Modale Simulation von nichtlinearen Strings, Membranen und Platten

非线性字符串、膜和平板等非线性字符串的快速可区分模式模拟

2505.05940v2

1419

05-26

Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models

Jailbreak-AudioBench: In-Depth-Bewertung und Analyse von Jailbreak-Bedrohungen für große Audio-Sprachenmodelle

监狱破碎-AudioBennch:对大型音频语言模型的监狱破碎威胁进行内部评价和分析

2501.13772v2

1420

05-26

Density Ratio-Free Doubly Robust Proxy Causal Learning

Dichte Verhältnis-frei doppelt robust Proxy Kausal Lernen

低密度比率-无杜布利强力代理原因学习

2505.19807v1

1421

05-26

Continuous Simplicial Neural Networks

Kontinuierliche simplizielle Neuralnetze

简单连续神经网络

2503.12919v2

1422

05-26

Modulated differentiable STFT and balanced spectrum metric for freight train wheelset bearing cross-machine transfer monitoring under speed fluctuations

Modulierte differenzierbare STFT und symmetrische Spektralmetrik für Güterzug-Radsatzlager-Übertragungsüberwachung unter Geschwindigkeitsschwankungen

根据速度波动情况对具有跨机械转移监测的货运火车轮轮车采用机动机动的可机动机动式STFT和平衡频谱度指标

2406.11917v3

1423

05-26

Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks

Erforschung des Bewusstseins in LLMs: Eine systematische Untersuchung von Theorien, Implementierungen und Grenzrisiken

探索LLMM中的觉悟:对理论、实施和前沿风险的系统调查

2505.19806v1

1424

05-26

GraphAU-Pain: Graph-based Action Unit Representation for Pain Intensity Estimation

GraphAU-Pain: Darstellung der Graph-basierten Aktionseinheit für Schmerzintensitätsabschätzung

图AAU-Pain: 以图表为基础的行动股疼痛强度估计代表

2505.19802v1

1425

05-26

Non-asymptotic convergence analysis of the stochastic gradient Hamiltonian Monte Carlo algorithm with discontinuous stochastic gradient with applications to training of ReLU neural networks

Nicht-asymptotische Konvergenzanalyse des stochastischen Gradienten Hamiltonian Monte Carlo Algorithmus mit diskontinuierlichem stochastischem Gradienten mit Anwendungen zum Training von ReLU-Neuralnetzwerken

对随机梯度汉密尔顿·汉密尔顿·蒙特-蒙特卡洛算法进行非症状趋同分析,使用不连续的随机梯度,并用于RELU神经网络培训

2409.17107v2

1426

05-26

The Missing Point in Vision Transformers for Universal Image Segmentation

Der fehlende Punkt in Vision Transformers für die universelle Bildsegmentierung

通用图像分割的愿景变异器中的缺失点

2505.19795v1

1427

05-26

What Can RL Bring to VLA Generalization? An Empirical Study

Was kann RL zur VLA-Verallgemeinerung bringen? Eine empirische Studie

RL能带给VLA的概括化带来什么?经验研究。

2505.19789v1

1428

05-26

MedDreamer: Model-Based Reinforcement Learning with Latent Imagination on Complex EHRs for Clinical Decision Support

MedDreamer: Modellbasiertes Verstärkungslernen mit latenter Imagination auf komplexen EHRs für die klinische Entscheidungsunterstützung

Medreamer:以模型为基础的强化学习,对临床决定支助的复杂电子人力资源进行中层想象

2505.19785v1

1429

05-26

Out-of-distribution Reject Option Method for Dataset Shift Problem in Early Disease Onset Prediction

Out-of-Distribution Ablehnung der Option Methode für Datensatz Verschiebung Problem bei Früherkrankungen Beginn Vorhersage

用于早期疾病上移预测中数据集移位问题的不分发拒绝选项方法

2405.19864v2

1430

05-26

Mol-LLM: Multimodal Generalist Molecular LLM with Improved Graph Utilization

Mol-LLM: Multimodaler Generalist Molecular LLM mit verbesserter Graphenverwendung

Mol-LLM:利用改进图表的多式通用主义分子有限力M

2502.02810v2

1431

05-26

Advancements in Medical Image Classification through Fine-Tuning Natural Domain Foundation Models

Fortschritte bei der Klassifikation medizinischer Bilder durch Modelle der Fine-Tuning Natural Domain Foundation

通过精美开发自然域基金会模型提高医学图像分类

2505.19779v1

1432

05-26

Query Performance Prediction using Relevance Judgments Generated by Large Language Models

Abfrage der Leistungsvorhersage anhand von Relevanzurteilen, die von großen Sprachmodellen erzeugt werden

使用大语言模型产生的相关性判断的查询性绩效预测

2404.01012v3

1433

05-26

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Verständnis der Leistungslücke im Preference Learning: Eine Dichotomie von RLHF und DPO

了解优先学习方面的绩效差距:RLHF和DPO的二分切开术

2505.19770v1

1434

05-26

Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases

Diff-Def: Diffusionsgenerierte Deformationsfelder für Bedingte Atlase

Diff- Def: 用于条件图集的 Diff- Def: 用于条件图集的 Dif- 扩散- 驱动解析字段

2403.16776v2

1435

05-26

Agentic Predictor: Performance Prediction for Agentic Workflows via Multi-View Encoding

Agentic Predictor: Leistungsvorhersage für Agentic Workflows über Multi-View-Encoding

AG 预测员:通过多查看编码对AG-工作流程的性能预测

2505.19764v1

1436

05-26

Unfolding AlphaFold’s Bayesian Roots in Probability Kinematics

AlphaFolds Bayesische Wurzeln in der Wahrscheinlichkeitskinematik entfalten

将 AlphaFold 的贝叶根在概率 Kinematics 中卸载

2505.19763v1

1437

05-26

In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement

In-Context-Demonstrationsfragen: Zur Prompt-Optimierung für Pseudo-Supervision-Verfeinerung

内文示范事项:关于Psuedo-监督改进的迅速优化

2410.03124v2

1438

05-26

Semantic-Aware Interpretable Multimodal Music Auto-Tagging

Semantic-Aware Interpretierbare multimodale Musik Auto-Tagging

解析多式音乐自动调制

2505.17233v2

1439

05-26

CIDRe: A Reference-Free Multi-Aspect Criterion for Code Comment Quality Measurement

CIDRe: Ein referenzfreies Multi-Aspekt-Kriterium für die Qualitätsmessung von Code Comment

CIDRe: 守则评论质量衡量的无参考性、无参考性、多特征的多标准标准

2505.19757v1

1440 05-26 Discrete Markov Bridge Diskretierte Markov-Brücke 分立马尔科夫桥 2505.19752v1

1441

05-26

Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data

Maschinelles Lernen Algorithmen zur Lärmreduzierung und krankheitsverursachende Gen-Feature-Extraktion in Gensequenzierungsdaten

用于减少噪音和在基因测序数据中进行疾病传播的基因特征采掘的机器学习算法

2505.19740v1

1442

05-26

Weighted Leave-One-Out Cross Validation

Gewichtete Leave-One-Out Cross-Validierung

加权请假一次性离职后交叉验证

2505.19737v1

1443

05-26

Using Time Structure to Estimate Causal Effects

Zeitstruktur zur Schätzung von Kausalitätseffekten verwenden

利用时间结构估计因果关系

2504.11076v2

1444

05-26

Accelerating Nash Learning from Human Feedback via Mirror Prox

Beschleunigendes Nash-Lernen aus menschlichem Feedback über Spiegelprox

通过镜像Prox从人类反馈中加快学习

2505.19731v1

1445

05-26

Stuffed Mamba: Oversized States Lead to the Inability to Forget

Gefüllte Mamba: Übergroße Staaten führen zu der Unfähigkeit zu vergessen

马姆巴:国家规模过大,导致无法忘却

2410.07145v2

1446

05-26

A Structured Tour of Optimization with Finite Differences

Eine strukturierte Tour der Optimierung mit endlichen Unterschieden

结构化优化与有限差异旅游

2505.19720v1

1447

05-26

OCN: Effectively Utilizing Higher-Order Common Neighbors for Better Link Prediction

OCN: Höhere Ordnung effektiv nutzen gemeinsame Nachbarn für bessere Link-Vorhersage

OCN:有效利用高端共同邻居改善联系预测

2505.19719v1

1448

05-26

Graceful Forgetting in Generative Language Models

Anmutiges Vergessen in generativen Sprachmodellen

在创用语言模型中优雅地忘却

2505.19715v1

1449

05-26

MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning

MT$^{3}$: Skalierung von MLLM-basierten Textbildmaschinenübersetzungen über Multi-Task-Verstärkungslernen

MT$=%3}$:通过多任务强化学习,扩大基于MLLM的文本图像机翻译

2505.19714v1

1450

05-26

On the Relation between Rectified Flows and Optimal Transport

Über die Beziehung zwischen rektifizierten Strömungen und optimalem Verkehr

纠正性流动与最佳运输之间的关系

2505.19712v1

1451

05-26

Automated Scientific Discovery: From Equation Discovery to Autonomous Discovery Systems

Automatisierte wissenschaftliche Entdeckung: Von der Gleichungserkundung zu autonomen Entdeckungssystemen

自动科学发现:从赤道发现到自主发现系统

2305.02251v2

1452

05-26

Solving Euler equations with Multiple Discontinuities via Separation-Transfer Physics-Informed Neural Networks

Lösen von Euler-Gleichungen mit mehreren Diskontinuitäten über Separation-Transfer-Physik-informierte Neuronale Netzwerke

通过分离-传输、物理内建神经网络解决多断裂的电动方程式

2505.20361v1

1453

05-26

Future-Oriented Navigation: Dynamic Obstacle Avoidance with One-Shot Energy-Based Multimodal Motion Prediction

Zukunftsorientierte Navigation: Dynamische Hindernisvermeidung mit einer heißen energiebasierten Multimodal-Bewegungsvorhersage

面向未来的导航:以单热能源为基础的多模式动力预测,动态障碍避免动态障碍

2505.00237v2

1454

05-26

HRP: High-Rank Preheating for Superior LoRA Initialization

HRP: Hochanker Vorwärmung für die Superior LoRA Initialisierung

HRP: 高级LORA初始化的高热预热

2502.07739v3

1455

05-26

Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments

Mosaic: Datenfreies Wissen Destillieren über Mixture-of-Experts für Heterogene verteilte Umgebungen

Mosaic:通过混合专家进行无数据知识蒸馏,促进异基因分布式环境

2505.19699v1

1456

05-26

Graph Guided Diffusion: Unified Guidance for Conditional Graph Generation

向导扩散:有条件图形生成统一指南

2505.19685v1

1457

05-26

CauSkelNet: Causal Representation Learning for Human Behaviour Analysis

CauSkelNet: Kausales Repräsentationslernen für die menschliche Verhaltensanalyse

CauSkelNet: 人类行为分析的因果关系学习

2409.15564v3

1458

05-26

Deep Actor-Critics with Tight Risk Certificates

Deep Actor-Critics mit engen Risikozertifikaten

具有严格风险证书的深行为者-批评者

2505.19682v1

1459

05-26

Cut out and Replay: A Simple yet Versatile Strategy for Multi-Label Online Continual Learning

Cut out und Replay: Eine einfache, aber vielseitige Strategie für Multi-Label Online Continual Learning

剪切和重放:一个简单但通俗易懂的多标签在线持续学习战略

2505.19680v1

1460

05-26

Optimal Multi-Fidelity Best-Arm Identification

Optimale Multi-Fidelity Best-Arm-Identifikation

最佳最佳多纤维最佳武器标识

2406.03033v2

1461

05-26

Bridging Privacy and Robustness for Trustworthy Machine Learning

Überbrückung von Privatsphäre und Robustheit für vertrauenswürdiges maschinelles Lernen

连接隐私和强力,促进可信赖的机器学习

2403.16591v4

1462

05-26

Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling

Zero-Shot-Streaming-Text zur Sprachsynthese mit Transducer und Auto-Regressive Modellierung

零热流文本,用于带有传感器和自动递减建模的语音合成

2505.19669v1

1463

05-26

GTR: Graph-Table-RAG for Cross-Table Question Answering

GTR: Graph-Table-RAG für Cross-Table-Frageantworten

GTR:用于跨表问题解答的图表表-RAG

2504.01346v3

1464

05-26

Unveil Multi-Picture Descriptions for Multilingual Mild Cognitive Impairment Detection via Contrastive Learning

Mehrbildbeschreibungen für mehrsprachige, leichte Kognitive Impairment-Erkennung durch kontrastives Lernen enthüllen

通过差异学习发现多语种轻视认知缺陷的单形多语种描述

2505.17067v2

1465

05-26

Best-Arm Identification in Unimodal Bandits

Best-Arm-Identifikation in unimodalen Banditen

统一强盗中的最佳武器识别

2411.01898v2

1466

05-26

MoESD: Unveil Speculative Decoding’s Potential for Accelerating Sparse MoE

MoESD: Spekulatives Decoding-Potential zur Beschleunigung von Sparse MoE enthüllen

MOESD: Unveil 投机性代谢潜力加速偏散的中导体

2505.19645v1

1467

05-26

Navigating Conflicting Views: Harnessing Trust for Learning

Navigieren gegensätzlicher Ansichten: Vertrauen fürs Lernen gewinnen

引导冲突观点:利用信任学习

2406.00958v3

1468

05-26

When fractional quasi p-norms concentrate

Wenn fraktioniertes Quasi-P-Normen-Konzentrat

当分微分准微调集中时

2505.19635v1

1469

05-26

Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs

Entkoppelung Spatio-Temporale Vorhersage: Wenn leichte große Modelle adaptive Hypergraphen treffen

脱钩的SPadio-TT时间预测:当轻量大模型与适应性高光谱相匹配时

2505.19620v1

1470

05-26

SESaMo: Symmetry-Enforcing Stochastic Modulation for Normalizing Flows

SESaMo: Symmetrie-verstärkende stochastische Modulation für normalisierende Strömungen

SESaMo: 正常流动的对称性-强化斯托调动

2505.19619v1

1471

05-26

When the Left Foot Leads to the Right Path: Bridging Initial Prejudice and Trainability

Wenn der linke Fuß auf den rechten Weg führt: Überbrückung von anfänglichen Vorurteilen und Trainingsfähigkeit

当左脚引向右路时:弥合最初的偏见和可训练性

2505.12096v2

1472

05-26

Learning and Interpreting Gravitational-Wave Features from CNNs with a Random Forest Approach

Erlernen und Dolmetschen von Gravitational-Wave-Features von CNNs mit einem zufälligen Waldansatz

使用随机森林方法从有线电视新闻网读取和解释引力维学特征

2505.20357v1

1473

05-26

Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models

Diagnostizieren und Abmildern von Modalitätsstörungen in multimodalen großen Sprachmodellen

多式联运大语言模型中的诊断和减缓模式干预

2505.19616v1

1474

05-26

Multiplicity is an Inevitable and Inherent Challenge in Multimodal Learning

Vielfältigkeit ist eine unvermeidliche und inhärente Herausforderung im multimodalen Lernen

多重性是多模式学习中不可避免和内在的挑战。

2505.19614v1

1475

05-26

Skrull: Towards Efficient Long Context Fine-tuning through Dynamic Data Scheduling

Skrull: Auf dem Weg zu einem effizienten langen Kontext Feinabstimmung durch Dynamic Data Scheduling

Skrull:通过动态数据安排,实现高效长处微调

2505.19609v1

1476

05-26

Energy-based Preference Optimization for Test-time Adaptation

Energiebasierte Preference-Optimierung für die Testzeitanpassung

以能源为基础的试验时间适应最佳应用

2505.19607v1

1477

05-26

Kuramoto-FedAvg: Using Synchronization Dynamics to Improve Federated Learning Optimization under Statistical Heterogeneity

Kuramoto-FedAvg: Synchronisationsdynamik zur Verbesserung der Federated Learning Optimization unter statistischer Heterogenität

Kuramoto-FedAvg:利用同步动态改善统计多样性下的联邦学习优化

2505.19605v1

1478

05-26

Evaluating Machine Translation Models for English-Hindi Language Pairs: A Comparative Analysis

Machine Translation Models für Englisch-Hindi Sprachpaare bewerten: Eine vergleichende Analyse

英文-中文语文配对评价机器翻译模型:比较分析

2505.19604v1

1479

05-26

Distributional Reinforcement Learning with Dual Expectile-Quantile Regression

Verstärktes Lernen mit Dual Expectile-Quantile Regression

双预期量递减分布强化学习

2305.16877v4

1480

05-26

Rep3D: Re-parameterize Large 3D Kernels with Low-Rank Receptive Modeling for Medical Imaging

Rep3D: Große 3D-Kernel mit Low-Rank-Empfangsmodellierung für die medizinische Bildgebung neu parametrieren

Rep3D: 医疗成像低射感应模型的大型 3D 内核再修复

2505.19603v1

1481

05-26

Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

Speichereffiziente visuelle Autoregressive Modellierung mit Scale-Aware-KV-Cache-Kompression

KV缓存压缩的内存有效视觉自动递减模型

2505.19602v1

1482

05-26

Preference Optimization by Estimating the Ratio of the Data Distribution

Präferenzoptimierung durch Schätzung des Verhältnisses der Datenverteilung

通过估计数据分配比率实现最佳优化

2505.19601v1

1483

05-26

Inconsistent Tokenizations Cause Language Models to be Perplexed by Japanese Grammar

Inkonsistente Tokenisierungen führen dazu, dass Sprachmodelle von japanischer Grammatik verblüfft werden.

前后不一致的招数导致语言模式被日语语法所混淆

2505.19599v1

1484

05-26

Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

Residual Connections und Normalisierung können eine Übersäuerung in GNNs wahrscheinlich verhindern

残留连接和正常化可可可避免防止全球NN的过度移动

2406.02997v3

1485

05-26

How Well Can Differential Privacy Be Audited in One Run?

Wie gut kann die Privatsphäre in einem einzigen Lauf überprüft werden?

如何在单一运行中对差异隐私进行审计?

2503.07199v2

1486

05-26

Learning to Reason without External Rewards

Vernunft lernen ohne externe Belohnungen

学习没有外部奖励的理性

2505.19590v1

1487

05-26

WQLCP: Weighted Adaptive Conformal Prediction for Robust Uncertainty Quantification Under Distribution Shifts

WQLCP: Gewichtete adaptive konforme Vorhersage für robuste Unsicherheit Quantifizierung unter Verteilungsverschiebungen

WQLCP: 分配变化下强势不确定性量化的加权适应性统一预测

2505.19587v1

1488

05-26

Accelerating Prefilling for Long-Context LLMs via Sparse Pattern Sharing

Beschleunigung der Vorfüllung für Langkontext-LLMs über Sparse Pattern Sharing

通过 Sparse 模式共享加速预填长文本 LLMs

2505.19578v1

1489

05-26

GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

GraLoRA: Granulare Low-Rank-Anpassung für den Parameter-Effizient Feintuning

GRALORA: 用于参数有效精密调整的颗粒式低兰克适应

2505.20355v1

1490

05-26

Situationally-Aware Dynamics Learning

Situational-Aware Dynamics Learning

情况认知动态学习

2505.19574v1

1491

05-26

Truncated Kernel Stochastic Gradient Descent on Spheres

Beschnittener Kern Stochastischer Gradient Abstieg auf Sphären

球体上被排出核心内核岩层渐变源

2410.01570v5

1492

05-26

MSD-LLM: Predicting Ship Detention in Port State Control Inspections with Large Language Model

MSD-LLM: Schiffshaft in Hafenstaatkontrolle mit großem Sprachmodell vorhersagen

MSD-LLM:用大语言模型预测港口国控制检查中船舶扣留情况

2505.19568v1

1493

05-26

BackSlash: Rate Constrained Optimized Training of Large Language Models

对大语言模式优化培训

2504.16968v3

1494

05-26

Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams

Lego Sketch: Ein skalierbares neurales Netzwerk für das Sketching von Datenströmen

Lego Sletch: 一个可缩放的内存放大神经网络,用于切割数据流

2505.19561v1

1495

05-26

EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding

EuroCon: Benchmarking Parlament Beratung für politische Konsensfindung

EuroCon:确定议会审议政治共识结果的基准

2505.19558v1

1496

05-26

Aligning Multiclass Neural Network Classifier Criterion with Task Performance Metrics

Ausrichten von Multiclass Neural Network Klassifikator Kriterium mit Task Performance Metrics

将多等神经网络分类标准与任务性性能计量对齐

2405.20954v2

1497

05-26

On scalable and efficient training of diffusion samplers

Zur skalierbaren und effizienten Schulung von Diffusionssammlern

对推广采样员进行可推广和高效率的培训

2505.19552v1

1498

05-26

Unlocking the Power of Diffusion Models in Sequential Recommendation: A Simple and Effective Approach

Entsperren der Macht von Diffusionsmodellen in der sequentiellen Empfehlung: Ein einfacher und effektiver Ansatz

在 “ 序列建议:简单而有效办法 “ 中解锁扩散模型扩散能力

2505.19544v1

1499

05-26

Cuff-KT: Tackling Learners’ Real-time Learning Pattern Adjustment via Tuning-Free Knowledge State Guided Model Updating

Cuff-KT: Anpassung von Lernmustern in Echtzeit durch Tuning-Free Knowledge State Guided Model Aktualisieren

CUff-KT:通过更新无资-无知识国家指导模式,解决学生实时学习模式调整问题

2505.19543v1

1500

05-26

FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation

FastCache: Schnelles Caching für Difffusionstransformator durch erlernbare lineare Annäherung

快速缓存: 通过可学习的线性近似化快速缓存扩散变异器

2505.20353v1

1501

05-26

R3: Robust Rubric-Agnostic Reward Models

R3: Robuste Rubric-Agnostische Belohnungsmodelle

R3:坚固的Rubric-不可知奖赏模型

2505.13388v2

1502

05-26

Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

Amulett: Neuausrichtung während der Testzeit für Personalisierte Präferenzanpassung von LLMs

缩略图:在试验期间重新对准,以适应LLMM的个性化偏好

2502.19148v2

1503

05-26

CITRAS: Covariate-Informed Transformer for Time Series Forecasting

CITRAS: Kovariat-informierter Transformer für die Zeitreihenprognose

CITRAS: 用于时间序列预测的共变-内建变换器

2503.24007v2

1504

05-26

Continuous-Time Analysis of Heavy Ball Momentum in Min-Max Games

Kontinuierliche Zeitanalyse von schweren Ball Momentum in Min-Max-Spiele

Min-Min-Max运动会重球势连续分析

2505.19537v1

1505

05-26

Training-Free Multi-Step Audio Source Separation

Schulungsfreie Mehrstufen-Audio-Quellentrennung

无培训的多步骤多步骤音频来源分离

2505.19534v1

1506

05-26

ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models

ExAnte: Ein Benchmark für Ex-Ante-Schlussfolgerungen in großen Sprachmodellen

ExAnte:大语言模型前推定基准

2505.19533v1

1507

05-26

Fox in the Henhouse: Supply-Chain Backdoor Attacks Against Reinforcement Learning

Fox im Henhouse: Supply-Chain-Hintertür greift gegen Verstärkungslernen an

Henhouse的狐狸:供应-Chain对加强学习的后门攻击

2505.19532v1

1508

05-26

Minimalist Softmax Attention Provably Learns Constrained Boolean Functions

Minimalistische Softmax-Achtung lernt nachweislich eingeschränkte Boolean-Funktionen

最小软性软性关注

2505.19531v1

1509

05-26

SLOT: Sample-specific Language Model Optimization at Test-time

Steckplatz: Beispielspezifische Sprachmodelloptimierung zur Testzeit

SPLOT: 测试时特定抽样语文示范模式优化

2505.12392v2

1510

05-26

Navigating loss manifolds via rigid body dynamics: A promising avenue for robustness and generalisation

Navigieren von Verlustkrümmern über starre Körperdynamik: Ein vielversprechender Weg für Robustheit und Verallgemeinerung

通过僵硬体体体动态来控制损失方块:加强和普及的有希望的途径

2505.19527v1

1511

05-26

Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate

Rethinking Gating Mechanism in Sparse MoE: Arbiträre Modalitätsinputs mit vertrauensgeführtem Tor bearbeiten

微粒MOE中的重新思考定位机制:用信任引导门处理任意模式投入

2505.19525v1

1512

05-26

Semi-Supervised Model-Free Bayesian State Estimation from Compressed Measurements

Halbüberwachte modellfreie bayesische Staatsschätzung aus komprimierten Messungen

根据压缩计量法对贝耶斯州无模式模型的半有效估算

2407.07368v5

1513

05-26

Applications and Effect Evaluation of Generative Adversarial Networks in Semi-Supervised Learning

Anwendungen und Wirkungsbewertung generativer adversarialer Netzwerke im semi-überwachten Lernen

半监测学习中产生反效果网络的应用和效果评价

2505.19522v1

1514

05-26

Learning Dynamics under Environmental Constraints via Measurement-Induced Bundle Structures

Dynamisches Lernen unter Umweltauflagen durch messinduzierte Bundle-Strukturen

通过衡量产生的捆绑结构,在环境制约因素下学习动力

2505.19521v1

1515

05-26

SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback

SIPDO: Closed-Loop Prompt Optimierung über Synthetic Data Feedback

SIPDO:通过合成数据反馈,通过闭闭电话快速优化

2505.19514v1

1516

05-26

Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models

Benchmarking multimodaler Wissenskonflikt für große multimodale Modelle

确定大型多式联运模式多模式知识冲突基准

2505.19509v1

1517

05-26

Multimodal Machine Translation with Visual Scene Graph Pruning

Multimodale maschinelle Übersetzung mit visuellen Szenendiagrammen

带有视觉场景图的多式机器翻译

2505.19507v1

1518

05-26

Understanding Why Large Language Models Can Be Ineffective in Time Series Analysis: The Impact of Modality Alignment

Verständnis, warum große Sprachmodelle in der Zeitreihenanalyse unwirksam sein können: Die Auswirkungen der Modalitätsausrichtung

理解为何大语言模型在时间序列分析中无效:方式调整的影响

2410.12326v2

1519

05-26

DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation

DOGe: Defensive Output Generation für LLM-Schutz vor Wissensdestillation

DOGe: 防知识蒸馏保护LLM的防御性产出产生

2505.19504v1

1520

05-26

Differentially private ratio statistics

Statistiken über unterschiedliche private Verhältnisse

差异性私人比率统计

2505.20351v1

1521

05-26

Learning for Dynamic Combinatorial Optimization without Training Data

Lernen für dynamische kombinatorische Optimierung ohne Trainingsdaten

没有培训数据的动态组合优化学习

2505.19497v1

1522

05-26

MetaSTNet: Multimodal Meta-learning for Cellular Traffic Conformal Prediction

MetaSTNet: Multimodales Meta-Learning für zellulären Verkehr Konforme Vorhersage

MetaSTNet: 细胞交通预测的多模式元学习

2505.21553v1

1523

05-26

Discounted Online Convex Optimization: Uniform Regret Across a Continuous Interval

Discounted Online Convex-Optimierung: Einheitlicher Bedauern über einen kontinuierlichen Intervall

贴现的在线 Convex 优化: 连续间隔的统一遗憾

2505.19491v1

1524

05-26

Understanding Transformer from the Perspective of Associative Memory

Transformer aus der Perspektive des assoziativen Gedächtnisses verstehen

从共同记忆的角度理解变异器

2505.19488v1

1525

05-26

VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning

VLMLight: Verkehrssignalsteuerung über Vision-Language Meta-Control und Dual-Branch-Reasoning

VLMLight:通过视觉语言、超控制和双层理由解释控制交通信号控制

2505.19486v1

1526

05-26

Understanding the learned look-ahead behavior of chess neural networks

Das gelernte Look-Ahead-Verhalten von neuronalen Schachnetzwerken verstehen

了解国际象棋神经网络所学的直视行为

2505.21552v1

1527

05-26

Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

Gewinnen Sie schnell oder verlieren Sie langsam: Ausgleichende Geschwindigkeit und Genauigkeit in Latenz-Sensitive Entscheidungen von LLMs

慢赢或慢输:LLMs的延缓敏感决定中平衡速度和准确性

2505.19481v1

1528

05-26

Revolutionizing Wildfire Detection with Convolutional Neural Networks: A VGG16 Model Approach

Revolutionierung der Wildfire-Detektion mit konvolutionären neuralen Netzwerken: Ein VGG16-Modellansatz

与革命神经神经网络一起革命性野火探测革命:VGG16示范方法

2505.19479v1

1529

05-26

Weighted quantization using MMD: From mean field to mean shift via gradient flows

Gewichtete Quantisierung mit MMD: Vom mittleren Feld zur mittleren Verschiebung über Gradientenströme

使用 MMD 加权量化: 从平均字段到通过梯度流转移

2502.10600v2

1530

05-26

Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables

Informationstheoretische Generalisierungsanalyse für VQ-VAEs: Eine Rolle latenter Variablen

VQ-VAEs 信息理论概括分析:隐性变量的作用

2505.19470v1

1531

05-26

Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory

Diversity-getriebene Generative Datensatzdestillation basierend auf Diffusionsmodell mit selbstadaptivem Speicher

基于带有自适应内存的传播模型的传播模型的多样化生成数据集蒸馏

2505.19469v1

1532

05-26

Parrot: Multilingual Visual Instruction Tuning

Papagei: Mehrsprachige visuelle Anleitung

Parrot: 多语言视觉教学图示

2406.02539v3

1533

05-26

Towards End-to-End Training of Automatic Speech Recognition for Nigerian Pidgin

Auf dem Weg zum Ende der Ausbildung zur automatischen Spracherkennung für nigerianische Pidgin

走向尼日利亚皮吉纳自动语音识别的端至端培训

2010.11123v2

1534

05-26

Decision Flow Policy Optimization

Optimierung der Entscheidungsflusspolitik

优化决策流程政策

2505.20350v1

1535

05-26

Origin Tracer: A Method for Detecting LoRA Fine-Tuning Origins in LLMs

Herkunfts-Tracer: Eine Methode zur Erkennung von LoRA-Feinabstimmungs-Ursprungen in LLMs

来源追踪器:用LLMM探测LORA精导来源的方法

2505.19466v1

1536

05-26

Residual Cross-Attention Transformer-Based Multi-User CSI Feedback with Deep Joint Source-Channel Coding

Residual Cross-Attention Transformer-basierte Multi-User CSI Feedback mit Deep Joint Source-Channel Coding

CSI 与深源-源-汇联合编码的反馈

2505.19465v1

1537

05-26

Your Classifier Can Do More: Towards Bridging the Gaps in Classification, Robustness, and Generation

Ihr Klassifikator kann mehr: Auf dem Weg zur Überbrückung der Lücken in Klassifizierung, Robustheit und Generation

您的分类员可以做更多的事情: 缩小分类、强健和代际差距

2505.19459v1

1538

05-26

Recurrent Self-Attention Dynamics: An Energy-Agnostic Perspective from Jacobians

Recurrent Self-Attention Dynamics: Eine energie-agnostische Perspektive von Jacobians

《自我注意动态:雅各布人对能源不可知的视角》

2505.19458v1

1539

05-26

MM-Prompt: Cross-Modal Prompt Tuning for Continual Visual Question Answering

MM-Prompt: Cross-Modal Prompt Tuning zur kontinuierlichen visuellen Fragestellung

MM-Prompt: 用于持续视觉问答的跨模式快速测试

2505.19455v1

1540

05-26

MetaGMT: Improving Actionable Interpretability of Graph Multilinear Networks via Meta-Learning Filtration

MetaGMT: Durch Meta-Learning Filtration die Durchführbarkeit von Graphen-Multilinearen Netzwerken verbessern

MetGMT:通过Met-Learn Filtation改进图形多线网络可操作的解释性

2505.19445v1

1541

05-26

Discovering Forbidden Topics in Language Models

Verbotene Themen in Sprachmodellen entdecken

发现语言模型中的禁止专题

2505.17441v2

1542

05-26

MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding

MORE-Brain:可解释和可通用跨主题FMRI视觉解码专家有条不紊混合

2505.15946v2

1543

05-26

RDI: An adversarial robustness evaluation metric for deep neural networks based on model statistical features

RDI: Eine gegnerische Robustheitsbewertungsmetrik für tiefe neuronale Netzwerke basierend auf modellstatistischen Merkmalen

RDI:基于示范统计特征的深神经网络对抗性强力评价标准

2504.18556v2

1544

05-26

Fairness Practices in Industry: A Case Study in Machine Learning Teams Building Recommender Systems

Fairness Practices in der Industrie: Eine Fallstudie in Machine Learning Teams Bau von Recommender Systemen

工业公平做法:机械学习小组建立建议系统个案研究

2505.19441v1

1545

05-26

The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

Die Geburt des Wissens: Emergente Funktionen über Zeit, Raum und Maßstab in großen Sprachmodellen

知识的诞生:跨越时间、空间和大语言模型规模的新兴特征

2505.19440v1

1546

05-26

Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression

Kann komprimierte LLMs wirklich handeln? Eine empirische Bewertung der Agentischen Fähigkeiten in der LLM-Kompression

能否压缩LLM Really Act? 对LLM Actrables in LLM Corpression的代理能力进行经验评估。

2505.19433v1

1547

05-26

Advanced long-term earth system forecasting by learning the small-scale nature

Fortschrittliche Langzeitprognosen des Erdsystems durch Erlernen der kleinmaßstäblichen Natur

学习小规模性质,进行高级长期地球系统预测

2505.19432v1

1548

05-26

Importance Weighted Score Matching for Diffusion Samplers with Enhanced Mode Coverage

Bedeutung Gewichteter Score passend für Diffusion Sampler mit erweiterten Modus Abdeckung

具有强化模式覆盖率的传播采样器比对重要加权分数

2505.19431v1

1549

05-26

MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision

MAS-ZERO: Konzipieren von Multi-Agenten-Systemen mit Zero Supervision

MAS-ZERO: 设计无监督的多机构系统

2505.14996v2

1550

05-26

WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference

WINA: Gewichtsinformierte Neuronen-Aktivierung zur Beschleunigung der Large Language Model Inferenz

WINA: 加速大语言模型推断:超速超高语言速变速超速超时超高电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电速电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电电

2505.19427v1

1551

05-26

The Role of Diversity in In-Context Learning for Large Language Models

Die Rolle der Vielfalt im In-Context-Lernen für große Sprachmodelle

多样性在为大语言模式进行内文学习方面的作用

2505.19426v1

1552

05-26

Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation

Strukturstörung: Verringern von bösartiger Diffusions-basierter Inpainting durch Selbstaufmerksamkeit Abfrage Störung

结构混乱:通过自控查询干扰来改变恶意扩散的涂漆

2505.19425v1

1553

05-26

Each Graph is a New Language: Graph Learning with LLMs

Jeder Graph ist eine neue Sprache: Graph Learning mit LLMs

每图都是一种新语言:用LLMM学习图表

2501.11478v3

1554

05-26

Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

Im Moment falsch dann: Nicht-Stationäre Direktpräferenz-Optimierung unter Preference Drift

右,右,错误然后: 非标准直接首选优化在偏好驱动器下

2407.18676v2

1555

05-26

SaVe-TAG: Semantic-aware Vicinal Risk Minimization for Long-Tailed Text-Attributed Graphs

SaVe-TAG: Semantisch-bewusst Vicinal Risk Minimierung für langgestreckte Text-Attribute Graphen

SaVe-TAG: 长途脱轨文本可归图解析相邻风险最小化

2410.16882v3

1556

05-26

Strictly Constrained Generative Modeling via Split Augmented Langevin Sampling

Streng eingeschränkte generative Modellierung über Split Augmented Langevin Sampling

通过分分扩大Langevin抽样进行严格约束的生成模型模拟

2505.18017v2

1557

05-26

Toward Physics-Informed Machine Learning for Data Center Operations: A Tropical Case Study

Auf dem Weg zum physikinformierten maschinellen Lernen für Rechenzentrumsoperationen: Eine Tropische Fallstudie

争取为数据中心业务进行物理一体化机械学习:热带案例研究

2505.19414v1

1558

05-26

Future Link Prediction Without Memory or Aggregation

Zukünftige Link-Vorhersage ohne Gedächtnis oder Aggregation

没有记忆或聚合的未来联系预测

2505.19408v1

1559

05-26

FedHERO: A Federated Learning Approach for Node Classification Task on Heterophilic Graphs

FedHERO: Ein Federated Learning Approach für Knotenklassifikation Aufgaben auf heterophilen Graphen

FEFHERO: 异生物图节点分类任务联邦学习方法

2504.21206v2

1560

05-26

Exploring the Possibility of TypiClust for Low-Budget Federated Active Learning

Erforschung der Möglichkeit des TypiClusts für budgetarmes, föderiertes aktives Lernen

探讨低预算联邦积极学习的TypiClust

2505.19404v1

1561

05-26

KHRONOS: a Kernel-Based Neural Architecture for Rapid, Resource-Efficient Scientific Computation

KHRONOS: Eine Kernel-basierte Neuralarchitektur für schnelle, ressourceneffiziente wissenschaftliche Berechnung

KHRONOS:一个以核心为基础的神经结构,用于快速、资源高效科学计算

2505.13315v2

1562

05-26

Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs

Können LLMs helfen, Erkenntnisse über LLMs zu enthüllen? Eine groß angelegte, sich entwickelnde Literaturanalyse von Frontier LLMs

LLMs 帮助发现关于LLM的见识? 大型、不断发展的前沿LMS文学分析

2502.18791v3

1563

05-26

Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent

Auf dem Weg zum Verständnis der Verallgemeinerbarkeit des verzögerten stochastischen Absinkens

了解拖延的拖延的逐步后世后代的普遍适用性

2308.09430v4

1564

05-26

Are Time-Series Foundation Models Deployment-Ready? A Systematic Study of Adversarial Robustness Across Domains

Sind Time-Series-Stiftungsmodelle bereit? Eine systematische Studie über die widerrechtliche Robustheit über Domains hinweg

时间-系列基金会的模型是部署-准备模型吗?

2505.19397v1

1565

05-26

Uniform convergence of the smooth calibration error and its relationship with functional gradient

Einheitliche Konvergenz des glatten Kalibrierfehlers und seines Verhältnisses mit dem funktionellen Gradienten

平稳校准误差及其与功能梯度的关系统一汇合

2505.19396v1

1566

05-26

Towards the Causal Complete Cause of Multi-Modal Representation Learning

Auf dem Weg zur kausalen vollständigen Ursache des multi-Modalen Repräsentationslernens

走向多模式代表制学习的事业完全原因

2407.14058v6

1567

05-26

Alignment of large language models with constrained learning

Ausrichtung großer Sprachmodelle mit eingeschränktem Lernen

大型语言模式与限制学习的结合

2505.19387v1

1568

05-26

JingFang: An Expert-Level Large Language Model for Traditional Chinese Medicine Clinical Consultation and Syndrome Differentiation-Based Treatment

JingFang: Ein sachverständiges Sprachmodell für die traditionelle chinesische Medizin Klinische Beratung und Syndromdifferenzierungsbasierte Behandlung

JingFang:中国传统医学临床咨询和综合症差别治疗专家级大语言模式

2502.04345v2

1569

05-26

Unsupervised Anomaly Detection Using Diffusion Trend Analysis for Display Inspection

Unüberwachte Anomalieerkennung mit Diffusion Trendanalyse für Display-Inspektion

用于显示检查的利用扩散趋势分析进行无监督异常探测

2407.09578v2

1570

05-25 (7)

SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

SALSA-RL: Stabilitätsanalyse im Latent Space of Actions zur Stärkung des Lernens

SALSA-RL:加强学习行动空间的稳定分析

2502.15512v2

1571

05-25

Foundations of Top-$k$ Decoding For Language Models

Grundlagen von Top-$k$ Dekodierung für Sprachmodelle

语言模式最高价基数

2505.19371v1

1572

05-25

SETransformer: A Hybrid Attention-Based Architecture for Robust Human Activity Recognition

SETransformer: Eine hybride, auf Aufmerksamkeit basierende Architektur für robuste menschliche Aktivitätserkennung

转型:以关注为基础的混合结构,以确认强有力的人类活动

2505.19369v1

1573

05-25

One Step Diffusion via Shortcut Models

Ein Schritt Diffusion über Shortcut-Modelle

通过快捷键模型进行单步扩散

2410.12557v2

1574

05-25

Adaptive Diffusion Guidance via Stochastic Optimal Control

Adaptive Diffusionsführung über stochastische Optimale Kontrolle

通过斯托卡优化控制进行适应性扩散指导

2505.19367v1

1575

05-25

FD-Bench: A Modular and Fair Benchmark for Data-driven Fluid Simulation

FD-Bench: Modularer und fairer Benchmark für datengetriebene Fluidsimulation

FD-时区:数据驱动流流模拟模块化公平基准

2505.20349v1

1576

05-25

Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments

Konsistenzbasierte abduktive Begründung über Wahrnehmungsfehler mehrerer vortrainierter Modelle in neuartigen Umgebungen

创新环境中多个未受过培训的多种模式的认知错误的基于一致性的直截力理由

2505.19361v1

1577

05-25

Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval

Optimierte Text-Embedding-Modelle und Benchmarks für die Amharische Passage Retrieval

阿姆光通过通过检索的最佳文本嵌入模型和基准

2505.19356v1

1578

05-25

FlashMD: long-stride, universal prediction of molecular dynamics

FlashMD: Langstride, universelle Vorhersage der molekularen Dynamik

FlashMD:长途、全方位预测分子动态

2505.19350v1

1579

05-25

Communication-Efficient Multi-Device Inference Acceleration for Transformer Models

Kommunikationseffiziente Multi-Device-Inferenzbeschleunigung für Transformer-Modelle

变换模型的通信效率高多变量推推加速

2505.19342v1

1580 05-25 Flow Q-Learning Fluss Q-Lernen 流动学习 2502.02538v2

1581

05-25

Improving Compositional Generation with Diffusion Models Using Lift Scores

Verbesserung der kompositorischen Generierung mit Diffusionsmodellen mit Lift-Scores

利用使用提升分数的传播模型改善组成型

2505.13740v2

1582

05-25

TRANSIT your events into a new mass: Fast background interpolation for weakly-supervised anomaly searches

Übertragen Sie Ihre Ereignisse in eine neue Masse: Schnelle Hintergrundinterpolation für schwach überwachte Anomaliensuche

将您的事件转换成一个新的质量: 快速背景内插, 用于受微弱监督的异常搜索

2503.04342v2

1583

05-25

WhisperD: Dementia Speech Recognition and Filler Word Detection with Whisper

WhisperD: Dementia Spracherkennung und Filler-Worterkennung mit Whisper

耳语:痴呆症言语识别和用耳语探测填字词

2505.21551v1

1584

05-25

Likert or Not: LLM Absolute Relevance Judgments on Fine-Grained Ordinal Scales

LLM Absolute Relevanz Urteile auf feinkörnigen Ordinalwaagen

理論或非理論:LLM 关于精准奥氏比额的绝对相关性判决

2505.19334v1

1585

05-25

Bayesian Comparisons Between Representations

Bayesische Vergleiche zwischen Repräsentationen

代表之间的贝叶比较

2411.08739v3

1586

05-25

Paying Alignment Tax with Contrastive Learning

Steuern mit kontraproduktivem Lernen ausgleichen

与反向学习支付一致税

2505.19327v1

1587

05-25

An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

Eine Adversarial Analyse von Thompson Sampling für Full-Information Online-Lernen: von Finite zu Unendlichen Aktionsräumen

对Thompson网上全面信息学习抽样分析:从有限到无限行动空间

2502.14790v4

1588

05-25

Regress, Don’t Guess – A Regression-like Loss on Number Tokens for Language Models

Regress, nicht raten – Ein Rückschritt-ähnlicher Verlust an Zahlenzeichen für Sprachmodelle

Regress, don’t guess - 语言模型数字调的回归式损失

2411.02083v2

1589

05-25

PIGPVAE: Physics-Informed Gaussian Process Variational Autoencoders

PIGPVAE: Physik-informierte Gauß-Prozessvariationelle Autoencoder

PIGPVAE: 物理化高斯进程变异自动编码器

2505.19320v1

1590

05-25

Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?

Sind Transformer durch die Verbindung getrennter Kenntnisse in Trainingsdaten in der Lage, Vernunft zu erreichen?

将培训数据方面的单独知识连接起来的变换者是否具有理性?

2501.15857v6

1591

05-25

Effort-aware Fairness: Incorporating a Philosophy-informed, Human-centered Notion of Effort into Algorithmic Fairness Metrics

Effort-aware Fairness: Aufnahme einer philosophisch-informierten, menschlich-zentrierten Nennung von Effort in algorithmische Fairness-Metriken

努力做到公平:将了解哲学、以人为中心的努力理念纳入到算法公平度量中

2505.19317v1

1592

05-25

Demand Selection for VRP with Emission Quota

Auswahl der Nachfrage nach VRP mit Emissionsquoten

具有排放配额的VRP需求选择

2505.19315v1

1593

05-25

Concept Reachability in Diffusion Models: Beyond Dataset Constraints

Konzept-Erreichbarkeit in Diffusions-Modellen: Jenseits von Datensatzbeschränkungen

传播模型中可达到的概念:超越数据集的制约

2505.19313v1

1594

05-25

Stochastic Hessian Fittings with Lie Groups

Stochastische hessische Beschläge mit Lie Groups

配有谎言组的假体装配机

2402.11858v5

1595

05-25

Fractional-Boundary-Regularized Deep Galerkin Method for Variational Inequalities in Mixed Optimal Stopping and Control

Fraktional-Boundary-Regularized Deep Galerkin-Methode für unterschiedliche Ungleichheiten in gemischten Optimalen Stoppen und Steuern

用于混合最佳制止和控制中差异性不平等的分数-界分- 常规深加热法

2505.19309v1

1596

05-25

From Single Images to Motion Policies via Video-Generation Environment Representations

Von Einzelbildern zu Motion Policies über Video-Generation Umweltvertretungen

从单一图像到通过视频环境代表从单一图像到运动政策

2505.19306v1

1597

05-25

Time Series Embedding Methods for Classification Tasks: A Review

Zeitreihen Einbetten von Methoden für die Klassifizierung Aufgaben: Eine Überprüfung

分类任务所含方法:审查

2501.13392v2

1598

05-25

LLM-Based Emulation of the Radio Resource Control Layer: Towards AI-Native RAN Protocols

LLM-basierte Emulation der Funkressourcenkontrollschicht: Auf dem Weg zu KI-Native RAN-Protokollen

基于LLM的无线电资源控制层模拟模拟无线电资源控制层:迈向AI-NTRAN议定书

2505.16821v2

1599

05-25

On the status of current quantum machine learning software

Zum Status der aktuellen Quantenmaschinen-Lernsoftware

关于当前量子机器学习软件现状

2503.08962v2

1600

05-25

100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?

100-LongBench: Sind de facto Long-Context-Benchmarks wortwörtlich die Lang-Context-Fähigkeit zu bewerten?

100-LongBench:事实上的长文本基准是否实际评价长文本能力?

2505.19293v1

1601

05-25

Hypercube-RAG: Hypercube-Based Retrieval-Augmented Generation for In-domain Scientific Question-Answering

Hypercube-RAG: Hypercube-based Retrieval-Augmented Generation for In-domain Scientific Question-Answering

Hypercube-RAG: 内地科学问题解答的超立方体回收回溯性养代

2505.19288v1

1602

05-25

Provably Overwhelming Transformer Models with Designed Inputs

Wahrscheinlich überwältigende Transformer-Modelle mit designten Eingängen

具有设计投入的、可预见地压得压得压倒的变压器模型

2502.06038v2

1603

05-25

A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning

Eine Momentaufnahme des Einflusses: Ein lokales Daten-Attributions-Framework für Online-Verstärkungs-Lernen

《影响概览:在线强化学习地方数据归属框架》

2505.19281v1

1604

05-25

Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization

Optimaler Transport Barycenter über Nonconvex-Concave Minimax-Optimierung

通过非 connconvex- concave Minimax 优化化优化运输博利中心

2501.14635v2

1605

05-25

Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits through Gaussian Approximation

Erreichen von $\tilde{\mathcal{O}(1/N)$ Optimality Gap in ruhelosen Banditen durch Gaußsche Annäherung

通过高斯近似度实现无休止强盗的最佳差距 $\ tilde\ mathcal{O\\\\\\\\\\( n)$

2410.15003v2

1606

05-25

Cellular Traffic Prediction via Byzantine-robust Asynchronous Federated Learning

Zelluläre Verkehrsvorhersage über byzantinisches-robustes Asynchrones Federated Learning

通过Byzantine-Robust 亚同步联谊会学习的细胞交通预测

2505.19263v1

1607

05-25

Towards a Spatiotemporal Fusion Approach to Precipitation Nowcasting

Auf dem Weg zu einem Spatiotemporalen Fusionsansatz zur Niederschlagung von Nowcasting

迈向对降水即时播送采取相向时间融合办法

2505.19258v1

1608

05-25

Learning-Augmented Online Bipartite Fractional Matching

Learning-Augmented Online Bipartite Fraktional Matching

学习增强的在线双两派人数配对

2505.19252v1

1609 05-25 Empirical Privacy Variance Empirische Datenschutzvarianz 隐私经验差异 2503.12314v2

1610

05-25

Improving Value Estimation Critically Enhances Vanilla Policy Gradient

Verbesserung der Wertschätzung Kritisch verbessert Vanilla Policy Gradient

显著加强香草政策梯度

2505.19247v1

1611

05-25

To CoT or To Loop? A Formal Comparison Between Chain-of-Thought and Looped Transformers

To CoT or To Loop? Ein formaler Vergleich zwischen Ketten-of-Thought und Schleiftransformatoren

尝试链和循环变换器之间的正式比较

2505.19245v1

1612

05-25

ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment

ActiveDPO: Aktive Direktpräferenzoptimierung für eine stichprobeneffiziente Ausrichtung

主动式DPO:为抽样有效对齐积极直接首选优化

2505.19241v1

1613

05-25

CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling

CLIP-UP: Ein einfaches und effizientes Mixture-of-Experts CLIP Training Rezept mit Sparse Upcycling

CLIP-UP:一个简单、高效的专家混合体 CLIP 与粗垃圾垃圾垃圾垃圾处理有关的培训名额

2502.00965v2

1614

05-25

LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models

LLLMs: Eine datengestützte Untersuchung der sich entwickelnden Forschung über Grenzen großer Sprachmodelle

LLLMs:关于大语言模式限制的不断发展的研究数据驱动调查

2505.19240v1

1615

05-25

Learning Transformer-based World Models with Contrastive Predictive Coding

Transformer-basierte Weltmodelle mit kontradiktivem Predictive Coding lernen

以学习变换器为基础的世界差异预测编码模式

2503.04416v2

1616

05-25

Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Effiziente Politikoptimierung in robusten, eingeschränkten MDPs mit Iterationskomplexitätsgarantien

在强力约束下,在具有迭接复杂度保障的多用途发展方案中提高政策效率的优化

2505.19238v1

1617

05-25

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Eine Welt in einem Funken Neuron zu sehen: Entwirren von Multi-Task-Interferenzen für trainingsfreies Modellverschmelzen

《在中世纪的火花中看到世界:为无培训模式合并拆散多任务干预》

2503.05320v2

1618

05-25

CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models

CoreMatching: Co-adaptive Sparse Inference Framework mit Token und Neuron Pruning für eine umfassende Beschleunigung von Vision-Language-Modellen

核心配料:与Token 和Neron Prurning 共同调适的简单推断框架,以全面加速视觉语言模型

2505.19235v1

1619

05-25

Learning Flexible Forward Trajectories for Masked Molecular Diffusion

Flexible Forward-Trajektorien für maskierte molekulare Diffusion lernen

蒙面分子扩散学习灵活前向轨迹

2505.16790v2

1620

05-25

Statistical Collusion by Collectives on Learning Platforms

Statistische Kollusion von Kollektiven über Lernplattformen

学习平台集体统计协作

2502.04879v3

1621

05-25

Imitation Learning via Focused Satisficing

通过有重点的满意度学习模拟学习

2505.14820v2

1622

05-25

CLEVER: A Curated Benchmark for Formally Verified Code Generation

CLEVER: Ein kuratierter Benchmark für die formal verifizierte Codegenerierung

正式核实的代码生成基准

2505.13938v3

1623

05-25

Scalarisation-based risk concepts for robust multi-objective optimisation

Scalarisierungsbasierte Risikokonzepte für eine robuste multiobjektive Optimierung

实现稳健的多目标优化的以尺度化为基础的风险风险概念

2405.10221v4

1624

05-25

Dynamic Angle Selection in X-Ray CT: A Reinforcement Learning Approach to Optimal Stopping

Dynamische Winkelauswahl in X-Ray CT: Ein verstärkten Lernansatz zum optimalen Stoppen

X- Ray CT: 优化停止的强化学习方法

2503.12688v2

1625

05-25

Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More

Sprachmodelle, Graph Searching und Überwachung Ehebruch: Wenn mehr Aufsicht weniger ist und wie man mehr macht

语言模式、图图搜索和监督通配:越少越少监督,如何做越多

2503.10542v3

1626

05-25

Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law

Skalierungsgesetze für gradienten Abstieg und Zeichenabstieg für lineare Bigram-Modelle unter Zipf’s Gesetz

齐普夫法下线形大梁模型的渐渐后裔和信号后裔法律扩大法

2505.19227v1

1627

05-25

LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models

LLaDA 1.5: Varianzreduzierte Preference-Optimierung für große Sprachdiffusionsmodelle

LLADA 1.5:大语言传播模式差异-减少优惠

2505.19223v1

1628

05-25

A Novel Transformer-Based Self-Supervised Learning Method to Enhance Photoplethysmogram Signal Artifact Detection

Eine neuartige, auf Transformer basierende, selbstüberwachte Lernmethode zur Verbesserung der Photoplethysmogramm-Signal-Artefakt-Erkennung

一种基于新颖变形器的以自我监督为基础的学习方法,用以加强光膜成像信号异形探测

2401.01013v2

1629

05-25

Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding

Where Paths Collide: Eine umfassende Untersuchung der klassischen und lernbasierten multi-agenten Pathfinding

路径相撞之处:对经典和以学习为基础的多方代理调查的全面调查

2505.19219v1

1630

05-25

Clustering by Nonparametric Smoothing

Clustering durch nichtparametrisches Glätten

以非参数平滑为群集

2503.09134v2

1631

05-25

Symmetries in Overparametrized Neural Networks: A Mean-Field View

Symmetrien in überparametrisierten Neuralen Netzwerken: Eine Mittelfeldansicht

过度对称的神经神经网络的对称性:平均实地观点

2405.19995v3

1632

05-25

Adaptive Cyclic Diffusion for Inference Scaling

Adaptive zyklische Diffusion zur Inferenzskalierung

用于推断力缩放的适应性二次循环传播

2505.14036v2

1633

05-25

SpeakStream: Streaming Text-to-Speech with Interleaved Data

SpeakStream: Streaming von Text-zu-Speech mit interleaved Daten

语音Stream:用断开数据流流流文本到语音

2505.19206v1

1634

05-25

Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety

Benign Proben Materie! Feinabstimmung auf Aussergewöhnliche Benign Proben stark bricht Sicherheit

重大事件重大事件重大事件安全重大事件重大事件重大事件重大事件重大事件

2505.06843v2

1635

05-25

FedGuCci: Making Local Models More Connected in Landscape for Federated Learning

FedGuCci: Lokale Modelle in der Landschaft für das Federated Learning stärker miteinander verbunden

FedGuCci:使地方模型在全局景观中更紧密地连接起来,促进联邦学习

2402.18949v3

1636

05-25

iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use

iTool: Verstärkte Feinsteuerung mit dynamischer Kalibrierung bei fortgeschrittenem Werkzeugeinsatz

i Tool:加强先进工具使用动态缺乏度校准的精细测试

2501.09766v4

1637 05-25 Diffusion Instruction Tuning Diffusions-Anleitung Tuning 传播指示图 2502.06814v2

1638

05-25

Curvature Dynamic Black-box Attack: revisiting adversarial robustness via dynamic curvature estimation

Krümmung Dynamischer Black-Box-Angriff: Wiederherstellung der gegnerischen Robustheit durch dynamische Krümmungsschätzung

曲线动态黑盒攻击: 通过动态曲线估计, 重新审视对抗性对称稳健性

2505.19194v1

1639

05-25

Interpretable Graph Learning Over Sets of Temporally-Sparse Data

Interpretable Graph Learning Over Sets von temporär-Spardaten

一组暂时分隔数据上的解释性图表学习

2505.19193v1

1640

05-25

I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts

I2MoE:可解释的多式多式互动意识混合企业专家

2505.19190v1

1641

05-25

Chordless Structure: A Pathway to Simple and Expressive GNNs

Chordless Structure: Ein Weg zu einfachen und expressiven GNNs

无字结构:通往简单和表达性全球NNN的路径

2505.19188v1

1642

05-25

Heterogeneous networks in drug-target interaction prediction

Heterogene Netzwerke in der Vorhersage von Wechselwirkungen mit Drogenzielen

药物目标相互作用预测中的不同类型网络

2504.16152v2

1643

05-25

A Physics-preserved Transfer Learning Method for Differential Equations

Eine physikkonservierte Transfer-Lernmethode für Differentialgleichungen

不同等分法的受物理保留转移学习方法

2505.01281v2

1644

05-25

CAGES: Cost-Aware Gradient Entropy Search for Efficient Local Multi-Fidelity Bayesian Optimization

CAGES: Kostenbewusste Gradienten-Entropie Suche nach effizienter lokaler Multi-Fidelity Bayesian-Optimierung

CAGES: 成本-软件软件渐进式 Entropy 搜索以高效的本地多纤维贝叶斯优化

2405.07760v2

1645

05-25

Federated Learning: From Theory to Practice

Föderiertes Lernen: Von der Theorie zur Praxis

联邦学习:从理论到实践

2505.19183v1

1646

05-25

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

DiTAR: Diffusion Transformer Autoregressive Modellierung für Sprachgenerierung

DITAR: 发声的传播变异器自动递减模型

2502.03930v3

1647

05-25

Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees

Auf dem Weg zu Graph Foundation Models: Allgemeines Lernen über Graphen über Task-Trees

走向图图基础模型:通过TLT-Trees对图的学习概观

2412.16441v3

1648

05-25

Nteasee: Understanding Needs in AI for Health in Africa – A Mixed-Methods Study of Expert and General Population Perspectives

Nteasee: Die Bedürfnisse von KI für die Gesundheit in Afrika verstehen – Eine gemischte Studie von Experten und allgemeinen Bevölkerungsperspektiven

Nteasee:了解大赦国际关于非洲保健的需要 – – 专家和一般人口观点混合方法研究

2409.12197v4

1649

05-25

Beyond Message Passing: Neural Graph Pattern Machine

超过消息传递: 神经图样机

2501.18739v2

1650

05-25

Saliency-guided Emotion Modeling: Predicting Viewer Reactions from Video Stimuli

Saliency-guided Emotion Modeling: Vorhersage von Zuschauerreaktionen aus Video-Stimuli

以色素为指导的情感建模:视频刺激的预测查看器反应

2505.19178v1

1651 05-25 Mixture of Lookup Experts Mischung von Lookup-Experten 查找专家混合 2503.15798v2

1652

05-25

Computational Inertia as a Conserved Quantity in Frictionless and Damped Learning Dynamics

Computational Inertia als konservierte Menge in friktionsloser und gedämpfter Lerndynamik

计算无损和断裂学习动力学的计算因电量

2505.19171v1

1653

05-25

JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models

JEDI: Die Macht der Jensen-Shannon-Divergenz bei entwirrenden Diffusionsmodellen

JEDI: 詹森-夏农分解扩散模型的分解力量

2505.19166v1

1654

05-25

CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter

CORAL: Lerne konsistente Repräsentationen über mehrstufiges Training mit leichterem spekulativen Entwurfer

CORAL: 利用轻型投机性起草者在多阶段培训中学习一致的代表性

2502.16880v3

1655

05-25

Efficient Training of Multi-task Neural Solver for Combinatorial Optimization

Effiziente Schulung von Multi-Task-Neural Solver zur kombinatorischen Optimierung

综合优化多任务神经溶剂高效培训

2305.06361v5

1656

05-25

Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation

Divide-Then-Aggregat: Eine effiziente Tool-Learning-Methode über parallele Tool-Invokation

分离后生成工具:通过平行工具使用使用效率高的工具学习方法

2501.12432v2

1657

05-25

Mean-Shift Distillation for Diffusion Mode Seeking

Mean-Shift-Destillation für den Diffusionsmodus

用于扩散模式搜索的中质蒸馏

2502.15989v2

1658

05-25

Do Large Language Models (Really) Need Statistical Foundations?

Brauchen große Sprachmodelle (wirklich) statistische Grundlagen?

大语言模式(真正)是否需要统计基础?

2505.19145v1

1659

05-25

ADGSyn: Dual-Stream Learning for Efficient Anticancer Drug Synergy Prediction

ADGSyn: Dual-Stream-Lernen für effiziente Anti-Krebs-Arzneimittel-Synergie-Vorhersage

ADGSyn:双层学习促进高效抗癌药物协同效应预测

2505.19144v1

1660

05-25

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering durch Verstärkungslernen

AdaCot:通过强化学习开拓探索的探索链

2505.11896v2

1661

05-25

CER: Confidence Enhanced Reasoning in LLMs

CER: Vertrauen in LLMs gestärkte Vernunft

CER: LLM 中增强信任的理由

2502.14634v2

1662

05-25

Uncertainty Quantification for Physics-Informed Neural Networks with Extended Fiducial Inference

Ungewissheitsquantifizierung für physikinformierte Neuronale Netzwerke mit erweiterter fiduzieller Schlussfolgerung

具有扩展影响推断力的物理成形神经网络的不确定性量化

2505.19136v1

1663

05-25

Incentivizing High-Quality Human Annotations with Golden Questions

Anreize für hochwertige menschliche Anmerkungen mit goldenen Fragen

以金质问题激励高品质人文说明

2505.19134v1

1664

05-25

Fast and Accurate Power Load Data Completion via Regularization-optimized Low-Rank Factorization

Schnelle und präzise Leistungslastdatenvervollständigung über Regularisierungsoptimierte Low-Rank-Fabrikisierung

通过正规化、优化低射速电荷因子化完成快速和准确电源负载数据

2505.19133v1

1665

05-25

Rank-One Modified Value Iteration

Rang eins geänderte Wert Iteration

Ran- One 修改值迭代

2505.01828v2

1666

05-25

Natural Language Generation from Visual Events: Challenges and Future Directions

Natürliche Sprachgenerierung aus visuellen Veranstaltungen: Herausforderungen und Zukunftsrichtungen

从视觉活动中产生自然语言:挑战和未来方向

2502.13034v2

1667

05-25

Interacting Large Language Model Agents. Interpretable Models and Social Learning

Interagieren von Large Language Model Agents. Interpretierbare Modelle und soziales Lernen

跨大语言示范工具、可解释模型和社会学习

2411.01271v2

1668

05-25

Adaptive Sensor Steering Strategy Using Deep Reinforcement Learning for Dynamic Data Acquisition in Digital Twins

Adaptive Sensorlenkungsstrategie mit tief greifendem Verstärkungslernen für die dynamische Datenerfassung in digitalen Zwillingen

利用深强化学习促进数字双对动态数据采集的适应感感感感指导战略

2504.10248v2

1669

05-25

Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods

Birke SGD: Ein Baumdiagramm-Framework für lokale und asynchrone SGD-Methoden

Birch SGD: 当地和非同步 SGD 方法树图框架

2505.09218v2

1670

05-25

Deep Active Speech Cancellation with Mamba-Masking Network

Deep Active Speech Stornierung mit Mamba-Masking Network

使用 Mamba- Masking 网络的深活动语音取消

2502.01185v2

1671

05-25

Exploring Magnitude Preservation and Rotation Modulation in Diffusion Transformers

Erforschung der Magnitudenerhaltung und Rotationsmodulation in Diffusionstransformatoren

在扩散变异器中探索磁力保护与旋转调节

2505.19122v1

1672

05-25

FP4 All the Way: Fully Quantized Training of LLMs

RP4: Vollständig quantifizierte Ausbildung von LLMs

FP4 全程:充分量化的LLMM培训

2505.19115v1

1673

05-25

Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling

Verwandeln von Müll in Schatz: Beschleunigen von Inferenzen von großen Sprachmodellen mit Token-Recycling

将垃圾垃圾变成宝库:加快使用 Tok 回收利用大语言模型的推论

2408.08696v3

1674

05-25

Stochastic Compositional Optimization with Compositional Constraints

Stochastische kompositorische Optimierung mit kompositorischen Einschränkungen

具有组成限制的斯托具组成优化

2209.04086v2

1675

05-25

An Interpretable Representation Learning Approach for Diffusion Tensor Imaging

Ein interpretierbarer Representations-Lernansatz für Diffusion Tensor Imaging

传播显像成像的可解释代表性学习方法

2505.19110v1

1676

05-25

Optimization-Inspired Few-Shot Adaptation for Large Language Models

Optimization-Inspired Wenig-Shot-Anpassung für große Sprachmodelle

优化- 激发了对大语言模型的微热适应

2505.19107v1

1677

05-25

Statistical inference for Linear Stochastic Approximation with Markovian Noise

Statistische Schlussfolgerung zur linearen stochastischen Annäherung an Markovsche Geräusche

与Markovian噪音的线性斯托口接近的统计推推

2505.19102v1

1678

05-25

Towards Robust Influence Functions with Flat Validation Minima

Auf dem Weg zu robusten Einflussfunktionen mit Flat Validation Minima

以平滑校准微型方式向强力影响函数方向

2505.19097v1

1679

05-25

A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random

Ein einheitliches Framework zur variablen Auswahl im modellbasierten Clustering mit Fehlen nicht zufällig

以模型为基础的集束模式中变量选择的统一框架, 随机不失踪

2505.19093v1

1680

05-25

ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

ReadBench: Vermessen der Dichte an Text Visuelle Lesefähigkeit von Vision-Sprachen-Modellen

” 阅读 “ :衡量视觉-语言模型的阅读能力

2505.19091v1

1681

05-25

CMoS: Rethinking Time Series Prediction Through the Lens of Chunk-wise Spatial Correlations

CMoS: Die Vorhersage der Zeitreihen durch die Linse der spaltweisen räumlichen Korrelationen neu denken

CMoS: 重新思考时间序列,通过整节空间交汇的镜头预测

2505.19090v1

1682

05-25

Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes

Temperatur ist alles, was Sie für die Generalisierung in Langevin Dynamics und anderen Markov-Prozessen benötigen

Langevin Dynamics 和其他Markov 进程需要的温度是全部您需要的普遍化

2505.19087v1

1683

05-25

Jodi: Unification of Visual Generation and Understanding via Joint Modeling

Jodi: Vereinheitlichung der visuellen Erzeugung und des Verständnisses durch gemeinsame Modellierung

Jodi:通过联合建模统一视觉生成和理解

2505.19084v1

1684

05-25

Geometric Determinations Of Characteristic Redshifts From DESI-DR2 BAO and DES-SN5YR Observations: Hints For New Expansion Rate Anomalies

Geometrische Bestimmung charakteristischer Rotverschiebungen aus DESI-DR2 BAO und DES-SN5YR Beobachtungen: Hinweise für neue Erweiterungsraten Anomalien

DESSI-DD2 BAO和DES-SN5YR观测的典型变迁的几何测定:新扩张率异常现象的提示

2505.19083v1

1685

05-25

On Continuity of Robust and Accurate Classifiers

Über die Kontinuität von robusten und präzisen Klassifikatoren

关于强力和准确性分类的连续性

2309.17048v2

1686

05-25

Flow Annealed Importance Sampling Bootstrap meets Differentiable Particle Physics

Flow Annealed Bedeutung Sampling Bootstrap trifft differenzierbare Teilchenphysik

流动的隐形重要性取样器装置符合可区分的粒子物理

2411.16234v2

1687

05-25

Cluster-Aware Multi-Round Update for Wireless Federated Learning in Heterogeneous Environments

Cluster-Aware Multi-Round Update für drahtloses Federated Learning in heterogenen Umgebungen

为不同不同环境无线联邦学习提供多功能集群软件多功能更新

2505.06268v2

1688

05-25

Recalibrating binary probabilistic classifiers

Rekalibrierung von binären probabilistischen Klassifikatoren

重新计算二进制概率分解器

2505.19068v1

1689

05-25

Adversarial Bandit over Bandits: Hierarchical Bandits for Online Configuration Management

Adversarial Bandit über Bandits: Hierarchische Bandits für Online-Konfigurationsmanagement

反强盗强盗: 用于在线配置管理的等级强盗

2505.19061v1

1690

05-25

An Initial Exploration of Fine-tuning Small Language Models for Smart Contract Reentrancy Vulnerability Detection

Eine erste Erkundung von Feinsteuerungs-Kleinsprachenmodellen für intelligente Vertragsrepentrancy Sicherheitserkennung

初步探索智能合同留置率易变性探测智能合同微调小型语言模型

2505.19059v1

1691

05-25

Policy Gradient with Tree Expansion

Politischer Gradient mit Baumerweiterung

随着树树扩张的政策渐变

2301.13236v2

1692

05-25

Distributionally Robust Deep Q-Learning

Verteilungsstarkes tiefes Q-Lernen

分布强力深学习 Q- 学习

2505.19058v1

1693

05-25

An Embarrassingly Simple Defense Against LLM Abliteration Attacks

Eine erschreckend einfache Verteidigung gegen LLM-Abliterationsangriffe

一种令人尴尬的简单防御对付LLM 缩写攻击

2505.19056v1

1694

05-25

Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning

Computerische Kosten im Deep-Verstärkung-Lernen durch Randomized Policy Learning reduzieren

降低深强化学习的计算成本

2505.19054v1

1695

05-25

Structured Reinforcement Learning for Combinatorial Decision-Making

Strukturiertes Stärkungslernen für kombinatorische Entscheidungsfindung

结构强化学习促进综合决策决策

2505.19053v1

1696

05-25

Efficient Data Selection at Scale via Influence Distillation

Effiziente Datenauswahl auf Scale durch Einflussdestillation

通过影响蒸馏在规模上高效数据选择

2505.19051v1

1697

05-25

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

SliM-LLM: Salience-getriebene Mixed-Precision-Quantisierung für große Sprachmodelle

SliM-LLM:大语言模型的盐度驱动混合精度量

2405.14917v2

1698

05-25

PII-Scope: A Comprehensive Study on Training Data PII Extraction Attacks in LLMs

PII-Scope: Eine umfassende Studie über Trainingsdaten PII-Extraktionsangriffe in LLMs

PII-范围:关于培训数据的综合研究

2410.06704v2

1699

05-25

When Models Don’t Collapse: On the Consistency of Iterative MLE

Wenn Modelle nicht zusammenbrechen: Über die Konsistenz iterativer MLE

当模型不折叠时: 在迭代 MLE 一致性上

2505.19046v1

1700

05-25

Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

Offline-Clustering von linearen Banditen: Entriegelung der Macht von Clustern in datenbeschränkten Umgebungen

线性强盗离线集群:解锁数据限制环境中的群集力量

2505.19043v1

1701

05-25

Turb-L1: Achieving Long-term Turbulence Tracing By Tackling Spectral Bias

Turb-L1: Langfristige Turbulenzen erreichen, die durch das Greifen spektraler Bias verfolgt werden

Turb-L1:通过处理光辉双鱼,实现长期动荡追踪

2505.19038v1

1702

05-25

Optimal Conformal Prediction under Epistemic Uncertainty

Optimale konforme Vorhersage unter epistemischer Unsicherheit

在不确定性下最优化的共变预测

2505.19033v1

1703

05-25

SoK: Dataset Copyright Auditing in Machine Learning Systems

SoK: Datensatz Copyright Auditing in Machine Learning Systemen

SoK:机器学习系统中的数据集版权审计

2410.16618v2

1704

05-25

Learn Beneficial Noise as Graph Augmentation

Benefitial Noise als Graph Augmentation lernen

学习以图增益为受益噪音

2505.19024v1

1705

05-25

A Smart Healthcare System for Monkeypox Skin Lesion Detection and Tracking

Ein intelligentes Gesundheitssystem für Monkeypox-Hautläsionserkennung und -verfolgung

用于探测和跟踪猴子天花皮肤皮层的智能保健系统

2505.19023v1

1706

05-25

Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs

Unbestimmte Quantifizierung auf Funktionsebene für die Kalibrierung von Feinabstimmungen auf LLMs

对LLMML进行校准微调的不确定性定量

2410.06431v3

1707

05-25

AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer

AnchorFormer: Differentielle Anker-Achtung für effizienten Vision Transformer

Anchor Former: 高效愿景变异器的可区别的锁定器注意

2505.16463v2

1708

05-25

When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers

Wann ist Task Vector für die Modellbearbeitung wahrscheinlich wirksam? Eine Generalisierungsanalyse von nichtlinearen Transformern

任务矢量何时对模式编辑有效? 非线性变换器的概括分析

2504.10957v3

1709

05-25

Fractured Chain-of-Thought Reasoning

Zersplitterte Kette von nachdenklichen Gründen

断断断断断断断断断断断断的探讨链原因

2505.12992v2

1710

05-25

Lorentzian Graph Isomorphic Network

Lorentzian 图形异形网络

2504.00142v4

1711

05-25

Querying Kernel Methods Suffices for Reconstructing their Training Data

Abfrage von Kernel-Methoden Möglichkeiten zur Wiederherstellung ihrer Trainingsdaten

查询重新构建其培训数据所需的核心内核方法

2505.19019v1

1712

05-25

Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering

Genaue und effiziente Multivariate Zeitreihenprognose über Offline-Clustering

通过离线群集预测准确而高效的多变量时间序列

2505.05738v2

1713

05-25

Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis

Ausbildung nichtlinearer Transformer für den Schlussfolgerungsketten-of-Thought: Eine theoretische Generalisierungsanalyse

培训非线性非线性变换器,用于研究链推论:理论一般分析

2410.02167v3

1714

05-25

Understanding the Robustness of Graph Neural Networks against Adversarial Attacks

Verständnis der Robustheit von Graphen-Neuralen Netzwerken gegen feindliche Angriffe

理解反对反向攻击的平面神经网络的强大力

2406.13920v2

1715

05-25

WorldEval: World Model as Real-World Robot Policies Evaluator

WorldEval: Weltmodell als Real-World-Roboterpolitik Evaluator

WorldEval:世界作为真实世界机器人政策评价人的世界模式

2505.19017v1

1716

05-25

Tokenizing Electron Cloud in Protein-Ligand Interaction Learning

Tokenizing Electron Cloud in Protein-Ligand Interaktion Lernen

将电云投入蛋白碱的相互作用学习

2505.19014v1

1717

05-25

Faithful Group Shapley Value

Treue Gruppe Shapley Wert

忠实的群群形状值

2505.19013v1

1718

05-25

Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

Alberta Wells Datensatz: Pinpointing Öl- und Gasquellen aus Satellitenbildern

艾伯塔·韦尔斯数据集:从卫星图象中点出石油和天然气井

2410.09032v3

1719

05-25

FERGI: Automatic Scoring of User Preferences for Text-to-Image Generation from Spontaneous Facial Expression Reaction

FERGI: Automatische Bewertung von Benutzereinstellungen für die Text-zu-Bild-Erzeugung aus spontaner Gesichtsausdrucksreaktion

FERGI: 自动自发面性表达反应生成文本到图像的用户首选项自动排序

2312.03187v4

1720

05-25

Handling Label Noise via Instance-Level Difficulty Modeling and Dynamic Optimization

Handhabung von Etikettengeräuschen über Instance-Level-Schwierigkeitsmodellierung und dynamische Optimierung

通过实度难度建模和动态优化处理标签噪音

2505.00812v2

1721

05-25

Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding

Galaxy Walker: Geometry-aware VLMs für Galaxy-Skala Verständnis

Galaxy Walker: 用于银河系统系统理解的几何觉测甚低LMS

2503.18578v3

1722

05-25

Inductive Gradient Adjustment For Spectral Bias In Implicit Neural Representations

Induktive Gradientenanpassung für Spektralbien in impliziten Neuraldarstellungen

隐含神经表层旁观生物的感应梯度调整

2410.13271v2

1723

05-25

Semi-pessimistic Reinforcement Learning

Halbpessimistisches Erlernen der Verstärkung

半悲观强化学习

2505.19002v1

1724

05-25

Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

Automatische und strukturschonende Sparsifikation von Hybrid-Neural-ODEs

混合神经代码的自动和结构软件分离

2505.18996v1

1725

05-25

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Verstärktes Lernen zur Vernunft in großen Sprachmodellen mit einem Trainingsbeispiel

采用 “ 一个培训实例 “ 采用大语言模式强化学习

2504.20571v2

1726

05-25

PDFBench: A Benchmark for De novo Protein Design from Function

PDFBench: Ein Benchmark für De novo Protein Design von der Funktion

PDFBench:从函数调出新蛋白设计基准

2505.20346v1

1727

05-25

STRICT: Stress Test of Rendering Images Containing Text

STRICT: Stresstest von Rendering-Bildern mit Text

STICT: 含有文字的图像的显示压力测试

2505.18985v1

1728

05-25

AmorLIP: Efficient Language-Image Pretraining via Amortization

AmorLIP: Effizientes Sprach-Bild-Vortraining über Amortisation

AmorLIP:通过摊销进行高效的语文图像预培训

2505.18983v1

1729

05-25

Learning Mamba as a Continual Learner: Meta-learning Selective State Space Models for Efficient Continual Learning

Mamba als Continual Learner lernen: Meta-Learning Selective State Space Models für effizientes Continual Learning

Mamba作为不断学习者学习Mamba:高效持续学习的元学习选择性国家空间模型

2412.00776v4

1730

05-25

LLMScan: Causal Scan for LLM Misbehavior Detection

LLMScan: Kausalscan zur Erkennung von LLM-Missverhalten

LLMScan:用于LLM Misbehavavor探测的成因扫描

2410.16638v4

1731

05-25

FedSKC: Federated Learning with Non-IID Data via Structural Knowledge Collaboration

FedSKC: Föderiertes Lernen mit nicht-ID-Daten über strukturelle Wissenskooperation

FDSKC:通过结构性知识协作,采用非IID数据的联邦学习

2505.18981v1

1732

05-25

GhostPrompt: Jailbreaking Text-to-image Generative Models based on Dynamic Optimization

GhostPrompt: Jailbreaking Text-to-image Generative Modelle basierend auf dynamischer Optimierung

GhostPropt:基于动态最佳化的破狱用文字到图像生成模型

2505.18979v1

1733

05-25

ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting

ScaleBiO: Skalierbare Bilevel-Optimierung für LLM-Datenumgewichtung

缩放 BIO: LLM 数据重新加权的可缩放双级优化

2406.19976v2

1734

05-25

GraSS: Scalable Influence Function with Sparse Gradient Compression

GraSS: Skalierbare Einflussfunktion mit Sparse Gradient Compression

GraSS: 带有微缩梯度压缩的可缩放影响函数

2505.18976v1

1735

05-25

The Final Layer Holds the Key: A Unified and Efficient GNN Calibration Framework

Die letzte Ebene hält den Schlüssel: Ein einheitliches und effizientes GNN-Kalibrierungssystem

最后层掌握着关键:统一有效的全球NNN校准框架

2505.11335v2

1736

05-25

MoLAE: Mixture of Latent Experts for Parameter-Efficient Language Models

MoLAE: Mischung aus latenten Experten für Parameter-Effiziente Sprachmodelle

MoLAE:参数有效语言模型原始专家混合

2503.23100v2

1737

05-25

Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees

Multi-Step-Konsistenzmodelle: Schnelle Generation mit theoretischen Garantien

多层次一致性模式:有理论保障的快速一代

2505.01049v2

1738

05-25

Genetic Influences on Brain Aging: Analyzing Sex Differences in the UK Biobank using Structural MRI

Genetische Einflüsse auf das Altern des Gehirns: Analyse von Geschlechtsunterschieden in der britischen Biobank mittels struktureller MRT

对大脑老龄化的遗传基因影响:利用结构MRI分析联合王国生物库中的性别差异

2505.20344v1

1739

05-25

Protein Design with Dynamic Protein Vocabulary

Protein Design mit dynamischem Protein Vokabular

配有动态蛋白质词汇词典的蛋白因设计

2505.18966v1

1740

05-25

Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models

Expansion Span: Kombinieren von Fading Memory und Retrieval in Hybrid State Space Models

扩展空间:在混合国家空间模型中将平缓内存和检索合并

2412.13328v2

1741

05-25

How Do Images Align and Complement LiDAR? Towards a Harmonized Multi-modal 3D Panoptic Segmentation

Wie richten und ergänzen Bilder LiDAR? Auf dem Weg zu einer harmonisierten multimodalen 3D-Panoptischen Segmentierung

图像如何对齐和补充 LiDAR ?

2505.18956v1

1742

05-25

Online Knowledge Distillation with Reward Guidance

Online-Wissensdestillation mit lohnender Anleitung

网上知识蒸馏与奖励指导

2505.18952v1

1743

05-25

The Price of Format: Diversity Collapse in LLMs

Der Preis des Formats: Diversity Collapse in LLMs

格式价格:多样化在LLMM中崩溃

2505.18949v1

1744

05-25

Exact Expressive Power of Transformers with Padding

Exakte Expressive Kraft von Transformatoren mit Padding

带有斜面的变形器的精确表达力

2505.18948v1

1745

05-25

Minimax Optimal Reinforcement Learning with Quasi-Optimism

Minimax Optimales Stärkungslernen mit Quasi-Optimismus

以准适应主义进行最优化强化学习

2503.00810v2

1746

05-25

Efficient Pauli channel estimation with logarithmic quantum memory

Effiziente Pauli-Kanalschätzung mit logarithmischem Quantenspeicher

具有对数量内存的高效保利频道估计

2309.14326v4

1747

05-25

Structural Alignment Improves Graph Test-Time Adaptation

Struktural Alignment verbessert Graph Test-Time Anpassung

结构调整改进图示测试时间适应

2502.18334v2

1748

05-25

Chi-Square Wavelet Graph Neural Networks for Heterogeneous Graph Anomaly Detection

Chi-Square Wavelet Graph Neural Networks für Heterogene Graph Anomalie Detection

用于异源图异常异常图探测的千平方波浪图神经网络

2505.18934v1

1749

05-25

Can Large Language Models Infer Causal Relationships from Real-World Text?

Können große Sprachmodelle Kausalbeziehungen aus Real-World Text ableiten?

大语言模型能否从真实世界文本中推断出因果关系?

2505.18931v1

1750

05-25

Hybrid Neural-MPM for Interactive Fluid Simulations in Real-Time

Hybrid-Neural-MPM für interaktive Fluidsimulationen in Echtzeit

用于实时交互流力模拟的神经-MPM混合神经-MPM

2505.18926v1

1751

05-25

Graph-Based Operator Learning from Limited Data on Irregular Domains

Graph-based Operator Lernen von begrenzten Daten über irreguläre Domains

以图图为基础的操作员学习关于非常规域域的有限数据

2505.18923v1

1752

05-25

ALPCAHUS: Subspace Clustering for Heteroscedastic Data

ALPCAHUS: Subraum-Clustering für heterosexuelle Daten

ALPCAHUS: 用于河流测量数据的子空间集群

2505.18918v1

1753

05-25

Behavior Injection: Preparing Language Models for Reinforcement Learning

Verhaltensinjektion: Vorbereitung von Sprachmodellen für verstärktes Lernen

行为注射:为强化学习准备语言模式

2505.18917v1

1754

05-25

PySAD: A Streaming Anomaly Detection Framework in Python

PySAD: Ein Streaming-Anomaly Detection-Framework in Python

PySAD: Python 流动异常检测框架

2009.02572v2

1755

05-25

Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach

Multimodale LLMs unter Verteilungsverschiebungen verstehen: Ein informationstheoretischer Ansatz

在分销变更下理解多式LLMs:信息理论方法

2502.00577v2

1756

05-25

On the Role of Label Noise in the Feature Learning Process

Über die Rolle von Etikettengeräuschen im Feature-Learning-Prozess

关于标签噪音在专题学习过程中的作用

2505.18909v1

1757

05-25

Stronger Enforcement of Instruction Hierarchy via Augmented Intermediate Representations

Stärkere Durchsetzung der Instruktionshierarchie durch Augmented Intermediate Representations

通过扩大中级代表,加强执行指示分级制度

2505.18907v1

1758

05-24 (6)

Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services

Pre-trained Encoder-Schlussfolgerung: Enthüllen Upstream-Encoder in Downstream Machine Learning Services

培训前编码器推断:在下游机器学习服务中向上游编码器

2408.02814v2

1759

05-24

PromptWise: Online Learning for Cost-Aware Prompt Assignment in Generative Models

PromptWise: Online-Lernen für kostenbewusste Prompt-Zuweisung in generativen Modellen

快速Wise:在创用模型中进行成本-软件快速指派在线学习

2505.18901v1

1760

05-24

Beyond Domain Randomization: Event-Inspired Perception for Visually Robust Adversarial Imitation from Videos

Beyond Domain Randomization: Event-inspirierte Wahrnehmung für visuell robuste Adversarial Imitation aus Videos

超出域随机化: 视频中视觉强力反逆模仿受事件启发的感知

2505.18899v1

1761

05-24

Marginal Fairness: Fair Decision-Making under Risk Measures

Marginal Fairness: Faire Entscheidungsfindung im Rahmen von Risikomaßnahmen

边际公平:风险措施下的公平决策

2505.18895v1

1762

05-24

Conformal Prediction for Uncertainty Estimation in Drug-Target Interaction Prediction

Konforme Vorhersage für Unsicherheitsschätzungen in der Drogen-Ziel-Interaktionsvorhersage

药物-目标相互作用预测中不确定性估计的非正式预测

2505.18890v1

1763

05-24

Enabling Unstructured Sparse Acceleration on Structured Sparse Accelerators

Ermöglichung unstrukturierter Spars-Beschleunigung bei strukturierten Spars-Beschleunigern

启用结构散开加速器, 启用无结构的分散加速器

2403.07953v3

1764

05-24

Neural Encoding and Decoding at Scale

Neurale Enkodierung und Dekodierung auf Scale

缩放时神经编码和解码

2504.08201v4

1765

05-24

Data Augmentation for Time-Series Classification: An Extensive Empirical Study and Comprehensive Survey

Datenvergrößerung für die Zeitreihenklassifikation: Eine umfangreiche empirische Studie und umfassende Umfrage

时间-系列分类数据扩充:广泛经验研究和全面调查

2310.10060v6

1766

05-24

KerZOO: Kernel Function Informed Zeroth-Order Optimization for Accurate and Accelerated LLM Fine-Tuning

KerZOO: Kernel-Funktion informierte Zeroth-Order-Optimierung für präzise und beschleunigte LLM-Feinsteuerung

KerZOO:为准确和加速 LLM 精密推荐而优化使用核心(KerZOO):

2505.18886v1

1767

05-24

LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders

LORE: Lagrangian-optimierte robuste Einbettungen für visuelle Encoder

Lagrangian- 优化的视觉编码器强力嵌入器

2505.18884v1

1768

05-24

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity

LinGen: Auf dem Weg zur High-Resolution Minute-Length Text-to-Video-Generation mit linearer Computational Complexity

LinGen:迈向具有线性比较复杂度的高分辨率分钟-语言文本到视频的生成

2412.09856v2

1769

05-24

Partition Generative Modeling: Masked Modeling Without Masks

Partition Generative Modellierung: Maskenmodellierung ohne Masken

生成建模:没有遮罩的蒙面建模

2505.18883v1

1770

05-24

RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models

RefLoRA: Refactored Low-Rank-Anpassung für effizientes Feintuning großer Modelle

RefLORA:为对大型模型进行高效微调而进行重构的低Rank适应

2505.18877v1

1771

05-24

Non-Stationary Lipschitz Bandits

Nicht-stationäre Lipschitz Banditen

非固定的利普施奇茨猛匪

2505.18871v1

1772

05-24

Sci-LoRA: Mixture of Scientific LoRAs for Cross-Domain Lay Paraphrasing

Sci-LoRA: Mischung aus wissenschaftlichen LoRAs für Cross-Domain Lay Paraphrasing

Sci-LORA:将科学LORA混合起来,用于跨域地谱图谱绘制

2505.18867v1

1773

05-24

Distribution-Aware Mobility-Assisted Decentralized Federated Learning

Distribution-Aware Mobility-Assisted Dezentrales Federated Learning

分发通知 – – 流动协助 – – 分权力下放的联邦学习

2505.18866v1

1774

05-24

Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Geführt von Guardrails: Steuerungsbarrierenfunktionen als Sicherheitsinstruktoren für das Roboterlernen

由警卫队指导:作为机器人学习安全教官的控制障碍功能

2505.18858v1

1775

05-24

USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations

USDC: Ein Datensatz von $\underline{U}$ser $\underline{S}$tance und $\underline{D}$ogmatism in langen $\underline{C}$onversations

USCC: 以 $\ underline{U}$ser $\ underline{S}$tance 和 $\ underline{D}$ogmatism 的数据集, 以 Long $\ underline{C} 美元对数值

2406.16833v2

1776

05-24

Toward Malicious Clients Detection in Federated Learning

Auf dem Weg zu bösartigen Kunden Erkennung im Föderierten Lernen

争取在联邦学习中发现恶意客户

2505.09110v2

1777

05-24

Corruption-Aware Training of Latent Video Diffusion Models for Robust Text-to-Video Generation

Korruption-Bewusst Training von latenten Video-Diffusions-Modellen für robuste Text-zu-Video-Generation

原始视频视频传播模型的反腐败知识培训

2505.21545v1

1778

05-24

On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization

Auf die Wirkung des negativen Gradienten in der Gruppe Relative Tiefenverstärkung Optimierung

对群体相对深强化优化中的负梯度效应的影响

2505.18830v1

1779

05-24

Multi-Agent Best Arm Identification in Stochastic Linear Bandits

Multi-Agent Best Arm Identification in stochastische Linear Banditen

斯托切斯定线强盗中多代理最佳武器识别

2411.13690v2

1780

05-24

Improved Regret and Contextual Linear Extension for Pandora’s Box and Prophet Inequality

Verbesserte regret und kontextuelle lineare Erweiterung für Pandora’s Box und Prophet Inequality

改进潘多拉盒子和先知不平等的遗憾和背景扩展线性扩展

2505.18828v1

1781

05-24

A Real-World Energy Management Dataset from a Smart Company Building for Optimization and Machine Learning

Ein Echtzeit-Energiemanagement-Datensatz aus einem Smart Company Building für Optimierung und maschinelles Lernen

最佳优化和机器学习智能公司大楼的 “ 现实世界能源管理数据集 “

2503.11469v2

1782

05-24

How to build a consistency model: Learning flow maps via self-distillation

Wie man ein Konsistenzmodell baut: Flusskarten über Selbstdestillation lernen

如何建立一致性模式:通过自我蒸馏学习流程图

2505.18825v1

1783

05-24

Robust multi-coil MRI reconstruction via self-supervised denoising

Robuste Multi-Coil-MRT-Rekonstruktion durch selbstüberwachte Denoisierung

通过自我监督的自监管的去注水进行强有力的多石油MRI重建

2411.12919v4

1784

05-24

Fully tensorial approach to hypercomplex neural networks

Voller Tensoransatz für hyperkomplexe neuronale Netzwerke

对超复合性神经神经网络采取完全强制的全方位方法

2407.00449v3

1785

05-24

Stealing Training Graphs from Graph Neural Networks

Stealing Training Graphen aus Graph Neural Networks

图表神经网络中的偷窃培训图

2411.11197v2

1786

05-24

GRoQ-LoCO: Generalist and Robot-agnostic Quadruped Locomotion Control using Offline Datasets

GRoQ-LoCO: Generalist und Roboter-agnostische Quadruped Locomotion Control mit Offline-Datensätzen

GROQ-LoCO:使用离线数据集的通用和机器人-不可知性四分流移动控制

2505.10973v3

1787

05-24

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Bevorzugte Leckage: Ein Kontaminierungsproblem im LLM-as-a-Richter

优先渗漏:LLM-作为法官的LLM中的污染问题

2502.01534v2

1788

05-24

Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis

Erforschung der QUIC-Dynamik: Ein großformatiger Datensatz für verschlüsselte Verkehrsanalyse

探索 QUIC 动态动态:加密流量分析的大型数据集

2410.03728v6

1789

05-24

DiSCo: Device-Server Collaborative LLM-Based Text Streaming Services

DiSCo: Geräte-Server Kollaborative LLM-basierte Text-Streaming-Dienste

DisCo: 设备-服务器协作协作LLM基于LLM的文本流服务

2502.11417v2

1790

05-24

Operator-Informed Score Matching for Markov Diffusion Models

Operator-Informed Score Matching für Markov Diffusion Modelle

Markov 扩散模型的操作员不完善的评分匹配

2406.09084v2

1791

05-24

Expert-Agnostic Learning to Defer

Experten-Agnostisches Lernen zur Abwehr

专家 – – 无法无天学习

2502.10533v2

1792

05-24

Partial Distribution Matching via Partial Wasserstein Adversarial Networks

Teilverteilung Passend über Teilwasserstein Adversarial Networks

通过部分瓦森斯坦对冲网络进行部分配配

2409.10499v2

1793

05-24

MAPLE: Enhancing Review Generation with Multi-Aspect Prompt LEarning in Explainable Recommendation

MAPLE: Verbesserung der Review Generation mit Multi-Aspect Prompt Learning in erklärbarer Empfehlung

MMALE: 在可解释建议中以多角度迅速和迅速的分解方式加强审查的产生

2408.09865v2

1794

05-24

Governing Equation Discovery from Data Based on Differential Invariants

Regulierende Gleichungs-Entdeckung aus Daten basierend auf unterschiedlichen Invarianten

从基于差异内在变量的数据中分离出来的数据

2505.18798v1

1795

05-24

Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection

Überwachung von Graphen-Neuralnetzwerken für unbeaufsichtigte Graphenanomalienerkennung

用于不受监督的异常图图探测的保护图形神经网络

2404.16366v2

1796

05-24

Leveraging Per-Instance Privacy for Machine Unlearning

Per-Instance-Leveraging-Privatsphäre für das maschinelle Lernen

利用个人隐私促进机器脱学

2505.18786v1

1797

05-24

A physics-guided smoothing method for material modeling with digital image correlation (DIC) measurements

Ein physikgeführtes Glättverfahren für die Materialmodellierung mit Messungen der digitalen Bildkorrelation (DIC)

采用物理制导平滑法进行数字图像相关测量材料建模

2505.18784v1

1798

05-24

Soft Weighted Machine Unlearning

Weichgewichtete Maschine nicht lernen

软加权机器脱学

2505.18783v1

1799

05-24

One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion

Eine Politik, aber viele Welten: Eine skalierbare, einheitliche Politik für vielseitige humanoide Lokomotion

一个政策,但许多世界:一个可扩展的统一政策,促进有生命力的人类活动

2505.18780v1

1800

05-24

HD-PiSSA: High-Rank Distributed Orthogonal Adaptation

HD-PiSSA: High-Rank verteilte Orthogonalanpassung

HD-PiSSA: 高射分散的正心调整适应

2505.18777v1

1801

05-24

Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Starke Mitgliedschafts-Inferenzangriffe auf massive Datensätze und (Moderate) große Sprachmodelle

对大规模数据集和(口头)大语言模型的强烈成员推论攻击

2505.18773v1

1802

05-24

CageNet: A Meta-Framework for Learning on Wild Meshes

CageNet: Ein Meta-Rahmen für das Lernen auf Wild Meshes

CageNet:野生动物类学习的元框架

2505.18772v1

1803

05-24

Dual-Path Stable Soft Prompt Generation for Domain Generalization

Dual-Path stabile Soft Prompt Generation für Domain-Verallgemeinerung

两平面稳定软软生成域通用化快速生成

2505.18770v1

1804

05-24

Multiple Wasserstein Gradient Descent Algorithm for Multi-Objective Distributional Optimization

Vielfacher Wasserstein Gradient Descent Algorithmus für Multi-Objective Distributional Optimization

多目标分布优化多瓦森斯坦梯度底源值

2505.18765v1

1805

05-24

Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model

Textgeführte Multi-Property-Molekularoptimierung mit einem Diffusions-Sprachenmodell

带有传播语言模型的文本引导多财产分子优化

2410.13597v2

1806

05-24

How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark

Wie wird LLM-Reasoning vom irrelevanten Kontext abgelenkt? Eine Analyse mit einem kontrollierten Benchmark

LLM 为何被不相关背景所忽略?

2505.18761v1

1807

05-24

The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation

Die Suche nach einer effizienten Begründung: Ein datenzentrischer Benchmark zur CoT-Destillation

有效合理理由的查询:COT蒸馏的数据中心基准

2505.18759v1

1808

05-24

Lean and Mean Adaptive Optimization via Subset-Norm and Subspace-Momentum with Convergence Guarantees

Lean and Mean Adaptive Optimization via Subset-Norm und Subspace-Momentum mit Konvergenzgarantien

通过具有聚合担保的子元和子空间动力及子空间动力进行皮和平均适应性优化

2411.07120v2

1809

05-24

Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding

Reduzierung der Speicherung vortrainierter neuraler Netzwerke durch ratenkontrainierte Quantisierung und Entropiecodierung

通过受费率限制的量化和元件编码减少储存预培训神经网络

2505.18758v1

1810

05-24

Smart Energy Guardian: A Hybrid Deep Learning Model for Detecting Fraudulent PV Generation

Smart Energy Guardian: Ein hybrides Deep-Learning-Modell zur Erkennung betrügerischer PV-Generation

智能能源守护者:发现欺诈性光电池发电的混合深学习模式

2505.18755v1

1811

05-24

HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting

HiMoE: Heterogenitäts-informierte Mixture-of-Experts für faire räumlich-zeitliche Vorhersagen

HimMoE:公平空间-时空预报专家的异异质性异构混合

2412.00316v3

1812

05-24

Season-Independent PV Disaggregation Using Multi-Scale Net Load Temporal Feature Extraction and Weather Factor Fusion

Saisonunabhängige PV-Disaggregation mittels Multi-Scale Net Load Temporal Feature Extraktion und Wetterfaktor Fusion

使用多种规模净负荷时间特征抽取和天气因素融合的季节独立光电池拆分

2505.18747v1

1813

05-24

C3R: Channel Conditioned Cell Representations for unified evaluation in microscopy imaging

C3R: Kanalkonditionierte Zelldarstellungen zur einheitlichen Auswertung in der Mikroskopie-Bildgebung

C3R:用于对显微镜成像进行统一评价的有条件细胞代表的频道

2505.18745v1

1814

05-24

Interpretable Company Similarity with Sparse Autoencoders

Interpretierbare Firmenähnlichkeit mit Sparse Autoencodern

与Sparse Autoencolders 相似

2412.02605v3

1815

05-24

Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models

Feature-Extraktion und -Lenkung für eine verbesserte Kettenbildung in Sprachmodellen

语言模型中强化研究链理由的特征采掘和指南

2505.15634v2

1816

05-24

An Interpretable Deep-Learning Framework for Predicting Hospital Readmissions From Electronic Health Records

Ein interpretierbarer Deep-Learning-Rahmen für die Vorhersage von Krankenhausrückübernahmen aus elektronischen Gesundheitsakten

预测医院从电子健康记录中读取的医院可解释的深学习框架

2310.10187v2

1817

05-24

AuroRA: Breaking Low-Rank Bottleneck of LoRA with Nonlinear Mapping

AuroRA: Breaking Low-Rank Engpass von LoRA mit nichtlinearer Kartierung

AuroRA:用非线性绘图法打破LORA的低兰克瓶尾裂

2505.18738v1

1818

05-24

Graph Neural Networks for Knowledge Enhanced Visual Representation of Paintings

知识强化画画视觉表现神经网络

2105.08190v2

1819

05-24

MADCAT: Combating Malware Detection Under Concept Drift with Test-Time Adaptation

MADCAT: Bekämpfung der Malware-Erkennung unter Konzept Drift mit Test-Zeit-Anpassung

MADCAT: 在 “ 漂流 “ 概念下,通过测试-时间适应来打击 “ 恶意探测 “

2505.18734v1

1820

05-24

ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search

ReGUIDE: Dateneffizientes GUI Grounding über räumliche Vernunft und Suche

数据高效界面:通过空间理性和搜索进行数据高效界面定位

2505.15259v2

1821

05-24

Reward-Driven Interaction: Enhancing Proactive Dialogue Agents through User Satisfaction Prediction

Reward-Driven Interaction: Verbesserung proaktiver Dialog-Agenten durch Nutzerzufriedenheitsvorhersage

回报率互动:通过用户满意度预测加强积极主动的对话机构

2505.18731v1

1822

05-24

Influence Functions for Scalable Data Attribution in Diffusion Models

Einflussfunktionen für skalierbare Datenzuweisungen in Diffusionsmodellen

扩散模型中可缩放数据归属的影响函数

2410.13850v5

1823

05-24

Message-Passing State-Space Models: Improving Graph Learning with Modern Sequence Modeling

Message-Passing State-Space-Modelle: Verbesserung des Graphen-Lernens mit moderner Sequenzmodellierung

传递信息的国家空间模型:利用现代序列模型改进图表学习

2505.18728v1

1824

05-24

Length independent generalization bounds for deep SSM architectures via Rademacher contraction and stability constraints

Längenunabhängige Verallgemeinerungsgrenzen für tiefe SSM-Architekturen über Rademacher Kontraktion und Stabilitätsbeschränkungen

通过雷德马赫公司收缩和稳定制约因素对深层的SMS结构进行长度独立概括的界限

2405.20278v3

1825

05-24

Audio Geolocation: A Natural Sounds Benchmark

Audio Geolocation: Ein natürlicher Klang Benchmark

音频地理定位:自然声音基准

2505.18726v1

1826

05-24

LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning

LoTA-QAF: Lossless Ternary Adaptation für Quantization-Aware Fine-Tuning

LoTA-QAF:量化软件微调的无损失田间适应

2505.18724v1

1827

05-24

Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization

Optimales Transport-basiertes Token-Gewichtungssystem für verbesserte Preference-Optimierung

增强优惠优化的优化运输托肯加权计划

2505.18720v1

1828

05-24

Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer

Neurale Parameter Suche nach schlankeren Modellen und besserer Übertragung

搜索细微精制模型和更好传输的神经参数

2505.18713v1

1829

05-24

Learning on LLM Output Signatures for gray-box Behavior Analysis

Lernen auf LLM-Ausgangssignaturen für graue Verhaltensanalyse

学习用于灰箱行为分析的 LLM 输出签名

2503.14043v2

1830

05-24

Steering LLM Reasoning Through Bias-Only Adaptation

Steuerung der LLM-Vernunft durch Bias-Only-Anpassung

仅有的偏差调整导致的偏差调整

2505.18706v1

1831

05-24

(Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models

(Implizit) Ensembles von Ensembles: Epistemische Ungewissheit bricht in großen Modellen zusammen

群集集合:大型模型中的不确定性粒子折叠

2409.02628v2

1832

05-24

Data Overvaluation Attack and Truthful Data Valuation in Federated Learning

Datenüberbewertung Angriff und Truthful Data Bewertung im Föderierten Lernen

联邦学习联盟的数据评价高估攻击和真实数据估值

2502.00494v3

1833

05-24

MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention

MonarchAchtung: Null-Schuss-Umwandlung zu schneller, Hardware-Bewusst strukturierter Aufmerksamkeit

MonarchAttention: 零热转换为快速硬件软件

2505.18698v1

1834

05-24

Can LLMs Alleviate Catastrophic Forgetting in Graph Continual Learning? A Systematic Study

Kann LLMs in Graph Continual Learning Katastrophisches Vergessen lindern? Eine systematische Studie

LLMs LLM 能够减轻图持续学习中的灾难性遗忘吗?系统研究

2505.18697v1

1835

05-24

Revisiting Model Inversion Evaluation: From Misleading Standards to Reliable Privacy Assessment

Revisiting Model Inversion Evaluation: Von irreführenden Standards zur zuverlässigen Datenschutzbewertung

重新审视示范反向评价:从错误领导标准到可靠隐私评估

2505.03519v3

1836

05-24

Simultaneous Optimization of Efficiency and Degradation in Tunable HTL-Free Perovskite Solar Cells with MWCNT-Integrated Back Contact Using a Machine Learning-Derived Polynomial Regressor

Gleichzeitige Optimierung von Effizienz und Degradation in Tunablen HTL-freien Perovskite-Solarzellen mit MWCNT-Integriert Zurück Kontakt mit einem maschinenlernenden Polynom-Regressor

利用机械学习多面制反转器,与MWCNT综合后退联系,同时优化金枪鱼可HTL-无 Perovskite的无Perovskite太阳能电池的效率和退化

2505.18693v1

1837

05-24

Variational Schrödinger Diffusion Models

Variationelle Schrödinger-Diffusionsmodelle

挥发模型

2405.04795v5

1838

05-24

Large Language Models in the Task of Automatic Validation of Text Classifier Predictions

Große Sprachmodelle in der Aufgabe der automatischen Validierung von Textklassifikatoren Vorhersagen

文本分类自动验证任务中的大语言模型

2505.18688v1

1839

05-24

Predictive Performance of Deep Quantum Data Re-uploading Models

Predictive Performance von Deep Quantum Data Re-Uploading-Modellen

深量量数据数据重新加载模型的预测性性能

2505.20337v1

1840

05-24

A fast algorithm to minimize prediction loss of the optimal solution in inverse optimization problem of MILP

Ein schneller Algorithmus zur Minimierung des Vorhersageverlusts der optimalen Lösung im inversen Optimierungsproblem von MILP

快速算法,以尽量减少MILP反优化问题最佳解决办法的预测损失

2405.14273v3

1841

05-24

Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens

Wie ein CHEMIST denken: Kombiniertes Heterogenes Einbetten von Modellintegrationsstrukturen und Tokens

思考像CHEMIST: 混合异基因嵌入模型集成结构和调子

2502.17986v2

1842

05-24

Augmenting the action space with conventions to improve multi-agent cooperation in Hanabi

Erweiterung des Aktionsraums mit Konventionen zur Verbesserung der Multi-Agenten-Kooperation in Hanabi

与公约扩大行动空间,以改进哈纳比多剂合作

2412.06333v3

1843

05-24

COPA: Comparing the incomparable in multi-objective model evaluation

COPA: Vergleich des Unvergleichbaren in der multiobjektiven Modellauswertung

CCOPA: 比较在多目标模式评价中无法比较的模型评价

2503.14321v2

1844

05-24

End-to-End Framework for Predicting the Remaining Useful Life of Lithium-Ion Batteries

End-to-End-Framework zur Vorhersage der verbleibenden Nutzungsdauer von Lithium-Ionen-Batterien

预测锂-碘电池剩余使用寿命的端至端框架

2505.16664v2

1845

05-24

A Quantum Approximation Scheme for k-Means

Ein Quantenannäherungsprogramm für k-Means

k- Means 的量接近量计划

2308.08167v3

1846

05-24

Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations

Erzeugen der Vollfeld-Evolution der physikalischen Dynamik aus irregulären Sparse-Beobachtungen

从不定期的偏差观测中生成物理动态全场演变

2505.09284v2

1847

05-24

Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment?

Findet Repräsentationsintervention wirklich Wunschvorstellungen und Ausgeglichenheit wieder?

代表权干预是否真正确定了理想概念和目的一致?

2505.18672v1

1848

05-24

Flat-LoRA: Low-Rank Adaptation over a Flat Loss Landscape

Flat-LoRA: Low-Rank Anpassung über eine flache verlorene Landschaft

Flat-LORA: 适应平坦损失地貌的低Rank适应

2409.14396v2

1849

05-24

DeCaFlow: A Deconfounding Causal Generative Model

DeCaFlow: Ein entkonfoundierendes Kausalgeneratives Modell

DeCaFlow:一个破碎的因果创造模型

2503.15114v2

1850

05-24

Self-Supervised Evolution Operator Learning for High-Dimensional Dynamical Systems

Selbstüberwachtes Evolutionsoperator-Lernen für hochdimensionelle dynamische Systeme

高多元动态系统学习

2505.18671v1

1851

05-24

Memory-Efficient Super-Resolution of 3D Micro-CT Images Using Octree-Based GANs: Enhancing Resolution and Segmentation Accuracy

Speichereffiziente Super-Resolution von 3D-Mikro-CT-Bildern mit oktree-basierten GANs: Verbesserung der Auflösung und Segmentierung Genauigkeit

使用以屋底为主的GANs:加强分辨率和分解准确度

2505.18664v1

1852

05-24

Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees

Adaptive Vorhersage-Powered AutoEval mit Zuverlässigkeit und Effizienzgarantien

具有可靠性和效率保障的适应性预测力自动评估

2505.18659v1

1853

05-24

Robustness in Large Language Models: A Survey of Mitigation Strategies and Evaluation Metrics

Robustheit in großen Sprachmodellen: Eine Umfrage zu Mitigationsstrategien und Evaluationsmetrics

大语言模式的强强力:减轻战略调查和评价

2505.18658v1

1854

05-24

LLM-QFL: Distilling Large Language Model for Quantum Federated Learning

LLM-QFL: Destillieren eines großen Sprachmodells für Quantum-Federated Learning

LLM-QFL:为量子联邦学习保留大语言模式

2505.18656v1

1855

05-24

On the Emergence of Linear Analogies in Word Embeddings

Zur Entstehung linearer Analogien in Word-Embeddings

单线模拟在文字嵌入中的出现

2505.18651v1

1856

05-24

Flow Matching for Geometric Trajectory Simulation

Flow Matching für geometrische Trajektoriensimulation

几何轨迹模拟流程匹配

2505.18647v1

1857

05-24

Randomized Midpoint Method for Log-Concave Sampling under Constraints

Randomisierte Midpoint-Methode für Log-Concave-Sampling unter Einschränkungen

制约下对日志集点取样的随机中点方法

2405.15379v2

1858

05-24

STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data

StaRFormer: Halbüberwachtes Task-Informiertes Representation-Lernen über dynamisches, aufmerksamkeitsbasiertes regionales Masking für sequentielle Daten

STARFormer:通过动态关注-基于关注的区域按顺序数据区域掩码,进行半超常任务化代表性学习

2504.10097v2

1859

05-24

ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation

ThanoRA: Aufgabe Heterogenität bewusst Multi-Task Low-Rank-Anpassung

塔诺拉:任务差异性-软件多功能、多任务、低风险适应

2505.18640v1

1860

05-24

Graph-Supported Dynamic Algorithm Configuration for Multi-Objective Combinatorial Optimization

Graphunterstützte dynamische Algorithmenkonfiguration für multi-objektive Kombinator-Optimierung

多目标组合优化多目标组合优化支持的图形支持动态算法配置

2505.16471v2

1861

05-24

DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection

DitHub: Modulares Framework zur inkrementellen Open-Vocabulary-Objekterkennung

DitHub: 递增开放词汇物体探测模块框架

2503.09271v2

1862

05-24

Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees

Multi-Step Alignment als Markov Games: Ein optimaler Online-Gradient-Abstieg mit Konvergenzgarantien

作为Markov运动会的多步对齐:带有一致保障的乐观的在线逐渐递增人种方法

2502.12678v2

1863

05-24

Leveraging Structural Knowledge in Diffusion Models for Source Localization in Data-Limited Graph Scenarios

Nutzung struktureller Kenntnisse in Diffusionsmodellen für die Quellenlokalisierung in datenbeschränkten Graphenszenarien

利用传播模型中的结构性知识,在数据限制的图表假设情景中实现源本地化

2502.17928v2

1864

05-24

Asymmetric Duos: Sidekicks Improve Uncertainty

Asymmetrische Duos: Sidekicks verbessern Unsicherheit

非对称 Duos: 侧边icks 改善不确定性

2505.18636v1

1865

05-24

You Can Wash Hands Better: Accurate Daily Handwashing Assessment with a Smartwatch

Sie können Hände besser waschen: Genaue tägliche Handwäsche Bewertung mit einer Smartwatch

你可以更好地洗手:用智能观察准确进行每日洗手评估

2112.06657v5

1866

05-24

Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding

Denken Sie, bevor Sie akzeptieren: Semantische Reflektierende Verifizierung für schnellere spekulative Dekodierung

在你接受之前先想想: 快速投机代号的语义反省校验

2505.18629v1

1867

05-24

HARP: Hesitation-Aware Reframing in Transformer Inference Pass

HARP: Hezitation-Aware Reframing in Transformer Inferenz Pass

HARP: 变压器推断通过中的偏移-软件重新配置

2412.07282v2

1868

05-24

QUCE: The Minimisation and Quantification of Path-Based Uncertainty for Generative Counterfactual Explanations

QUCE: Die Minimierung und Quantifizierung pfadbasierter Unsicherheiten für generative gegenfaktische Erklärungen

QUCE: 产生反事实解释的路径不确定性的最小化和量化

2402.17516v5

1869

05-24

Mind The Gap: Deep Learning Doesn’t Learn Deeply

Mind The Gap: Deep Learning lernt nicht tief

思想差距:深学习不深入学习

2505.18623v1

1870

05-24

Trust, or Don’t Predict: Introducing the CWSA Family for Confidence-Aware Model Evaluation

Vertrauen oder nicht voraussagen: Einführung der CWSA-Familie für vertrauensbewusste Modellbewertung

信任或不要预测:介绍CWSA家庭促进信任-了解模型评价

2505.18622v1

1871

05-24

Neural Solver Selection for Combinatorial Optimization

Neural Solver Selection zur kombinatorischen Optimierung

组合优化的神经溶剂选择

2410.09693v2

1872

05-24

Federated Class-Incremental Learning with Hierarchical Generative Prototypes

Föderiertes Klassen-Inkrementelles Lernen mit Hierarchischen Generativen Prototypen

具有等级制起源原型的联邦高级高等程度学习

2406.02447v4

1873

05-24

MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation

MAVL: Ein mehrsprachiger Audio-Video-Text Datensatz für animierte Song-Übersetzung

MAVL: 动动歌曲翻译多语种视听歌词数据集

2505.18614v1

1874

05-24

MLRan: A Behavioural Dataset for Ransomware Analysis and Detection

MLRan: Ein Verhaltensdatensatz für Ransomware Analyse und Erkennung

MLran:用于分析和探测Ransomware 分析和探测的行为数据集

2505.18613v1

1875

05-24

An Artificial Intelligence Model for Early Stage Breast Cancer Detection from Biopsy Images

Ein Modell der Künstlichen Intelligenz zur Früherkennung von Brustkrebs aus Biopsiebildern

早期从生物心理图像中检测乳腺癌的人工智能模型

2505.20332v1

1876

05-24

Exemplar-Free Continual Learning for State Space Models

Beispielfreies kontinuierliches Lernen für Staatsraummodelle

国家空间模型免税免费持续学习

2505.18604v1

1877

05-24

LLM-Meta-SR: Learning to Evolve Selection Operators for Symbolic Regression

LLM-Meta-SR: Lernen, Auswahloperatoren für symbolische Regression zu entwickeln

LLM-Meta-SR:学习如何向演进中的反射反射选择操作员学习

2505.18602v1

1878

05-24

Learning to Program Quantum Measurements for Machine Learning

Lernen, Quantenmessungen für maschinelles Lernen zu programmieren

学习机器学习量度方案

2505.13525v2

1879 05-24 Sum of Squares Circuits Summe der Quadrate Schaltungen 平方电路总和 2408.11778v3

1880

05-24

LLMs for Supply Chain Management

LLMs für Supply Chain Management

供应链管理LLMs

2505.18597v1

1881

05-24

MisoDICE: Multi-Agent Imitation from Unlabeled Mixed-Quality Demonstrations

MisoDICE: Multi-Agent-Imitation aus nicht gekennzeichneten Mixed-Quality-Demonstrationen

MisoDICE:从未贴标签的混合质量示范中多机构吸收

2505.18595v1

1882

05-24

Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks

Bayesian Meta-Reinforcement Learning mit Laplace Variational Recurrent Networks

采用拉位变换经常网络加强Bayesian Met-加强学习

2505.18591v1

1883

05-24

CiRL: Open-Source Environments for Reinforcement Learning in Circular Economy and Net Zero

CiRL: Open-Source-Umgebungen für verstärktes Lernen in der Kreislaufwirtschaft und Net Zero

CIRL: 在循环经济和净零中加强学习的开放源环境

2505.21536v1

1884

05-24

Model Extrapolation Expedites Alignment

Modell Extrapolation Expeditionen Ausrichtung

模型外推快速调整

2404.16792v4

1885

05-24

Continuous Multi-Task Pre-training for Malicious URL Detection and Webpage Classification

Kontinuierliches Multi-Task-Vortraining für bösartige URL-Erkennung und Webpage-Klassifikation

恶意URL探测和网页分类连续多任务连续培训

2402.11495v2

1886

05-24

REAL: Representation Enhanced Analytic Learning for Exemplar-free Class-incremental Learning

REAL: Darstellungsverstärktes analytisches Lernen für exemplarisch-freies Klassen-inkrementelles Lernen

实际:为免世禁初级入门学习加强代表性分析学习

2403.13522v2

1887

05-24

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

AFL: Ein eingleisiger analytischer Ansatz für das Federated Learning mit vortrainierten Modellen

ACL: 采用培训前模式的联邦学习单一分析方法

2405.16240v2

1888

05-24

Mechanical in-sensor computing: a programmable meta-sensor for structural damage classification without external electronic power

Mechanische In-Sensor-Computing: ein programmierbarer Meta-Sensor für die Klassifizierung von Strukturschäden ohne externe elektronische Leistung

传感器中的机械内传感器计算:可编程的元传感器,用于结构损害分类,无外部电子电源

2505.18579v1

1889

05-24

Trust-Region Twisted Policy Improvement

Vertrauensregion verdrehte politische Verbesserung

改变政策改进

2504.06048v3

1890

05-24

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

TabICL: Ein tabellarisches Grundlagenmodell für das In-Context-Lernen mit großen Datenmengen

TabICL: 大型数据内部知识学习表示基础模型

2502.05564v2

1891

05-24

DAL: A Practical Prior-Free Black-Box Framework for Non-Stationary Bandit Environments

DAL: Ein praktisches Prior-Free Black-Box Framework für nicht-stationäre Bandit-Umgebungen

DAL:非高度强盗环境实际的、事先免费的黑盒框架

2501.19401v2

1892

05-24

Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks

Konvergenzanalyse des natürlichen Gradientenabstiegs für überparameterisierte physikinformierte neurale Netzwerke

超参数物理内成形神经神经网络的自然梯分源相趋同分析

2408.00573v3

1893

05-24

Autocomp: LLM-Driven Code Optimization for Tensor Accelerators

Autocomp: LLM-gesteuerte Code-Optimierung für Tensor-Beschleuniger

自动comp: LLM- Driven 代码对 Tensor 加速器的优化

2505.18574v1

1894

05-24

Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs

Steigerung der Effizienz und Exploration bei der Stärkung des Lernens für LLMs

提高LLMM 强化学习的效率和探索

2505.18573v1

1895

05-24

VISTA: Vision-Language Inference for Training-Free Stock Time-Series Analysis

VISTA: Vision-Language-Schlussfolgerung für eine trainingsfreie Analyse der Stock-Zeitreihen

VISTA:无培训-库存无培训-时间-系列分析的远景-语言推断

2505.18570v1

1896

05-24

Learning without Isolation: Pathway Protection for Continual Learning

Lernen ohne Isolation: Pfadschutz für kontinuierliches Lernen

无孤立的学习:持续学习的路径保护

2505.18568v1

1897

05-24

ReflectDiffu:Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework

ReflectDiffu: Reflect zwischen emotional-intent Ansteckung und Mimicry für Empathetic Response Generation über ein RL-Diffusion Framework

反省:通过RL-扩散框架,对情感-情感内聚变和Mmimimicry之间的反射,以便产生同情性反应

2409.10289v3

1898

05-24

Learning Fluid-Structure Interaction Dynamics with Physics-Informed Neural Networks and Immersed Boundary Methods

Learning Fluid-Struktur-Interaktion Dynamik mit physikinformierten Neuronalen Netzwerken und eingetauchten Grenzmethoden

与物理内成形神经网络和混合边界方法的互动动态

2505.18565v1

1899

05-24

Joint-stochastic-approximation Random Fields with Application to Semi-supervised Learning

Gelenk-Stochastische-Annäherung Random Fields mit Anwendung auf semi-überwachtes Lernen

应用到半监督学习的混合随机场

2505.20330v1

1900

05-24

Joint-stochastic-approximation Autoencoders with Application to Semi-supervised Learning

Gelenkstochastische Approximation Autoencoder mit Anwendung auf semi-überwachtes Lernen

应用到半监督学习的联合研究- 接近自动校方

2505.18558v1

1901

05-24

LAMDA: A Longitudinal Android Malware Benchmark for Concept Drift Analysis

LAMDA: Ein Longitudinal Android Malware Benchmark für Konzept Drift Analyse

LAMDA: 关于概念漂流分析的纵向和机器人毛毛虫基准

2505.18551v1

1902

05-24

ReflectGAN: Modeling Vegetation Effects for Soil Carbon Estimation from Satellite Imagery

ReflectGAN: Modellierung von Vegetationseffekten für Bodenkohlenstoffschätzungen aus Satellitenbildern

反射GAN:从卫星图像中模拟土壤碳估计的植被效应

2505.18546v1

1903

05-24

B-score: Detecting biases in large language models using response history

B-Score: Voreingenommenheit in großen Sprachmodellen anhand der Antworthistorie erkennen

B-序号:利用回应历史在大型语言模型中发现偏见

2505.18545v1

1904

05-24

Benchmarking Poisoning Attacks against Retrieval-Augmented Generation

Benchmarking von Giftangriffen gegen retrieval-angereicherte Generation

制定基准,确定对回收一代人进行中毒袭击的基准

2505.18543v1

1905

05-24

Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking

Denken Sie an Ihre Vision: Multimodale Abschätzung refraaktiver Störungen mittels Elektrookulographie und Eye Tracking

思考你的愿景:利用电光学和眼视跟踪对折发性失常进行多模式估计

2505.18538v1

1906

05-24

Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD

Konvergenz, Haft und Flucht: Stochastische Dynamik in der Nähe kritischer Punkte in SGD

聚合、粘合和逃离:SGD中近临界点的斯托卡动态

2505.18535v1

1907

05-24

CMoE: Converting Mixture-of-Experts from Dense to Accelerate LLM Inference

CMoE: Konvertieren von Mischungen von Experten aus Dense zu beschleunigter LLM-Inferenz

CMoE: 将混合专家从高能转换为加速LLM推理

2502.04416v2

1908

05-24

Preserving AUC Fairness in Learning with Noisy Protected Groups

AUC Fairness beim Lernen mit geräuschgeschützten Gruppen bewahren

维护AUC在与噪音保护群体学习中的公平公平

2505.18532v1

1909

05-24

SMART: Self-Aware Agent for Tool Overuse Mitigation

SMART: Self-Aware Agent für Tool Overuse Mitigation

SMART: 减少工具过度使用自智能剂

2502.11435v2

1910

05-24

Compositional Generalization via Forced Rendering of Disentangled Latents

Zusammensetzungelle Verallgemeinerung durch Zwangsverleumdung entwirrter Latente

通过强迫拆散的内流流流体

2501.18797v2

1911

05-24

CLaDMoP: Learning Transferrable Models from Successful Clinical Trials via LLMs

CLaDMoP: Übertragbare Modelle aus erfolgreichen klinischen Studien über LLMs lernen

CLADMOP:通过LLMs成功临床试验学习可转让模型

2505.18527v1

1912

05-24

Scalable Gaussian Processes with Low-Rank Deep Kernel Decomposition

Skalierbare Gauß-Prozesse mit niederrassiger Tiefenkernzersetzung

可缩放高斯进程,且低射深内核内核分解

2505.18526v1

1913

05-24

LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs

LiSTEN: Soft Token-Embeddings für neurale Audio-LLMs lernen

LISTEN: 神经音频LMS学习软软制嵌入器

2505.18517v1

1914

05-24

Test-Time Adaptation with Binary Feedback

Test-Zeit-Anpassung mit Binär-Feedback

带有二进制反馈的测试时间适应

2505.18514v1

1915

05-24

Enhancing Training Data Attribution with Representational Optimization

Verbesserung der Schulungsdatenzuweisung mit repräsentativer Optimierung

提高培训数据分配,优化代表性

2505.18513v1

1916

05-24

AcuRank: Uncertainty-Aware Adaptive Computation for Listwise Reranking

AcuRank: Ungewissheits-Bewusst-Adaptive-Computation für Listwise-Reranking

AcuRank: 列表排序的不确定性- 软件适应性计算

2505.18512v1

1917

05-24

SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs

SPDEBench: Ein umfassender Benchmark für das Lernen regelmäßiger und singulärer stochastischer PDEs

SPDEBENCH: 定期学习和单声速学项目的广泛基准

2505.18511v1

1918

05-24

How Particle System Theory Enhances Hypergraph Message Passing

Wie Partikelsystemtheorie die Hypergraph-Nachricht verbessert

粒子系统理论如何增强超光速消息传递

2505.18505v1

1919

05-24

Representation Learning with Mutual Influence of Modalities for Node Classification in Multi-Modal Heterogeneous Networks

Repräsentationslernen mit gegenseitigem Einfluss von Modalitäten für die Knotenklassifikation in multimodalen Heterogenen Netzwerken

多模式不同形式网络节点分类方式相互影响,代表学习

2505.07895v2

1920

05-24

LiDAR-EDIT: LiDAR Data Generation by Editing the Object Layouts in Real-World Scenes

LiDAR-EDIT: LiDAR-Datenerstellung durch Bearbeiten der Objektlayouts in realen Szenen

LiDAR-EDIT:通过在真实世界景点中编辑对象布局生成LIDAR数据

2412.00592v3

1921

05-24

EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

EscapeBench: Auf dem Weg zu mehr kreativer Intelligenz von Sprachmodell-Agenten

逃避:努力推进语言示范代理的创意智能

2412.13549v2

1922

05-24

Perception-Informed Neural Networks: Beyond Physics-Informed Neural Networks

Wahrnehmungs-informierte neurale Netzwerke: Jenseits physikinformierter neuraler Netzwerke

感知内化神经网络:超越物理内化神经网络

2505.03806v2

1923

05-24

Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection

Gruppenadaptive Schwellenoptimierung für robuste KI-generierte Texterkennung

强力AI-发光的文本探测的集团-适应性阈值优化

2502.04528v4

1924

05-24

Knowledge Grafting of Large Language Models

Wissen Graften von großen Sprachmodellen

大语言模式知识转让

2505.18502v1

1925

05-24

MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning

MENTOR: Mixture-of-Experts-Netzwerk mit Task-Oriented Perturbation für visuelles Verstärkungslernen

INTOOR: 视力强化学习中以任务为导向的干扰干扰模拟专家网络

2410.14972v2

1926

05-24

G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning

G1: LLMs zur Vernunft bringen bei Diagrammen mit Verstärkungslernen

G1:在加强学习的图表方面向理性者传授法学硕士

2505.18499v1

1927

05-24

Quantum Feature Space of a Qubit Coupled to an Arbitrary Bath

Quanten-Feature-Raum eines Qubits in Verbindung mit einem willkürlichen Bad

与任意浴室结合的Qubit夫妇的量量地貌空间

2505.03397v3

1928

05-24

FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers

FuseGPT: Lernbare Ebenen Fusion generativer vortrainierter Transformer

FuseGPT: 训练前改造器的产生型先导变异器的可学习层融合

2411.14507v2

1929

05-24

Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking

Beyond Masked and Unmasked: Diskrete Diffusion Models via Partial Masking

超越遮盖和无遮盖:通过部分遮盖分解扩散模型

2505.18495v1

1930

05-24

FedHL: Federated Learning for Heterogeneous Low-Rank Adaptation via Unbiased Aggregation

FedHL: Föderiertes Lernen für heterogene Low-Rank-Anpassung durch unvoreingenommene Aggregation

FFHL:通过无偏见的聚合体进行异种性、低兰克低差异适应的联邦学习

2505.18494v1

1931 05-24 TextArena TextArena TextArenna 文本 2504.11442v2

1932

05-24

Statistical Inference under Performativity

Statistische Schlussfolgerung unter Performativität

性能下统计推断值

2505.18493v1

1933

05-24

Synthesizing and Adapting Error Correction Data for Mobile Large Language Model Applications

Synchronisieren und Anpassen von Fehlerkorrekturdaten für mobile Großsprachen-Modellanwendungen

合成和调整移动大语言模型应用错误校正数据

2505.18488v1

1934

05-24

Grounding Bodily Awareness in Visual Representations for Efficient Policy Learning

Bodily Bewusstsein in visuellen Darstellungen für effizientes politisches Lernen geerdet

提高政策学习效率的视觉表现方面的共同认识

2505.18487v1

1935

05-24

The Prompt is Mightier than the Example

Die Aufforderung ist mächtiger als das Beispiel

火急比例子更强

2505.18485v1

1936

05-24

DiffPuter: Empowering Diffusion Models for Missing Data Imputation

DiffPuter: Empowering Diffusion Modelle für fehlende Daten-Imputation

DiffPuter:赋予缺失数据计算传播模型权力

2405.20690v2

1937

05-24

Change Point Detection in the Frequency Domain with Statistical Reliability

Punkterkennung im Frequenzbereich mit statistischer Zuverlässigkeit ändern

具有统计可靠性的频率域的更改点探测

2502.03062v2

1938

05-24

Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective

Sigmoid-Selbstaufmerksamkeit hat eine geringere Probenkomplexität als Softmax-Selbstaufmerksamkeit: Eine Mischung aus Experten-Perspektive

与 Softmax自觉:混合专家视角相比,Sigmoid自觉的样本复杂性较低。

2502.00281v2

1939

05-24

Provably Robust Training of Quantum Circuit Classifiers Against Parameter Noise

Wahrscheinlich robustes Training von Quantum Circuit Klassifikatoren gegen Parametergeräusche

针对参数噪音的量子电路分级器的可证实的强力培训

2505.18478v1

1940

05-24

CAPE: Covariate-Adjusted Pre-Training for Generalized Epidemic Time Series Forecasting

CAPE: Kovariat-adjustierte Vorschulung für generalisierte epidemische Zeitreihen

CAPE: 通用流行病时间序列预测共同调整前培训

2502.03393v3

1941

05-24

Using Large Language Models to Tackle Fundamental Challenges in Graph Learning: A Comprehensive Survey

Große Sprachmodelle nutzen, um grundlegende Herausforderungen im Graphenlernen zu bewältigen: Eine umfassende Umfrage

使用大语言模式应对图表学习中的基本挑战:全面调查

2505.18475v1

1942

05-24

Performance and Generalizability Impacts of Incorporating Geolocation into Deep Learning for Dynamic PM2.5 Estimation

Leistung und Verallgemeinerbarkeit Auswirkungen der Einbeziehung von Geolocation in Deep Learning für dynamische PM2.5 Abschätzung

将地理定位纳入深入学习以进行动态PP2.5估算的绩效和通用性影响

2505.18461v1

1943

05-24

EdgeAgentX: A Novel Framework for Agentic AI at the Edge in Military Communication Networks

EdgeAgentX: Ein neuartiges Framework für Agentische KI am Rand in militärischen Kommunikationsnetzwerken

EdgeAgengengenderX:军事通信网络边缘地带AAA剂性AI新框架

2505.18457v1

1944

05-24

On the Limitations and Possibilities of Nash Regret Minimization in Zero-Sum Matrix Games under Noisy Feedback

Über die Einschränkungen und Möglichkeiten der Nash Regret Minimierung in Zero-Sum Matrix Games unter Noisy Feedback

根据噪音反馈在零-苏姆母体运动会中尽量减少纳什迟缓的限制和可能性

2306.13233v3

1945

05-24

Reinforcement Learning for Stock Transactions

Verstärkungslernen für Aktientransaktionen

证券交易强化学习

2505.16099v2

1946

05-24

Anchored Diffusion Language Model

Verankertes Diffusions-Sprachenmodell

原成品的传播语言模式

2505.18456v1

1947

05-24

On Minimax Estimation of Parameters in Softmax-Contaminated Mixture of Experts

Zur Minimax-Abschätzung von Parametern in Softmax-kontaminierter Mischung von Experten

关于Softmax 被污染的专家混合体参数最小估计

2505.18455v1

1948

05-24

$μ$-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts

$μ$-MoE: Test-Time Pruning als Mikro-Grained Mixture-of-Experts

美元-MoE:作为微粒混合剂专家进行试验时休整

2505.18451v1

1949

05-24

Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting

Breaking Silos: Adaptive Modellfusion löst bessere Zeitreihen voraus

破碎硅:适应性模型融合解锁更好的时间序列预测

2505.18442v1

1950

05-24

DB-KSVD: Scalable Alternating Optimization for Disentangling High-Dimensional Embedding Spaces

DB-KSVD: Skalierbare alternierende Optimierung für das Entwirren hochdimensionaler Einbettungsräume

DB-KSVD: 拆分高多元嵌入空间的可缩放变换最佳优化

2505.18441v1

1951

05-24

Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning

Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methoden für dezentralisiertes Mehr-Agenten-Verstärkungs-Lernen

分散式多机构强化学习的深神经立体-集中式多机构强化学习方法中全球最佳程度趋同

2505.18433v1

Article 0

Title@2025-05-29 (4): From Chat Logs to Collective Insights: Aggregative Question Answering

Title: From Chat Logs to Collective Insights: Aggregative Question Answering

Von Chat Logs zu Collective Insights: Aggregative Question Answering

从聊天日志到集体透视:聚合问题解答 2505.23765v1

Authors: Wentao Zhang, Woojeong Kim, Yuntian Deng

Conversational agents powered by large language models (LLMs) are rapidly becoming integral to our daily interactions, generating unprecedented amounts of conversational data. Such datasets offer a powerful lens into societal interests, trending topics, and collective concerns. Yet, existing approaches typically treat these interactions as independent and miss critical insights that could emerge from aggregating and reasoning across large-scale conversation logs. In this paper, we introduce Aggregative Question Answering, a novel task requiring models to reason explicitly over thousands of user-chatbot interactions to answer aggregative queries, such as identifying emerging concerns among specific demographics. To enable research in this direction, we construct a benchmark, WildChat-AQA, comprising 6,027 aggregative questions derived from 182,330 real-world chatbot conversations. Experiments show that existing methods either struggle to reason effectively or incur prohibitive computational costs, underscoring the need for new approaches capable of extracting collective insights from large-scale conversational data.

由大型语言模型(LLMs)驱动的交汇代理机构正在迅速成为我们日常互动的有机组成部分,产生前所未有的对话数据数量。这类数据集为社会利益、趋势话题和集体关注提供了强大的透镜。然而,现有方法通常将这些互动视为独立和缺乏从大规模对话日志的汇总和推理中可能产生的关键洞察力。在本文中,我们引入了聚合问题回答,这是一项新颖的任务,要求模型明确解释数千个用户-聊天机器人互动,以解答聚合问题,例如确定特定人口群中新出现的关切问题。为了能够进行这方面的研究,我们建立了一个基准,即WildChat-AQA,由182,330个实时聊天室对话产生的6,027个汇总问题组成。实验表明,现有的方法要么是试图有效解释,要么是产生令人望而望而却望而却步的计算成本,这突出表明需要采取新的方法,能够从大规模对话数据中获取集体见解。

Article 1

Title@2025-05-29 (4): Differential Information: An Information-Theoretic Perspective on Preference Optimization

Title: Differential Information: An Information-Theoretic Perspective on Preference Optimization

Differentialinformation: Eine informationstheoretische Perspektive zur Preference-Optimierung

差别信息:关于首选优化的信息理论观点 2505.23761v1

Authors: Yunjae Won, Hyunji Lee, Hyeonbin Hwang, Minjoon Seo

Direct Preference Optimization (DPO) has become a standard technique for aligning language models with human preferences in a supervised manner. Despite its empirical success, the theoretical justification behind its log-ratio reward parameterization remains incomplete. In this work, we address this gap by utilizing the Differential Information Distribution (DID): a distribution over token sequences that captures the information gained during policy updates. First, we show that when preference labels encode the differential information required to transform a reference policy into a target policy, the log-ratio reward in DPO emerges as the uniquely optimal form for learning the target policy via preference optimization. This result naturally yields a closed-form expression for the optimal sampling distribution over rejected responses. Second, we find that the condition for preferences to encode differential information is fundamentally linked to an implicit assumption regarding log-margin ordered policies-an inductive bias widely used in preference optimization yet previously unrecognized. Finally, by analyzing the entropy of the DID, we characterize how learning low-entropy differential information reinforces the policy distribution, while high-entropy differential information induces a smoothing effect, which explains the log-likelihood displacement phenomenon. We validate our theoretical findings in synthetic experiments and extend them to real-world instruction-following datasets. Our results suggest that learning high-entropy differential information is crucial for general instruction-following, while learning low-entropy differential information benefits knowledge-intensive question answering. Overall, our work presents a unifying perspective on the DPO objective, the structure of preference data, and resulting policy behaviors through the lens of differential information.

直接偏好优化(DPO)已成为以监督方式使语言模式与人类偏好相一致的一种标准技术。尽管它取得了经验上的成功,但其日志-鼠标奖励参数的理论理由仍然不完整。在这项工作中,我们通过使用差异信息分布(DID):在象征性序列上分配,捕捉政策更新过程中获得的信息。首先,我们表明,当偏爱标签将将参考政策转化为目标政策所需的差异信息编码成一个目标政策时,DPO的正轨偏差奖励将成为通过偏好优化学习目标政策的独特最佳形式。这自然产生一种封闭式的表达形式,用于最佳抽样分布,而不是被拒绝的答复。第二,我们发现,对差异信息进行编码的偏好与一个隐含的假设从根本上联系在一起,即对在政策更新政策更新过程中广泛使用的政策偏差分配。最后,我们通过分析数据变现的精度,我们如何学习低偏差的视角加强了政策分布,而高偏差信息则带来一种顺畅的效果,这解释了结果的正统化的理论性分析结果,同时,我们学习了我们关于数据流化数据流化的演化的理论性分析,从而验证了我们的数据。

Article 2

Title@2025-05-29 (4): Model Immunization from a Condition Number Perspective

Title: Model Immunization from a Condition Number Perspective

Modell Immunisierung aus einem Zustand Anzahl Perspektive

从条件数字角度进行示范免疫 2505.23760v1

Authors: Amber Yijia Zheng, Cedar Site Bai, Brian Bullins, Raymond A. Yeh

Model immunization aims to pre-train models that are difficult to fine-tune on harmful tasks while retaining their utility on other non-harmful tasks. Though prior work has shown empirical evidence for immunizing text-to-image models, the key understanding of when immunization is possible and a precise definition of an immunized model remain unclear. In this work, we propose a framework, based on the condition number of a Hessian matrix, to analyze model immunization for linear models. Building on this framework, we design an algorithm with regularization terms to control the resulting condition numbers after pre-training. Empirical results on linear models and non-linear deep-nets demonstrate the effectiveness of the proposed algorithm on model immunization. The code is available at https://github.com/amberyzheng/model-immunization-cond-num.

示范免疫旨在为难以微调有害任务,同时保留其用于其他非有害任务的训练前模型。虽然先前的工作已经表明对文本到图像模型进行免疫的经验证据,但对于何时可能进行免疫的关键理解以及对免疫模式的准确定义仍然不明确。在这项工作中,我们提议了一个框架,以赫森矩阵的条件编号为基础,分析线性模型的免疫模式。在这个框架的基础上,我们设计了一个算法,以规范条款控制培训前产生的条件数字。线性模型和非线性深网的经验结果显示了拟议模式免疫算法的有效性。该代码可在https://github.com/amberyzheng/model-immunization-cond-num上查阅。

Article 3

Title@2025-05-29 (4): Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint

Title: Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint

Puzzlet von Puzzles: Wenn Vision-Language-Modelle keinen Hinweis aufnehmen können

由谜题拼取的谜题: 当视觉语言模型无法使用提示时 2505.23759v1

Authors: Heekyung Lee, Jiaxin Ge, Tsung-Han Wu, Minwoo Kang, Trevor Darrell, David M. Chan

Rebus puzzles, visual riddles that encode language through imagery, spatial arrangement, and symbolic substitution, pose a unique challenge to current vision-language models (VLMs). Unlike traditional image captioning or question answering tasks, rebus solving requires multi-modal abstraction, symbolic reasoning, and a grasp of cultural, phonetic and linguistic puns. In this paper, we investigate the capacity of contemporary VLMs to interpret and solve rebus puzzles by constructing a hand-generated and annotated benchmark of diverse English-language rebus puzzles, ranging from simple pictographic substitutions to spatially-dependent cues (“head” over “heels”). We analyze how different VLMs perform, and our findings reveal that while VLMs exhibit some surprising capabilities in decoding simple visual clues, they struggle significantly with tasks requiring abstract reasoning, lateral thinking, and understanding visual metaphors.

通过图像、空间安排和符号替代将语言编码成像的Rebus 拼图、视觉拼图、视觉拼图,对当前的视觉语言模型(VLM)构成了独特的挑战。与传统的图像字幕或问答任务不同,变复解决需要多式抽象、象征性推理以及掌握文化、语音和语言标语。在本文中,我们调查当代VLMs通过构建一个手动生成的和附加注释的多种英语复交拼图的基准来解释和解决变现拼图的能力,从简单的图像替代到空间依赖的提示(“头”到“耳目 ” 。我们分析了不同的VLMs是如何运作的,我们的发现表明,虽然VLMs在解码简单的视觉线索方面表现出一些惊人的能力,但是他们与需要抽象推理、横向思维和理解视觉比喻的任务进行了巨大的斗争。

Article 4

Title@2025-05-29 (4): REOrdering Patches Improves Vision Models

Title: REOrdering Patches Improves Vision Models

REOrdering Patches verbessert Vision Modelle

重新排列补丁改进愿景模式 2505.23751v1

Authors: Declan Kutscher, David M. Chan, Yutong Bai, Trevor Darrell, Ritwik Gupta

Sequence models such as transformers require inputs to be represented as one-dimensional sequences. In vision, this typically involves flattening images using a fixed row-major (raster-scan) order. While full self-attention is permutation-equivariant, modern long-sequence transformers increasingly rely on architectural approximations that break this invariance and introduce sensitivity to patch ordering. We show that patch order significantly affects model performance in such settings, with simple alternatives like column-major or Hilbert curves yielding notable accuracy shifts. Motivated by this, we propose REOrder, a two-stage framework for discovering task-optimal patch orderings. First, we derive an information-theoretic prior by evaluating the compressibility of various patch sequences. Then, we learn a policy over permutations by optimizing a Plackett-Luce policy using REINFORCE. This approach enables efficient learning in a combinatorial permutation space. REOrder improves top-1 accuracy over row-major ordering on ImageNet-1K by up to 3.01% and Functional Map of the World by 13.35%.

变压器等序列模型需要输入作为一维序列。在视觉中, 这通常涉及使用固定的行主( raster- scan) 排序平整图像。虽然完全自省是异位的, 现代长序变压器越来越依赖建筑近似, 打破这种偏差并引入修补顺序的灵敏度。我们显示补丁顺序会大大影响这种环境中的模型性能, 简单替代物, 如列主或希尔伯特曲线, 产生显著的精确性变。我们为此提议了 REODER, 是一个发现任务优化补丁排序的两阶段框架。首先, 我们先通过评估各种补差序列的可压缩性来获得信息理论性。然后, 我们通过利用 REINFORCE 优化 Plackett-Luce 政策来学习一种对调的政策。这种方法可以使调色空间中的有效学习。 REOrorder 能够提高图像Net-1 K 的顶级精度, 最高为3. 01% , 和 World 地图 13. 35% 。

Article 5

Title@2025-05-29 (4): Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?

Title: Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?

Verzerrung der AI Alignment: Optimiert Preference Optimization für Preferences?

AI对齐的扭曲:偏好优化是否优化优惠? 2505.23749v1

Authors: Paul Gölz, Nika Haghtalab, Kunhe Yang

After pre-training, large language models are aligned with human preferences based on pairwise comparisons. State-of-the-art alignment methods (such as PPO-based RLHF and DPO) are built on the assumption of aligning with a single preference model, despite being deployed in settings where users have diverse preferences. As a result, it is not even clear that these alignment methods produce models that satisfy users on average – a minimal requirement for pluralistic alignment. Drawing on social choice theory and modeling users’ comparisons through individual Bradley-Terry (BT) models, we introduce an alignment method’s distortion: the worst-case ratio between the optimal achievable average utility, and the average utility of the learned policy. The notion of distortion helps draw sharp distinctions between alignment methods: Nash Learning from Human Feedback achieves the minimax optimal distortion of $(\frac{1}{2} + o(1)) \cdot \beta$ (for the BT temperature $\beta$), robustly across utility distributions, distributions of comparison pairs, and permissible KL divergences from the reference policy. RLHF and DPO, by contrast, suffer $\geq (1 - o(1)) \cdot \beta$ distortion already without a KL constraint, and $e^{\Omega(\beta)}$ or even unbounded distortion in the full setting, depending on how comparison pairs are sampled.

在培训前,大型语言模式与基于对口比较的人类偏好相一致。尽管在用户有不同偏好的环境中,但大型语言模式在培训前后与人类偏好相符。尽管在用户有不同的偏好,但基于假设与单一偏爱模式保持一致的假设,建立了最先进的调整方法(如基于PPPO的RLHF和DPO)。因此,甚至还不清楚这些调整方法是否产生平均满足用户的模型 – – 这是多元一致的最低要求。根据社会选择理论和通过个人Bradleley-Tery(BT)模型对用户进行比较的模型,我们引入了一种调整方法的扭曲:最佳可实现的平均效用与所学政策的平均效用之间的最坏比例。扭曲概念有助于在调整方法之间作出鲜明的区分:Nash从人类反馈中学习,实现了美元(frac{12}+o(1)的最小最大最佳扭曲值,(beta)\be$(BT温度$\beta$),强的分布,比较配对比的配和允许的KL与参考政策的差异(甚至RHF)和OD=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Article 6

Title@2025-05-29 (4): Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Title: Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Raum-MLLM: Steigerung der MLLM-Kapazitäten in visueller räumlicher Intelligenz

空间-MLLM:增强以视觉为基础的空间情报中的MLLM能力 2505.23747v1

Authors: Diankun Wu, Fangfu Liu, Yi-Hsin Hung, Yueqi Duan

Recent advancements in Multimodal Large Language Models (MLLMs) have significantly enhanced performance on 2D visual tasks. However, improving their spatial intelligence remains a challenge. Existing 3D MLLMs always rely on additional 3D or 2.5D data to incorporate spatial awareness, restricting their utility in scenarios with only 2D inputs, such as images or videos. In this paper, we present Spatial-MLLM, a novel framework for visual-based spatial reasoning from purely 2D observations. Unlike conventional video MLLMs which rely on CLIP-based visual encoders optimized for semantic understanding, our key insight is to unleash the strong structure prior from the feed-forward visual geometry foundation model. Specifically, we propose a dual-encoder architecture: a pretrained 2D visual encoder to extract semantic features, and a spatial encoder-initialized from the backbone of the visual geometry model-to extract 3D structure features. A connector then integrates both features into unified visual tokens for enhanced spatial understanding. Furthermore, we propose a space-aware frame sampling strategy at inference time, which selects the spatially informative frames of a video sequence, ensuring that even under limited token length, the model focuses on frames critical for spatial reasoning. Beyond architecture improvements, we construct the Spatial-MLLM-120k dataset and train the model on it using supervised fine-tuning and GRPO. Extensive experiments on various real-world datasets demonstrate that our spatial-MLLM achieves state-of-the-art performance in a wide range of visual-based spatial understanding and reasoning tasks. Project page: https://diankun-wu.github.io/Spatial-MLLM/.

最近多式大语言模型(MLLM)的进步大大增强了2D视觉任务的业绩。然而,改进它们的空间智能仍是一项挑战。现有的3D MLLMS总是依赖额外的3D或2.5D数据来纳入空间意识,在仅包含2D投入的情景中限制了它们的实用性,例如图像或视频。在本文中,我们介绍了Space-MLLMM,这是纯2D观测中基于视觉的空间推理的新框架。与传统视频MLLMS相比,MLLMS最优化地利用基于CLIP的视觉编码器来优化语义理解,我们的关键洞察力是释放之前的强力结构。具体地说,我们提出一个双重编码结构:一个预先训练的 2D 视觉编码器,用于提取语义特征,例如图像或视频定位模型,然后将两者的特征整合到统一的视觉标记中,以便提高空间理解。我们提出一个空间认知框架比值更精细的取样战略,在向上选择一个空间- mal- mill mill 模型,用来在Simal imal imal imal imal lader Slader Slader Slader Slader Slader Slader Slax 上显示一个我们空间模型,我们空间-mailder-mader-lader-lader-laview一个稳定的空间模型,在空间模型,在Slxxxxxxxxxx 。我们空间- 上,在空间模型中,在Slxx 。

Article 7

Title@2025-05-29 (4): To Trust Or Not To Trust Your Vision-Language Model’s Prediction

Title: To Trust Or Not To Trust Your Vision-Language Model’s Prediction

Vertrauen oder nicht Vertrauen in die Vorhersage Ihres Vision-Sprache-Modells

相信或不相信你的视觉语言模型的预测 2505.23745v1

Authors: Hao Dong, Moru Liu, Jian Liang, Eleni Chatzi, Olga Fink

Vision-Language Models (VLMs) have demonstrated strong capabilities in aligning visual and textual modalities, enabling a wide range of applications in multimodal understanding and generation. While they excel in zero-shot and transfer learning scenarios, VLMs remain susceptible to misclassification, often yielding confident yet incorrect predictions. This limitation poses a significant risk in safety-critical domains, where erroneous predictions can lead to severe consequences. In this work, we introduce TrustVLM, a training-free framework designed to address the critical challenge of estimating when VLM’s predictions can be trusted. Motivated by the observed modality gap in VLMs and the insight that certain concepts are more distinctly represented in the image embedding space, we propose a novel confidence-scoring function that leverages this space to improve misclassification detection. We rigorously evaluate our approach across 17 diverse datasets, employing 4 architectures and 2 VLMs, and demonstrate state-of-the-art performance, with improvements of up to 51.87% in AURC, 9.14% in AUROC, and 32.42% in FPR95 compared to existing baselines. By improving the reliability of the model without requiring retraining, TrustVLM paves the way for safer deployment of VLMs in real-world applications. The code will be available at https://github.com/EPFL-IMOS/TrustVLM.

视觉语言模型(VLM)在调和视觉和文字模型(VLM)方面展示了强大的能力,使视觉和文字模型(VLM)能够适应多种多式理解和生成的多种应用。虽然VLM在零射和传输学习情景中表现优异,但它们仍然容易被错误分类,往往产生自信但不正确的预测。这种限制在安全关键领域构成了巨大的风险,错误预测可能导致严重后果。在这项工作中,我们引入了信任VLM(VLLM),这是一个没有培训的框架,旨在应对在VLM预测可以信任时进行估算的重大挑战。受到VLMS中观察到的模式差距的激励,以及一些概念在图像嵌入空间中更明显地代表了某些概念的洞察力。我们提出一个新的信任分级功能,利用这一空间来改进对错误分类的检测。我们在17个不同的数据集中严格评价我们的方法,使用4个架构和2 VLMM(VLM),并展示最先进的表现,在AURC的51.87%、AURO(9.14%)和FPRM(M)中的32.42%(FM)比现有的基准更安全的部署要更加可靠。

Article 8

Title@2025-05-29 (4): On the Convergence Analysis of Muon

Title: On the Convergence Analysis of Muon

Zur Konvergenzanalyse von Muon

Muon的趋同分析 2505.23737v1

Authors: Wei Shen, Ruichuan Huang, Minhui Huang, Cong Shen, Jiawei Zhang

The majority of parameters in neural networks are naturally represented as matrices. However, most commonly used optimizers treat these matrix parameters as flattened vectors during optimization, potentially overlooking their inherent structural properties. Recently, an optimizer called Muon has been proposed, specifically designed to optimize matrix-structured parameters. Extensive empirical evidence shows that Muon can significantly outperform traditional optimizers when training neural networks. Nonetheless, the theoretical understanding of Muon’s convergence behavior and the reasons behind its superior performance remain limited. In this work, we present a comprehensive convergence rate analysis of Muon and its comparison with Gradient Descent (GD). We further characterize the conditions under which Muon can outperform GD. Our theoretical results reveal that Muon can benefit from the low-rank and approximate blockwise diagonal structure of Hessian matrices – phenomena widely observed in practical neural network training. Our experimental results support and corroborate the theoretical findings.

神经网络中的大多数参数自然以矩阵形式呈现。然而,最常用的优化器将这些矩阵参数视为优化过程中的平坦矢量,可能忽略了它们固有的结构特性。最近,提出了称为Muon的优化器,专门设计以优化矩阵结构参数。广泛的实证证据表明,Muon在培训神经网络时可以大大优于传统优化器。然而,对Muon趋同行为的理论理解及其优异性能背后的原因仍然有限。在这项工作中,我们对Muon的趋同率进行了全面分析,并与GD进行了比较。我们进一步确定了Muon能够超越GD的条件。我们的理论结果表明,Muon可以受益于海珊矩阵的低端和近乎成块的对角结构,这是在实际神经网络培训中广泛观察到的现象。我们的实验结果支持和证实了理论结论。

Article 9

Title@2025-05-29 (4): EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast

Title: EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast

EmotionRankCLAP: Bridging Natural Language Speaking Styles und Ordinal Speech Emotion via Rank-N-Contrast

情感-RankCLAP:通过Ran-N-Contrast将自然语言语言语言的口语风格和普通语言的情感联系起来 2505.23732v1

Authors: Shreeram Suresh Chandra, Lucas Goncalves, Junchen Lu, Carlos Busso, Berrak Sisman

Current emotion-based contrastive language-audio pretraining (CLAP) methods typically learn by na"ively aligning audio samples with corresponding text prompts. Consequently, this approach fails to capture the ordinal nature of emotions, hindering inter-emotion understanding and often resulting in a wide modality gap between the audio and text embeddings due to insufficient alignment. To handle these drawbacks, we introduce EmotionRankCLAP, a supervised contrastive learning approach that uses dimensional attributes of emotional speech and natural language prompts to jointly capture fine-grained emotion variations and improve cross-modal alignment. Our approach utilizes a Rank-N-Contrast objective to learn ordered relationships by contrasting samples based on their rankings in the valence-arousal space. EmotionRankCLAP outperforms existing emotion-CLAP methods in modeling emotion ordinality across modalities, measured via a cross-modal retrieval task.

以情感为基础的对比性语言-语言-语言预培训(CLAP)方法通常通过“将音频样本与相应的文本提示相匹配”来学习。因此,这一方法未能捕捉情绪的规律性,妨碍了情感之间的理解,并常常由于调整不力而导致音频和文字嵌入之间的模式差异很大。为了处理这些缺陷,我们引入了情感-Rank-语言预培训(CLAP),这是一种监督式的对比性学习方法,它使用情感言语和自然语言的维性属性,促进共同捕捉细微的情感变异,改善跨模式的对齐。我们的方法利用一个Rank-N-Contrast目标,通过对比其在价值-繁荣空间的排名来学习定型关系。情感- CLAP(EMERRank-CAP)比现有的情感- CLAP方法在模式上超越了现有的情感-常态模式模型化方法,通过跨模式的检索任务来衡量。

Article 10

Title@2025-05-29 (4): Keep Everyone Happy: Online Fair Division of Numerous Items with Few Copies

Title: Keep Everyone Happy: Online Fair Division of Numerous Items with Few Copies

Halten Sie alle glücklich: Online Fair Division von zahlreichen Artikeln mit wenigen Kopien

让人人快乐:许多物品的在线公平分会,只有很少的影印件。 2408.12845v2

Authors: Arun Verma, Indrajit Saha, Makoto Yokoo, Bryan Kian Hsiang Low

This paper considers a novel variant of the online fair division problem involving multiple agents in which a learner sequentially observes an indivisible item that has to be irrevocably allocated to one of the agents while satisfying a fairness and efficiency constraint. Existing algorithms assume a small number of items with a sufficiently large number of copies, which ensures a good utility estimation for all item-agent pairs from noisy bandit feedback. However, this assumption may not hold in many real-life applications, for example, an online platform that has a large number of users (items) who use the platform’s service providers (agents) only a few times (a few copies of items), which makes it difficult to accurately estimate utilities for all item-agent pairs. To address this, we assume utility is an unknown function of item-agent features. We then propose algorithms that model online fair division as a contextual bandit problem, with sub-linear regret guarantees. Our experimental results further validate the effectiveness of the proposed algorithms.

本文审议了在线公平分配问题的一个新变体,其中涉及多个代理商,学习者依次观察一个不可分割的项目,必须不可撤销地分配给其中的一个代理商,同时满足公平和效率方面的限制;现有的算法假设少数项目,其副本数量足够多,确保了对来自吵闹的土匪反馈的所有物品代理对的有用性评估;然而,这一假设可能在许多现实应用中并不具备,例如,一个使用平台服务提供商(代理商)的用户数量众多的在线平台(项目)只有几次(项目份数不多),因此难以准确估计所有物品代理商的公用事业。为了解决这一问题,我们假设物品代理商的功能是未知的。我们然后提出一种算法,将在线公平划分模式作为背景的土匪问题,并附带线性遗憾保证。我们的实验结果进一步验证了拟议算法的有效性。

Article 11

Title@2025-05-29 (4): MuLoCo: Muon is a practical inner optimizer for DiLoCo

Title: MuLoCo: Muon is a practical inner optimizer for DiLoCo

MuLoCo: Muon ist ein praktischer Innenoptimierer für DiLoCo

MuLoCo: Muon 是 DiLoCo 的实用内部优化器 2505.23725v1

Authors: Benjamin Thérien, Xiaolong Huang, Irina Rish, Eugene Belilovsky

DiLoCo is a powerful framework for training large language models (LLMs) under networking constraints with advantages for increasing parallelism and accelerator utilization in data center settings. Despite significantly reducing communication frequency, however, DiLoCo’s communication steps still involve all-reducing a complete copy of the model’s parameters. While existing works have explored ways to reduce communication in DiLoCo, the role of error feedback accumulators and the effect of the inner-optimizer on compressibility remain under-explored. In this work, we investigate the effectiveness of standard compression methods including Top-k sparsification and quantization for reducing the communication overhead of DiLoCo when paired with two local optimizers (AdamW and Muon). Our experiments pre-training decoder-only transformer language models (LMs) reveal that leveraging Muon as the inner optimizer for DiLoCo along with an error-feedback accumulator allows to aggressively compress the communicated delta to 2-bits with next to no performance degradation. Crucially, MuLoCo (Muon inner optimizer DiLoCo) significantly outperforms DiLoCo while communicating 8X less and having identical memory complexity.

DILOCO是一个强大的框架,用于在网络制约下培训大型语言模型(LLMS),其优势在于增加在数据中心环境中的平行和加速利用。尽管通信频率显著降低,但DILOCO的通信步骤仍然涉及全面减少该模型参数的完整副本。虽然现有工作探索了减少DILOCO的通信的方法,但错误反馈累积器的作用以及内装节能器对压缩作用的影响仍然未得到充分探讨。在这项工作中,我们调查标准压缩方法的有效性,包括高空透析和量化,以减少DILOCO与两个当地优化器(AdamW和Muon)的通信管理费。我们的实验前训练只使用变压器语言模型(LMS)显示,利用Muon作为DILOCO的内部优化器,以及错误反馈累积器的作用,能够将传送的三角盘压缩到2位,而下一个是无性能退化。关键是, Muloco(MUon 内部优化存储器与DLOLO的复杂度小于DLO),大大超越了DLOC。

Article 12

Title@2025-05-29 (4): SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA

Title: SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA

SC-LoRA: Ausbalancieren effizienter Feinsteuerung und Wissenserhaltung über Subraum-kontrainierte LoRA

SC-LORA:通过分空间训练LORA平衡高效微调和知识保护 2505.23724v1

Authors: Minrui Luo, Fuhang Kuang, Yu Wang, Zirui Liu, Tianxing He

Parameter-Efficient Fine-Tuning (PEFT) methods, particularly Low-Rank Adaptation (LoRA), are indispensable for efficiently customizing Large Language Models (LLMs). However, vanilla LoRA suffers from slow convergence speed and knowledge forgetting problems. Recent studies have leveraged the power of designed LoRA initialization, to enhance the fine-tuning efficiency, or to preserve knowledge in the pre-trained LLM. However, none of these works can address the two cases at the same time. To this end, we introduce Subspace-Constrained LoRA (SC-LoRA), a novel LoRA initialization framework engineered to navigate the trade-off between efficient fine-tuning and knowledge preservation. We achieve this by constraining the output of trainable LoRA adapters in a low-rank subspace, where the context information of fine-tuning data is most preserved while the context information of preserved knowledge is least retained, in a balanced way. Such constraint enables the trainable weights to primarily focus on the main features of fine-tuning data while avoiding damaging the preserved knowledge features. We provide theoretical analysis on our method, and conduct extensive experiments including safety preservation and world knowledge preservation, on various downstream tasks. In our experiments, SC-LoRA succeeds in delivering superior fine-tuning performance while markedly diminishing knowledge forgetting, surpassing contemporary LoRA initialization methods.

高效定制大语言模型(LLMS)离不开低频调试(LORA)方法,特别是低频调试(LORA),这是高效定制大语言模型(LLM)不可或缺的。然而,Vanilla LoRA的趋同速度缓慢,知识被忽略了问题。最近的研究利用了设计Lora的初始化能力,提高了微调效率,或保留了预先培训的LLLMM的知识。然而,所有这些工程都没有能够同时处理这两个案例。为此,我们引入了子空间调试LORA(SC-LORA),这是一个新颖的LORA初始化框架,目的是在高效微调和知识保护之间实现取舍。我们通过限制低层亚空间可培训的LORA适应者的产出来实现这一目标,在低层亚空间中保留了微调数据的背景信息,而保留知识的背景信息最少,以平衡的方式保存。这种限制使得可训练的重度能够主要侧重于微调制调数据的主要特征,同时避免损害保存的知识特征。我们从理论角度分析了我们的方法,并进行了广泛的试验,在不断改进的下层级调整世界知识,同时进行广泛的试验,在不断改进后改进后,在改进后进行。

Article 13

Title@2025-05-29 (4): ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering

Title: ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering

ML-Agent: Verstärkung von LLM-Agenten für autonome Maschinenbautechnik

ML-代理:加强自动机械学习工程的LLM代理 2505.23723v1

Authors: Zexi Liu, Jingyi Chai, Xinyu Zhu, Shuo Tang, Rui Ye, Bo Zhang, Lei Bai, Siheng Chen

The emergence of large language model (LLM)-based agents has significantly advanced the development of autonomous machine learning (ML) engineering. However, most existing approaches rely heavily on manual prompt engineering, failing to adapt and optimize based on diverse experimental experiences. Focusing on this, for the first time, we explore the paradigm of learning-based agentic ML, where an LLM agent learns through interactive experimentation on ML tasks using online reinforcement learning (RL). To realize this, we propose a novel agentic ML training framework with three key components: (1) exploration-enriched fine-tuning, which enables LLM agents to generate diverse actions for enhanced RL exploration; (2) step-wise RL, which enables training on a single action step, accelerating experience collection and improving training efficiency; (3) an agentic ML-specific reward module, which unifies varied ML feedback signals into consistent rewards for RL optimization. Leveraging this framework, we train ML-Agent, driven by a 7B-sized Qwen-2.5 LLM for autonomous ML. Remarkably, despite being trained on merely 9 ML tasks, our 7B-sized ML-Agent outperforms the 671B-sized DeepSeek-R1 agent. Furthermore, it achieves continuous performance improvements and demonstrates exceptional cross-task generalization capabilities.

大型语言模式(LLM)代理商的出现大大推动了自主机器学习(ML)工程的发展,然而,大多数现有方法都严重依赖人工快速工程,未能根据不同的实验经验进行适应和优化。我们第一次探索基于学习的代理ML模式,即一个LLM代理商利用在线强化学习(RL),通过互动实验学习ML任务。为了实现这一点,我们提议了一个新型的代理ML培训框架,由三个关键组成部分:(1) 探索性强化微调,使LLM代理商能够产生多种行动,加强RL探索;(2) 渐进式RL,使培训能够采取单一行动步骤,加快经验收集,提高培训效率;(3) 专门针对Agric ML的奖励模块,将各种ML反馈信号整合成对RL优化的一致奖励。利用这一框架,我们培训ML-A代理商,由7B规模的Quen-2.5LMMLM驱动,明显地推动,尽管我们仅仅接受了9 ML任务的培训,但我们的7BS-S-CS-CS-SLS-CSLS-S-S-SVAx Excal eximstreal ex ex exstrual ex ex ex ex ex eximproformacal ex

Article 14

Title@2025-05-29 (4): Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields

Title: Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields

Verteilungsverschiebungen für maschinelle Lernkräfte verstehen und abmildern

机器学习领域理解和缩小分布变化 2503.08674v2

Authors: Tobias Kreiman, Aditi S. Krishnapriyan

Machine Learning Force Fields (MLFFs) are a promising alternative to expensive ab initio quantum mechanical molecular simulations. Given the diversity of chemical spaces that are of interest and the cost of generating new data, it is important to understand how MLFFs generalize beyond their training distributions. In order to characterize and better understand distribution shifts in MLFFs, we conduct diagnostic experiments on chemical datasets, revealing common shifts that pose significant challenges, even for large foundation models trained on extensive data. Based on these observations, we hypothesize that current supervised training methods inadequately regularize MLFFs, resulting in overfitting and learning poor representations of out-of-distribution systems. We then propose two new methods as initial steps for mitigating distribution shifts for MLFFs. Our methods focus on test-time refinement strategies that incur minimal computational cost and do not use expensive ab initio reference labels. The first strategy, based on spectral graph theory, modifies the edges of test graphs to align with graph structures seen during training. Our second strategy improves representations for out-of-distribution systems at test-time by taking gradient steps using an auxiliary objective, such as a cheap physical prior. Our test-time refinement strategies significantly reduce errors on out-of-distribution systems, suggesting that MLFFs are capable of and can move towards modeling diverse chemical spaces, but are not being effectively trained to do so. Our experiments establish clear benchmarks for evaluating the generalization capabilities of the next generation of MLFFs. Our code is available at https://tkreiman.github.io/projects/mlff_distribution_shifts/.

机器学习力场(MLFFs)是取代昂贵的初始量子机械分子模拟的有希望的替代方案。鉴于化学空间的多样性,人们感兴趣的化学空间的多样性以及产生新数据的成本,我们必须了解MLFFs如何超越其培训分布范围加以概括。为了确定和更好地理解MLFFs的分布变化,我们对化学数据集进行诊断性实验,发现共同变化带来重大挑战,甚至对经过广泛数据培训的大型基础模型也是如此。根据这些观察,我们假设当前受监督的培训方法对MLFFs不够正规化,导致过度装配和学习超出分配系统的不良表现。我们随后提出了两种新方法,作为减缓MLFFs分布变化的初步步骤。我们的方法侧重于测试时间的改进战略,这些战略需要最低计算成本,而不使用昂贵的ab元素参考标签。第一个战略以光谱图表理论为基础,调整测试图表的边缘,使之与图表结构一致。我们第二个战略改进了在测试时间里程/空域模式系统外的配置结构,而不是在测试时间里程中有效地评估我们测得的缩缩缩缩的M-Lsalalalalalalal 战略,通过一个辅助目标,可以减少我们测测测测算系统。

Article 15

Title@2025-05-29 (4): DiffER: Categorical Diffusion for Chemical Retrosynthesis

Title: DiffER: Categorical Diffusion for Chemical Retrosynthesis

DiffER: Kategorische Diffusion für chemische Retrosynthese

DiffER: 化学复制合成的分类扩散 2505.23721v1

Authors: Sean Current, Ziqi Chen, Daniel Adu-Ampratwum, Xia Ning, Srinivasan Parthasarathy

Methods for automatic chemical retrosynthesis have found recent success through the application of models traditionally built for natural language processing, primarily through transformer neural networks. These models have demonstrated significant ability to translate between the SMILES encodings of chemical products and reactants, but are constrained as a result of their autoregressive nature. We propose DiffER, an alternative template-free method for retrosynthesis prediction in the form of categorical diffusion, which allows the entire output SMILES sequence to be predicted in unison. We construct an ensemble of diffusion models which achieves state-of-the-art performance for top-1 accuracy and competitive performance for top-3, top-5, and top-10 accuracy among template-free methods. We prove that DiffER is a strong baseline for a new class of template-free model, capable of learning a variety of synthetic techniques used in laboratory settings and outperforming a variety of other template-free methods on top-k accuracy metrics. By constructing an ensemble of categorical diffusion models with a novel length prediction component with variance, our method is able to approximately sample from the posterior distribution of reactants, producing results with strong metrics of confidence and likelihood. Furthermore, our analyses demonstrate that accurate prediction of the SMILES sequence length is key to further boosting the performance of categorical diffusion models.

通过应用传统上为自然语言处理而建造的模型,主要是通过变压器神经网络,自动化学复古法方法最近取得了成功。这些模型展示了在化学产品和反应器SMILES编码之间翻译化学产品和反应器SMILES编码的巨大能力,但因其自反性质而受到限制。我们提议DiffER,一种不使用模板的替代反转合成预测方法,即以绝对扩散的形式进行反转合成预测的替代方法,它使得整个输出SMILES序列能够以一致的方式预测。我们建造了一个综合的传播模型,能够达到顶层3、顶层5和顶层10级无模板方法的最先进的精确性能。我们证明,DiffER是新型无模板模型的强大基准,能够学习实验室环境中使用的各种合成技术,在顶层精确度度测量仪上,超过其他不使用模板的方法。通过构建一个带有新长度预测组件的绝对扩散模型,我们的方法能够从反应器的后端分布到顶端3级的竞争性性能,我们用精确度模型的精确度分析结果进一步展示。

Article 16

Title@2025-05-29 (4): COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents

Title: COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents

COBRA: Kontextueller Bandit-Algorithmus für die Sicherung wahrheitsgetreuer strategischer Agenten

COBRA: 确保真实战略媒介的背景土匪比重 2505.23720v1

Authors: Arun Verma, Indrajit Saha, Makoto Yokoo, Bryan Kian Hsiang Low

This paper considers a contextual bandit problem involving multiple agents, where a learner sequentially observes the contexts and the agent’s reported arms, and then selects the arm that maximizes the system’s overall reward. Existing work in contextual bandits assumes that agents truthfully report their arms, which is unrealistic in many real-life applications. For instance, consider an online platform with multiple sellers; some sellers may misrepresent product quality to gain an advantage, such as having the platform preferentially recommend their products to online users. To address this challenge, we propose an algorithm, COBRA, for contextual bandit problems involving strategic agents that disincentivize their strategic behavior without using any monetary incentives, while having incentive compatibility and a sub-linear regret guarantee. Our experimental results also validate the different performance aspects of our proposed algorithm.

本文考虑了涉及多个代理商的背景强盗问题, 学习者按顺序观察背景和代理商报告的武器,然后选择能最大限度地提高系统总体报酬的手臂。背景强盗的现有工作假设代理商真实地报告其武器,这在许多现实生活中是不现实的。例如, 考虑一个有多个销售者的在线平台; 一些销售者可能会为了获得好处而歪曲产品质量, 比如让平台优先向在线用户推荐产品。为了应对这一挑战, 我们建议使用一种算法, COBRA, 用于涉及战略代理商的背景强盗问题, 这些战略代理商在不使用任何货币奖励的情况下不鼓励其战略行为,同时具有激励兼容性和亚线性遗憾保证。我们的实验结果还验证了我们提议的算法的不同性。

Article 17

Title@2025-05-29 (4): FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

Title: FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

FastTD3: Einfaches, schnelles und fähiges Verstärkungslernen für die humanoide Kontrolle

快速TD3: 人类控制简单、快速和有能力的强化学习 2505.22642v2

Authors: Younggyo Seo, Carmelo Sferrazza, Haoran Geng, Michal Nauman, Zhao-Heng Yin, Pieter Abbeel

Reinforcement learning (RL) has driven significant progress in robotics, but its complexity and long training times remain major bottlenecks. In this report, we introduce FastTD3, a simple, fast, and capable RL algorithm that significantly speeds up training for humanoid robots in popular suites such as HumanoidBench, IsaacLab, and MuJoCo Playground. Our recipe is remarkably simple: we train an off-policy TD3 agent with several modifications – parallel simulation, large-batch updates, a distributional critic, and carefully tuned hyperparameters. FastTD3 solves a range of HumanoidBench tasks in under 3 hours on a single A100 GPU, while remaining stable during training. We also provide a lightweight and easy-to-use implementation of FastTD3 to accelerate RL research in robotics.

强化学习(RL)催生了机器人方面的重大进步,但其复杂性和漫长的培训时间仍然是主要的瓶颈。在本报告中,我们引入了快速TD3, 一种简单、快速、有能力的RL算法,大大加快了人类机器人在流行套房的培训,如人形堡、IsaacLab和Mujoco游乐场。我们的配方非常简单:我们培训了一个脱离政策的TD3代理,并进行了若干修改 – – 平行模拟、大批量更新、分布式评论和仔细调控的超参数。快速TD3在3小时内解决了单个A100 GPU上的一系列人形堡任务,同时在培训期间保持稳定。我们还提供了轻量和易于使用的快速TD3,以加速机器人的RL研究。

Article 18

Title@2025-05-29 (4): TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning

Title: TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning

TiRex: Nullschnelle Vorhersagen über lange und kurze Horizonte mit verbessertem In-Context-Lernen

TiRex: 利用强化的内文学习,对长地和短地平线进行零热预测 2505.23719v1

Authors: Andreas Auer, Patrick Podest, Daniel Klotz, Sebastian Böck, Günter Klambauer, Sepp Hochreiter

In-context learning, the ability of large language models to perform tasks using only examples provided in the prompt, has recently been adapted for time series forecasting. This paradigm enables zero-shot prediction, where past values serve as context for forecasting future values, making powerful forecasting tools accessible to non-experts and increasing the performance when training data are scarce. Most existing zero-shot forecasting approaches rely on transformer architectures, which, despite their success in language, often fall short of expectations in time series forecasting, where recurrent models like LSTMs frequently have the edge. Conversely, while LSTMs are well-suited for time series modeling due to their state-tracking capabilities, they lack strong in-context learning abilities. We introduce TiRex that closes this gap by leveraging xLSTM, an enhanced LSTM with competitive in-context learning skills. Unlike transformers, state-space models, or parallelizable RNNs such as RWKV, TiRex retains state-tracking, a critical property for long-horizon forecasting. To further facilitate its state-tracking ability, we propose a training-time masking strategy called CPM. TiRex sets a new state of the art in zero-shot time series forecasting on the HuggingFace benchmarks GiftEval and Chronos-ZS, outperforming significantly larger models including TabPFN-TS (Prior Labs), Chronos Bolt (Amazon), TimesFM (Google), and Moirai (Salesforce) across both short- and long-term forecasts.

内文学习,大型语言模型仅使用快速实例执行任务的能力最近已经适应了时间序列预测。这一模式使得零点预测成为了零点预测,因为过去的价值是预测未来价值的背景,使非专家可以使用强大的预测工具,培训数据稀缺时提高了绩效。大多数现有的零点预测方法都依赖变压器结构,尽管在语言上取得成功,但在时间序列预测中往往低于预期,而LSTMS等经常模型往往处于优势。相反,虽然LSTMS因其国家跟踪能力而完全适合时间序列模型,但它们缺乏很强的内流学习能力。我们引入了TiRex,通过利用xLSTM(一个具有竞争力的内流学习技能的增强LSTM)来弥补这一差距。不同于变压器、州空间模型或可平行的RWKKV(TRex)等变压器、短程跟踪模型、长距轨道预测的关键属性。为了进一步促进其状态跟踪能力,我们提议在C-FMS(C-FAR-FAR-FAR-MS)长期预测中采用更大规模的C-MIS-MS(C-MIS-MIS-MIS-MIS-MIS-S-S IMFAR-MIS-I-Misal-S-S-S-Misal-S-S-S-S-S-I-S-Misal-S-S-S-S-S-S-S-MS-MS-I-MS-Tir-Tir-Tir-Tir-Tir-N-N-N-N-N-MS-C-MS-N-NC-S-S-S-M-M-NC-S-S-N-N-N-N-N-N-N-N-N-N-S-N-N-N-N-N-N-N-S-S-S-S-C-N-MS-S-S-S-S-S-S-N-N-N-N-N-N-N-N-N-N-S-S-N-MS-N-N-N-N-N-N-N-MS-N-I-N-N-N-MS-S-S-

Article 19

Title@2025-05-29 (4): Foundation Model Hidden Representations for Heart Rate Estimation from Auscultation

Title: Foundation Model Hidden Representations for Heart Rate Estimation from Auscultation

Fundamentalmodell versteckte Darstellungen für die Herzfrequenzschätzung aus der Auskultation

基金会 “ 基金会 “ 用于从修术中心速估计的模型隐藏模型代表 2505.20745v2

Authors: Jingping Nie, Dung T. Tran, Karan Thakkar, Vasudha Kowtha, Jon Huang, Carlos Avendano, Erdrin Azemi, Vikramjit Mitra

Auscultation, particularly heart sound, is a non-invasive technique that provides essential vital sign information. Recently, self-supervised acoustic representation foundation models (FMs) have been proposed to offer insights into acoustics-based vital signs. However, there has been little exploration of the extent to which auscultation is encoded in these pre-trained FM representations. In this work, using a publicly available phonocardiogram (PCG) dataset and a heart rate (HR) estimation model, we conduct a layer-wise investigation of six acoustic representation FMs: HuBERT, wav2vec2, wavLM, Whisper, Contrastive Language-Audio Pretraining (CLAP), and an in-house CLAP model. Additionally, we implement the baseline method from Nie et al., 2024 (which relies on acoustic features) and show that overall, representation vectors from pre-trained foundation models (FMs) offer comparable performance to the baseline. Notably, HR estimation using the representations from the audio encoder of the in-house CLAP model outperforms the results obtained from the baseline, achieving a lower mean absolute error (MAE) across various train/validation/test splits despite the domain mismatch.

最近,提出了自我监督的声学代表基础模型(FMS),以提供对基于声学的重要信号的洞察力;然而,对于这些经过事先训练的调频演示中电解学的编码程度,几乎没有探索。在这项工作中,我们使用公开可得的光心图数据集(PCG)和心率(HR)估计模型,对六个声学代表调频进行了分层调查:HuBERT、 wav2vec2、 wavLM、Whisper、对比语言学预修课(CLAP)和内部CLAP模型。此外,我们实施了Nie等人(依赖声学特征的)2024年基准方法,并表明,从经过训练的基础模型(FMS)中代表的矢量总体表现可与基线相比。值得注意的是,通过内部CLAP模型音频导导出的数据(CLAP模型的音频导演算率超过CLAP的绝对值/模型),尽管从低基线中得出了不同程度的校程结果,我们还是跨了EMA的精确度。

Article 20

Title: Skin Lesion Phenotyping via Nested Multi-modal Contrastive Learning

Haut-Lesion-Phenotypisierung über verschachteltes multimodales kontrastives Lernen

通过Nested多模式反竞争学习进行皮肤脱 Le基因分析 2505.23709v1

Authors: Dionysis Christopoulos, Sotiris Spanos, Eirini Baltzi, Valsamis Ntouskos, Konstantinos Karantzalos

We introduce SLIMP (Skin Lesion Image-Metadata Pre-training) for learning rich representations of skin lesions through a novel nested contrastive learning approach that captures complex relationships between images and metadata. Melanoma detection and skin lesion classification based solely on images, pose significant challenges due to large variations in imaging conditions (lighting, color, resolution, distance, etc.) and lack of clinical and phenotypical context. Clinicians typically follow a holistic approach for assessing the risk level of the patient and for deciding which lesions may be malignant and need to be excised, by considering the patient’s medical history as well as the appearance of other lesions of the patient. Inspired by this, SLIMP combines the appearance and the metadata of individual skin lesions with patient-level metadata relating to their medical record and other clinically relevant information. By fully exploiting all available data modalities throughout the learning process, the proposed pre-training strategy improves performance compared to other pre-training strategies on downstream skin lesions classification tasks highlighting the learned representations quality.

我们引入了SLIMP(皮肤悬浮图像-元数据预培训),以便通过一种新颖的巢状对比式学习方法来了解丰富的皮肤损伤表现,该方法捕捉到图像和元数据之间的复杂关系;仅仅基于图像的皮肤瘤检测和皮肤损伤分类,由于成像条件(亮光、颜色、分辨率、距离等)和缺乏临床和临床信息等)的巨大差异而构成重大挑战;临床医生通常采取综合办法,评估病人的风险程度,确定哪些损伤可能是恶性,哪些需要切除,考虑到病人的医学史以及病人其他损伤的外观;受此启发,SLIMP将个别皮肤损伤的外观和元数据与病人病历和其他临床相关信息的元数据结合起来;在整个学习过程中,充分利用所有可用的数据模式,拟议的培训前战略将提高业绩,而与其他培训前战略相比,提高下游皮肤损伤分类工作的业绩,而培训前战略则强调所了解的面貌质量。

Article 21

Title@2025-05-29 (4): Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better

Title: Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better

Wissensisolierende Vision-Sprache-Action-Modelle: Schnell trainieren, schnell laufen, besser generalisieren

知识绝知识的愿景-语言-行动模式:快速列车、快速跑车、更普遍化 2505.23705v1

Authors: Danny Driess, Jost Tobias Springenberg, Brian Ichter, Lili Yu, Adrian Li-Bell, Karl Pertsch, Allen Z. Ren, Homer Walke, Quan Vuong, Lucy Xiaoyang Shi, Sergey Levine

Vision-language-action (VLA) models provide a powerful approach to training control policies for physical systems, such as robots, by combining end-to-end learning with transfer of semantic knowledge from web-scale vision-language model (VLM) training. However, the constraints of real-time control are often at odds with the design of VLMs: the most powerful VLMs have tens or hundreds of billions of parameters, presenting an obstacle to real-time inference, and operate on discrete tokens rather than the continuous-valued outputs that are required for controlling robots. To address this challenge, recent VLA models have used specialized modules for efficient continuous control, such as action experts or continuous output heads, which typically require adding new untrained parameters to the pretrained VLM backbone. While these modules improve real-time and control capabilities, it remains an open question whether they preserve or degrade the semantic knowledge contained in the pretrained VLM, and what effect they have on the VLA training dynamics. In this paper, we study this question in the context of VLAs that include a continuous diffusion or flow matching action expert, showing that naively including such experts significantly harms both training speed and knowledge transfer. We provide an extensive analysis of various design choices, their impact on performance and knowledge transfer, and propose a technique for insulating the VLM backbone during VLA training that mitigates this issue. Videos are available at https://pi.website/research/knowledge_insulation.

视觉-语言行动模式(VLA)模式提供了一种强有力的方法,通过将端到端的学习与从网络规模的视觉-语言模型(VLM)培训的语义知识转让结合起来,为控制机器人等物理系统的培训控制政策提供培训,将终端到端的学习与从网络规模的视觉-语言模型(VLM)培训的语义知识转让结合起来,然而,实时控制的限制往往与VLM的设计不相符:最强大的VLM模型拥有数百亿或数千亿参数,对实时推断构成障碍,以离散的象征而不是控制机器人所需的持续价值高的输出进行操作。为了应对这一挑战,最近的VLA模型使用专门模块来进行有效的连续控制,例如行动专家或连续产出负责人,这通常需要为预先培训的VLM骨干增加新的未经培训的参数。虽然这些模块提高了实时和控制能力,但它们保存或降低预先培训VLM的语义知识,以及它们对VLA培训动态有何影响。在本文件中,我们从VLA的角度研究这一问题,其中包括在持续传播或流动的行动分析过程中提供这种知识的流学程分析。

Article 22

Title@2025-05-29 (4): (U)NFV: Supervised and Unsupervised Neural Finite Volume Methods for Solving Hyperbolic PDEs

Title: (U)NFV: Supervised and Unsupervised Neural Finite Volume Methods for Solving Hyperbolic PDEs

(U)NFV: Überwachte und unüberwachte neurale Finite-Volume-Methoden zur Lösung hyperbolischer PDEs

(U) NFV: 被监督和不受监督的解决双曲 PDE 的神经有限量方法 2505.23702v1

Authors: Nathan Lichtlé, Alexi Canesse, Zhe Fu, Hossein Nick Zinat Matin, Maria Laura Delle Monache, Alexandre M. Bayen

We introduce (U)NFV, a modular neural network architecture that generalizes classical finite volume (FV) methods for solving hyperbolic conservation laws. Hyperbolic partial differential equations (PDEs) are challenging to solve, particularly conservation laws whose physically relevant solutions contain shocks and discontinuities. FV methods are widely used for their mathematical properties: convergence to entropy solutions, flow conservation, or total variation diminishing, but often lack accuracy and flexibility in complex settings. Neural Finite Volume addresses these limitations by learning update rules over extended spatial and temporal stencils while preserving conservation structure. It supports both supervised training on solution data (NFV) and unsupervised training via weak-form residual loss (UNFV). Applied to first-order conservation laws, (U)NFV achieves up to 10x lower error than Godunov’s method, outperforms ENO/WENO, and rivals discontinuous Galerkin solvers with far less complexity. On traffic modeling problems, both from PDEs and from experimental highway data, (U)NFV captures nonlinear wave dynamics with significantly higher fidelity and scalability than traditional FV approaches.

我们引入了(U)NFV, 这是一种模块式神经网络结构,它概括了解决双曲养护法的经典有限体积(FV)方法。双曲部分偏差方程式(PDE)是难以解决的难题,特别是其物理相关解决方案包含冲击和不连续性的养护法。FV方法被广泛用于其数学属性:与恒温解决方案的趋同、流动保护或整体变异减少,但在复杂环境中往往缺乏准确性和灵活性。神经中量量解决这些局限性的方法是:在保存保护结构的同时,学习更新关于超长空间和时空超时短体积的规则。它既支持关于解决方案数据(NFV)的监督培训,又支持通过弱形残余损失(UNFV)进行不受监督的培训。适用于一阶保护法,(U)NFVV达到比Godunov方法低10倍的错误,优于ENO/WENO和不连续的加勒金溶剂,其复杂性要小得多。关于交通建模的问题,来自PDEs和实验性高速公路数据,(U)NFV捕捉取非直线波波波波波波动力,其传统和可变性方法远。

Article 23

Title@2025-05-29 (4): DiCoFlex: Model-agnostic diverse counterfactuals with flexible control

Title: DiCoFlex: Model-agnostic diverse counterfactuals with flexible control

DiCoFlex: Modell-agnostische diverse Gegenfakten mit flexibler Steuerung

DiCoFlex:具有灵活控制的模型 – – 不可知性多元反事实 2505.23700v1

Authors: Oleksii Furman, Ulvi Movsum-zada, Patryk Marszalek, Maciej Zięba, Marek Śmieja

Counterfactual explanations play a pivotal role in explainable artificial intelligence (XAI) by offering intuitive, human-understandable alternatives that elucidate machine learning model decisions. Despite their significance, existing methods for generating counterfactuals often require constant access to the predictive model, involve computationally intensive optimization for each instance and lack the flexibility to adapt to new user-defined constraints without retraining. In this paper, we propose DiCoFlex, a novel model-agnostic, conditional generative framework that produces multiple diverse counterfactuals in a single forward pass. Leveraging conditional normalizing flows trained solely on labeled data, DiCoFlex addresses key limitations by enabling real-time user-driven customization of constraints such as sparsity and actionability at inference time. Extensive experiments on standard benchmark datasets show that DiCoFlex outperforms existing methods in terms of validity, diversity, proximity, and constraint adherence, making it a practical and scalable solution for counterfactual generation in sensitive decision-making domains.

反事实解释在可解释的人工智能(XAI)中发挥着关键作用,它提供了直观的、人所无法理解的替代方法,阐明机器学习模式决定。尽管这些方法很重要,但现有的反事实方法往往需要不断访问预测模型,涉及对每种情况进行计算密集的优化,缺乏适应新的用户定义的限制而不进行再培训的灵活性。在本文中,我们提议DicoFlex,这是一个在单一前方传递过程中产生多种反事实的新颖的、不易理解的、有条件的模型化框架。DicoFlex利用仅以标签数据培训的有条件的正常流动,通过实时用户驱动的定制限制(如在推论时间的宽度和可操作性)来解决关键限制。关于标准基准数据集的广泛实验表明,DicoFlex在有效性、多样性、近距离和约束性方面超越了现有方法,使其成为敏感决策领域反事实生成的一个实用和可扩展的解决办法。

Article 24

Title@2025-05-29 (4): Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms

Title: Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms

Computational Algebra mit Achtung: Transformer Oracles für Border Basis Algorithmen

注意的计算代数:边境基准比值的变异甲骨文 2505.23696v1

Authors: Hiroshi Kera, Nico Pelleriti, Yuki Ishihara, Max Zimmer, Sebastian Pokutta

Solving systems of polynomial equations, particularly those with finitely many solutions, is a crucial challenge across many scientific fields. Traditional methods like Gr"obner and Border bases are fundamental but suffer from high computational costs, which have motivated recent Deep Learning approaches to improve efficiency, albeit at the expense of output correctness. In this work, we introduce the Oracle Border Basis Algorithm, the first Deep Learning approach that accelerates Border basis computation while maintaining output guarantees. To this end, we design and train a Transformer-based oracle that identifies and eliminates computationally expensive reduction steps, which we find to dominate the algorithm’s runtime. By selectively invoking this oracle during critical phases of computation, we achieve substantial speedup factors of up to 3.5x compared to the base algorithm, without compromising the correctness of results. To generate the training data, we develop a sampling method and provide the first sampling theorem for border bases. We construct a tokenization and embedding scheme tailored to monomial-centered algebraic computations, resulting in a compact and expressive input representation, which reduces the number of tokens to encode an $n$-variate polynomial by a factor of $O(n)$. Our learning approach is data efficient, stable, and a practical enhancement to traditional computer algebra algorithms and symbolic computation.

解决多式方程式的系统,特别是那些有有限多种解决方案的系统,是许多科学领域的一项关键挑战。传统方法,如Gr'obner和边界基地,是基本的基本方法,但有很高的计算成本,这些方法激励了最近的深学习方法提高效率,尽管牺牲了产出的正确性。在这项工作中,我们引入了甲骨边边边界基础算法,这是在维持输出保证的同时加速边界基础计算的第一个深学习方法。为此,我们设计和培训了一个基于变异器的变异器,它识别并消除了计算成本昂贵的削减步骤,我们发现这些步骤在算法运行时占据了主导地位。通过在关键计算阶段有选择地援引这个步子,我们实现了与基算法相比高达3.5x的大幅加速因素,但不会损害结果的正确性。为了生成培训数据,我们开发了一个取样方法,并为边界基地提供了第一个抽样标本。我们根据单项(核心的代数计算方法)设计了一种象征性和嵌套式的计算方法,从而形成一个压缩和直观的输入式的输入,从而减少以美元为最高级计算方法的象征值的代数,从而降低了我们将一个数字的模型的模型的增压乘数。

Article 25

Title@2025-05-29 (4): On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures

Title: On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures

Über die Ausbildungskonvergenz von Transformern für die In-Context-Klassifizierung von Gauß-Mischungen

Gaussian混合物内集成分类变异器培训趋同 2410.11778v3

Authors: Wei Shen, Ruida Zhou, Jing Yang, Cong Shen

Although transformers have demonstrated impressive capabilities for in-context learning (ICL) in practice, theoretical understanding of the underlying mechanism that allows transformers to perform ICL is still in its infancy. This work aims to theoretically study the training dynamics of transformers for in-context classification tasks. We demonstrate that, for in-context classification of Gaussian mixtures under certain assumptions, a single-layer transformer trained via gradient descent converges to a globally optimal model at a linear rate. We further quantify the impact of the training and testing prompt lengths on the ICL inference error of the trained transformer. We show that when the lengths of training and testing prompts are sufficiently large, the prediction of the trained transformer approaches the ground truth distribution of the labels. Experimental results corroborate the theoretical findings.

虽然变压器在实践中表现出令人印象深刻的内流学习能力(ICL),但对于使变压器能够执行ICL的基本机制的理论理解仍处于初级阶段,这项工作旨在从理论上研究变压器在内流分类任务的培训动态,我们证明,对于某些假设对高斯混合物的内流分类,通过梯度下降而培训的单层变压器以线性速度与全球最佳模式相融合。我们进一步量化培训和测试快速长度对ICL所培训变压器的推断错误的影响。我们表明,在培训和测试时间足够长的情况下,经过培训的变压器的预测将接近标签的地面真实分布。实验结果证实了理论结论。

Article 26

Title@2025-05-29 (4): From Individual Experience to Collective Evidence: A Reporting-Based Framework for Identifying Systemic Harms

Title: From Individual Experience to Collective Evidence: A Reporting-Based Framework for Identifying Systemic Harms

Von der individuellen Erfahrung zu kollektiven Beweisen: Ein meldepflichtiger Rahmen für die Identifizierung systemischer Schäden

从个人经验到集体证据:查明系统危害的报告框架 2502.08166v2

Authors: Jessica Dai, Paula Gradu, Inioluwa Deborah Raji, Benjamin Recht

When an individual reports a negative interaction with some system, how can their personal experience be contextualized within broader patterns of system behavior? We study the reporting database problem, where individual reports of adverse events arrive sequentially, and are aggregated over time. In this work, our goal is to identify whether there are subgroups–defined by any combination of relevant features–that are disproportionately likely to experience harmful interactions with the system. We formalize this problem as a sequential hypothesis test, and identify conditions on reporting behavior that are sufficient for making inferences about disparities in true rates of harm across subgroups. We show that algorithms for sequential hypothesis tests can be applied to this problem with a standard multiple testing correction. We then demonstrate our method on real-world datasets, including mortgage decisions and vaccine side effects; on each, our method (re-)identifies subgroups known to experience disproportionate harm using only a fraction of the data that was initially used to discover them.

当个人报告与某个系统的负面互动时,他们的个人经验如何在更广泛的系统行为模式中被联系到背景?我们研究报告数据库问题,即关于不利事件的个别报告按顺序出现,并随着时间的推移加以汇总。在这项工作中,我们的目标是确定是否有由相关特征组合来界定的分组,这些分组极有可能与系统发生有害互动。我们将此问题正式确定为顺序假设测试,并查明报告行为的条件,足以推断各分组之间实际伤害率的差异。我们显示,连续假设测试的算法可以用标准的多次测试校正来适用于这一问题。我们然后在现实世界数据集中展示我们的方法,包括按揭决定和疫苗副作用;在每种数据中,我们的方法(重新)仅使用最初用于发现这些数据的一部分,将已知遭受过度伤害的分组确定为已知的分组。

Article 27

Title@2025-05-29 (4): Mobi-$π$: Mobilizing Your Robot Learning Policy

Title: Mobi-$π$: Mobilizing Your Robot Learning Policy

Mobi-$π$: Mobilisierung Ihrer Roboter-Lernpolitik

Mobi-$ 美元:调动机器人学习政策 2505.23692v1

Authors: Jingyun Yang, Isabella Huang, Brandon Vu, Max Bajracharya, Rika Antonova, Jeannette Bohg

Learned visuomotor policies are capable of performing increasingly complex manipulation tasks. However, most of these policies are trained on data collected from limited robot positions and camera viewpoints. This leads to poor generalization to novel robot positions, which limits the use of these policies on mobile platforms, especially for precise tasks like pressing buttons or turning faucets. In this work, we formulate the policy mobilization problem: find a mobile robot base pose in a novel environment that is in distribution with respect to a manipulation policy trained on a limited set of camera viewpoints. Compared to retraining the policy itself to be more robust to unseen robot base pose initializations, policy mobilization decouples navigation from manipulation and thus does not require additional demonstrations. Crucially, this problem formulation complements existing efforts to improve manipulation policy robustness to novel viewpoints and remains compatible with them. To study policy mobilization, we introduce the Mobi-$\pi$ framework, which includes: (1) metrics that quantify the difficulty of mobilizing a given policy, (2) a suite of simulated mobile manipulation tasks based on RoboCasa to evaluate policy mobilization, (3) visualization tools for analysis, and (4) several baseline methods. We also propose a novel approach that bridges navigation and manipulation by optimizing the robot’s base pose to align with an in-distribution base pose for a learned policy. Our approach utilizes 3D Gaussian Splatting for novel view synthesis, a score function to evaluate pose suitability, and sampling-based optimization to identify optimal robot poses. We show that our approach outperforms baselines in both simulation and real-world environments, demonstrating its effectiveness for policy mobilization.

在这项工作中,我们制定了政策动员问题:找到一个移动机器人基地,这是在经过有限的相机观点培训的操纵政策方面的新环境;然而,大多数这些政策都是根据从有限的机器人位置和相机观点收集的数据进行的培训;这导致对新机器人位置的概括化不力,从而限制在移动平台上使用这些政策,特别是用于诸如按键或旋转水龙头等精确任务。在这项工作中,我们制定了政策动员问题:找到一个移动机器人基地,这是在经过有限的摄影师观点培训的操纵政策方面新出现的环境。相比之下,对政策本身进行再培训,使之更强有力地对看不见的机器人基地进行初始化,政策动员从操纵到操作,因此不需要额外的演示。很显然,这一问题的提出补充了目前在移动平台上改进操纵政策对新观点的强大性和与这些观点的兼容性。为了研究政策动员,我们引入了mobi-$的框架,其中包括:(1) 量化调动特定政策难度的计量标准,(2) 基于RoboCas的模拟移动操作任务,以评价政策动员政策,(3) 可视化分析工具,以及(4) 将一些基线方法用于分析,我们还提出升级的升级的升级。

Article 28

Title@2025-05-29 (4): Unifying Perspectives: Plausible Counterfactual Explanations on Global, Group-wise, and Local Levels

Title: Unifying Perspectives: Plausible Counterfactual Explanations on Global, Group-wise, and Local Levels

Vereinheitlichende Perspektiven: Plausible gegenfaktische Erklärungen auf globaler, gruppenweiser und lokaler Ebene

统一观点:关于全球、集团和当地雇员的可视反事实解释 2405.17642v2

Authors: Oleksii Furman, Patryk Wielopolski, Łukasz Lenkiewicz, Jerzy Stefanowski, Maciej Zięba

The growing complexity of AI systems has intensified the need for transparency through Explainable AI (XAI). Counterfactual explanations (CFs) offer actionable “what-if” scenarios on three levels: Local CFs providing instance-specific insights, Global CFs addressing broader trends, and Group-wise CFs (GWCFs) striking a balance and revealing patterns within cohesive groups. Despite the availability of methods for each granularity level, the field lacks a unified method that integrates these complementary approaches. We address this limitation by proposing a gradient-based optimization method for differentiable models that generates Local, Global, and Group-wise Counterfactual Explanations in a unified manner. We especially enhance GWCF generation by combining instance grouping and counterfactual generation into a single efficient process, replacing traditional two-step methods. Moreover, to ensure trustworthiness, we innovatively introduce the integration of plausibility criteria into the GWCF domain, making explanations both valid and realistic. Our results demonstrate the method’s effectiveness in balancing validity, proximity, and plausibility while optimizing group granularity, with practical utility validated through practical use cases.

通过可解释的AI(XAI),AI系统日益复杂,增加了透明度的必要性; 反事实解释(CFS)在以下三个层面提供了可操作的“如果是什么”情景:地方CFS提供具体实例的洞察力,全球CFS处理更广泛的趋势,以及集体CFs在具有凝聚力的群体中取得平衡和揭示模式; 尽管为每个微粒层面提供了方法,但实地缺乏一种统一的方法,将这些互补方法结合起来。我们通过提出一种基于梯度的优化方法来解决这一局限性,以统一的方式为产生地方、全球和集团之间反事实解释的不同模型提出一种基于梯度的优化方法。我们特别通过将实例分组和反事实生成合并到一个单一的有效过程,以取代传统的两步方法。此外,为了确保信任性,我们创新地将可信赖性标准纳入GWCF领域,同时作出合理和现实的解释。我们的结果表明,在优化群体粒子性的同时,在优化群体有效性、近近和可信赖性方面,同时通过实际使用案例来验证实用性,从而增强GWCF的生成。

Article 29

Title@2025-05-29 (4): Learning Compositional Functions with Transformers from Easy-to-Hard Data

Title: Learning Compositional Functions with Transformers from Easy-to-Hard Data

Komponative Funktionen mit Transformern von einfachen Daten lernen

学习从易读数据转换器的学习构成函数 2505.23683v1

Authors: Zixuan Wang, Eshaan Nichani, Alberto Bietti, Alex Damian, Daniel Hsu, Jason D. Lee, Denny Wu

Transformer-based language models have demonstrated impressive capabilities across a range of complex reasoning tasks. Prior theoretical work exploring the expressive power of transformers has shown that they can efficiently perform multi-step reasoning tasks involving parallelizable computations. However, the learnability of such constructions, particularly the conditions on the data distribution that enable efficient learning via gradient-based optimization, remains an open question. Towards answering this question, in this work we study the learnability of the $k$-fold composition task, which requires computing an interleaved composition of $k$ input permutations and $k$ hidden permutations, and can be expressed by a transformer with $O(\log k)$ layers. On the negative front, we prove a Statistical Query (SQ) lower bound showing that any SQ learner that makes only polynomially-many queries to an SQ oracle for the $k$-fold composition task distribution must have sample size exponential in $k$, thus establishing a statistical-computational gap. On the other hand, we show that this function class can be efficiently learned, with runtime and sample complexity polynomial in $k$, by gradient descent on an $O(\log k)$-depth transformer via two different curriculum learning strategies: one in which data consists of $k’$-fold composition functions with $k’ \le k$ presented in increasing difficulty, and another in which all such data is presented simultaneously. Our work sheds light on the necessity and sufficiency of having both easy and hard examples in the data distribution for transformers to learn complex compositional tasks.

以变换器为基础的语言模型在一系列复杂的推理任务中表现出了令人印象深刻的能力。先前的探索变压器显性力量的理论工作表明, 变压器能够高效地执行包含平行计算在内的多步推理任务。然而, 这样的构造, 特别是数据分配条件的学习性, 以便通过基于梯度的优化来有效学习, 仍然是个未决问题。在回答这个问题时, 我们研究美元倍数构成任务的学习性, 需要计算美元输入值的跨端构成和美元隐藏的平价, 并且可以通过一个具有$(logal)的变压层来表达。在负面上, 我们证明一个统计质(SQ) , 特别是数据分配条件, 使得通过基于渐变法的数据结构中, 任何SQ 类的简单质查询, 都必须以 $( $) 为单位, 来显示一个复杂的统计- 计算差距。在另一方面, 我们显示, 这个函数类可以高效地学习, 运行时间和数据变压变法中, 通过一个变压式的变压法, 。

Article 30

Title@2025-05-29 (4): Understanding Mode Connectivity via Parameter Space Symmetry

Title: Understanding Mode Connectivity via Parameter Space Symmetry

Mode-Konnektivität über Parameter Raumsymmetrie verstehen

通过参数空间对称法理解模式连通性 2505.23681v1

Authors: Bo Zhao, Nima Dehmamy, Robin Walters, Rose Yu

Neural network minima are often connected by curves along which train and test loss remain nearly constant, a phenomenon known as mode connectivity. While this property has enabled applications such as model merging and fine-tuning, its theoretical explanation remains unclear. We propose a new approach to exploring the connectedness of minima using parameter space symmetry. By linking the topology of symmetry groups to that of the minima, we derive the number of connected components of the minima of linear networks and show that skip connections reduce this number. We then examine when mode connectivity and linear mode connectivity hold or fail, using parameter symmetries which account for a significant part of the minimum. Finally, we provide explicit expressions for connecting curves in the minima induced by symmetry. Using the curvature of these curves, we derive conditions under which linear mode connectivity approximately holds. Our findings highlight the role of continuous symmetries in understanding the neural network loss landscape.

神经网络迷宫往往通过曲线连接, 火车和测试损失几乎保持不变, 这是一种被称为模式连接的现象。虽然此属性使得模型合并和微调等应用得以进行, 但其理论解释仍然不清楚。我们提出一种新的方法, 利用参数空间对称来探索微型连接性。通过将对称组的地形学与微型对称组联系起来, 我们得出线性网络微型网的连接部分的数量, 并显示跳过连接会减少这个数量。然后我们用参数对称来检查模式连接和线性模式连接在最小值中占相当一部分的值时, 我们用参数对称来检查模式连接性连接性连接性。最后, 我们用这些曲线的曲线的曲线来得出线性连接性模式连接性条件。我们的发现突出了持续对称性连接在理解神经网络损失景观中所起的作用。

Article 31

Title@2025-05-29 (4): SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

Title: SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

SVRPBench: Ein realistischer Maßstab für stochastisches Fahrzeugrouting-Problem

SVRPBench: 蒸汽车辆流出问题的现实基准 2505.21887v2

Authors: Ahmed Heakl, Yahia Salaheldin Shaaban, Martin Takac, Salem Lahlou, Zangir Iklassov

Robust routing under uncertainty is central to real-world logistics, yet most benchmarks assume static, idealized settings. We present SVRPBench, the first open benchmark to capture high-fidelity stochastic dynamics in vehicle routing at urban scale. Spanning more than 500 instances with up to 1000 customers, it simulates realistic delivery conditions: time-dependent congestion, log-normal delays, probabilistic accidents, and empirically grounded time windows for residential and commercial clients. Our pipeline generates diverse, constraint-rich scenarios, including multi-depot and multi-vehicle setups. Benchmarking reveals that state-of-the-art RL solvers like POMO and AM degrade by over 20% under distributional shift, while classical and metaheuristic methods remain robust. To enable reproducible research, we release the dataset and evaluation suite. SVRPBench challenges the community to design solvers that generalize beyond synthetic assumptions and adapt to real-world uncertainty.

不确定情况下的强力航向是现实世界物流的核心,但大多数基准是静态、理想化的环境。我们介绍了SVRPBench,这是第一个在城市规模车辆航道中捕捉高纤维性随机动态的开放基准。它覆盖了500多例,有多达1000名客户,模拟了现实的交付条件:根据时间的拥堵、逻辑正常的延误、概率性事故,以及基于经验的住宅和商业客户时间窗口。我们的输油管道产生了多种多样的、限制性强的情景,包括多功能和多车辆设置。基准显示,在分布式转换中,POM和AM等最先进的RL解答器在20 % , 而传统和计量方法依然健全。为了进行再生研究,我们发布了数据集和评价套件。SVRPBench挑战社区设计超越合成假设并适应现实世界不确定性的解决方案。

Article 32

Title@2025-05-29 (4): Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds

Title: Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds

Bayesische Optimierung durch menschliches Feedback: Nah-optimale Reue-Bounds

Bayesian 人体反馈的优化:接近最佳的冷却环 2505.23673v1

Authors: Aya Kayal, Sattar Vakili, Laura Toni, Da-shan Shiu, Alberto Bernacchia

Bayesian optimization (BO) with preference-based feedback has recently garnered significant attention due to its emerging applications. We refer to this problem as Bayesian Optimization from Human Feedback (BOHF), which differs from conventional BO by learning the best actions from a reduced feedback model, where only the preference between two actions is revealed to the learner at each time step. The objective is to identify the best action using a limited number of preference queries, typically obtained through costly human feedback. Existing work, which adopts the Bradley-Terry-Luce (BTL) feedback model, provides regret bounds for the performance of several algorithms. In this work, within the same framework we develop tighter performance guarantees. Specifically, we derive regret bounds of $\tilde{\mathcal{O}}(\sqrt{\Gamma(T)T})$, where $\Gamma(T)$ represents the maximum information gain$\unicode{x2014}$a kernel-specific complexity term$\unicode{x2014}$and $T$ is the number of queries. Our results significantly improve upon existing bounds. Notably, for common kernels, we show that the order-optimal sample complexities of conventional BO$\unicode{x2014}$achieved with richer feedback models$\unicode{x2014}$are recovered. In other words, the same number of preferential samples as scalar-valued samples is sufficient to find a nearly optimal solution.

通过基于优惠的反馈,巴伊西亚优化(BOO)最近因其新出现的应用而引起极大关注。我们提到这一问题,即巴伊西亚优化来自人类反馈(BOHF),它与传统BO不同,它从一个减少的反馈模式中学习了最佳行动,每个步骤都向学习者透露了两种行动之间的偏好。目标是利用有限的优惠查询来确定最佳行动,通常通过昂贵的人类反馈获得。采用布拉德-泰鲁斯(BTL)的反馈模式(BBTL)为若干算法的性能提供了遗憾界限。在这项工作中,我们在同一个框架内发展了更严格的绩效保障。具体地说,我们从一个减少的反馈模式($\tilde\mathcal{O\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

Article 33

Title@2025-05-29 (4): GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

Title: GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

GSO: Herausfordernde Software-Optimierungsaufgaben zur Bewertung von SWE-Agenten

GSO:评估SWE-Agentics的有挑战的软件优化任务 2505.23671v1

Authors: Manish Shetty, Naman Jain, Jinjian Liu, Vijay Kethanaboyina, Koushik Sen, Ion Stoica

Developing high-performance software is a complex task that requires specialized expertise. We introduce GSO, a benchmark for evaluating language models’ capabilities in developing high-performance software. We develop an automated pipeline that generates and executes performance tests to analyze repository commit histories to identify 102 challenging optimization tasks across 10 codebases, spanning diverse domains and programming languages. An agent is provided with a codebase and performance test as a precise specification, and tasked to improve the runtime efficiency, which is measured against the expert developer optimization. Our quantitative evaluation reveals that leading SWE-Agents struggle significantly, achieving less than 5% success rate, with limited improvements even with inference-time scaling. Our qualitative analysis identifies key failure modes, including difficulties with low-level languages, practicing lazy optimization strategies, and challenges in accurately localizing bottlenecks. We release the code and artifacts of our benchmark along with agent trajectories to enable future research.

开发高性能软件是一项复杂的任务,需要专门知识。我们引入了GSO,这是评价语言模型开发高性能软件能力的基准。我们开发了一个自动管道,生成和执行绩效测试,以分析存储库,承诺历史查明10个代码库的102项挑战性优化任务,涵盖不同的领域和编程语言。向代理商提供了一个代码库和性能测试,作为精确的规格,并负责提高运行时间效率,以专家开发师的优化为衡量标准。我们的定量评估显示,领先的SWE-Agency 进行了巨大的斗争,取得了不到5%的成功率,即便在推论时间上也有有限的改进。我们的质量分析确定了关键的失败模式,包括使用低度语言的困难、采用懒惰性优化战略,以及在准确定位瓶颈方面存在的挑战。我们发布了基准的代码和工艺以及代理轨迹,以利今后的研究。

Article 34

Title@2025-05-29 (4): Maximizing Confidence Alone Improves Reasoning

Title: Maximizing Confidence Alone Improves Reasoning

Maximierung des Vertrauens allein verbessert die Vernunft

使信心最大化单独提高合理性 2505.22660v2

Authors: Mihir Prabhudesai, Lili Chen, Alex Ippoliti, Katerina Fragkiadaki, Hao Liu, Deepak Pathak

Reinforcement learning (RL) has enabled machine learning models to achieve significant advances in many fields. Most recently, RL has empowered frontier language models to solve challenging math, science, and coding problems. However, central to any RL algorithm is the reward function, and reward engineering is a notoriously difficult problem in any domain. In this paper, we propose RENT: Reinforcement Learning via Entropy Minimization – a fully unsupervised RL method that requires no external reward or ground-truth answers, and instead uses the model’s entropy of its underlying distribution as an intrinsic reward. We find that by reinforcing the chains of thought that yield high model confidence on its generated answers, the model improves its reasoning ability. In our experiments, we showcase these improvements on an extensive suite of commonly-used reasoning benchmarks, including GSM8K, MATH500, AMC, AIME, and GPQA, and models of varying sizes from the Qwen and Mistral families. The generality of our unsupervised learning method lends itself to applicability in a wide range of domains where external supervision is unavailable.

强化学习(RL)使机器学习模式在许多领域取得了显著进步。最近,RL授权前沿语言模式解决具有挑战性的数学、科学和编码问题。然而,任何RL算法的核心是奖赏功能,而奖赏工程则是任何领域一个臭名昭著的困难问题。在本文中,我们提议RENT:通过最小化强化学习(Entropy最小化) – – 一种完全不受监督的RL方法,不需要外部奖赏或地面真相回答,而是使用模型基本分布的螺旋状作为内在奖赏。我们发现,通过加强能够对其生成的答案产生高度模型信心的思维链,模型提高了其推理能力。我们在实验中展示了这些改进之处,展示了一套广泛通用的推理基准,包括GSM8K、MATH500、AMC、AIME和GPQA,以及来自Quen和Mistral家庭不同大小的模式。我们未超超超的学习方法的通用性适用于无法进行外部监督的广泛领域。

Article 35

Title@2025-05-29 (4): SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression

Title: SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression

SLiM: Ein-Schuss-Quantisierung und Sparsamkeit mit Low-Rank-Annäherung für LLM-Gewichtskompression

SLiM: LLM 重量压缩的单射量和与低级别近似相近的分数 2410.09615v3

Authors: Mohammad Mozaffari, Amir Yazdanbakhsh, Maryam Mehri Dehnavi

Conventional model compression techniques for LLMs address high memory consumption and slow inference challenges but typically require computationally expensive retraining to preserve accuracy. In contrast, one-shot compression methods eliminate retraining cost, but struggle to achieve accuracy comparable to dense models. This paper presents SLIM, a new one-shot compression framework that holistically integrates hardware-friendly quantization, sparsity, and low-rank approximation into a unified process. First, we formulate the quantization process using a probabilistic approach (SLIM-Quant) that enables us to apply uniform quantization. Then, we use an existing one-shot pruning method to apply semi-structured sparsity on top of the quantized weights. Finally, to compensate for the introduced aggregated quantization and sparsity error, we use a novel saliency function with unique invertible and additive features that enables us to mathematically compute the value of low-rank adapters. SLIM improves model accuracy by up to 5.66% (LLaMA-2-7B) for 2:4 sparsity with 4-bit weight quantization, outperforming prior methods. Models compressed with SLIM achieve up to 4.3x and 3.8x on Nvidia RTX3060 and A100 GPUs, respectively. Additionally, they achieve up to 0.23x end-to-end memory reduction in comparison to their dense counterparts. We also propose an optional PEFT recipe that further improves accuracy by up to 1.66% (LLaMA-2-13B) compared to SLIM without fine-tuning.

LLMS的常规模型压缩技术解决了高内存消耗和缓慢发酵的挑战,但通常需要计算昂贵的再培训才能保持准确性。相比之下,一发压缩方法消除了再培训成本,但努力达到与密度模型相近的精确度。本文展示了SLIM,这是一个新的一发压缩框架,在整体上将硬件友好的量化、宽度和低调近似值整合到一个统一的进程中。首先,我们采用概率化方法(SLIM-Quant)来制定量化进程,这使我们能够应用统一的量化方法(SLIM-Qunat)来进行统一。然后,我们使用现有的一发压缩方法,将半成型的压缩方法消除再精确度成本成本成本,但在四比重的重量顶端上应用半结构的松散度。最后,为了补偿引入的复合四分化和宽度错误,我们使用新的显眼功能,将硬件友好易读和低端适应器的价值进行数学的计算。SLIM将模型提高到5.66%(LMA-2-7B)进一步将模型改进模型的准确性精确度提高到2.4比重重量,将SIM-ral-ral-revilx分别推出S-S-48-48-40至S-40-MA-MA-S-SIM-40-40-S-S-MA-S-S-S-S-S-S-I-S-I-S-S-S-S-S-S-40-S-S-S-S-MA-S-S-S-S-S-S-MA-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-S-S-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-

Article 36

Title@2025-05-29 (4): LoLA: Low-Rank Linear Attention With Sparse Caching

Title: LoLA: Low-Rank Linear Attention With Sparse Caching

LoLA: Low-Rank Lineare Aufmerksamkeit mit Sparse Caching

LoLA: 低兰克线性注意, 以粗糙的缓存 2505.23666v1

Authors: Luke McDermott, Robert W. Heath Jr., Rahul Parhi

Transformer-based large language models suffer from quadratic complexity at inference on long sequences. Linear attention methods are efficient alternatives, however, they fail to provide an accurate approximation of softmax attention. By additionally incorporating sliding window attention into each linear attention head, this gap can be closed for short context-length tasks. Unfortunately, these approaches cannot recall important information from long contexts due to “memory collisions”. In this paper , we propose LoLA: Low-rank Linear Attention with sparse caching. LoLA separately stores additional key-value pairs that would otherwise interfere with past associative memories. Moreover, LoLA further closes the gap between linear attention models and transformers by distributing past key-value pairs into three forms of memory: (i) recent pairs in a local sliding window; (ii) difficult-to-memorize pairs in a sparse, global cache; and (iii) generic pairs in the recurrent hidden state of linear attention. As an inference-only strategy, LoLA enables pass-key retrieval on up to 8K context lengths on needle-in-a-haystack tasks from RULER. It boosts the accuracy of the base subquadratic model from 0.6% to 97.4% at 4K context lengths, with a 4.6x smaller cache than that of Llama-3.1 8B. LoLA demonstrates strong performance on zero-shot commonsense reasoning tasks among 1B and 8B parameter subquadratic models. Finally, LoLA is an extremely lightweight approach: Nearly all of our results can be reproduced on a single consumer GPU.

以变换器为基础的大型语言模型在长序列的推论中具有二次复杂性。线性关注方法是高效的替代方法, 但是它们无法提供精确的软体关注近似值。此外, 线性关注方法通过将滑动窗口关注点纳入每个线性关注头, 这一差距可以因短期上下文任务而缩小。不幸的是, 由于“ 模拟碰撞” , 这些方法无法回忆长背景下的重要信息。在本文中, 我们提议 LoLLAA: 低端线性关注, 且缓冲不小。 LoLA 单独存储了额外的关键值配对配对, 否则会干扰过去的关联记忆。此外, LoLA 进一步缩小线性关注模型和变异器之间的差距, 将过去的键性值配对分配成三种记忆形式 :(i) 本地滑动窗口中的最近一对; (ii) 难以在“ 全球缓冲器” 中进行模拟的双对; (iii) 经常隐藏线性关注状态下的通用对配对。一种只发光化策略, LoLSastA 能够让直截段段内上到8K- 直径B 的直径直径直线性操作的直径直线性操作, 直径直线性 A 直线性 A 直线性A 的直径直径直径直径直线性 A 直线性 A 直对直径直径直径对直对直对直对直线性 A 。

Article 37

Title@2025-05-29 (4): AMBER: Adaptive Mesh Generation by Iterative Mesh Resolution Prediction

Title: AMBER: Adaptive Mesh Generation by Iterative Mesh Resolution Prediction

AMBER: Adaptive Mesh-Generierung durch iterative Mesh-Auflösungsvorhersage

以迭代网目分辨率预测的适应性代谢代谢 2505.23663v1

Authors: Niklas Freymuth, Tobias Würth, Nicolas Schreiber, Balazs Gyenes, Andreas Boltres, Johannes Mitsch, Aleksandar Taranovic, Tai Hoang, Philipp Dahlinger, Philipp Becker, Luise Kärger, Gerhard Neumann

The cost and accuracy of simulating complex physical systems using the Finite Element Method (FEM) scales with the resolution of the underlying mesh. Adaptive meshes improve computational efficiency by refining resolution in critical regions, but typically require task-specific heuristics or cumbersome manual design by a human expert. We propose Adaptive Meshing By Expert Reconstruction (AMBER), a supervised learning approach to mesh adaptation. Starting from a coarse mesh, AMBER iteratively predicts the sizing field, i.e., a function mapping from the geometry to the local element size of the target mesh, and uses this prediction to produce a new intermediate mesh using an out-of-the-box mesh generator. This process is enabled through a hierarchical graph neural network, and relies on data augmentation by automatically projecting expert labels onto AMBER-generated data during training. We evaluate AMBER on 2D and 3D datasets, including classical physics problems, mechanical components, and real-world industrial designs with human expert meshes. AMBER generalizes to unseen geometries and consistently outperforms multiple recent baselines, including ones using Graph and Convolutional Neural Networks, and Reinforcement Learning-based approaches.

使用精密元素法(FEM)尺度模拟复杂的物理系统,其成本和准确性与基本网格的分辨率相仿。适应性 meshes通过在关键区域改进分辨率来提高计算效率,但通常需要由一位人类专家进行任务特定的超光度或烦琐的手工设计。我们提议通过专家重建(AMBER)进行适应性模拟,这是对网状适应的一种监督的学习方法。从粗略的网格开始,AMBER迭接地预测了缩放场,即从几何到目标网格的本地元件大小的函数映射,并利用这一预测来利用一个箱外网格生成一个新的中间网格。这一过程通过一个等级式的图形神经网络来启动,并依靠通过在培训期间将专家标签自动投射到AMBER生成的数据上来增强数据。我们从粗略的网格中对2D和3D数据集进行了评估,其中包括经典物理学问题、机械部件和与人类专家模拟的实界工业设计。 AMBER 将一般地分为看不见的和连续的超缓度。

Article 38

Title@2025-05-29 (4): Bayesian Perspective on Memorization and Reconstruction

Title: Bayesian Perspective on Memorization and Reconstruction

Bayesische Perspektive auf Erinnerung und Wiederaufbau

Bayes人对记忆和重建的看法 2505.23658v1

Authors: Haim Kaplan, Yishay Mansour, Kobbi Nissim, Uri Stemmer

We introduce a new Bayesian perspective on the concept of data reconstruction, and leverage this viewpoint to propose a new security definition that, in certain settings, provably prevents reconstruction attacks. We use our paradigm to shed new light on one of the most notorious attacks in the privacy and memorization literature - fingerprinting code attacks (FPC). We argue that these attacks are really a form of membership inference attacks, rather than reconstruction attacks. Furthermore, we show that if the goal is solely to prevent reconstruction (but not membership inference), then in some cases the impossibility results derived from FPC no longer apply.

我们从新的贝叶斯人的角度看待数据重建的概念,并利用这个观点提出一个新的安全定义,在某些环境下,可以明显地防止重建攻击。我们利用我们的范式,对隐私和记忆文献中最臭名昭著的攻击之一——指纹代码攻击(FCC ) —— 提供新的信息。我们争辩说,这些攻击实际上是成员推论攻击的一种形式,而不是重建攻击。此外,我们表明,如果目标仅仅在于防止重建(而不是成员推论),那么在某些情况下,从FPC产生的不可能的结果不再适用。

Article 39

Title@2025-05-29 (4): Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation

Title: Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation

Aktives Layer-Kontrastives Decodieren reduziert Halluzination bei der Generierung von Großsprachenmodellen

大型语言模式生成中活性多语言解层解码减少幻觉 2505.23657v1

Authors: Hongxiang Zhang, Hao Chen, Tianyi Zhang, Muhao Chen

Recent decoding methods improve the factuality of large language models~(LLMs) by refining how the next token is selected during generation. These methods typically operate at the token level, leveraging internal representations to suppress superficial patterns. Nevertheless, LLMs remain prone to hallucinations, especially over longer contexts. In this paper, we propose Active Layer-Contrastive Decoding (ActLCD), a novel decoding strategy that actively decides when to apply contrasting layers during generation. By casting decoding as a sequential decision-making problem, ActLCD employs a reinforcement learning policy guided by a reward-aware classifier to optimize factuality beyond the token level. Our experiments demonstrate that ActLCD surpasses state-of-the-art methods across five benchmarks, showcasing its effectiveness in mitigating hallucinations in diverse generation scenarios.

最近解码方法通过精炼代代代中如何选择下一个符号来提高大型语言模型~(LLMs)的实际情况质量。这些方法通常在象征性层面运作,利用内部代表来压制表面模式。尽管如此,LLMs仍然容易产生幻觉,特别是在较长的环境下。在本文中,我们提议了一种新的解码战略,即积极的多层调解码战略,即积极决定代中何时应用对比层。通过将解码作为一个相继的决策问题,ActLCD采用了一种强化学习政策,由有奖分的分类师指导,使事实质量在象征性层面之外达到最佳水平。我们的实验表明,AcLCD超越了五个基准的最新方法,显示了它在减少不同代中幻觉方面的有效性。

Article 40

Title@2025-05-29 (4): How does Transformer Learn Implicit Reasoning?

Title: How does Transformer Learn Implicit Reasoning?

Wie lernt Transformer Implizite Vernunft?

变形者如何学习隐含理由? 2505.23653v1

Authors: Jiaran Ye, Zijun Yao, Zhidian Huang, Liangming Pan, Jinxin Liu, Yushi Bai, Amy Xin, Liu Weichuan, Xiaoyin Che, Lei Hou, Juanzi Li

Recent work suggests that large language models (LLMs) can perform multi-hop reasoning implicitly – producing correct answers without explicitly verbalizing intermediate steps – but the underlying mechanisms remain poorly understood. In this paper, we study how such implicit reasoning emerges by training transformers from scratch in a controlled symbolic environment. Our analysis reveals a three-stage developmental trajectory: early memorization, followed by in-distribution generalization, and eventually cross-distribution generalization. We find that training with atomic triples is not necessary but accelerates learning, and that second-hop generalization relies on query-level exposure to specific compositional structures. To interpret these behaviors, we introduce two diagnostic tools: cross-query semantic patching, which identifies semantically reusable intermediate representations, and a cosine-based representational lens, which reveals that successful reasoning correlates with the cosine-base clustering in hidden space. This clustering phenomenon in turn provides a coherent explanation for the behavioral dynamics observed across training, linking representational structure to reasoning capability. These findings provide new insights into the interpretability of implicit multi-hop reasoning in LLMs, helping to clarify how complex reasoning processes unfold internally and offering pathways to enhance the transparency of such models.

最近的工作表明,大型语言模型(LLMS)可以隐含地进行多动脉推理 – – 提出正确的答案,而没有明确地解释中间步骤 – – 但基本机制仍然不易理解。在本文中,我们研究这些隐含的推理如何通过在受控制的象征性环境中从零开始培训变压器而产生。我们的分析揭示了一个三阶段的发展轨迹:早期记忆,随后是分布式的概括,最终是跨分布式的概括化。我们发现,用原子三联体进行的培训没有必要,而是加速学习,而第二波的概括化取决于对特定组成结构的询问程度的暴露。为了解释这些行为,我们引入了两种诊断工具:交叉拼写语的语义拼接,它识别了可重新使用的语义中间表达器,以及基于共弦的表达镜,它揭示了成功的推理与隐蔽空间的正基组合有关。这种组合现象反过来为整个培训中观察到的行为动态提供了一致的解释,将代表性结构与推理能力联系起来。这些发现为LLMSMS的隐含多动性多动推理提供了新的理解性解释性提供了新的见解。

Article 41

Title@2025-05-29 (4): Optimization-Free Diffusion Model – A Perturbation Theory Approach

Title: Optimization-Free Diffusion Model – A Perturbation Theory Approach

Optimierungsfreies Diffusionsmodell – Ein Perturbationstheorie-Ansatz

优化-无优化传播模式 – – 扰动理论方法 2505.23652v1

Authors: Yuehaw Khoo, Mathias Oster, Yifan Peng

Diffusion models have emerged as a powerful framework in generative modeling, typically relying on optimizing neural networks to estimate the score function via forward SDE simulations. In this work, we propose an alternative method that is both optimization-free and forward SDE-free. By expanding the score function in a sparse set of eigenbasis of the backward Kolmogorov operator associated with the diffusion process, we reformulate score estimation as the solution to a linear system, avoiding iterative optimization and time-dependent sample generation. We analyze the approximation error using perturbation theory and demonstrate the effectiveness of our method on high-dimensional Boltzmann distributions and real-world datasets.

传播模型已成为基因模型的强大框架,通常依靠优化神经网络,通过前方SDE模拟来估计得分函数。在这项工作中,我们提出了一种替代方法,既无优化,又无前方SDE。通过扩大与扩散过程相关的落后的科尔莫戈罗夫操作员的零星的分数功能,我们重新将得分估计作为线性系统的解决方案,避免迭代优化和根据时间生成样本。我们使用扰动理论分析近似误差,并展示我们在高维波尔茨曼分布和真实世界数据集上的方法的有效性。

Article 42

Title@2025-05-29 (4): Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation

Title: Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation

Merge-Friendly Post-Training Quantization für Multi-Target Domain-Anpassung

多目标域适应培训后量化 2505.23651v1

Authors: Juncheol Shin, Minsang Seok, Seonggon Kim, Eunhyeok Park

Model merging has emerged as a powerful technique for combining task-specific weights, achieving superior performance in multi-target domain adaptation. However, when applied to practical scenarios, such as quantized models, new challenges arise. In practical scenarios, quantization is often applied to target-specific data, but this process restricts the domain of interest and introduces discretization effects, making model merging highly non-trivial. In this study, we analyze the impact of quantization on model merging through the lens of error barriers. Leveraging these insights, we propose a novel post-training quantization, HDRQ - Hessian and distant regularizing quantization - that is designed to consider model merging for multi-target domain adaptation. Our approach ensures that the quantization process incurs minimal deviation from the source pre-trained model while flattening the loss surface to facilitate smooth model merging. To our knowledge, this is the first study on this challenge, and extensive experiments confirm its effectiveness.

合并模型已成为一种强大的技术,可以将特定任务的权重结合起来,在多目标领域的适应中实现优异性能。然而,当应用到实际情景,例如量化模型时,就会出现新的挑战。在实际情景中,量化往往适用于特定目标数据,但这一过程限制了关注领域,并引入了分化效应,使模型高度非三边性合并。在本研究中,我们分析了量化对通过错误障碍透镜合并模型的影响。利用这些洞察力,我们提出了一种新的培训后量化(HDHRQ - Hessian和远处常规化量化)方案,旨在考虑将模型合并用于多目标领域的适应。我们的方法确保量化进程在平整损失表面的同时,尽可能避免偏离源源前培训模式,从而便利模型的顺利合并。据我们所知,这是关于这一挑战的首次研究,并且广泛的实验证实了其有效性。

Article 43

Title@2025-05-29 (4): Optimal Bounds for Adversarial Constrained Online Convex Optimization

Title: Optimal Bounds for Adversarial Constrained Online Convex Optimization

Optimale Grenzen für die Online-Konvergenzoptimierung

优化在线电传优化优化 2503.13366v4

Authors: Ricardo N. Ferreira, Cláudia Soares

Constrained Online Convex Optimization (COCO) can be seen as a generalization of the standard Online Convex Optimization (OCO) framework. At each round, a cost function and constraint function are revealed after a learner chooses an action. The goal is to minimize both the regret and cumulative constraint violation (CCV) against an adaptive adversary. We show for the first time that is possible to obtain the optimal $O(\sqrt{T})$ bound on both regret and CCV, improving the best known bounds of $O \left( \sqrt{T} \right)$ and $\tilde{O} \left( \sqrt{T} \right)$ for the regret and CCV, respectively. Based on a new surrogate loss function enforcing a minimum penalty on the constraint function, we demonstrate that both the Follow-the-Regularized-Leader and the Online Gradient Descent achieve the optimal bounds.

受约束的在线 convex 优化( COCO) 可以被视为对标准在线 Convex 优化( OCO) 框架的概括化。在每一回合中, 在学习者选择一个动作后都会显示成本函数和约束功能。目标是尽量减少对适应性对手的遗憾和累积约束性违反( CCV) 。我们第一次显示有可能获得对遗憾和CCV 的约束的最佳美元, 改进已知的美元左( scqrt{ T}\right) 和 $\ tilde{ O} 左(\ lft (\ qrt{ T}\right) 的最佳界限。基于对约束功能实施最低处罚的新套位损失功能, 我们证明后续带和在线梯族都达到了最佳界限。

Article 44

Title@2025-05-29 (4): Continuous Chain of Thought Enables Parallel Exploration and Reasoning

Title: Continuous Chain of Thought Enables Parallel Exploration and Reasoning

Kontinuierliche Gedankenkette ermöglicht parallele Erkundung und Vernunft

连续思考链有助于平行探索和推理 2505.23648v1

Authors: Halil Alperen Gozeten, M. Emrullah Ildiz, Xuechen Zhang, Hrayr Harutyunyan, Ankit Singh Rawat, Samet Oymak

Current language models generate chain-of-thought traces by autoregressively sampling tokens from a finite vocabulary. While this discrete sampling has achieved remarkable success, conducting chain-of-thought with continuously-valued tokens (CoT2) offers a richer and more expressive alternative. Our work examines the benefits of CoT2 through logical reasoning tasks that inherently require search capabilities and provide optimization and exploration methods for CoT2. Theoretically, we show that CoT2 allows the model to track multiple traces in parallel and quantify its benefits for inference efficiency. Notably, one layer transformer equipped with CoT2 can provably solve the combinatorial “subset sum problem” given sufficient embedding dimension. These insights lead to a novel and effective supervision strategy where we match the softmax outputs to the empirical token distributions of a set of target traces. Complementing this, we introduce sampling strategies that unlock policy optimization and self-improvement for CoT2. Our first strategy samples and composes $K$ discrete tokens at each decoding step to control the level of parallelism, and reduces to standard CoT when $K=1$. Our second strategy relies on continuous exploration over the probability simplex. Experiments confirm that policy optimization with CoT2 indeed improves the performance of the model beyond its initial discrete or continuous supervision.

当前的语言模型通过从限定词汇中自动递增抽样符号产生思维链痕迹。虽然这种离散的抽样已经取得了显著的成功, 使用持续价值的象征(CoT2) 进行思维链(CoT2) 提供了更丰富和更直观的替代方法。我们的工作通过逻辑推理任务来审查COT2的好处,这些逻辑推理任务必然需要搜索能力,并为CoT2. 提供优化和探索方法。我们从理论上表明, CoT2 允许该模型平行跟踪多种痕迹并量化其推论效率的好处。值得注意的是, 一个配有 CoT2 的层变异器可以在足够嵌入层面之后, 解决组合“ 子集问题 ” 。这些洞察力导致一种新颖和有效的监督战略, 我们通过将软负载输出与一组目标痕迹的经验符号分布相匹配。作为补充, 我们引入采样战略, 解开政策优化和自我简化为CO2, 我们的第一个战略样本在控制平行水平的每个解码步骤中都配置$K=1美元, 并降低为标准的COT 标准值, 当模型显示其连续的精确度时, 我们的优化战略依赖于持续探索政策。

Article 45

Title@2025-05-29 (4): Are Reasoning Models More Prone to Hallucination?

Title: Are Reasoning Models More Prone to Hallucination?

Sind vernünftigere Modelle eher halluzinierend?

理性模型更能让人产生幻觉吗? 2505.23646v1

Authors: Zijun Yao, Yantao Liu, Yanxu Chen, Jianhui Chen, Junfeng Fang, Lei Hou, Juanzi Li, Tat-Seng Chua

Recently evolved large reasoning models (LRMs) show powerful performance in solving complex tasks with long chain-of-thought (CoT) reasoning capability. As these LRMs are mostly developed by post-training on formal reasoning tasks, whether they generalize the reasoning capability to help reduce hallucination in fact-seeking tasks remains unclear and debated. For instance, DeepSeek-R1 reports increased performance on SimpleQA, a fact-seeking benchmark, while OpenAI-o3 observes even severer hallucination. This discrepancy naturally raises the following research question: Are reasoning models more prone to hallucination? This paper addresses the question from three perspectives. (1) We first conduct a holistic evaluation for the hallucination in LRMs. Our analysis reveals that LRMs undergo a full post-training pipeline with cold start supervised fine-tuning (SFT) and verifiable reward RL generally alleviate their hallucination. In contrast, both distillation alone and RL training without cold start fine-tuning introduce more nuanced hallucinations. (2) To explore why different post-training pipelines alters the impact on hallucination in LRMs, we conduct behavior analysis. We characterize two critical cognitive behaviors that directly affect the factuality of a LRM: Flaw Repetition, where the surface-level reasoning attempts repeatedly follow the same underlying flawed logic, and Think-Answer Mismatch, where the final answer fails to faithfully match the previous CoT process. (3) Further, we investigate the mechanism behind the hallucination of LRMs from the perspective of model uncertainty. We find that increased hallucination of LRMs is usually associated with the misalignment between model uncertainty and factual accuracy. Our work provides an initial understanding of the hallucination in LRMs.

最近发展起来的大型推理模型(LRMs)显示,在解决复杂任务时,有长期思维链推理能力(CoT)推理能力(LRMs)的强大表现。由于这些LRM多数是通过正式推理任务的培训后开发的,因此,它们是否广泛运用推理能力来帮助减少寻求事实的任务中的幻觉,现在仍然不清楚和辩论。例如,DeepSeek-RS1报告提高了简单QA(一个寻求事实的基准)的性能,而OpenAI-o3则观察到了更严重的幻觉。这种差异自然引起以下研究问题:推理模型更易产生幻觉吗?本文从三个角度处理问题。(1) 我们首先对LRMMs的幻觉进行整体评价。我们的分析显示,LRMRMs在培训后的全面演练中,经过寒冷监督的微调(SFT)和可核查的奖励RLL通常会减轻其幻觉。相比之下,光学和RM的L培训后演算过程通常也会改变我们错觉的正确性结果。

Article 46

Title@2025-05-29 (4): Towards Unified Attribution in Explainable AI, Data-Centric AI, and Mechanistic Interpretability

Title: Towards Unified Attribution in Explainable AI, Data-Centric AI, and Mechanistic Interpretability

Auf dem Weg zu einer einheitlichen Attribution in erklärbarer KI, datenzentraler KI und mechanistischer Interpretierbarkeit

实现可解释的AI、数据集中AI和机械可解释性的统一归属 2501.18887v3

Authors: Shichang Zhang, Tessa Han, Usha Bhalla, Himabindu Lakkaraju

The increasing complexity of AI systems has made understanding their behavior critical. Numerous interpretability methods have been developed to attribute model behavior to three key aspects: input features, training data, and internal model components, which emerged from explainable AI, data-centric AI, and mechanistic interpretability, respectively. However, these attribution methods are studied and applied rather independently, resulting in a fragmented landscape of methods and terminology. This position paper argues that feature, data, and component attribution methods share fundamental similarities, and a unified view of them benefits both interpretability and broader AI research. To this end, we first analyze popular methods for these three types of attributions and present a unified view demonstrating that these seemingly distinct methods employ similar techniques (such as perturbations, gradients, and linear approximations) over different aspects and thus differ primarily in their perspectives rather than techniques. Then, we demonstrate how this unified view enhances understanding of existing attribution methods, highlights shared concepts and evaluation criteria among these methods, and leads to new research directions both in interpretability research, by addressing common challenges and facilitating cross-attribution innovation, and in AI more broadly, with applications in model editing, steering, and regulation.

AI系统越来越复杂,因此理解它们的行为至关重要。许多解释方法已经发展成许多,将模型行为分为三个关键方面:投入特征、培训数据和内部模型组成部分,这些组成部分分别来自可解释的AI、以数据为中心的AI和机械解释。然而,这些归属方法是相当独立的研究和应用,造成方法和术语的不成体系。本立场文件认为,特征、数据和组成部分归属方法具有基本相似性,统一看待这些方法既有利于解释性,又有利于更广泛的AI研究。为此,我们首先分析这三类属性的流行方法,提出统一的观点,表明这些似乎截然不同的方法在不同方面采用相似的技术(如扰动、梯度和线性近似),因此主要在它们的观点上不同,而不是在技术上不同。然后,我们展示这种统一的观点如何增进对现有归属方法的理解,突出这些方法的共同概念和评价标准,并导致在解释性研究方面找到新的研究方向,解决共同的挑战,促进交叉归属创新,在AI中更为广泛地应用模式编辑、指导和监管。

Article 47

Title@2025-05-29 (4): Global optimization of graph acquisition functions for neural architecture search

Title: Global optimization of graph acquisition functions for neural architecture search

Globale Optimierung von Graphen-Erfassungsfunktionen für die neuronale Architektursuche

全球优化用于神经结构搜索的图图获取功能 2505.23640v1

Authors: Yilin Xie, Shiqiang Zhang, Jixiang Qing, Ruth Misener, Calvin Tsay

Graph Bayesian optimization (BO) has shown potential as a powerful and data-efficient tool for neural architecture search (NAS). Most existing graph BO works focus on developing graph surrogates models, i.e., metrics of networks and/or different kernels to quantify the similarity between networks. However, the acquisition optimization, as a discrete optimization task over graph structures, is not well studied due to the complexity of formulating the graph search space and acquisition functions. This paper presents explicit optimization formulations for graph input space including properties such as reachability and shortest paths, which are used later to formulate graph kernels and the acquisition function. We theoretically prove that the proposed encoding is an equivalent representation of the graph space and provide restrictions for the NAS domain with either node or edge labels. Numerical results over several NAS benchmarks show that our method efficiently finds the optimal architecture for most cases, highlighting its efficacy.

图表 Bayesian 优化(BO) 显示了作为神经结构搜索(NAS)的强大和数据效率工具的潜力。多数现有图表BO 侧重于开发图形替代模型,即网络和/或不同内核的量度,以量化网络之间的相似性。然而,由于绘制图形搜索空间和获取功能的复杂性,作为与图形结构的单独优化任务,对获取优化没有进行很好研究。本文展示了图形输入空间的清晰优化配方,包括可达性和最短路径等属性,这些属性后来被用于绘制图形内核和获取功能。我们理论上证明,拟议的编码相当于图形空间的等量度,并为NAS 域提供了限制,有节点或边缘标签。几个NAS 基准的数值结果显示,我们的方法在多数情况下都有效地找到了最佳结构,突出其功效。

Article 48

Title@2025-05-29 (4): Position: Scaling LLM Agents Requires Asymptotic Analysis with LLM Primitives

Title: Position: Scaling LLM Agents Requires Asymptotic Analysis with LLM Primitives

Position: Skalierung von LLM-Agenten erfordert asymptotische Analyse mit LLM-Primitiven

位置: 缩放 LLM 代理需要用 LLM 原始功能进行抗药性分析 2502.04358v2

Authors: Elliot Meyerson, Xin Qiu

Decomposing hard problems into subproblems often makes them easier and more efficient to solve. With large language models (LLMs) crossing critical reliability thresholds for a growing slate of capabilities, there is an increasing effort to decompose systems into sets of LLM-based agents, each of whom can be delegated sub-tasks. However, this decomposition (even when automated) is often intuitive, e.g., based on how a human might assign roles to members of a human team. How close are these role decompositions to optimal? This position paper argues that asymptotic analysis with LLM primitives is needed to reason about the efficiency of such decomposed systems, and that insights from such analysis will unlock opportunities for scaling them. By treating the LLM forward pass as the atomic unit of computational cost, one can separate out the (often opaque) inner workings of a particular LLM from the inherent efficiency of how a set of LLMs are orchestrated to solve hard problems. In other words, if we want to scale the deployment of LLMs to the limit, instead of anthropomorphizing LLMs, asymptotic analysis with LLM primitives should be used to reason about and develop more powerful decompositions of large problems into LLM agents.

将棘手问题分解成次级问题往往使这些问题更容易解决,更有效率。随着大型语言模型(LLMs)跨越关键可靠性临界临界值以达到日益成熟的能力,人们正日益努力将系统分解成以LLM为基础的代理器,每个代理器都可以被授予子任务。然而,这种分解(即使在自动化的情况下)往往不自然,例如,根据一个人如何将角色分配给人类团队的成员;这些角色是如何接近于最佳的分解?本立场文件认为,需要与LLLM原始体进行无症状分析,以说明这种分解系统的效率,而这种分析的洞见将释放出机会。通过将LLMM的前身作为计算成本的原子单位处理,可以将特定LLM的(往往不透明)内部工作与一组LMs如何精心安排以解决难题的内在效率区分开来。换句话说,如果我们想将LLMS的部署范围缩小到限度,而不是将这种分解的系统的效率提高到更强大的LMsrialms,那么,就应该将LMsrialmas进行更强大的分解分析。

Article 49

Title@2025-05-29 (4): MCP Safety Training: Learning to Refuse Falsely Benign MCP Exploits using Improved Preference Alignment

Title: MCP Safety Training: Learning to Refuse Falsely Benign MCP Exploits using Improved Preference Alignment

MCP Safety Training: Lernen, falsch benachbarte MCP-Exploits mit verbesserter Präferenzausrichtung abzulehnen

MCP 安全培训:学会利用改进的优惠协调,错误拒绝 MCP 剥削 2505.23634v1

Authors: John Halloran

The model context protocol (MCP) has been widely adapted as an open standard enabling the seamless integration of generative AI agents. However, recent work has shown the MCP is susceptible to retrieval-based “falsely benign” attacks (FBAs), allowing malicious system access and credential theft, but requiring that users download compromised files directly to their systems. Herein, we show that the threat model of MCP-based attacks is significantly broader than previously thought, i.e., attackers need only post malicious content online to deceive MCP agents into carrying out their attacks on unsuspecting victims’ systems. To improve alignment guardrails against such attacks, we introduce a new MCP dataset of FBAs and (truly) benign samples to explore the effectiveness of direct preference optimization (DPO) for the refusal training of large language models (LLMs). While DPO improves model guardrails against such attacks, we show that the efficacy of refusal learning varies drastically depending on the model’s original post-training alignment scheme–e.g., GRPO-based LLMs learn to refuse extremely poorly. Thus, to further improve FBA refusals, we introduce Retrieval Augmented Generation for Preference alignment (RAG-Pref), a novel preference alignment strategy based on RAG. We show that RAG-Pref significantly improves the ability of LLMs to refuse FBAs, particularly when combined with DPO alignment, thus drastically improving guardrails against MCP-based attacks.

示范背景协议(MCP)被广泛改编为一种开放标准,可以无缝地整合基因性AI剂;然而,最近的工作表明,MCP很容易受到基于检索的“恶性无害”攻击(FBA),允许恶意系统访问和证明盗窃,但要求用户下载直接损害其系统的文件。在这里,我们表明,MCP攻击的威胁模式比以前想象的要广泛得多,即攻击者只需要在网上张贴恶意内容,以欺骗MCP代理人对不受监视的受害者系统进行攻击。为了改进针对这种攻击的警戒系统,我们引入了新的FBA和(truly)良性样本,以探索直接偏好优化(DPO)对大语言模型(LLMS)进行拒绝训练的效果。虽然DPO改进了针对这类攻击的模型保护装置,但我们发现拒绝学习的效果差异很大,取决于模型最初的训练后调整计划(eg),基于GPCP的LM学会学会学会学会学会拒绝极低的进攻。因此,我们进一步改进了AGG的升级战略。

Article 50

Title@2025-05-29 (4): Prompting Whisper for Improved Verbatim Transcription and End-to-end Miscue Detection

Title: Prompting Whisper for Improved Verbatim Transcription and End-to-end Miscue Detection

Prompting Whisper für verbesserte wörtliche Transkription und End-to-End-Missue-Erkennung

逐字记录和终端至终端杂项探测 2505.23627v1

Authors: Griffin Dietz Smith, Dianna Yee, Jennifer King Chen, Leah Findlater

Identifying mistakes (i.e., miscues) made while reading aloud is commonly approached post-hoc by comparing automatic speech recognition (ASR) transcriptions to the target reading text. However, post-hoc methods perform poorly when ASR inaccurately transcribes verbatim speech. To improve on current methods for reading error annotation, we propose a novel end-to-end architecture that incorporates the target reading text via prompting and is trained for both improved verbatim transcription and direct miscue detection. Our contributions include: first, demonstrating that incorporating reading text through prompting benefits verbatim transcription performance over fine-tuning, and second, showing that it is feasible to augment speech recognition tasks for end-to-end miscue detection. We conducted two case studies – children’s read-aloud and adult atypical speech – and found that our proposed strategies improve verbatim transcription and miscue detection compared to current state-of-the-art.

阅读声响时的错误(即错误)通常会通过将自动语音识别(ASR)记录抄录与目标读取文本进行比较,来识别在读出声响时产生的错误(即错误),从而在事后发现时通常会发现错误。然而,当ASR不准确地抄录逐字记录稿时,事后方法效果不佳。为了改进当前读出错误注释的方法,我们提议了一个新的端对端结构,通过提示和训练将目标阅读文本纳入,从而改进逐字记录誊写和直接检测错误。我们的贡献包括:第一,表明通过在微调后推动效益逐字抄录功能纳入阅读文本,第二,表明加强语音识别任务以发现端到端错误是可行的。我们进行了两个案例研究:儿童读音和成人非典型的演讲。我们发现,我们提出的战略改善了逐字记录和错误检测,而与当前的最新技术相比,我们提出的战略则改善了逐字记录和错误检测。

Article 51

Title@2025-05-29 (4): Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Title: Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Quartett: Native FP4 Training kann für große Sprachmodelle optimal sein

四方:土著FFF4培训可以成为大语言模式的最佳方式 2505.14669v2

Authors: Roberto L. Castro, Andrei Panferov, Soroush Tabesh, Oliver Sieberling, Jiale Chen, Mahdi Nikdan, Saleh Ashkboos, Dan Alistarh

Training large language models (LLMs) models directly in low-precision offers a way to address computational costs by improving both throughput and energy efficiency. For those purposes, NVIDIA’s recent Blackwell architecture facilitates very low-precision operations using FP4 variants. Yet, current algorithms for training LLMs in FP4 precision face significant accuracy degradation and often rely on mixed-precision fallbacks. In this paper, we investigate hardware-supported FP4 training and introduce a new approach for accurate, end-to-end FP4 training with all the major computations (i.e., linear layers) in low precision. Through extensive evaluations on Llama-type models, we reveal a new low-precision scaling law that quantifies performance trade-offs across bit-widths and training setups. Guided by this investigation, we design an “optimal” technique in terms of accuracy-vs-computation, called Quartet. We implement Quartet using optimized CUDA kernels tailored for Blackwell, demonstrating that fully FP4-based training is a competitive alternative to FP16 half-precision and to FP8 training. Our code is available at https://github.com/IST-DASLab/Quartet.

低精度培训大型语言模型(LLMS)直接在低精度情况下直接培训大型语言模型(LLMS)为通过提高输送量和能源效率解决计算成本提供了一种方法。为此,NVIDIA最近的Blackwell结构为使用FP4变量的非常低精度操作提供了便利。然而,目前对FP4精度培训LLMS的计算方法面临显著的精度退化,并往往依赖混合精度下降。在本文件中,我们调查硬件支持的FP4培训,并采用新的方法,以所有主要计算(即线性层)的精确度、端到端方FC4培训。通过对Llama型模型的广泛评价,我们揭示了一个新的低精度缩缩缩放法,将性能交易分成四倍宽度和培训设置。在这项调查的指导下,我们设计了一种精度-五价调的“最佳”技术。我们使用为Blackwell定制的CUDA内核内核(即线层层层层层)应用的优化CUDUDAFP4/FPA半级培训。

Article 52

Title@2025-05-29 (4): SPACE: SPike-Aware Consistency Enhancement for Test-Time Adaptation in Spiking Neural Networks

Title: SPACE: SPike-Aware Consistency Enhancement for Test-Time Adaptation in Spiking Neural Networks

SPACE: SPike-Aware Consistency Enhancement für Test-Time-Anpassung in Spiking Neuronal Networks

空间:在Spiking神经网络中加强在测试-时间适应方面的SPike-Aware一致性增强 2504.02298v2

Authors: Xinyu Luo, Kecheng Chen, Pao-Sheng Vincent Sun, Chris Xing Tian, Arindam Basu, Haoliang Li

Spiking Neural Networks (SNNs), as a biologically plausible alternative to Artificial Neural Networks (ANNs), have demonstrated advantages in terms of energy efficiency, temporal processing, and biological plausibility. However, SNNs are highly sensitive to distribution shifts, which can significantly degrade their performance in real-world scenarios. Traditional test-time adaptation (TTA) methods designed for ANNs often fail to address the unique computational dynamics of SNNs, such as sparsity and temporal spiking behavior. To address these challenges, we propose SPike-Aware Consistency Enhancement (SPACE), the first source-free and single-instance TTA method specifically designed for SNNs. SPACE leverages the inherent spike dynamics of SNNs to maximize the consistency of spike-behavior-based local feature maps across augmented versions of a single test sample, enabling robust adaptation without requiring source data. We evaluate SPACE on multiple datasets. Furthermore, SPACE exhibits robust generalization across diverse network architectures, consistently enhancing the performance of SNNs on CNNs (such as VGG and ResNet), Transformer models, and ConvLSTM architectures. Experimental results show that SPACE outperforms state-of-the-art methods, highlighting its effectiveness and robustness in real-world settings.

作为人工神经网络的一种生物上可信的替代方法,Spiking神经网络(SNNS)作为人工神经网络(ANNS)的一种生物上可信的替代方法,在能源效率、时间处理和生物合理性方面具有明显的优势,然而,SNNS对分布变化非常敏感,在现实世界情景中,这种变化会大大降低其性能。为ANNS设计的传统的测试时间适应方法往往无法解决SNS独特的计算动态,如空间和时间跳动行为。为了应对这些挑战,我们建议SPECE(SPCE)是专门为SNS专门设计的首个无源和单 Inste- Intance TTTA方法。空间利用SNNS内在的激增动态,以便在单一测试样本的扩大版本中最大限度地提高基于峰值的本地地貌图的连贯性,从而能够在不需要源数据的情况下进行稳健的适应。此外,在多种网络结构中,空间展示了强有力的通用,在CNNS(如VGGG和ResNet-stallimes)上不断提高S-stallimings-stallimal-stage-stillings

Article 53

Title@2025-05-29 (4): Instance-Optimality for Private KL Distribution Estimation

Title: Instance-Optimality for Private KL Distribution Estimation

Instanz-Optimalität für private KL-Verteilungsabschätzung

私人 KL 分布分布估计的实情- 最佳度 2505.23620v1

Authors: Jiayuan Ye, Vitaly Feldman, Kunal Talwar

We study the fundamental problem of estimating an unknown discrete distribution $p$ over $d$ symbols, given $n$ i.i.d. samples from the distribution. We are interested in minimizing the KL divergence between the true distribution and the algorithm’s estimate. We first construct minimax optimal private estimators. Minimax optimality however fails to shed light on an algorithm’s performance on individual (non-worst-case) instances $p$ and simple minimax-optimal DP estimators can have poor empirical performance on real distributions. We then study this problem from an instance-optimality viewpoint, where the algorithm’s error on $p$ is compared to the minimum achievable estimation error over a small local neighborhood of $p$. Under natural notions of local neighborhood, we propose algorithms that achieve instance-optimality up to constant factors, with and without a differential privacy constraint. Our upper bounds rely on (private) variants of the Good-Turing estimator. Our lower bounds use additive local neighborhoods that more precisely captures the hardness of distribution estimation in KL divergence, compared to ones considered in prior works.

我们研究的是估算一个未知的离散分配 $p$ 超过 $d 符号的基本问题, 给出的分布样本为美元 i.d. d. 。我们有兴趣将真实分布和算法估计之间的 KL 差异最小化。我们首先建造迷你最大优化的私人估计器。但是, 最小最大性能未能揭示算法在个人( 非最坏情况) 情况下的性能( 美元 ) 和简单小型最大最大最大DP估计器在真实分布上的经验性能差。然后, 我们从实例最佳性角度来研究这一问题, 将 $p$ 的算法误差与当地小区( $p$ ) 的最低可实现估计错误相比较。在本地周围的自然概念下, 我们建议的算法能够达到常数性能, 并且没有差别的隐私限制。我们的上限值依靠( 私人) 良好估计器的变式。我们的下限使用比对本地社区进行添加, 比较之前工程的计算。

Article 54

Title@2025-05-29 (4): Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes

Title: Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes

Wenig scharfe Rede Deepfake Detection Anpassung an Gaußsche Prozesse

Gaussian 过程的“深假探测”适应 2505.23619v1

Authors: Neta Glazer, David Chernin, Idan Achituve, Sharon Gannot, Ethan Fetaya

Recent advancements in Text-to-Speech (TTS) models, particularly in voice cloning, have intensified the demand for adaptable and efficient deepfake detection methods. As TTS systems continue to evolve, detection models must be able to efficiently adapt to previously unseen generation models with minimal data. This paper introduces ADD-GP, a few-shot adaptive framework based on a Gaussian Process (GP) classifier for Audio Deepfake Detection (ADD). We show how the combination of a powerful deep embedding model with the Gaussian processes flexibility can achieve strong performance and adaptability. Additionally, we show this approach can also be used for personalized detection, with greater robustness to new TTS models and one-shot adaptability. To support our evaluation, a benchmark dataset is constructed for this task using new state-of-the-art voice cloning models.

近来在文本到语音(TTS)模型方面,特别是在语音克隆方面的进步,加强了对适应性和高效深假探测方法的需求。随着TTS系统不断发展,检测模型必须能够有效地适应以极少数据生成的先前不为人知的一代模型。本文介绍了ADD-GP,这是基于音频深藏器探测高山过程(ADD)的几张微小的适应性框架。我们展示了强大的深层嵌入模型与高山过程灵活性的结合如何能够实现强大的性能和适应性。此外,我们展示了这种方法也可以用于个性化检测,对新的TTS模型和一发式适应性更强。为了支持我们的评估,使用新的最新语音克隆模型为这项任务构建了一个基准数据集。

Article 55

Title@2025-05-29 (4): One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory

Title: One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory

Eine Trajektorie, ein Token: Erdliche Video-Tokenisierung über panoptische Sub-Objekt-Trajektorie

一个轨迹, 一个 Token: 通过泛光子物件轨迹, 固定的视频轨迹 2505.23617v1

Authors: Chenhao Zheng, Jieyu Zhang, Mohammadreza Salehi, Ziqi Gao, Vishnu Iyengar, Norimasa Kobori, Quan Kong, Ranjay Krishna

Effective video tokenization is critical for scaling transformer models for long videos. Current approaches tokenize videos using space-time patches, leading to excessive tokens and computational inefficiencies. The best token reduction strategies degrade performance and barely reduce the number of tokens when the camera moves. We introduce grounded video tokenization, a paradigm that organizes tokens based on panoptic sub-object trajectories rather than fixed patches. Our method aligns with fundamental perceptual principles, ensuring that tokenization reflects scene complexity rather than video duration. We propose TrajViT, a video encoder that extracts object trajectories and converts them into semantically meaningful tokens, significantly reducing redundancy while maintaining temporal coherence. Trained with contrastive learning, TrajViT significantly outperforms space-time ViT (ViT3D) across multiple video understanding benchmarks, e.g., TrajViT outperforms ViT3D by a large margin of 6% top-5 recall in average at video-text retrieval task with 10x token deduction. We also show TrajViT as a stronger model than ViT3D for being the video encoder for modern VideoLLM, obtaining an average of 5.2% performance improvement across 6 VideoQA benchmarks while having 4x faster training time and 18x less inference FLOPs. TrajViT is the first efficient encoder to consistently outperform ViT3D across diverse video analysis tasks, making it a robust and scalable solution.

有效的视频象征性化对于放大长视频变压器模型至关重要。当前的方法将使用时空补丁的视频象征性化, 导致过度的象征性和计算效率低下。最佳象征性减少策略会降低性能, 当相机移动时几乎不会减少物证数量。我们引入了有底线的视频象征性化模式, 这个模式可以组织基于全光子子对象轨迹而非固定补丁的物证。我们的方法符合基本的概念性原则, 确保代号化反映场景复杂性而不是视频持续时间。我们提议 TrajViT , 是一个视频编码化的视频编码器, 提取对象轨迹, 并将其转换为具有语义意义的代号, 大大减少冗余, 同时保持时间一致性。我们通过对比性学习培训, TrajViViViViT 明显超越时空 ViT (ViT) , 例如, TrajViViViVT 超越 ViFL 平均 ViD 格式, 在 ViDA 上获得比 ViL 的高级 ViVA 格式格式的首次快速分析, 在 ViL 4 ViL 上, 在 ViL 上, 在 ViL 4 ViL 上, 在 ViL 上, 在 ViL 上, 在 ViL 快速的高级分析中, 在 ViL 上, 在 ViD 上, 在 ViL 上, 在 ViL 上, 在 ViL 上, 在 ViL 上, 在 ViD 上, 在 ViL 上, 在 ViL 上, 在 ViL 上, 在 ViL 上, 在 ViL 上, 在 ViD 上, 在 ViD 上, 在 ViD 上, 在 Vial 上, 上, 在 Vial 上, 上, 在 ViD 上, 在 Vial 上, 上, 在 ViD 上, 在 Vial 上, 上, 在 Vi 上, 上, 上, 上, 在 ViD 上, 在 Vi 上,在 Vi 上, 在上, 在上, 在上

Article 56

Title@2025-05-29 (4): Causal Machine Learning in IoT-based Engineering Problems: A Tool Comparison in the Case of Household Energy Consumption

Title: Causal Machine Learning in IoT-based Engineering Problems: A Tool Comparison in the Case of Household Energy Consumption

Kausales maschinelles Lernen in IoT-basierten Engineering-Problemen: Ein Tool-Vergleich im Fall des Haushaltsenergieverbrauchs

以木工工程问题为基础的因果机械学习:家庭能源消费工具比较 2505.12147v2

Authors: Nikolaos-Lysias Kosioris, Sotirios Nikoletseas, Gavrilis Filios, Stefanos Panagiotou

The rapid increase in computing power and the ability to store Big Data in the infrastructure has enabled predictions in a large variety of domains by Machine Learning. However, in many cases, existing Machine Learning tools are considered insufficient or incorrect since they exploit only probabilistic dependencies rather than inference logic. Causal Machine Learning methods seem to close this gap. In this paper, two prevalent tools based on Causal Machine Learning methods are compared, as well as their mathematical underpinning background. The operation of the tools is demonstrated by examining their response to 18 queries, based on the IDEAL Household Energy Dataset, published by the University of Edinburgh. First, it was important to evaluate the causal relations assumption that allowed the use of this approach; this was based on the preexisting scientific knowledge of the domain and was implemented by use of the in-built validation tools. Results were encouraging and may easily be extended to other domains.

计算机能力的迅速增长和在基础设施中存储大数据的能力的迅速增长使得机器学习能够在大量领域作出预测,然而,在许多情况下,现有机器学习工具被认为不充分或不正确,因为它们只利用概率依赖性而不是推论逻辑。由于机器学习方法似乎缩小了这一差距。在本文中,比较了两个基于Causal机器学习方法的常用工具及其数学基础背景。工具的运作表现在审查它们对根据爱丁堡大学出版的IDEL家用能源数据集提出的18个问题的答复。首先,必须评估允许使用这一方法的因果关系假设;这是以领域先前的科学知识为基础,通过使用内部验证工具加以实施的。结果令人鼓舞,而且很容易推广到其他领域。

Article 57

Title@2025-05-29 (4): Learning Interpretable Differentiable Logic Networks for Tabular Regression

Title: Learning Interpretable Differentiable Logic Networks for Tabular Regression

Learning Interpretable Differentiable Logic Networks for Tabular Regression

用于制表递减的可解释可解释逻辑网络 2505.23615v1

Authors: Chang Yue, Niraj K. Jha

Neural networks (NNs) achieve outstanding performance in many domains; however, their decision processes are often opaque and their inference can be computationally expensive in resource-constrained environments. We recently proposed Differentiable Logic Networks (DLNs) to address these issues for tabular classification based on relaxing discrete logic into a differentiable form, thereby enabling gradient-based learning of networks built from binary logic operations. DLNs offer interpretable reasoning and substantially lower inference cost. We extend the DLN framework to supervised tabular regression. Specifically, we redesign the final output layer to support continuous targets and unify the original two-phase training procedure into a single differentiable stage. We evaluate the resulting model on 15 public regression benchmarks, comparing it with modern neural networks and classical regression baselines. Regression DLNs match or exceed baseline accuracy while preserving interpretability and fast inference. Our results show that DLNs are a viable, cost-effective alternative for regression tasks, especially where model transparency and computational efficiency are important.

神经网络在许多领域都取得了杰出的成绩;然而,它们的决策过程往往不透明,在资源受限制的环境中,它们的推论可能计算得非常昂贵。我们最近提议了不同的逻辑网络(DLNs)来解决这些问题,以便根据松散的离散逻辑进行列表分类,将其分为不同的形式,从而能够以梯度为基础学习从二元逻辑操作中建立的网络。DLNs提供了可解释的推理,并大大降低了推论成本。我们把DLN框架扩大到受监督的表格回归。具体地说,我们重新设计了最后产出层,以支持连续目标,并将最初的两阶段培训程序统一到一个可区分的阶段。我们根据15个公共回归基准对由此产生的模型进行了评估,将其与现代神经网络和经典回归基线进行比较。回归DLNs在保存可解释性和快速推断性的同时匹配或超过基线精度。我们的结果表明,DLNs是回归任务的可行、成本效益高的替代方法,特别是在模型透明和计算效率重要的情况下。

Article 58

Title@2025-05-29 (4): Inference-time Scaling of Diffusion Models through Classical Search

Title: Inference-time Scaling of Diffusion Models through Classical Search

Inferenzzeit Skalierung von Diffusionsmodellen durch klassische Suche

通过古典搜索对传播模型进行传播的推断-时间缩放 2505.23614v1

Authors: Xiangcheng Zhang, Haowei Lin, Haotian Ye, James Zou, Jianzhu Ma, Yitao Liang, Yilun Du

Classical search algorithms have long underpinned modern artificial intelligence. In this work, we tackle the challenge of inference-time control in diffusion models – adapting generated outputs to meet diverse test-time objectives – using principles from classical search. We propose a general framework that orchestrates local and global search to efficiently navigate the generative space. It employs a theoretically grounded local search via annealed Langevin MCMC and performs compute-efficient global exploration using breadth-first and depth-first tree search. We evaluate our approach on a range of challenging domains, including planning, offline reinforcement learning, and image generation. Across all tasks, we observe significant gains in both performance and efficiency. These results show that classical search provides a principled and practical foundation for inference-time scaling in diffusion models. Project page at diffusion-inference-scaling.github.io.

古典搜索算法长期以来一直是现代人工智能的基础。在这项工作中,我们利用古典搜索的原则,应对传播模型的推论时间控制的挑战 – – 调整产生的产出,以实现不同的测试时间目标。我们提出了一个总体框架,通过本地和全球搜索,以高效地导航基因空间。我们通过annealed Langevin MCMC 进行基于理论的本地搜索,并利用宽度第一和深度第一树搜索进行计算高效的全球探索。我们评估了我们在一系列具有挑战性的领域的做法,包括规划、离线强化学习和图像生成。我们观察了在业绩和效率方面所取得的重大进展。这些结果显示,古典搜索为传播模型的推论时间缩提供了原则和实践基础。

Article 59

Title@2025-05-29 (4): The Generalized Skew Spectrum of Graphs

Title: The Generalized Skew Spectrum of Graphs

Das generalisierte Skew-Spektrum der Graphen

普通的Skew图象光谱 2505.23609v1

Authors: Armando Bellante, Martin Plávala, Alessandro Luongo

This paper proposes a family of permutation-invariant graph embeddings, generalizing the Skew Spectrum of graphs of Kondor & Borgwardt (2008). Grounded in group theory and harmonic analysis, our method introduces a new class of graph invariants that are isomorphism-invariant and capable of embedding richer graph structures - including attributed graphs, multilayer graphs, and hypergraphs - which the Skew Spectrum could not handle. Our generalization further defines a family of functions that enables a trade-off between computational complexity and expressivity. By applying generalization-preserving heuristics to this family, we improve the Skew Spectrum’s expressivity at the same computational cost. We formally prove the invariance of our generalization, demonstrate its improved expressiveness through experiments, and discuss its efficient computation.

本文提出了一组变异图嵌入, 概括了 Kondor & Borgwardt 和 Borgwardt 图表的Skew Spectrum(2008年) 。我们的方法以群论和和谐分析为基础, 引入了一种新的图形变异物类别, 这些变异体是非正态的, 能够嵌入更丰富的图形结构, 包括可归属图、多层图和高压图, Skew Spectrum 无法处理这些结构。我们的概括化进一步定义了能够平衡计算复杂性和表达性之间的函数组合。通过对这个家庭应用一般化- 保留超自然论, 我们用同样的计算成本改进Skew Spectrum 的表达性。我们正式证明了我们一般化的变异性, 通过实验来显示其更清晰的表达性, 并讨论其高效的计算。

Article 60

Title@2025-05-29 (4): Data Model Design for Explainable Machine Learning-based Electricity Applications

Title: Data Model Design for Explainable Machine Learning-based Electricity Applications

Datenmodell-Design für erklärbare maschinelle Learning-basierte Stromanwendungen

可解释机器学习用电力应用数据模型设计 2505.23607v1

Authors: Carolina Fortuna, Gregor Cerar, Blaz Bertalanic, Andrej Campa, Mihael Mohorcic

The transition from traditional power grids to smart grids, significant increase in the use of renewable energy sources, and soaring electricity prices has triggered a digital transformation of the energy infrastructure that enables new, data driven, applications often supported by machine learning models. However, the majority of the developed machine learning models rely on univariate data. To date, a structured study considering the role meta-data and additional measurements resulting in multivariate data is missing. In this paper we propose a taxonomy that identifies and structures various types of data related to energy applications. The taxonomy can be used to guide application specific data model development for training machine learning models. Focusing on a household electricity forecasting application, we validate the effectiveness of the proposed taxonomy in guiding the selection of the features for various types of models. As such, we study of the effect of domain, contextual and behavioral features on the forecasting accuracy of four interpretable machine learning techniques and three openly available datasets. Finally, using a feature importance techniques, we explain individual feature contributions to the forecasting accuracy.

传统电网向智能电网的过渡、可再生能源使用量的大幅增加、以及电价的飙升,都引发了能源基础设施的数字化转型,使新的、数据驱动的、往往由机器学习模型支持的应用得以实现。然而,大多数发达的机器学习模型依赖单体数据。迄今为止,尚缺乏一项结构化研究,研究元数据和导致多变量数据的额外测量的作用。在这份文件中,我们建议了一种分类学,确定和构建与能源应用有关的各类数据。分类学可用于指导用于培训机器学习模型的具体数据模型的开发。我们以家庭电力预测应用为重点,验证了拟议的分类学在指导选择各类模型特征方面的有效性。因此,我们研究了域、背景和行为特征对四种可解释的机器学习技术和三种公开提供的数据集预测准确性的影响。最后,我们用一种特征重要技术,解释了对预测准确性所作的个人特征贡献。

Article 61

Title@2025-05-29 (4): Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Title: Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Muddit: Befreiende Generation jenseits von Text-zu-Bild mit einem Unified Discrete Diffusion Model

Muddit: 利用统一分解传播模型在文本到图像之外解放一代 2505.23606v1

Authors: Qingyu Shi, Jinbin Bai, Zhuoran Zhao, Wenhao Chai, Kaidong Yu, Jianzong Wu, Shuangyong Song, Yunhai Tong, Xiangtai Li, Xuelong Li, Shuicheng Yan

Unified generation models aim to handle diverse tasks across modalities – such as text generation, image generation, and vision-language reasoning – within a single architecture and decoding paradigm. Autoregressive unified models suffer from slow inference due to sequential decoding, and non-autoregressive unified models suffer from weak generalization due to limited pretrained backbones. We introduce Muddit, a unified discrete diffusion transformer that enables fast and parallel generation across both text and image modalities. Unlike prior unified diffusion models trained from scratch, Muddit integrates strong visual priors from a pretrained text-to-image backbone with a lightweight text decoder, enabling flexible and high-quality multimodal generation under a unified architecture. Empirical results show that Muddit achieves competitive or superior performance compared to significantly larger autoregressive models in both quality and efficiency. The work highlights the potential of purely discrete diffusion, when equipped with strong visual priors, as a scalable and effective backbone for unified generation.

单一一代模式旨在在一个单一架构和解码模式中处理不同模式的不同任务 – – 如文本生成、图像生成和视觉语言推理等。自动递减统一模式由于顺序解码而出现缓慢的推论,非自动递增统一模式由于受过训练的骨干有限而出现薄弱的概括化。我们引入了一个统一的离散扩散变压器,即Mudddit,它能够在文字和图像模式之间实现快速和平行的生成。与以前从零开始训练的统一传播模型不同,Mudddit将预先训练过的文本到图像主干网的强直观预感与轻量文本解码相结合,使得在一个统一的架构下能够进行灵活和高质量的多式联运生成。经验性结果显示,Mudddit在质量和效率两方面都具有竞争力或优异性,而自动递增型模型则大得多。工作凸显了纯离散扩散的潜力,在具备强的视觉前导力时,作为统一生成的可缩和有效骨干。

Article 62

Title@2025-05-29 (4): STeCa: Step-level Trajectory Calibration for LLM Agent Learning

Title: STeCa: Step-level Trajectory Calibration for LLM Agent Learning

STeCa: Schritt-Level-Trajektorienkalibrierung für LLM Agent Learning

STeCa:LLM代理学习的职级轨迹校准 2502.14276v2

Authors: Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li

Large language model (LLM)-based agents have shown promise in tackling complex tasks by interacting dynamically with the environment. Existing work primarily focuses on behavior cloning from expert demonstrations or preference learning through exploratory trajectory sampling. However, these methods often struggle to address long-horizon tasks, where suboptimal actions accumulate step by step, causing agents to deviate from correct task trajectories. To address this, we highlight the importance of timely calibration and the need to automatically construct calibration trajectories for training agents. We propose Step-Level Trajectory Calibration (STeCa), a novel framework for LLM agent learning. Specifically, STeCa identifies suboptimal actions through a step-level reward comparison during exploration. It constructs calibrated trajectories using LLM-driven reflection, enabling agents to learn from improved decision-making processes. We finally leverage these calibrated trajectories with successful trajectories for reinforced training. Extensive experiments demonstrate that STeCa significantly outperforms existing methods. Further analysis highlights that timely calibration enables agents to complete tasks with greater robustness. Our code and data are available at https://github.com/WangHanLinHenry/STeCa.

大型语言模型(LLM)的代理机构通过动态地与环境互动,在应对复杂任务方面表现出了希望。现有工作主要侧重于通过专家演示或探索性轨迹抽样学习来进行行为克隆;然而,这些方法往往难以解决长期对等任务,因为亚优化行动会一步步积累,使代理机构偏离正确的任务轨迹。为此,我们强调及时校准的重要性,以及自动为培训代理机构建立校准轨迹的必要性。我们提出了逐步轨迹校准(STeCa),这是LLLM代理机构学习的新颖框架。具体地说,STeCa在探索期间通过一步级的奖励比较确定了亚优性行动。它利用LLM驱动的反射来构建校准轨迹,使代理机构能够从改进的决策进程中学习。我们最后利用这些校准轨迹和成功轨迹来强化培训。广泛的实验表明STeCa大大超越了现有方法。进一步的分析强调,及时校准使代理机构能够以更稳健的方式完成任务。我们的代码和数据在 https/Wng/H.

Article 63

Title@2025-05-29 (4): On Transferring Transferability: Towards a Theory for Size Generalization

Title: On Transferring Transferability: Towards a Theory for Size Generalization

Übertragbarkeit: Auf dem Weg zu einer Theorie der Größenverallgemeinerung

关于转让可转让性:走向一个通用规模理论 2505.23599v1

Authors: Eitan Levin, Yuxin Ma, Mateo Díaz, Soledad Villar

Many modern learning tasks require models that can take inputs of varying sizes. Consequently, dimension-independent architectures have been proposed for domains where the inputs are graphs, sets, and point clouds. Recent work on graph neural networks has explored whether a model trained on low-dimensional data can transfer its performance to higher-dimensional inputs. We extend this body of work by introducing a general framework for transferability across dimensions. We show that transferability corresponds precisely to continuity in a limit space formed by identifying small problem instances with equivalent large ones. This identification is driven by the data and the learning task. We instantiate our framework on existing architectures, and implement the necessary changes to ensure their transferability. Finally, we provide design principles for designing new transferable models. Numerical experiments support our findings.

许多现代学习任务需要能够吸收不同大小投入的模型。因此, 已经为输入为图表、数据集和点云的域提出了维维独立结构。图表神经网络的近期工作探讨了一个接受过低维数据培训的模型能否将其性能转移至高维投入。我们通过引入一个通用的跨维可转移框架扩展了这项工作内容。我们显示,可转移性与通过识别小问题案例和同等大案例而形成的有限空间的连续性完全吻合。这个识别由数据和学习任务驱动。我们对现有结构的框架进行即时化,并进行必要的修改,以确保其可转移性。最后, 我们为设计新的可转移模型提供了设计原则。数字实验支持了我们的调查结果。

Article 64

Title@2025-05-29 (4): LLM Performance for Code Generation on Noisy Tasks

Title: LLM Performance for Code Generation on Noisy Tasks

LLM-Performance für Code-Generierung bei lauten Aufgaben

LLM 噪音任务代码生成的LLM性能 2505.23598v1

Authors: Radzim Sendyka, Christian Cabrera, Andrei Paleyes, Diana Robinson, Neil Lawrence

This paper investigates the ability of large language models (LLMs) to recognise and solve tasks which have been obfuscated beyond recognition. Focusing on competitive programming and benchmark tasks (LeetCode and MATH), we compare performance across multiple models and obfuscation methods, such as noise and redaction. We demonstrate that all evaluated LLMs can solve tasks obfuscated to a level where the text would be unintelligible to human readers, and does not contain key pieces of instruction or context. We introduce the concept of eager pattern matching to describe this behaviour, which is not observed in tasks published after the models’ knowledge cutoff date, indicating strong memorisation or overfitting to training data, rather than legitimate reasoning about the presented problem. We report empirical evidence of distinct performance decay patterns between contaminated and unseen datasets. We discuss the implications for benchmarking and evaluations of model behaviour, arguing for caution when designing experiments using standard datasets. We also propose measuring the decay of performance under obfuscation as a possible strategy for detecting dataset contamination and highlighting potential safety risks and interpretability issues for automated software systems.

本文调查了大型语言模型(LLMS)认识和解决超出认知范围的任务的能力。我们把注意力集中在竞争性编程和基准任务(LeetCode和MATH)上,比较了多种模型和模糊方法(例如噪音和编辑)的性能。我们证明,所有经过评价的LLMS都能够解决被模糊的任务,使其达到对人类读者不易理解的程度,而没有包含关键的指示或背景。我们引入了渴望模式匹配的概念来描述这种行为,在模型知识截止日期之后公布的任务中没有观察到这种行为,表明高度的记忆化或过度适应培训数据,而不是对所提出的问题进行合理的推理。我们报告了被污染的数据集和不可见的数据集之间不同性能衰减模式的经验证据。我们讨论了在使用标准数据集设计实验时对基准和行为评价的影响,我们主张谨慎。我们还提议测量模糊状态下性能的衰败,作为发现数据污染和突出潜在安全风险以及自动化软件系统可解释性问题的可能战略。

Article 65

Title@2025-05-29 (4): Multilook Coherent Imaging: Theoretical Guarantees and Algorithms

Title: Multilook Coherent Imaging: Theoretical Guarantees and Algorithms

Multilook Coherent Imaging: Theoretische Garantien und Algorithmen

多视相协调成像:理论保障和理算 2505.23594v1

Authors: Xi Chen, Soham Jana, Christopher A. Metzler, Arian Maleki, Shirin Jalali

Multilook coherent imaging is a widely used technique in applications such as digital holography, ultrasound imaging, and synthetic aperture radar. A central challenge in these systems is the presence of multiplicative noise, commonly known as speckle, which degrades image quality. Despite the widespread use of coherent imaging systems, their theoretical foundations remain relatively underexplored. In this paper, we study both the theoretical and algorithmic aspects of likelihood-based approaches for multilook coherent imaging, providing a rigorous framework for analysis and method development. Our theoretical contributions include establishing the first theoretical upper bound on the Mean Squared Error (MSE) of the maximum likelihood estimator under the deep image prior hypothesis. Our results capture the dependence of MSE on the number of parameters in the deep image prior, the number of looks, the signal dimension, and the number of measurements per look. On the algorithmic side, we employ projected gradient descent (PGD) as an efficient method for computing the maximum likelihood solution. Furthermore, we introduce two key ideas to enhance the practical performance of PGD. First, we incorporate the Newton-Schulz algorithm to compute matrix inverses within the PGD iterations, significantly reducing computational complexity. Second, we develop a bagging strategy to mitigate projection errors introduced during PGD updates. We demonstrate that combining these techniques with PGD yields state-of-the-art performance. Our code is available at https://github.com/Computational-Imaging-RU/Bagged-DIP-Speckle.

多视一致成像是数字全息学、超声成像和合成孔径雷达等应用中广泛使用的一种技术。这些系统中的一个中心挑战是存在多复制性噪音,通常称为分光,从而降低图像质量。尽管广泛使用一致成像系统,但其理论基础仍然相对没有得到充分探讨。在本文中,我们研究了多视一致成像基于可能性的方法的理论和算法方面,为分析和方法开发提供了一个严格的框架。我们的理论贡献包括:在深层图像假设下,建立最大可能性估计器(MSE)的理论上限。我们的结果反映了MSE对前深图像参数数量的依赖性、外观数量、信号尺寸和每面测量数。在算法方面,我们使用预测的梯度下降值(PGD)作为计算最大可能性解决方案的有效方法。此外,我们引入了两个关键想法,即加强PGD的实际表现。首先,我们采用了新星-沙尔兹算算算算算法,在深度图变缩缩缩缩图中,我们用这些变缩缩的缩缩缩图,我们用GDGDM/DMDA的缩算方法在进行。

Article 66

Title@2025-05-29 (4): Position: Federated Foundation Language Model Post-Training Should Focus on Open-Source Models

Title: Position: Federated Foundation Language Model Post-Training Should Focus on Open-Source Models

Position: Federated Foundation Language Model Nachschulung sollte sich auf Open-Source-Modelle konzentrieren

立场:联邦基金会语文示范培训后培训应侧重于开放来源模式 2505.23593v1

Authors: Nikita Agrawal, Simon Mertel, Ruben Mayer

Post-training of foundation language models has emerged as a promising research domain in federated learning (FL) with the goal to enable privacy-preserving model improvements and adaptations to user’s downstream tasks. Recent advances in this area adopt centralized post-training approaches that build upon black-box foundation language models where there is no access to model weights and architecture details. Although the use of black-box models has been successful in centralized post-training, their blind replication in FL raises several concerns. Our position is that using black-box models in FL contradicts the core principles of federation such as data privacy and autonomy. In this position paper, we critically analyze the usage of black-box models in federated post-training, and provide a detailed account of various aspects of openness and their implications for FL.

基础语言模型的后培训已成为联合会学习(FL)中一个有希望的研究领域,目标是使隐私保护模式的改进和适应用户的下游任务。该领域最近的进展是采用集中的培训后培训方法,以黑盒基础语言模型为基础,在没有机会获得模型重量和结构细节的情况下建立这种模式。虽然在集中培训后使用黑盒模式取得了成功,但在FL中盲目复制却引起了若干关切。我们的立场是,在FL中使用黑盒模式与联邦的核心原则,如数据隐私和自主性相矛盾。在本立场文件中,我们严格分析黑盒模式在黑盒培训后培训中的使用情况,并详细说明开放的各个方面及其对FL的影响。

Article 67

Title@2025-05-29 (4): Accelerated Training of Federated Learning via Second-Order Methods

Title: Accelerated Training of Federated Learning via Second-Order Methods

Beschleunigte Ausbildung des Föderierten Lernens über Methoden der zweiten Ordnung

通过二级方法加快联邦学习培训 2505.23588v1

Authors: Mrinmay Sen, Sidhant R Nair, C Krishna Mohan

This paper explores second-order optimization methods in Federated Learning (FL), addressing the critical challenges of slow convergence and the excessive communication rounds required to achieve optimal performance from the global model. While existing surveys in FL primarily focus on challenges related to statistical and device label heterogeneity, as well as privacy and security concerns in first-order FL methods, less attention has been given to the issue of slow model training. This slow training often leads to the need for excessive communication rounds or increased communication costs, particularly when data across clients are highly heterogeneous. In this paper, we examine various FL methods that leverage second-order optimization to accelerate the training process. We provide a comprehensive categorization of state-of-the-art second-order FL methods and compare their performance based on convergence speed, computational cost, memory usage, transmission overhead, and generalization of the global model. Our findings show the potential of incorporating Hessian curvature through second-order optimization into FL and highlight key challenges, such as the efficient utilization of Hessian and its inverse in FL. This work lays the groundwork for future research aimed at developing scalable and efficient federated optimization methods for improving the training of the global model in FL.

本文探讨了联邦学习联合会(FL)的二级优化方法,探讨了缓慢趋同和为达到全球模式最佳业绩所需的过度通信周期等关键挑战。虽然FL的现有调查主要侧重于与统计和装置标签差异有关的挑战,以及一级FL方法的隐私和安全问题,但对模式培训缓慢问题的关注较少。这种缓慢的培训往往导致需要过多的通信回合或增加通信成本,特别是在客户数据高度差异的情况下。我们在本文件中审查了利用第二级优化来加快培训进程的多种FL方法。我们提供了第二级FL方法的全面分类,并根据趋同速度、计算成本、记忆使用、传承间接费用和全球模型的普及,比较其业绩。我们的调查结果显示,通过第二级优化将赫森曲线纳入FL的可能性,并突出了主要挑战,例如赫桑的有效利用及其在FL的反面。这项工作为今后旨在改进FC可升级和高效全球优化方法的示范研究奠定了基础。

Article 68

Title@2025-05-29 (4): PCA for Enhanced Cross-Dataset Generalizability in Breast Ultrasound Tumor Segmentation

Title: PCA for Enhanced Cross-Dataset Generalizability in Breast Ultrasound Tumor Segmentation

PCA für verbesserte Cross-Dataset-Verallgemeinerung in der Brust-Ultraschall-Tumor-Segmentierung

五氯苯甲醚,用于在乳房超声波肿瘤分割中增强交叉数据的通用性 2505.23587v1

Authors: Christian Schmidt, Heinrich Martin Overhoff

In medical image segmentation, limited external validity remains a critical obstacle when models are deployed across unseen datasets, an issue particularly pronounced in the ultrasound image domain. Existing solutions-such as domain adaptation and GAN-based style transfer-while promising, often fall short in the medical domain where datasets are typically small and diverse. This paper presents a novel application of principal component analysis (PCA) to address this limitation. PCA preprocessing reduces noise and emphasizes essential features by retaining approximately 90\% of the dataset variance. We evaluate our approach across six diverse breast tumor ultrasound datasets comprising 3,983 B-mode images and corresponding expert tumor segmentation masks. For each dataset, a corresponding dimensionality reduced PCA-dataset is created and U-Net-based segmentation models are trained on each of the twelve datasets. Each model trained on an original dataset was inferenced on the remaining five out-of-domain original datasets (baseline results), while each model trained on a PCA dataset was inferenced on five out-of-domain PCA datasets. Our experimental results indicate that using PCA reconstructed datasets, instead of original images, improves the model’s recall and Dice scores, particularly for model-dataset pairs where baseline performance was lowest, achieving statistically significant gains in recall (0.57 $\pm$ 0.07 vs. 0.70 $\pm$ 0.05, $p = 0.0004$) and Dice scores (0.50 $\pm$ 0.06 vs. 0.58 $\pm$ 0.06, $p = 0.03$). Our method reduced the decline in recall values due to external validation by $33\%$. These findings underscore the potential of PCA reconstruction as a safeguard to mitigate declines in segmentation performance, especially in challenging cases, with implications for enhancing external validity in real-world medical applications.

在医学图像分割中,当模型在超声波图像域中部署时,有限的外部有效性仍是一个关键障碍。现有解决方案,例如域适应和基于 GAN 风格的转移,虽然很有希望,但往往在医疗领域落后,因为数据基通常规模较小和多样化。本文展示了一种新颖的主要元件分析(PCA)应用,以应对这一限制。CCA预处理噪音,并通过保留大约90美元的数据元差异来强调基本特征。我们评估了我们通过由 0983 B-mode 图像和相应的专家肿瘤分解掩罩组成的六种不同乳腺超声数据集的方法。对于每个数据集来说,一个相应的维度减少了CPA-data数据集,而基于U-Net的分解模型则在每12个数据集中都受到训练。每个在原始数据集中受训的模型都根据其余的5个外部原始数据集(基线结果)来减少噪音。每套模型都用具有挑战性地计算结果,每套货币值为0.07美元。在5 美元美元的外部数据模型中,我们的原始数据解算算算出50美元。我们的原始数据模型中, 降为0.203 。我们的实验结果,特别地显示的成绩,特别地算算算算算得。

Article 69

Title@2025-05-29 (4): On-Policy RL with Optimal Reward Baseline

Title: On-Policy RL with Optimal Reward Baseline

On-Policy RL mit optimaler Prämienbasis

具有最佳回报基准的政策性RL 2505.23585v1

Authors: Yaru Hao, Li Dong, Xun Wu, Shaohan Huang, Zewen Chi, Furu Wei

Reinforcement learning algorithms are fundamental to align large language models with human preferences and to enhance their reasoning capabilities. However, current reinforcement learning algorithms often suffer from training instability due to loose on-policy constraints and computational inefficiency due to auxiliary models. In this work, we propose On-Policy RL with Optimal reward baseline (OPO), a novel and simplified reinforcement learning algorithm designed to address these challenges. OPO emphasizes the importance of exact on-policy training, which empirically stabilizes the training process and enhances exploration. Moreover, OPO introduces the optimal reward baseline that theoretically minimizes gradient variance. We evaluate OPO on mathematical reasoning benchmarks. The results demonstrate its superior performance and training stability without additional models or regularization terms. Furthermore, OPO achieves lower policy shifts and higher output entropy, encouraging more diverse and less repetitive responses. These results highlight OPO as a promising direction for stable and effective reinforcement learning in large language model alignment and reasoning tasks. The implementation is provided at https://github.com/microsoft/LMOps/tree/main/opo.

强化学习算法对于使大型语言模式与人类偏好相一致并提高其推理能力至关重要。然而,由于政策限制松散,而且由于辅助模式导致计算效率低下,目前的强化学习算法往往因培训不稳定而受到影响。在这项工作中,我们提议采用最佳奖励基线(OPO),即新的简化强化学习算法(OPO),以应对这些挑战。OPO强调精确的政策培训的重要性,这种培训在经验上稳定了培训过程,并加强了探索。此外,OPO还引入了最佳奖励基线,从理论上将梯度差异降到最低。我们评估了OPO的数学推理基准。结果显示,OPO在没有额外模型或正规化条件的情况下,其业绩和培训稳定性较高。此外,OPO实现了较低的政策变化和产出增量,鼓励了更多多样性和较少重复性的反应。这些结果突出OPO是稳定和有效加强大语言模式调整和推理任务的有希望的方向。在 https://github.com/microcol/LMOps/tree/pine/polo/opopoto。

Article 70

Title@2025-05-29 (4): Improving Time Series Forecasting via Instance-aware Post-hoc Revision

Title: Improving Time Series Forecasting via Instance-aware Post-hoc Revision

Verbesserung der Zeitreihenprognose über Instance-aware Post-hoc-Revision

改进时间序列预测,通过 “ 热后后预测 “ 改进时间序列预测 2505.23583v1

Authors: Zhiding Liu, Mingyue Cheng, Guanhao Zhao, Jiqian Yang, Qi Liu, Enhong Chen

Time series forecasting plays a vital role in various real-world applications and has attracted significant attention in recent decades. While recent methods have achieved remarkable accuracy by incorporating advanced inductive biases and training strategies, we observe that instance-level variations remain a significant challenge. These variations–stemming from distribution shifts, missing data, and long-tail patterns–often lead to suboptimal forecasts for specific instances, even when overall performance appears strong. To address this issue, we propose a model-agnostic framework, PIR, designed to enhance forecasting performance through Post-forecasting Identification and Revision. Specifically, PIR first identifies biased forecasting instances by estimating their accuracy. Based on this, the framework revises the forecasts using contextual information, including covariates and historical time series, from both local and global perspectives in a post-processing fashion. Extensive experiments on real-world datasets with mainstream forecasting models demonstrate that PIR effectively mitigates instance-level errors and significantly improves forecasting reliability.

时间序列预测在现实世界的各种应用中发挥着关键作用,近几十年来吸引了极大关注。虽然最近的方法通过纳入先进的感化偏差和培训战略取得了显著的准确性,但我们注意到,实例层面的差异仍是一个重大挑战。由于分布变化、数据缺失和长尾模式的变化,往往导致对具体实例的预测不尽如人意,即使总体性能看似强劲。为了解决这一问题,我们提议了一个模型-不可知性框架,即PIR,目的是通过预测后识别和订正来提高预测绩效。具体地说,PIR首先通过估计其准确性来查明偏差预测实例。基于这一点,该框架从后处理时的当地和全球角度对预测进行了修改,包括变量和历史时间序列。在现实世界数据集和主流预测模型上进行的广泛实验表明,PIR有效地减轻了实例级错误,并大大提高了预测的可靠性。

Article 71

Title@2025-05-29 (4): Wake-Informed 3D Path Planning for Autonomous Underwater Vehicles Using A* and Neural Network Approximations

Title: Wake-Informed 3D Path Planning for Autonomous Underwater Vehicles Using A* and Neural Network Approximations

Wake-Informierte 3D-Pfadplanung für autonome Unterwasserfahrzeuge mit A*- und Neuralnetzwerk-Annäherungen

使用A* 和神经网络相近的自动水下车辆的觉醒3D路径规划 2502.01918v2

Authors: Zachary Cooper-Baldock, Stephen Turnock, Karl Sammut

Autonomous Underwater Vehicles (AUVs) encounter significant energy, control and navigation challenges in complex underwater environments, particularly during close-proximity operations, such as launch and recovery (LAR), where fluid interactions and wake effects present additional navigational and energy challenges. Traditional path planning methods fail to incorporate these detailed wake structures, resulting in increased energy consumption, reduced control stability, and heightened safety risks. This paper presents a novel wake-informed, 3D path planning approach that fully integrates localized wake effects and global currents into the planning algorithm. Two variants of the A* algorithm - a current-informed planner and a wake-informed planner - are created to assess its validity and two neural network models are then trained to approximate these planners for real-time applications. Both the A* planners and NN models are evaluated using important metrics such as energy expenditure, path length, and encounters with high-velocity and turbulent regions. The results demonstrate a wake-informed A* planner consistently achieves the lowest energy expenditure and minimizes encounters with high-velocity regions, reducing energy consumption by up to 11.3%. The neural network models are observed to offer computational speedup of 6 orders of magnitude, but exhibit 4.51 - 19.79% higher energy expenditures and 9.81 - 24.38% less optimal paths. These findings underscore the importance of incorporating detailed wake structures into traditional path planning algorithms and the benefits of neural network approximations to enhance energy efficiency and operational safety for AUVs in complex 3D domains.

自主水下潜水器(AUV)在复杂的水下环境中,特别是在发射和回收(LAR)等近距离作业中,遇到能源、控制和导航方面的重大挑战,特别是在发射和回收(LAR)等近距离作业中,流动相互作用和后醒效应带来额外的航行和能源挑战;传统路径规划方法未能纳入这些详细的防守结构,导致能源消耗增加、控制稳定性降低、安全风险增加;本文介绍了一种新颖的知情后醒、3D路径规划方法,充分将局部后醒效应和全球潮流纳入规划算法;A* 算法的两个变式,即目前知情的计划者和后醒悟计划设计者,用来评估其有效性,然后对两个神经网络模型进行培训,以近似这些规划者进行实时应用。A* 规划师和NNN模型都使用能源支出增加、路径长度和与高速和动荡地区相遇等重要指标进行评估。结果显示,知情的A* 规划员始终实现最起码的能源支出,并将高速度区域遇到的能源消耗减少至11.3%,但晚知情规划。在19个神经网络模型显示,将更高速度的域域域域域域域域段的计算结果为24显示速度的进度为24的进度。

Article 72

Title@2025-05-29 (4): BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model

Title: BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model

BioReason: Förderung multimodaler biologischer Vernunft innerhalb eines DNA-LLM-Modells

BioReason:在DNA-LLM模型中激励多式生物理由 2505.23579v1

Authors: Adibvafa Fallahpour, Andrew Magnuson, Purav Gupta, Shihao Ma, Jack Naimer, Arnav Shah, Haonan Duan, Omar Ibrahim, Hani Goodarzi, Chris J. Maddison, Bo Wang

Unlocking deep, interpretable biological reasoning from complex genomic data is a major AI challenge hindering scientific discovery. Current DNA foundation models, despite strong sequence representation, struggle with multi-step reasoning and lack inherent transparent, biologically intuitive explanations. We introduce BioReason, a pioneering architecture that, for the first time, deeply integrates a DNA foundation model with a Large Language Model (LLM). This novel connection enables the LLM to directly process and reason with genomic information as a fundamental input, fostering a new form of multimodal biological understanding. BioReason’s sophisticated multi-step reasoning is developed through supervised fine-tuning and targeted reinforcement learning, guiding the system to generate logical, biologically coherent deductions. On biological reasoning benchmarks including KEGG-based disease pathway prediction - where accuracy improves from 88% to 97% - and variant effect prediction, BioReason demonstrates an average 15% performance gain over strong single-modality baselines. BioReason reasons over unseen biological entities and articulates decision-making through interpretable, step-by-step biological traces, offering a transformative approach for AI in biology that enables deeper mechanistic insights and accelerates testable hypothesis generation from genomic data. Data, code, and checkpoints are publicly available at https://github.com/bowang-lab/BioReason

从复杂的基因组数据中解开的深层、可解释的生物推理是妨碍科学发现的一项重大挑战。目前的DNA基础模型,尽管有很强的顺序代表,却与多步推理斗争,缺乏内在的透明、生物学直观的解释。我们引入了BioReason,这是一个开创性架构,首次将DNA基础模型与大语言模型(LLM)深入结合。这种新颖的连接使LLLM能够直接处理和解释基因组信息,将其作为一种基本投入,促进一种新形式的多式联运生物理解。BioReason的尖端多步推理是通过监督的微调和有针对性的强化学习来发展,指导系统产生符合逻辑、生物一致性的推理。关于生物推理基准,包括基于KEGG的疾病路径预测,其精确率从88%提高到97%,以及变异效应预测,BioReason显示平均15%的性能超过强的单一模式基线。关于无形生物实体的理由,并通过可解释、逐步的生物追踪来解释决策,为生物学的系统提供变革方法,在生物学上提供改变性方法,使数据/基因系统生成的模型的模型的模型能够更深层次上得到的数据。

Article 73

Title@2025-05-29 (4): CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring

Title: CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring

CoT Red-Handed: Stresstesting Chain-of-Thought-Überwachung

COT 红手:压力测试研究链监测 2505.23575v1

Authors: Benjamin Arnav, Pablo Bernabeu-Pérez, Nathan Helm-Burger, Tim Kostolansky, Hannes Whittingham, Mary Phuong

As AI models are deployed with increasing autonomy, it is important to ensure they do not take harmful actions unnoticed. As a potential mitigation, we investigate Chain-of-Thought (CoT) monitoring, wherein a weaker trusted monitor model continuously oversees the intermediate reasoning steps of a more powerful but untrusted model. We compare CoT monitoring to action-only monitoring, where only final outputs are reviewed, in a red-teaming setup where the untrusted model is instructed to pursue harmful side tasks while completing a coding problem. We find that CoT monitoring improves detection by up to 27 percentage points in scenarios where action-only monitoring fails to reliably identify sabotage. However, CoT traces can also contain misleading rationalizations that deceive the monitor, reducing performance in more obvious sabotage cases. To address this, we introduce a hybrid protocol that independently scores both reasoning and final outputs and combines them using a weighted average. This hybrid monitor consistently outperforms both CoT and action-only monitors across all tested models and tasks, with detection rates over four times higher than action-only monitoring for subtle deception scenarios.

由于大赦国际模式的部署越来越具有自主性,因此必须确保它们不会不注意有害行动。作为可能的缓解措施,我们调查“努力链”监测,其中受信任程度较低的监测模式持续监督一个更强大但不信任的模式的中间推理步骤。我们比较“信任度较低的监测”和“只行动”监测,因为只有最后产出才接受审查的“行动”监测,在红色组合中,不受信任的模式被指示执行有害的侧面任务,同时完成编码问题。我们发现,“信任度”监测在只采取行动的监测无法可靠地查明破坏的情景中提高了高达27个百分点的检测。然而,“信任度”跟踪还包含误导性合理化,欺骗了监测,降低了更明显的破坏性案例的绩效。为了解决这一问题,我们引入了一个混合协议,独立地计算推理和最终产出,并使用加权平均数将其组合在一起。这种混合监测始终超越“信任”和“行动”监测,在所有测试的模式和任务中,探测率比“行动”监测率高出4倍多于“行动”监测。

Article 74

Title@2025-05-29 (4): Maximum Likelihood Learning of Latent Dynamics Without Reconstruction

Title: Maximum Likelihood Learning of Latent Dynamics Without Reconstruction

Maximale Wahrscheinlichkeit Lernen von latenten Dynamiken ohne Rekonstruktion

学习没有重建的原始动力学 2505.23569v1

Authors: Samo Hromadka, Kai Biegun, Lior Fox, James Heald, Maneesh Sahani

We introduce a novel unsupervised learning method for time series data with latent dynamical structure: the recognition-parametrized Gaussian state space model (RP-GSSM). The RP-GSSM is a probabilistic model that learns Markovian Gaussian latents explaining statistical dependence between observations at different time steps, combining the intuition of contrastive methods with the flexible tools of probabilistic generative models. Unlike contrastive approaches, the RP-GSSM is a valid probabilistic model learned via maximum likelihood. Unlike generative approaches, the RP-GSSM has no need for an explicit network mapping from latents to observations, allowing it to focus model capacity on inference of latents. The model is both tractable and expressive: it admits exact inference thanks to its jointly Gaussian latent prior, while maintaining expressivity with an arbitrarily nonlinear neural network link between observations and latents. These qualities allow the RP-GSSM to learn task-relevant latents without ad-hoc regularization, auxiliary losses, or optimizer scheduling. We show how this approach outperforms alternatives on problems that include learning nonlinear stochastic dynamics from video, with or without background distractors. Our results position the RP-GSSM as a useful foundation model for a variety of downstream applications.

我们对具有潜伏动态结构的时间序列数据采用了一种新的不受监督的学习方法:识别和平衡高斯州空间模型(RP-GSSM)。RP-GSSM是一个概率模型,它学习Markovian Gaussian潜伏,解释不同时间步骤观测之间的统计依赖性,将对比方法的直觉与概率基因模型的灵活工具结合起来。与对比方法不同,RP-GSSM是一种通过最大可能性学习的有效的概率模型。与基因化方法不同,RP-GSSM不需要从潜层到观测的清晰网络绘图,使其将模型能力集中在潜层的推断上。这个模型既具有可移植性和可表达性:它承认精确推导出不同时间步骤,同时保持与观测和潜层之间任意的非线性神经网络联系的直观性。这些特性使RP-GSSM能够学习与任务相关的潜层,而没有自动规范、辅助性损失或优化的定位。我们展示了模型在下游应用中如何将模型的定位定位定位作为不具有背景的图像基础,我们如何学习了在下流流式模型上的替代方法。

Article 75

Title@2025-05-29 (4): DRO: A Python Library for Distributionally Robust Optimization in Machine Learning

Title: DRO: A Python Library for Distributionally Robust Optimization in Machine Learning

DRO: Eine Python-Bibliothek für Distributional Robuste Optimierung im maschinellen Lernen

DRO: 一个用于在机器学习中进行分配式强力优化的 Python 图书馆 2505.23565v1

Authors: Jiashuo Liu, Tianyu Wang, Henry Lam, Hongseok Namkoong, Jose Blanchet

We introduce dro, an open-source Python library for distributionally robust optimization (DRO) for regression and classification problems. The library implements 14 DRO formulations and 9 backbone models, enabling 79 distinct DRO methods. Furthermore, dro is compatible with both scikit-learn and PyTorch. Through vectorization and optimization approximation techniques, dro reduces runtime by 10x to over 1000x compared to baseline implementations on large-scale datasets. Comprehensive documentation is available at https://python-dro.org.

我们引入了Dro,这是一个开放源码的Python图书馆,用于对回归和分类问题进行分布式强力优化(DRO),该图书馆安装了14个DRO配方和9个主干模型,使79种不同的DRO方法成为可能,此外,Dro与Scikit-learn和PyTorrch兼容,通过矢量化和优化近似技术,Dro将运行时间比大规模数据集的基准实施时间减少10x至1000x以上,综合文件可在https://python-dro.org上查阅。

Article 76

Title@2025-05-29 (4): Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

Title: Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

Segment Policy Optimization: Effektive Segment-Level-Kreditvergabe in RL für große Sprachmodelle

政策优化优化:大语言模式RL中有效的分部一级信用分配 2505.23564v1

Authors: Yiran Guo, Lijie Xu, Jie Liu, Dan Ye, Shuang Qiu

Enhancing the reasoning capabilities of large language models effectively using reinforcement learning (RL) remains a crucial challenge. Existing approaches primarily adopt two contrasting advantage estimation granularities: Token-level methods (e.g., PPO) aim to provide the fine-grained advantage signals but suffer from inaccurate estimation due to difficulties in training an accurate critic model. On the other extreme, trajectory-level methods (e.g., GRPO) solely rely on a coarse-grained advantage signal from the final reward, leading to imprecise credit assignment. To address these limitations, we propose Segment Policy Optimization (SPO), a novel RL framework that leverages segment-level advantage estimation at an intermediate granularity, achieving a better balance by offering more precise credit assignment than trajectory-level methods and requiring fewer estimation points than token-level methods, enabling accurate advantage estimation based on Monte Carlo (MC) without a critic model. SPO features three components with novel strategies: (1) flexible segment partition; (2) accurate segment advantage estimation; and (3) policy optimization using segment advantages, including a novel probability-mask strategy. We further instantiate SPO for two specific scenarios: (1) SPO-chain for short chain-of-thought (CoT), featuring novel cutpoint-based partition and chain-based advantage estimation, achieving $6$-$12$ percentage point improvements in accuracy over PPO and GRPO on GSM8K. (2) SPO-tree for long CoT, featuring novel tree-based advantage estimation, which significantly reduces the cost of MC estimation, achieving $7$-$11$ percentage point improvements over GRPO on MATH500 under 2K and 4K context evaluation. We make our code publicly available at https://github.com/AIFrameResearch/SPO.

现有方法主要采用两种对比优势估算方法:Token级别方法(例如PPO)旨在提供细微的优势信号,但由于难以培训准确的批评模型而造成不准确的估计。关于其他极端的轨迹级方法(例如GROP),完全依赖来自最终奖励的粗劣优势信号,导致不精确的信用分配。为解决这些限制,我们提议部分政策优化(SPO),这是一个新的RL框架,在中间颗粒度上利用部分水平优势估算,通过提供比轨迹水平更准确的信用分配信号,并由于培训准确的批评模型而导致估算不准确;关于其他极端的轨级方法(例如GROPO),完全依赖来自最终奖励的粗略优势信号,导致不精确的信用分配。为解决这些限制,我们提议采用部分优势,包括新颖的概率估测战略。我们进一步即时价SPO/MO-GO-MO-MO-MO-S-CRal-GO-C-PO-PO-C-C-PO-GO-C-C-LS-C-C-CO-C-C-PO-PO-C-C-C-C-C-C-C-LO-C-C-C-C-C-C-PO-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-PO-C-C-C-C-C-C-C-C-PAR-C-C-C-C-C-C-C-C-C-C-C-PO-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-PL-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-PAR-C-C-C-C-C-C-C-C-

Article 77

Title@2025-05-29 (4): LEXam: Benchmarking Legal Reasoning on 340 Law Exams

Title: LEXam: Benchmarking Legal Reasoning on 340 Law Exams

LEXam: Benchmarking der rechtlichen Begründung von 340 Rechtsprüfungen

LEXam:340项法律考试的法律依据基准 2505.12864v2

Authors: Yu Fan, Jingwei Ni, Jakob Merane, Etienne Salimbeni, Yang Tian, Yoan Hermstrüwer, Yinya Huang, Mubashara Akhtar, Florian Geering, Oliver Dreyer, Daniel Brunner, Markus Leippold, Mrinmaya Sachan, Alexander Stremitzer, Christoph Engel, Elliott Ash, Joel Niklaus

Long-form legal reasoning remains a key challenge for large language models (LLMs) in spite of recent advances in test-time scaling. We introduce LEXam, a novel benchmark derived from 340 law exams spanning 116 law school courses across a range of subjects and degree levels. The dataset comprises 4,886 law exam questions in English and German, including 2,841 long-form, open-ended questions and 2,045 multiple-choice questions. Besides reference answers, the open questions are also accompanied by explicit guidance outlining the expected legal reasoning approach such as issue spotting, rule recall, or rule application. Our evaluation on both open-ended and multiple-choice questions present significant challenges for current LLMs; in particular, they notably struggle with open questions that require structured, multi-step legal reasoning. Moreover, our results underscore the effectiveness of the dataset in differentiating between models with varying capabilities. Adopting an LLM-as-a-Judge paradigm with rigorous human expert validation, we demonstrate how model-generated reasoning steps can be evaluated consistently and accurately. Our evaluation setup provides a scalable method to assess legal reasoning quality beyond simple accuracy metrics. Project page: https://lexam-benchmark.github.io/

尽管最近测试时间的扩大有所进展,但大型语言模型(LLMS)的长期法律推理仍然是一项关键挑战。我们引入了LEXam,这是340次法律考试的新基准,涉及不同学科和学位水平的116个法学院课程;数据集包括英语和德语4 886个法律考试问题,包括2 841个长式、开放式问题和2 045个多种选择问题。除了参考答案外,未决问题还附有明确的指导,概述预期的法律推理方法,如问题识别、规则回顾或规则应用。我们对开放式和多种选择问题的评价对目前的LLMS提出了重大挑战;特别是,它们与需要结构化、多步的法律推理的开放问题作斗争。此外,我们的结果强调数据集在区分不同能力模型方面的有效性。采用LLM-as-a-judge模式,并严格地验证人类专家,我们展示如何连贯和准确地评价模型产生的推理步骤。我们的评价设置提供了一种可扩展的方法,用以评估超出简单精确度度度度的衡量标准的质量。项目: https://lexgis-bisgismamuspage.

Article 78

Title@2025-05-29 (4): Qwen Look Again: Guiding Vision-Language Reasoning Models to Re-attention Visual Information

Title: Qwen Look Again: Guiding Vision-Language Reasoning Models to Re-attention Visual Information

Qwen Look Again: Leitende Vision-Sprachen-Reasoning-Modelle, um visuelle Informationen erneut zu speichern

再看一遍:指导视觉信息重新阅读的视觉-语言定位依据模式 2505.23558v1

Authors: Xu Chu, Xinrong Chen, Guanyu Wang, Zhijie Tan, Kui Huang, Wenyu Lv, Tong Mo, Weiping Li

Inference time scaling drives extended reasoning to enhance the performance of Vision-Language Models (VLMs), thus forming powerful Vision-Language Reasoning Models (VLRMs). However, long reasoning dilutes visual tokens, causing visual information to receive less attention and may trigger hallucinations. Although introducing text-only reflection processes shows promise in language models, we demonstrate that it is insufficient to suppress hallucinations in VLMs. To address this issue, we introduce Qwen-LookAgain (Qwen-LA), a novel VLRM designed to mitigate hallucinations by incorporating a vision-text reflection process that guides the model to re-attention visual information during reasoning. We first propose a reinforcement learning method Balanced Reflective Policy Optimization (BRPO), which guides the model to decide when to generate vision-text reflection on its own and balance the number and length of reflections. Then, we formally prove that VLRMs lose attention to visual tokens as reasoning progresses, and demonstrate that supplementing visual information during reflection enhances visual attention. Therefore, during training and inference, Visual Token COPY and Visual Token ROUTE are introduced to force the model to re-attention visual information at the visual level, addressing the limitations of text-only reflection. Experiments on multiple visual QA datasets and hallucination metrics indicate that Qwen-LA achieves leading accuracy performance while reducing hallucinations. Our code is available at: https://github.com/Liar406/Look_Again.

推算时间缩放推动扩大推理,以提高视觉语言模型(VLMs)的性能,从而形成强大的视觉语言解释模型(VLRMs),从而形成强大的视觉语言解释模型(VLRMs),但长期推理会淡化视觉象征,导致视觉信息受到较少的关注,并可能引起幻觉。虽然引入仅以文字表示的反射过程在语言模型中显示出希望,但我们表明它不足以抑制VLMs中的幻觉。为了解决这一问题,我们引入了Quwen-LAgain(Qwen-LA),这是一个新的VLRMM(VLRM),旨在减轻幻觉,其方法是纳入一个视觉文字反思进程,引导模型在推理过程中重新保留视觉信息。我们首先建议强化学习方法,平衡思考政策优化(BROPO),该方法指导模型决定何时生成视觉反思本身的视觉反思,平衡反射次数和长度。然后,我们正式证明VLRMRMs失去对视觉信息作为推理学进步的注意,并表明在思考过程中补充视觉信息会加强视觉关注。因此,在培训和推理学期间,视觉TVOVY-CY-LVAL-LVAL-LA(OLVALA)在视觉记录中,在视觉判断到可判前的图像反映的图像反映的多次的图像的图像-LVALVALVALVALVA级水平上,在可判读数据到可辨),在可判读。

Article 79

Title@2025-05-29 (4): Learning Parametric Distributions from Samples and Preferences

Title: Learning Parametric Distributions from Samples and Preferences

Parametrische Verteilungen aus Proben und Präferenzen lernen

抽样和优惠制的学习参数分布 2505.23557v1

Authors: Marc Jourdan, Gizem Yüce, Nicolas Flammarion

Recent advances in language modeling have underscored the role of preference feedback in enhancing model performance. This paper investigates the conditions under which preference feedback improves parameter estimation in classes of continuous parametric distributions. In our framework, the learner observes pairs of samples from an unknown distribution along with their relative preferences depending on the same unknown parameter. We show that preference-based M-estimators achieve a better asymptotic variance than sample-only M-estimators, further improved by deterministic preferences. Leveraging the hard constraints revealed by deterministic preferences, we propose an estimator achieving an estimation error scaling of $\mathcal{O}(1/n)$ – a significant improvement over the $\Theta(1/\sqrt{n})$ rate attainable with samples alone. Next, we establish a lower bound that matches this accelerated rate; up to dimension and problem-dependent constants. While the assumptions underpinning our analysis are restrictive, they are satisfied by notable cases such as Gaussian or Laplace distributions for preferences based on the log-probability reward.

语言建模方面的最新进展凸显了偏好反馈在提高模型性能方面的作用。本文调查了偏好反馈改善连续参数分布类别参数估计的条件。在我们的框架内, 学习者观察的是来自未知分布的样本配对, 以及根据相同未知参数的相对偏好。我们显示, 以优惠为基础的M- 估测器比只采样的M- 估测器的无症状差异要好得多, 并通过确定性偏好进一步提高。利用确定性偏好所揭示的硬性限制, 我们建议了一位估测器, 实现美元/ mathcal{O}( 1/ n) 的估算误差比例, 大大高于单凭标本即可得到的 $/ Theta (1/\ sqrt{n} 率。接下来, 我们设定了一个更低的界限, 与这一加速率相匹配; 最高为尺寸和问题依赖的常数。尽管我们的分析所依据的假设是限制性的, 但是他们对一些突出的例子感到满意, 例如高斯或拉比特分配基于日- 概率奖励的偏好。

Article 80

Title@2025-05-29 (4): Adaptive Federated LoRA in Heterogeneous Wireless Networks with Independent Sampling

Title: Adaptive Federated LoRA in Heterogeneous Wireless Networks with Independent Sampling

Adaptives Federated LoRA in heterogenen drahtlosen Netzwerken mit unabhängiger Probenahme

具有独立抽样调查的多源无线网络中的联邦适应性 2505.23555v1

Authors: Yanzhao Hou, Jiaxiang Geng, Boyu Li, Xiaofeng Tao, Juncheng Wang, Xiaodong Xu, Bing Luo

Federated LoRA has emerged as a promising technique for efficiently fine-tuning large language models (LLMs) on distributed devices by reducing the number of trainable parameters. However, existing approaches often inadequately overlook the theoretical and practical implications of system and data heterogeneity, thereby failing to optimize the overall training efficiency, particularly in terms of wall-clock time. In this paper, we propose an adaptive federated LoRA strategy with independent client sampling to minimize the convergence wall-clock time of federated fine-tuning under both computation and communication heterogeneity. We first derive a new convergence bound for federated LoRA with arbitrary and independent client sampling, notably without requiring the stringent bounded gradient assumption. Then, we introduce an adaptive bandwidth allocation scheme that accounts for heterogeneous client resources and system bandwidth constraints. Based on the derived theory, we formulate and solve a non-convex optimization problem to jointly determine the LoRA sketching ratios and sampling probabilities, aiming to minimize wall-clock convergence time. An efficient and low-complexity algorithm is developed to approximate the solution. Finally, extensive experiments demonstrate that our approach significantly reduces wall-clock training time compared to state-of-the-art methods across various models and datasets.

通过减少可训练参数的数量,联邦洛拉联盟已成为高效微调分布式设备上大型语言模型(LLMs)的一个很有希望的技术,通过减少可训练参数的数量,可以有效地微调分布式设备上的大型语言模型(LLMs),但是,现有的方法往往没有适当地忽视系统和数据差异的理论和实践影响,从而未能优化总体培训效率,特别是墙时时段的培训效率。在本文件中,我们提出了一个适应性的联邦洛拉联盟战略,通过独立客户抽样,尽量减少计算和通信差异性两种情况下联合微调的同步时间。我们首先为具有任意和独立客户抽样的联邦洛拉公司找到新的趋同点,特别是不需要严格的封闭梯度假设。然后,我们引入了适应性带宽分配计划,考虑到各种客户资源和系统带宽限制。根据推理,我们制定并解决非凝固型优化问题,共同确定洛拉的草图比例和取样概率,目的是最大限度地减少墙时段的趋同时间。我们制定了高效和低兼容性的算法,以近解决方案。最后,广泛的实验表明我们的做法大大缩短了各种壁点培训时间和不同状态的数据。

Article 81

Title@2025-05-29 (4): Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters

Title: Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters

Nachhaltiges CO2-basiertes und wassereffizientes LLM-Scheeduling in Geo-verteilten Cloud-Rechenzentren

地球分布云数据中心的可持续碳软件和水效率高的LLM 2505.23554v1

Authors: Hayden Moore, Sirui Qi, Ninad Hogade, Dejan Milojicic, Cullen Bash, Sudeep Pasricha

In recent years, Large Language Models (LLM) such as ChatGPT, CoPilot, and Gemini have been widely adopted in different areas. As the use of LLMs continues to grow, many efforts have focused on reducing the massive training overheads of these models. But it is the environmental impact of handling user requests to LLMs that is increasingly becoming a concern. Recent studies estimate that the costs of operating LLMs in their inference phase can exceed training costs by 25x per year. As LLMs are queried incessantly, the cumulative carbon footprint for the operational phase has been shown to far exceed the footprint during the training phase. Further, estimates indicate that 500 ml of fresh water is expended for every 20-50 requests to LLMs during inference. To address these important sustainability issues with LLMs, we propose a novel framework called SLIT to co-optimize LLM quality of service (time-to-first token), carbon emissions, water usage, and energy costs. The framework utilizes a machine learning (ML) based metaheuristic to enhance the sustainability of LLM hosting across geo-distributed cloud datacenters. Such a framework will become increasingly vital as LLMs proliferate.

近年来,大语言模型(LLM),如ChatGPT、CoPilot和Gemini等,在不同领域被广泛采用。随着LLMs的使用继续增加,许多努力集中于减少这些模型的大量培训间接费用。但是,处理用户对LLMs的要求对环境的影响日益引起关注。最近的研究估计,在推论阶段操作LMs的成本每年可超过培训成本25x。LMs不断被问及,运行阶段的累积碳足迹显示远远超过培训阶段的足迹。此外,估计表明,在推断过程中,每20-50个LMs提出的LMs申请中,就有500毫升的淡水花费。为了与LMS解决这些重要的可持续性问题,我们提出了一个名为SLIT的新框架,以共同优化LMs服务质量(时间到头等)、碳排放、水使用和能源成本。框架利用基于机器的MLAEuric来提高LM公司在地理分布式云中托管服务的可持续性。这一框架将日益成为至关重要的一个框架。

Article 82

Title@2025-05-29 (4): Comparing the Moore-Penrose Pseudoinverse and Gradient Descent for Solving Linear Regression Problems: A Performance Analysis

Title: Comparing the Moore-Penrose Pseudoinverse and Gradient Descent for Solving Linear Regression Problems: A Performance Analysis

Vergleich der Moore-Penrose Pseudoinverse und Gradient Descent zur Lösung linearer Regressionsprobleme: Eine Leistungsanalyse

将摩尔-彭罗斯-普塞多温和梯底比较以解决线性倒退问题:绩效分析 2505.23552v1

Authors: Alex Adams

This paper investigates the comparative performance of two fundamental approaches to solving linear regression problems: the closed-form Moore-Penrose pseudoinverse and the iterative gradient descent method. Linear regression is a cornerstone of predictive modeling, and the choice of solver can significantly impact efficiency and accuracy. I review and discuss the theoretical underpinnings of both methods, analyze their computational complexity, and evaluate their empirical behavior on synthetic datasets with controlled characteristics, as well as on established real-world datasets. My results delineate the conditions under which each method excels in terms of computational time, numerical stability, and predictive accuracy. This work aims to provide practical guidance for researchers and practitioners in machine learning when selecting between direct, exact solutions and iterative, approximate solutions for linear regression tasks.

本文件调查了解决线性回归问题的两种基本方法的比较性能:封闭式摩尔-彭罗斯伪反射和迭代梯度下降法。线性回归是预测型模型的基石,而求解器的选择可以极大地影响效率和准确性。我审查并讨论这两种方法的理论基础,分析其计算复杂性,评价其在具有受控特性的合成数据集和既定真实世界数据集方面的实证行为。我的结果描述了每种方法在计算时间、数字稳定性和预测准确性方面优异的条件。这项工作旨在为研究人员和从业者在选择直线回归任务的直接、精确解决方案和迭接性、近似近似解决方案时,提供机器学习的实际指导。

Article 83

Title@2025-05-29 (4): Diffusion Sampling Correction via Approximately 10 Parameters

Title: Diffusion Sampling Correction via Approximately 10 Parameters

Diffusions-Probenahmekorrektur über ca. 10 Parameter

通过大约10个参数校正传播抽样校正 2411.06503v3

Authors: Guangyi Wang, Wei Peng, Lijiang Li, Wenyu Chen, Yuren Cai, Songzhi Su

While powerful for generation, Diffusion Probabilistic Models (DPMs) face slow sampling challenges, for which various distillation-based methods have been proposed. However, they typically require significant additional training costs and model parameter storage, limiting their practicality. In this work, we propose PCA-based Adaptive Search (PAS), which optimizes existing solvers for DPMs with minimal additional costs. Specifically, we first employ PCA to obtain a few basis vectors to span the high-dimensional sampling space, which enables us to learn just a set of coordinates to correct the sampling direction; furthermore, based on the observation that the cumulative truncation error exhibits an ``S”-shape, we design an adaptive search strategy that further enhances the sampling efficiency and reduces the number of stored parameters to approximately 10. Extensive experiments demonstrate that PAS can significantly enhance existing fast solvers in a plug-and-play manner with negligible costs. E.g., on CIFAR10, PAS optimizes DDIM’s FID from 15.69 to 4.37 (NFE=10) using only 12 parameters and sub-minute training on a single A100 GPU. Code is available at https://github.com/onefly123/PAS.

虽然具有发电能力,但扩散概率模型(DPM)面临缓慢的取样挑战,为此提出了各种蒸馏法方法,但通常需要大量额外的培训费用和模型参数储存,限制其实用性;在这项工作中,我们提议以五氯苯甲醚为基础的适应性搜索(PAS),以尽可能少的额外费用优化DPM的现有解决方案;具体地说,我们首先利用五氯苯甲醚获得一些基础矢量,以跨越高维取样空间,使我们能够只学习一套坐标,以纠正取样方向;此外,根据累积脱轨误差显示“S”形状的观察,我们设计了适应性搜索战略,进一步提高取样效率,并将储存参数的数量减少到大约10个,广泛的实验表明,五氯苯甲醚能够以插装方式大大增强现有的快速解决方案,费用微不足道。

Article 84

Title@2025-05-29 (4): Fast Large Language Model Collaborative Decoding via Speculation

Title: Fast Large Language Model Collaborative Decoding via Speculation

Schnelles Large Language Model Kollaboratives Decodieren über Spekulation

通过投机进行快速大语言合作示范模式 2502.01662v2

Authors: Jiale Fu, Yuchu Jiang, Junkai Chen, Jiaming Fan, Xin Geng, Xu Yang

Large Language Model (LLM) collaborative decoding techniques improve output quality by combining the outputs of multiple models at each generation step, but they incur high computational costs. In this paper, we introduce Collaborative decoding via Speculation (CoS), a novel framework that accelerates collaborative decoding without compromising performance. Inspired by Speculative Decoding–where a small proposal model generates tokens sequentially, and a larger target model verifies them in parallel, our approach builds on two key insights: (1) the verification distribution can be the combined distribution of both the proposal and target models, and (2) alternating each model as the proposer and verifier can further enhance efficiency. We generalize this method to collaboration among n models and theoretically prove that CoS is never slower than standard collaborative decoding, typically achieving faster speed. Extensive experiments demonstrate CoS is 1.11x-2.23x faster than standard collaborative decoding without compromising generation quality. Our code is available at https://github.com/Kamichanw/CoS/.

大型语言模型(LLM)合作解码技术(LLM)通过将多种模型的输出在每一代阶段结合起来,提高了产出质量,但计算成本很高。在本文中,我们引入了通过投机(COS)协作解码(COS),这是一个在不损害性能的情况下加速协作解码的新框架。受一个小型提案模型依次生成代号的投机解码(LLLM)的启发,而一个更大的目标模型平行核查,我们的方法基于两个主要的洞察力:(1) 核查分配可以是提案和目标模型的混合分布,以及(2) 作为提议方和核查方可以进一步提高效率,对每一种模型进行交替。我们将这种方法推广到n模式之间的合作,理论上证明COS从未比标准的合作解码慢过,通常能更快。广泛的实验显示COS比标准的协作解码速度快1.1x-2.23x比标准的代码在不影响生成质量的情况下更快。我们的代码可以在https://github.com/Kamichaw/COS/上查阅。

Article 85

Title@2025-05-29 (4): Domain-Aware Tensor Network Structure Search

Title: Domain-Aware Tensor Network Structure Search

Domain-Aware Tensor Netzwerkstruktur Suche

域- 软件显示器网络网络结构搜索 2505.23537v1

Authors: Giorgos Iacovides, Wuyang Zhou, Chao Li, Qibin Zhao, Danilo Mandic

Tensor networks (TNs) provide efficient representations of high-dimensional data, yet identification of the optimal TN structures, the so called tensor network structure search (TN-SS) problem, remains a challenge. Current state-of-the-art (SOTA) algorithms are computationally expensive as they require extensive function evaluations, which is prohibitive for real-world applications. In addition, existing methods ignore valuable domain information inherent in real-world tensor data and lack transparency in their identified TN structures. To this end, we propose a novel TN-SS framework, termed the tnLLM, which incorporates domain information about the data and harnesses the reasoning capabilities of large language models (LLMs) to directly predict suitable TN structures. The proposed framework involves a domain-aware prompting pipeline which instructs the LLM to infer suitable TN structures based on the real-world relationships between tensor modes. In this way, our approach is capable of not only iteratively optimizing the objective function, but also generating domain-aware explanations for the identified structures. Experimental results demonstrate that tnLLM achieves comparable TN-SS objective function values with much fewer function evaluations compared to SOTA algorithms. Furthermore, we demonstrate that the LLM-enabled domain information can be used to find good initializations in the search space for sampling-based SOTA methods to accelerate their convergence while preserving theoretical performance guarantees.

电线网络(TNS)能够有效地反映高维数据,然而,确定最佳的TN结构,即所谓的高频网络结构搜索(TN-SS)问题,仍然是一项挑战。目前的先进(SOTA)算法在计算上成本很高,因为它们需要广泛的功能评估,而对于现实世界的应用来说,这种评估是令人望而却步的。此外,现有的方法忽视了现实世界数据所固有的宝贵域信息,而且其查明的TN结构缺乏透明度。为此,我们提议了一个新型的TN-SS框架,称为TnLLM,它包含数据域域信息并利用大型语言模型(LLMS)的推理能力直接预测适当的TN结构。拟议的框架涉及一种对域有觉的快速管道,它要求LM根据现实世界关系推导出适当的TN结构结构。我们的方法不仅能够反复优化目标功能,而且还能够为所确定的结构产生域觉悟解释。实验结果表明,TNLLM在S-SS的初始搜索功能上实现了可比较的TN-S-M-LTA的快速搜索功能,而我们使用的域域域域级搜索功能则可以少于SO-MA。

Article 86

Title: It’s a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data

Es ist ein (Blind) Match! Richtung Vision-Sprache Korrespondenz ohne Paralleldaten

这是一个( Blind) 匹配! 向没有平行数据的视觉语言对应函授 2503.24129v2

Authors: Dominik Schnaus, Nikita Araslanov, Daniel Cremers

The platonic representation hypothesis suggests that vision and language embeddings become more homogeneous as model and dataset sizes increase. In particular, pairwise distances within each modality become more similar. This suggests that as foundation models mature, it may become possible to match vision and language embeddings in a fully unsupervised fashion, i.e. without parallel data. We present the first feasibility study, and investigate conformity of existing vision and language foundation models in the context of unsupervised, or “blind”, matching. First, we formulate unsupervised matching as a quadratic assignment problem and introduce a novel heuristic that outperforms previous solvers. We also develop a technique to find optimal matching problems, for which a non-trivial match is very likely. Second, we conduct an extensive study deploying a range of vision and language models on four datasets. Our analysis reveals that for many problem instances, vision and language representations can be indeed matched without supervision. This finding opens up the exciting possibility of embedding semantic knowledge into other modalities virtually annotation-free. As a proof of concept, we showcase an unsupervised classifier, which achieves non-trivial classification accuracy without any image-text annotation.

柏拉图代表假设表明,随着模型和数据集大小的增加,视觉和语言嵌入会变得更加单一。特别是,每种模式内部的相近距离会变得更加相似。这表明,随着基础模型成熟,有可能以完全不受监督的方式,即没有平行数据,匹配视觉和语言嵌入。我们提出第一次可行性研究,调查现有视觉和语言基建模型在不受监督或“盲”匹配背景下的兼容性。首先,我们将不受监督的匹配作为二次分配问题,并引入比以往解决者更相近的新型超模范。我们还开发了一种找到最佳匹配问题的技术,而非三角匹配的可能性很大。第二,我们在四个数据集上进行广泛的研究,运用了一系列的视觉和语言模型。我们的分析表明,对于许多问题的情况,视觉和语言表达确实可以在没有监督的情况下相匹配。这打开了将语义知识嵌入其他模式的令人兴奋的可能性,几乎是无注释的。作为概念的证明,我们展示了一种不受监督的图像分类的准确性,我们展示了一种不精确性。

Article 87

Title@2025-05-29 (4): NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks

Title: NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks

NACHOS: Neurale Architektur Suche nach Hardware eingeschränkt Early Exit Neural Networks

NACHOS: 早期外出神经网络硬件控制系统神经结构搜索 2401.13330v2

Authors: Matteo Gambella, Jary Pomponi, Simone Scardapane, Manuel Roveri

Early Exit Neural Networks (EENNs) endow astandard Deep Neural Network (DNN) with Early Exit Classifiers (EECs), to provide predictions at intermediate points of the processing when enough confidence in classification is achieved. This leads to many benefits in terms of effectiveness and efficiency. Currently, the design of EENNs is carried out manually by experts, a complex and time-consuming task that requires accounting for many aspects, including the correct placement, the thresholding, and the computational overhead of the EECs. For this reason, the research is exploring the use of Neural Architecture Search (NAS) to automatize the design of EENNs. Currently, few comprehensive NAS solutions for EENNs have been proposed in the literature, and a fully automated, joint design strategy taking into consideration both the backbone and the EECs remains an open problem. To this end, this work presents Neural Architecture Search for Hardware Constrained Early Exit Neural Networks (NACHOS), the first NAS framework for the design of optimal EENNs satisfying constraints on the accuracy and the number of Multiply and Accumulate (MAC) operations performed by the EENNs at inference time. In particular, this provides the joint design of backbone and EECs to select a set of admissible (i.e., respecting the constraints) Pareto Optimal Solutions in terms of best tradeoff between the accuracy and number of MACs. The results show that the models designed by NACHOS are competitive with the state-of-the-art EENNs. Additionally, this work investigates the effectiveness of two novel regularization terms designed for the optimization of the auxiliary classifiers of the EENN

早期出国神经网络(EENNs)在早期出国分类(EECs)下设置了标准的深心神经网络(DNN),在达到对分类足够信任时,在处理的中间点提供预测,这在有效性和效率方面带来许多好处。目前,EENNes的设计是由专家手工完成的,这是一项复杂和耗时的任务,需要考虑许多方面,包括正确定位、门槛设置和欧亚经济共同体的计算间接费用。因此,研究正在探索利用神经建筑搜索(NAS)实现EENNes设计自动化。目前,文献中为EENNNNes提出了很少的全面的ENAS解决方案,而考虑到骨干和欧亚经济共同体的完全自动化的联合设计战略仍然是一个尚未解决的问题。为此,这项工作提出了为硬体骨架经过训练的早期退出神经网络(NACHOS)进行神经建筑结构搜索,这是国家空间建筑结构中第一个用于设计最佳环境指标的升级框架,满足了对EENNES的精确度和数量的限制。

Article 88

Title@2025-05-29 (4): Subgraph Gaussian Embedding Contrast for Self-Supervised Graph Representation Learning

Title: Subgraph Gaussian Embedding Contrast for Self-Supervised Graph Representation Learning

Subgraph Gaussian Einbettungskontrast für selbstüberwachtes Graphen-Darstellungslernen

自支持图表代表制学习的 Subgraph Gaussian 嵌入式对比对比度 2505.23529v1

Authors: Shifeng Xie, Aref Einizade, Jhony H. Giraldo

Graph Representation Learning (GRL) is a fundamental task in machine learning, aiming to encode high-dimensional graph-structured data into low-dimensional vectors. Self-Supervised Learning (SSL) methods are widely used in GRL because they can avoid expensive human annotation. In this work, we propose a novel Subgraph Gaussian Embedding Contrast (SubGEC) method. Our approach introduces a subgraph Gaussian embedding module, which adaptively maps subgraphs to a structured Gaussian space, ensuring the preservation of input subgraph characteristics while generating subgraphs with a controlled distribution. We then employ optimal transport distances, more precisely the Wasserstein and Gromov-Wasserstein distances, to effectively measure the similarity between subgraphs, enhancing the robustness of the contrastive learning process. Extensive experiments across multiple benchmarks demonstrate that \method~outperforms or presents competitive performance against state-of-the-art approaches. Our findings provide insights into the design of SSL methods for GRL, emphasizing the importance of the distribution of the generated contrastive pairs.

图形教学( GRL) 是机器学习的一项基本任务, 目的是将高方图结构数据编码为低维矢量。自我支持学习( SSL) 方法在 GRL 中被广泛使用, 因为它们可以避免昂贵的人类批注。在这项工作中, 我们提出一个新的Subgraph Gausian 嵌入对比( SubGEC) 方法。我们的方法引入了子集子集成模块, 该模块将子集成成到结构化的高斯空间, 确保在生成受控分布的子集时保存输入子集特性。然后我们使用最佳的运输距离, 更精确地说, 瓦瑟斯坦和 Gromov- Wasserstein 的距离, 以有效测量子集之间的相似性, 增强对比性学习过程的稳健性。跨多个基准的大规模实验表明, 血压~ 外形或显示与最新技术方法相比具有竞争力的性能。我们的发现为 GRSL 设计 SL 方法提供了洞察设计方法的洞察力, , 强调生成对比配对分布的重要性。

Article 89

Title@2025-05-29 (4): Comparative assessment of fairness definitions and bias mitigation strategies in machine learning-based diagnosis of Alzheimer’s disease from MR images

Title: Comparative assessment of fairness definitions and bias mitigation strategies in machine learning-based diagnosis of Alzheimer’s disease from MR images

Vergleichende Bewertung von Fairness-Definitionen und Bias-Minderungsstrategien in der maschinellen Lern-basierten Diagnose der Alzheimer-Krankheit aus MR-Bildern

对利用MR图像对阿尔茨海默氏病进行机器学习诊断的公平定义和减少偏见战略的比较评估 2505.23528v1

Authors: Maria Eleftheria Vlontzou, Maria Athanasiou, Christos Davatzikos, Konstantina S. Nikita

The present study performs a comprehensive fairness analysis of machine learning (ML) models for the diagnosis of Mild Cognitive Impairment (MCI) and Alzheimer’s disease (AD) from MRI-derived neuroimaging features. Biases associated with age, race, and gender in a multi-cohort dataset, as well as the influence of proxy features encoding these sensitive attributes, are investigated. The reliability of various fairness definitions and metrics in the identification of such biases is also assessed. Based on the most appropriate fairness measures, a comparative analysis of widely used pre-processing, in-processing, and post-processing bias mitigation strategies is performed. Moreover, a novel composite measure is introduced to quantify the trade-off between fairness and performance by considering the F1-score and the equalized odds ratio, making it appropriate for medical diagnostic applications. The obtained results reveal the existence of biases related to age and race, while no significant gender bias is observed. The deployed mitigation strategies yield varying improvements in terms of fairness across the different sensitive attributes and studied subproblems. For race and gender, Reject Option Classification improves equalized odds by 46% and 57%, respectively, and achieves harmonic mean scores of 0.75 and 0.80 in the MCI versus AD subproblem, whereas for age, in the same subproblem, adversarial debiasing yields the highest equalized odds improvement of 40% with a harmonic mean score of 0.69. Insights are provided into how variations in AD neuropathology and risk factors, associated with demographic characteristics, influence model fairness.

目前的研究对机器学习(ML)模型进行全面的公平分析,以便根据MRI的神经成像特征,对诊断米氏认知缺陷和阿尔茨海默氏病(AD)的机器学习(MMI)模型进行综合的公平分析。对多科数据集中与年龄、种族和性别有关的双轨关系以及代用特征对这些敏感属性的编码的影响进行了调查。还评估了识别此类偏见的各种公平定义和指标的可靠性。根据最适当的公平措施,对广泛使用的预处理、处理和处理后神经偏差缓解战略的可比性进行了比较分析。此外,还采用了新的综合措施,通过考虑F1分数和对等率比率对公平与业绩之间的权衡进行量化。所获得的结果显示存在与年龄和种族有关的偏见,但没有观察到严重的性别偏差。部署的缓解战略在不同敏感属性和子问题中,对广泛使用的处理前处理、处理和处理后神经偏差缓解战略进行了比较分析。在种族和性别方面,拒绝选择的分类使公平性和业绩之间的权衡取价差,在最高等级和最高等级中分别为46%和57%和57%之间,在最高等级之间,最高等级为最高等级为最高等级,最高等级为最高等级为最高等级为最高等级,最高等级为最高等级为最高,最高等级为最高等级为最高等级为最高等级为最高等级为最高等级为最高,最高,最高等级为最高等级为最高等级为最高和最高等级为最高等级为最高等级为最高等级为最高等级为最高等级为最高。

Article 90

Title@2025-05-29 (4): Normalizing Flows are Capable Models for RL

Title: Normalizing Flows are Capable Models for RL

Normalisierende Strömungen sind fähige Modelle für RL

正常流动是RL的能力模型 2505.23527v1

Authors: Raj Ghugare, Benjamin Eysenbach

Modern reinforcement learning (RL) algorithms have found success by using powerful probabilistic models, such as transformers, energy-based models, and diffusion/flow-based models. To this end, RL researchers often choose to pay the price of accommodating these models into their algorithms – diffusion models are expressive, but are computationally intensive due to their reliance on solving differential equations, while autoregressive transformer models are scalable but typically require learning discrete representations. Normalizing flows (NFs), by contrast, seem to provide an appealing alternative, as they enable likelihoods and sampling without solving differential equations or autoregressive architectures. However, their potential in RL has received limited attention, partly due to the prevailing belief that normalizing flows lack sufficient expressivity. We show that this is not the case. Building on recent work in NFs, we propose a single NF architecture which integrates seamlessly into RL algorithms, serving as a policy, Q-function, and occupancy measure. Our approach leads to much simpler algorithms, and achieves higher performance in imitation learning, offline, goal conditioned RL and unsupervised RL.

现代强化学习(RL)算法通过使用强大的概率模型(如变压器、能源基模型以及扩散/流基模型)获得了成功。为此,RL研究人员往往选择支付将这些模型纳入其算法的代价 – – 扩散模型具有表达性,但由于他们依赖解决差异方程式,而自动反向变压器模型是可伸缩的,但通常需要学习离散的表达方式。相比之下,正常流动(NFs)似乎提供了一种有吸引力的替代方法,因为它们使得可能性和取样能够不解决差异方程式或自动反向结构。然而,他们在RL中的潜力受到的注意有限,部分原因是普遍认为正常化流程缺乏足够的表达性。我们表明情况并非如此。根据NFs最近的工作,我们提议了一个单一的NF结构,将无缝合地纳入RL算法,作为政策、Q-功能和占用度尺度。我们的方法导致更简单的算法,并在模仿学习、离线、目标性、目标性、条件和不超光性RL方面实现更高的性功能。

Article 91

Title@2025-05-29 (4): Accelerating AllReduce with a Persistent Straggler

Title: Accelerating AllReduce with a Persistent Straggler

AllReduce mit einem persistenten Straggler beschleunigen

使用持久性斯特拉格驱动器加速全部拖动 2505.23523v1

Authors: Arjun Devraj, Eric Ding, Abhishek Vijaya Kumar, Robert Kleinberg, Rachee Singh

Distributed machine learning workloads use data and tensor parallelism for training and inference, both of which rely on the AllReduce collective to synchronize gradients or activations. However, bulk-synchronous AllReduce algorithms can be delayed by a persistent straggler that is slower to reach the synchronization barrier required to begin the collective. To address this challenge, we propose StragglAR: an AllReduce algorithm that accelerates distributed training and inference in the presence of persistent stragglers. StragglAR implements a ReduceScatter among the remaining GPUs during the straggler-induced delay, and then executes a novel collective algorithm to complete the AllReduce once the straggler reaches the synchronization barrier. StragglAR achieves a 2x theoretical speedup over popular bandwidth-efficient AllReduce algorithms (e.g., Ring) for large GPU clusters with persistent stragglers. On an 8-GPU server, our implementation of StragglAR yields a 22% speedup over state-of-the-art AllReduce algorithms.

分散的机器学习工作量在培训和推论方面使用数据和分解的平行法,两者都依靠 AllReduce 集体组合来同步梯度或激活。但是, 散装同步的全Reduce 算法可能会被一个持久性的分解器延缓, 而这种分解速度要慢到启动集体所需的同步屏障。为了应对这一挑战, 我们提议 StragglAR : 一种全Reduce 算法, 加速在持久性排挤者面前的分布式培训和推论。 StragglAR 在 strggler 引发的延缓期间, 在其余的 GPU 中实施一个减少分解器, 然后执行一种新的集体算法, 以在拖动器到达同步屏障后完成全Reduce 。 StragglAR 实现2x理论速度, 超过流行的带宽效率的全Reduce 算法( 如 Ring) 。在 8- GGPU 服务器上, 我们的 StragglAR 将产生一个超过 22% 的全局-Art- allRuedudes 算法。

Article 92

Title@2025-05-29 (4): Of Mice and Machines: A Comparison of Learning Between Real World Mice and RL Agents

Title: Of Mice and Machines: A Comparison of Learning Between Real World Mice and RL Agents

Von Mäusen und Maschinen: Ein Vergleich des Lernens zwischen Real World Mäusen und RL Agenten

Mice和Mings:真实世界Mice和RL代理商之间的学习比较 2505.12204v2

Authors: Shuo Han, German Espinosa, Junda Huang, Daniel A. Dombeck, Malcolm A. MacIver, Bradly C. Stadie

Recent advances in reinforcement learning (RL) have demonstrated impressive capabilities in complex decision-making tasks. This progress raises a natural question: how do these artificial systems compare to biological agents, which have been shaped by millions of years of evolution? To help answer this question, we undertake a comparative study of biological mice and RL agents in a predator-avoidance maze environment. Through this analysis, we identify a striking disparity: RL agents consistently demonstrate a lack of self-preservation instinct, readily risking ``death’’ for marginal efficiency gains. These risk-taking strategies are in contrast to biological agents, which exhibit sophisticated risk-assessment and avoidance behaviors. Towards bridging this gap between the biological and artificial, we propose two novel mechanisms that encourage more naturalistic risk-avoidance behaviors in RL agents. Our approach leads to the emergence of naturalistic behaviors, including strategic environment assessment, cautious path planning, and predator avoidance patterns that closely mirror those observed in biological systems.

在强化学习(RL)方面最近取得的进展表明,在复杂的决策任务方面,能力令人印象深刻。这一进展提出了一个自然的问题:这些人工系统如何与生物剂进行比较,生物剂是成百上千万年演变形成的?为了帮助回答这个问题,我们对捕食者避险的迷宫环境中的生物小鼠和RL剂进行了比较研究。我们通过这一分析发现了一个显著的差别:RL代理物一贯表明缺乏自我保护本能,很容易冒着“死亡”的风险来提高效率。这些冒险战略与生物剂不同,生物剂表现出复杂的风险评估和避免行为。为缩小生物剂与人工剂之间的这一差距,我们提出了两个新机制,鼓励生物剂中更自然的避免风险行为。我们的方法导致自然行为的出现,包括战略环境评估、谨慎的路径规划以及密切反映生物系统所观察到的捕食者避免模式。

Article 93

Title@2025-05-29 (4): An AI System for Continuous Knee Osteoarthritis Severity Grading Using Self-Supervised Anomaly Detection with Limited Data

Title: An AI System for Continuous Knee Osteoarthritis Severity Grading Using Self-Supervised Anomaly Detection with Limited Data

Ein KI-System für kontinuierliche Knie-Osteoarthritis Schweregraduierung mittels selbstüberwachter Anomalieerkennung mit begrenzten Daten

AI 使用有限数据的自超异常检测系统 2407.11500v2

Authors: Niamh Belton, Aonghus Lawlor, Kathleen M. Curran

The diagnostic accuracy and subjectivity of existing Knee Osteoarthritis (OA) ordinal grading systems has been a subject of on-going debate and concern. Existing automated solutions are trained to emulate these imperfect systems, whilst also being reliant on large annotated databases for fully-supervised training. This work proposes a three stage approach for automated continuous grading of knee OA that is built upon the principles of Anomaly Detection (AD); learning a robust representation of healthy knee X-rays and grading disease severity based on its distance to the centre of normality. In the first stage, SS-FewSOME is proposed, a self-supervised AD technique that learns the ‘normal’ representation, requiring only examples of healthy subjects and <3% of the labels that existing methods require. In the second stage, this model is used to pseudo label a subset of unlabelled data as ‘normal’ or ‘anomalous’, followed by denoising of pseudo labels with CLIP. The final stage involves retraining on labelled and pseudo labelled data using the proposed Dual Centre Representation Learning (DCRL) which learns the centres of two representation spaces; normal and anomalous. Disease severity is then graded based on the distance to the learned centres. The proposed methodology outperforms existing techniques by margins of up to 24% in terms of OA detection and the disease severity scores correlate with the Kellgren-Lawrence grading system at the same level as human expert performance. Code available at https://github.com/niamhbelton/SS-FewSOME_Disease_Severity_Knee_Osteoarthritis.

现有Knee Osteoarthrates(OA) 或dinal等级系统(OA) 的诊断准确性和主观性一直是持续辩论和关注的主题。现有的自动化解决方案经过培训,以效仿这些不完善的系统,同时依靠大型附加说明的数据库进行充分监督的培训。这项工作提出了基于异常检测(AD) 原则的膝盖 OA 自动连续定级的三阶段方法; 学习健康的膝盖X光和定级疾病严重程度的强有力表现。在第一阶段, 提出了SS- FewSOME, 一种自我监督的AD技术, 以学习“ 正常” 代表, 仅需要健康科目的实例和 < 3% 现有方法所需的标签。在第二阶段, 该模型用于将一组未贴标签的数据贴上“ 正常” 或“ 恶性 ” 标签, 并随后与 CLIP 的假标签进行分级。最后阶段涉及使用拟议的 DIML 内部研究中心(DCRL) 级学习“正常” 标准/ Ralderalalalalalalalbalalalalal 系统进行再培训。该模型用于当前正常的常规中心, 和历史中心。的常规检测中心, 以现有的标准和现有标准中心为常规。

Article 94

Title@2025-05-29 (4): SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning

Title: SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning

SimBa: Einfachheit Bias für das Skalieren von Parametern im Deep Reinforcement Learning

SimBA: 深强化学习中增强参数的简单比值 2410.09754v2

Authors: Hojoon Lee, Dongyoon Hwang, Donghu Kim, Hyunseung Kim, Jun Jet Tai, Kaushik Subramanian, Peter R. Wurman, Jaegul Choo, Peter Stone, Takuma Seno

Recent advances in CV and NLP have been largely driven by scaling up the number of network parameters, despite traditional theories suggesting that larger networks are prone to overfitting. These large networks avoid overfitting by integrating components that induce a simplicity bias, guiding models toward simple and generalizable solutions. However, in deep RL, designing and scaling up networks have been less explored. Motivated by this opportunity, we present SimBa, an architecture designed to scale up parameters in deep RL by injecting a simplicity bias. SimBa consists of three components: (i) an observation normalization layer that standardizes inputs with running statistics, (ii) a residual feedforward block to provide a linear pathway from the input to output, and (iii) a layer normalization to control feature magnitudes. By scaling up parameters with SimBa, the sample efficiency of various deep RL algorithms-including off-policy, on-policy, and unsupervised methods-is consistently improved. Moreover, solely by integrating SimBa architecture into SAC, it matches or surpasses state-of-the-art deep RL methods with high computational efficiency across DMC, MyoSuite, and HumanoidBench. These results demonstrate SimBa’s broad applicability and effectiveness across diverse RL algorithms and environments.

在CV和NLP中,尽管传统理论表明,较大的网络容易过于过度,但网络数量大,尽管有传统理论表明,更大的网络容易过度,但最近的进展在很大程度上是扩大网络参数的驱动力,尽管有传统理论表明,更大的网络容易过于适应。这些大型网络避免了过度的融合,将一些元素引致简单偏差,引导模式走向简单和普遍的解决办法;然而,在深入的RL中,设计和扩大网络规模的探索探索较少,但利用这个机会,我们提出SimBa这一旨在扩大深度RL的参数的架构,通过输入简单偏差,扩大网络和NLPRP的最近进展,我们提出SimBa。SimBa由三个组成部分组成:(一)观测正常化层,将投入标准化,与运行中统计标准化,投入标准化,(二)剩余的向前向前进块,从输入到产出的线性路径,(三)层,以控制特性大小的线路径,使控制特性大小的大小的尺寸正常化正常化。通过与SimL的参数、MMC、 MySUSite、宽的和人类的系统、宽的系统环境展示这些结果的结果显示结果。

Article 95

Title@2025-05-29 (4): OmniEarth-Bench: Towards Holistic Evaluation of Earth’s Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data

Title: OmniEarth-Bench: Towards Holistic Evaluation of Earth’s Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data

OmniEarth-Bench: Auf dem Weg zu einer ganzheitlichen Bewertung der sechs Sphären und der Wechselwirkungen zwischen der Erde und multimodalen Erddaten

Omni地球环境:争取全面评价地球六层和与多模式对地观测地球数据交互作用 2505.23522v1

Authors: Fengxiang Wang, Mingshuo Chen, Xuming He, YiFan Zhang, Feng Liu, Zijie Guo, Zhenghao Hu, Jiong Wang, Jingyi Xu, Zhangrui Li, Fenghua Ling, Ben Fei, Weijia Li, Long Lan, Wenjing Yang, Wenlong Zhang, Lei Bai

Existing benchmarks for Earth science multimodal learning exhibit critical limitations in systematic coverage of geosystem components and cross-sphere interactions, often constrained to isolated subsystems (only in Human-activities sphere or atmosphere) with limited evaluation dimensions (less than 16 tasks). To address these gaps, we introduce OmniEarth-Bench, the first comprehensive multimodal benchmark spanning all six Earth science spheres (atmosphere, lithosphere, Oceansphere, cryosphere, biosphere and Human-activities sphere) and cross-spheres with one hundred expert-curated evaluation dimensions. Leveraging observational data from satellite sensors and in-situ measurements, OmniEarth-Bench integrates 29,779 annotations across four tiers: perception, general reasoning, scientific knowledge reasoning and chain-of-thought (CoT) reasoning. This involves the efforts of 2-5 experts per sphere to establish authoritative evaluation dimensions and curate relevant observational datasets, 40 crowd-sourcing annotators to assist experts for annotations, and finally, OmniEarth-Bench is validated via hybrid expert-crowd workflows to reduce label ambiguity. Experiments on 9 state-of-the-art MLLMs reveal that even the most advanced models struggle with our benchmarks, where none of them reach 35\% accuracy. Especially, in some cross-spheres tasks, the performance of leading models like GPT-4o drops to 0.0\%. OmniEarth-Bench sets a new standard for geosystem-aware AI, advancing both scientific discovery and practical applications in environmental monitoring and disaster prediction. The dataset, source code, and trained models were released.

地球科学多式联运学习的现有基准显示,在系统覆盖地球系统组成部分和跨孔互动方面,存在严重的局限性,往往局限于评估层面有限的孤立子系统(仅在人类活动领域或大气中),评估层面有限(不到16项任务)。为了填补这些差距,我们引入了涵盖所有六个地球科学领域(大气、地圈、海洋、冰层、生物圈和人类活动领域)和交叉球体的首个综合多式联运基准OmniEarth-Bench,这是涵盖所有六个地球科学领域(大气、地圈、海洋、冰层、生物圈和人类活动领域)的第一个综合多式联运基准,具有100个专家精准的评价层面。利用卫星传感器和地表测量的观测数据,OmniEarth-Bench将29 779个说明整合到四个层面:感知、一般推理、科学知识推理和思维链推理(Cobni-Ben-Ben-Ben-Chen),这涉及每个领域2-5专家努力建立权威评价层面的多面评价维度和精确度评估范围,这些方面经过培训的IL-Ben-Ben-Ben-Ben-Ben-CS-S-S-S-S-S-S-S-LS-SLS-S-S-S-S-S-S-S-I-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SL-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SL-S-S-S-SL-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-

Article 96

Title@2025-05-29 (4): AnchorAttention: Difference-Aware Sparse Attention with Stripe Granularity

Title: AnchorAttention: Difference-Aware Sparse Attention with Stripe Granularity

AnkerAchtung: Differenz-Bewusst Sparse Achtung mit Streifen Granularität

锁定目标: 带条形颗粒的差别- 软件分散注意 2505.23520v1

Authors: Yu Zhang, Dong Guo, Fang Wu, Guoliang Zhu, Dian Ding, Yiming Zhang

Large Language Models (LLMs) with extended context lengths face significant computational challenges during the pre-filling phase, primarily due to the quadratic complexity of self-attention. Existing methods typically employ dynamic pattern matching and block-sparse low-level implementations. However, their reliance on local information for pattern identification fails to capture global contexts, and the coarse granularity of blocks leads to persistent internal sparsity, resulting in suboptimal accuracy and efficiency. To address these limitations, we propose \textbf{AnchorAttention}, a difference-aware, dynamic sparse attention mechanism that efficiently identifies critical attention regions at a finer stripe granularity while adapting to global contextual information, achieving superior speed and accuracy. AnchorAttention comprises three key components: (1) \textbf{Pattern-based Anchor Computation}, leveraging the commonalities present across all inputs to rapidly compute a set of near-maximum scores as the anchor; (2) \textbf{Difference-aware Stripe Sparsity Identification}, performing difference-aware comparisons with the anchor to quickly obtain discrete coordinates of significant regions in a stripe-like sparsity pattern; (3) \textbf{Fine-grained Sparse Computation}, replacing the traditional contiguous KV block loading approach with simultaneous discrete KV position loading to maximize sparsity rates while preserving full hardware computational potential. With its finer-grained sparsity strategy, \textbf{AnchorAttention} achieves higher sparsity rates at the same recall level, significantly reducing computation time. Compared to previous state-of-the-art methods, at a text length of 128k, it achieves a speedup of 1.44$\times$ while maintaining higher recall rates.

大型语言模型(LLMS) 使用时间长度较长的大型语言模型(LLMS) 在预填阶段面临巨大的计算挑战,这主要是由于自我注意的二次复杂性造成的。现有方法通常使用动态模式匹配和块状的低级别执行。但是,它们依赖本地信息进行模式识别无法捕捉全球背景, 块块的粗微颗粒导致持续的内部偏狭, 导致不优化的准确性和效率。为了应对这些限制, 我们提议了\ textbf{AnchorAttrant}, 一种认识差异的动态细微关注机制, 能够有效地识别精细条形颗粒的临界关注区域,同时适应全球背景信息, 实现更高速度和准确性。锚定部分由三个关键部分组成:(1) textbf{Partern- brocor Connorational commation} 利用所有投入的共性, 快速比较一组接近最大分数的锚值; (2) textborf{Def{de-defreal-awareather Stateal deparnial deparity) disal deal deal deal deal deal deal dal dal dal dislational dislation lax lating lating lating lating lax lating lax lax lax lax lax lax lating 和快速算。

Article 97

Title@2025-05-29 (4): Hyperspherical Normalization for Scalable Deep Reinforcement Learning

Title: Hyperspherical Normalization for Scalable Deep Reinforcement Learning

Hypersphärische Normalisierung für skalierbares Deep Reinforcement Learning

可缩放深强化学习超球常规化 2502.15280v2

Authors: Hojoon Lee, Youngdo Lee, Takuma Seno, Donghu Kim, Peter Stone, Jaegul Choo

Scaling up the model size and computation has brought consistent performance improvements in supervised learning. However, this lesson often fails to apply to reinforcement learning (RL) because training the model on non-stationary data easily leads to overfitting and unstable optimization. In response, we introduce SimbaV2, a novel RL architecture designed to stabilize optimization by (i) constraining the growth of weight and feature norm by hyperspherical normalization; and (ii) using a distributional value estimation with reward scaling to maintain stable gradients under varying reward magnitudes. Using the soft actor-critic as a base algorithm, SimbaV2 scales up effectively with larger models and greater compute, achieving state-of-the-art performance on 57 continuous control tasks across 4 domains. The code is available at https://dojeon-ai.github.io/SimbaV2.

扩大模型规模和计算使受监督的学习得到一致的绩效改进。然而,这一教训往往不适用于强化学习(RL),因为培训非静止数据模型很容易导致超常和不稳定的优化。作为回应,我们引入了SimbaV2,这是一个新的RL结构,旨在通过超球正常化来稳定优化,(一) 限制重量和特征规范的增长;以及(二) 使用分配价值估计,并给予一定的奖励,以维持不同奖励程度下的稳定梯度。利用软性行为者-批评作为基本算法,SimbaV2 有效地与更大的模型进行升级,并进行更大的计算,在4个领域的57项连续控制任务上实现最先进的业绩。该代码可在https://dojeon-ai.github.io/SimbaV2上查阅。

Article 98

Title@2025-05-29 (4): SGD Jittering: A Training Strategy for Robust and Accurate Model-Based Architectures

Title: SGD Jittering: A Training Strategy for Robust and Accurate Model-Based Architectures

SGD Jittering: Eine Schulungsstrategie für robuste und präzise modellbasierte Architekturen

SGD JGT JUGT JIGT: 强健和准确的建模建筑培训战略 2410.14667v2

Authors: Peimeng Guan, Mark A. Davenport

Inverse problems aim to reconstruct unseen data from corrupted or perturbed measurements. While most work focuses on improving reconstruction quality, generalization accuracy and robustness are equally important, especially for safety-critical applications. Model-based architectures (MBAs), such as loop unrolling methods, are considered more interpretable and achieve better reconstructions. Empirical evidence suggests that MBAs are more robust to perturbations than black-box solvers, but the accuracy-robustness tradeoff in MBAs remains underexplored. In this work, we propose a simple yet effective training scheme for MBAs, called SGD jittering, which injects noise iteration-wise during reconstruction. We theoretically demonstrate that SGD jittering not only generalizes better than the standard mean squared error training but is also more robust to average-case attacks. We validate SGD jittering using denoising toy examples, seismic deconvolution, and single-coil MRI reconstruction. Both SGD jittering and its SPGD extension yield cleaner reconstructions for out-of-distribution data and demonstrates enhanced robustness against adversarial attacks.

反面问题旨在从腐败或扰动测量中重建隐蔽数据。虽然大多数工作的重点是提高重建质量,但普遍化的准确性和稳健性同样重要,特别是对于安全关键应用而言。基于模型的建筑(MBAs),例如环状无滚动方法,被认为更容易解释,并实现更好的重建。经验性证据表明,MBAs比黑箱解决问题者更能进行扰动,但MBAs的准确性-紫色交易仍未得到充分探讨。在这项工作中,我们提出了一个简单而有效的MBAs培训计划,称为SGD振动,在重建过程中注入噪音。我们理论上证明SGD的振动不仅比标准的平均平方错误培训更好,而且比普通攻击更强。我们用不注意的玩具、地震变电和单原油MRI等例子来验证SGD振动。SGD的震动及其SPGD扩展为MD提供了更清洁的重建,用于向外分配数据,并显示对对抗性攻击的加强力度。

Article 99

Title@2025-05-29 (4): Joint Localization and Activation Editing for Low-Resource Fine-Tuning

Title: Joint Localization and Activation Editing for Low-Resource Fine-Tuning

Gemeinsame Lokalisierungs- und Aktivierungsbearbeitung für Low-Resource Fine-Tuning

低资源微调联合定位和启动编辑 2502.01179v4

Authors: Wen Lai, Alexander Fraser, Ivan Titov

Parameter-efficient fine-tuning (PEFT) methods, such as LoRA, are commonly used to adapt LLMs. However, the effectiveness of standard PEFT methods is limited in low-resource scenarios with only a few hundred examples. Recent advances in interpretability research have inspired the emergence of activation editing (or steering) techniques, which modify the activations of specific model components. Due to their extremely small parameter counts, these methods show promise for small datasets. However, their performance is highly dependent on identifying the correct modules to edit and often lacks stability across different datasets. In this paper, we propose Joint Localization and Activation Editing (JoLA), a method that jointly learns (1) which heads in the Transformer to edit (2) whether the intervention should be additive, multiplicative, or both and (3) the intervention parameters themselves - the vectors applied as additive offsets or multiplicative scalings to the head output. Through evaluations on three benchmarks spanning commonsense reasoning, natural language understanding, and natural language generation, we demonstrate that JoLA consistently outperforms existing methods. The code for the method is released at https://github.com/wenlai-lavine/jola.

在低资源情景中,标准的PEFT方法的效力有限,仅举几百个例子; 最近在可解释性研究方面的进展促使启动编辑(或指导)技术的出现,这些技术改变特定模型组件的启动。由于这些技术的参数数极小,这些方法显示了对小型数据集的希望。然而,这些方法的性能在很大程度上取决于如何确定编辑的正确模块,而且往往缺乏不同数据集之间的稳定性。在本文件中,我们提议联合定位和激活编辑(JoLA),这种方法共同学习(1) 变换器中头头要编辑(2) 干预是否应当添加、倍增、或同时和(3) 干预参数本身——矢量作为添加的抵消或倍增缩到主输出。我们通过对三个基准的评价,跨越了共同思维推理、自然语言理解和自然语言生成,证明JoLA 一贯地超越了现有方法。该方法的代码在https://github.lain/lain-laime.

Article 100

Title@2025-05-29 (4): DeepFilterGAN: A Full-band Real-time Speech Enhancement System with GAN-based Stochastic Regeneration

Title: DeepFilterGAN: A Full-band Real-time Speech Enhancement System with GAN-based Stochastic Regeneration

DeepFilterGAN: Ein Full-Band-Real-Time-Speech Enhancement-System mit GAN-basierter stochastischer Regeneration

DeepFilterGAN:全频实时语音增强系统,以GAN为基础进行蒸汽再生 2505.23515v1

Authors: Sanberk Serbest, Tijana Stojkovic, Milos Cernak, Andrew Harper

In this work, we propose a full-band real-time speech enhancement system with GAN-based stochastic regeneration. Predictive models focus on estimating the mean of the target distribution, whereas generative models aim to learn the full distribution. This behavior of predictive models may lead to over-suppression, i.e. the removal of speech content. In the literature, it was shown that combining a predictive model with a generative one within the stochastic regeneration framework can reduce the distortion in the output. We use this framework to obtain a real-time speech enhancement system. With 3.58M parameters and a low latency, our system is designed for real-time streaming with a lightweight architecture. Experiments show that our system improves over the first stage in terms of NISQA-MOS metric. Finally, through an ablation study, we show the importance of noisy conditioning in our system. We participated in 2025 Urgent Challenge with our model and later made further improvements.

在这项工作中,我们建议采用基于GAN的随机再生功能全波实时语音增强系统。预测模型侧重于估算目标分布的平均值, 而基因模型则旨在学习完全分布。这种预测模型的行为可能导致过度压抑, 即删除语音内容。在文献中, 文献显示, 将预测模型与随机再生框架内的基因模型相结合, 可以减少产出的扭曲。我们使用这个框架来获取实时语音增强系统。由于有3.58M参数和低耐久性, 我们的系统是设计用于使用轻量结构实时流的。实验显示,我们的系统在第一阶段里在新QA- MOS 衡量标准方面有所改进。最后, 我们通过模拟研究, 展示了我们系统中噪音调节的重要性。我们参加了2025年的紧急挑战, 并随后做了进一步的改进。

Article 101

Title@2025-05-29 (4): Spectrotemporal Modulation: Efficient and Interpretable Feature Representation for Classifying Speech, Music, and Environmental Sounds

Title: Spectrotemporal Modulation: Efficient and Interpretable Feature Representation for Classifying Speech, Music, and Environmental Sounds

Spektrotemporale Modulation: Effiziente und interpretierbare Feature-Darstellung für die Klassifizierung von Sprach-, Musik- und Umweltgeräuschen

时速变化:演讲、音乐和环境声音的分类化演讲、音乐和环境声音的高效和可解释的地物代表 2505.23509v1

Authors: Andrew Chang, Yike Li, Iran R. Roman, David Poeppel

Audio DNNs have demonstrated impressive performance on various machine listening tasks; however, most of their representations are computationally costly and uninterpretable, leaving room for optimization. Here, we propose a novel approach centered on spectrotemporal modulation (STM) features, a signal processing method that mimics the neurophysiological representation in the human auditory cortex. The classification performance of our STM-based model, without any pretraining, is comparable to that of pretrained audio DNNs across diverse naturalistic speech, music, and environmental sounds, which are essential categories for both human cognition and machine perception. These results show that STM is an efficient and interpretable feature representation for audio classification, advancing the development of machine listening and unlocking exciting new possibilities for basic understanding of speech and auditory sciences, as well as developing audio BCI and cognitive computing.

音频DNN在各种机器监听任务上表现出了令人印象深刻的表现;然而,它们的大部分陈述都是计算成本高且无法解释的,留下优化的空间。在这里,我们建议了一种以光谱时温特征为核心的新颖方法,一种模仿人类听觉皮层中神经生理特征的信号处理方法。我们基于STM的模型的分类性能,没有经过任何培训,与经过预先训练的来自各种自然学语言、音乐和环境声音的音频DNN的分类性能相似,而后者是人类认知和机器认知的基本类别。这些结果表明,STM是一种高效且可解释的音频分类特征代表,推动了机器监听的发展,为基本理解语音和听觉科学以及发展音频 BCI 和认知计算提供了令人兴奋的新的可能性。

Article 102

Title@2025-05-29 (4): Why Machine Learning Models Fail to Fully Capture Epistemic Uncertainty

Title: Why Machine Learning Models Fail to Fully Capture Epistemic Uncertainty

Warum Modelle des maschinellen Lernens die epistemische Unsicherheit nicht vollständig erfassen

机器学习模型为何不能完全捕捉宇宙的不确定性 2505.23506v1

Authors: Sebastián Jiménez, Mira Jürgens, Willem Waegeman

In recent years various supervised learning methods that disentangle aleatoric and epistemic uncertainty based on second-order distributions have been proposed. We argue that these methods fail to capture critical components of epistemic uncertainty, particularly due to the often-neglected component of model bias. To show this, we make use of a more fine-grained taxonomy of epistemic uncertainty sources in machine learning models, and analyse how the classical bias-variance decomposition of the expected prediction error can be decomposed into different parts reflecting these uncertainties. By using a simulation-based evaluation protocol which encompasses epistemic uncertainty due to both procedural- and data-driven uncertainty components, we illustrate that current methods rarely capture the full spectrum of epistemic uncertainty. Through theoretical insights and synthetic experiments, we show that high model bias can lead to misleadingly low estimates of epistemic uncertainty, and common second-order uncertainty quantification methods systematically blur bias-induced errors into aleatoric estimates, thereby underrepresenting epistemic uncertainty. Our findings underscore that meaningful aleatoric estimates are feasible only if all relevant sources of epistemic uncertainty are properly represented.

近些年来,提出了基于二阶分布的分解偏向性和共感性不确定性的各种受监督的学习方法。我们争辩说,这些方法未能捕捉成瘾性不确定性的关键组成部分,特别是由于模型偏向往往被忽略的成分。为了表明这一点,我们使用了一种在机器学习模型中更精细的成瘾性不确定性来源分类法,并分析了如何将预期的预测错误的典型偏差差异分解分解成反映这些不确定性的不同部分。我们通过使用模拟评价协议,包括基于程序和数据不确定性组成部分的共感性不确定性,我们说明目前的方法很少能捕捉成瘾性不确定性的全部范围。我们通过理论见解和合成实验,我们表明,高模型偏差可能导致对成瘾性不确定性的错误估计偏差,以及常见的二阶级不确定性量化方法,系统模糊偏差导致的误差,从而低估了这些不确定性。我们的调查结果强调,只有在所有有关的成瘾来源都得到适当体现的情况下,才可行。

Article 103

Title@2025-05-29 (4): Hijacking Large Language Models via Adversarial In-Context Learning

Title: Hijacking Large Language Models via Adversarial In-Context Learning

Entführen von großen Sprachmodellen über das adversarische In-Context-Lernen

通过对抗性内书学习劫持大语言模式 2311.09948v3

Authors: Xiangyu Zhou, Yao Qiang, Saleh Zare Zade, Prashant Khanduri, Dongxiao Zhu

In-context learning (ICL) has emerged as a powerful paradigm leveraging LLMs for specific downstream tasks by utilizing labeled examples as demonstrations (demos) in the preconditioned prompts. Despite its promising performance, crafted adversarial attacks pose a notable threat to the robustness of LLMs. Existing attacks are either easy to detect, require a trigger in user input, or lack specificity towards ICL. To address these issues, this work introduces a novel transferable prompt injection attack against ICL, aiming to hijack LLMs to generate the target output or elicit harmful responses. In our threat model, the hacker acts as a model publisher who leverages a gradient-based prompt search method to learn and append imperceptible adversarial suffixes to the in-context demos via prompt injection. We also propose effective defense strategies using a few shots of clean demos, enhancing the robustness of LLMs during ICL. Extensive experimental results across various classification and jailbreak tasks demonstrate the effectiveness of the proposed attack and defense strategies. This work highlights the significant security vulnerabilities of LLMs during ICL and underscores the need for further in-depth studies.

理论内学(ICL)已成为一种强有力的范例,通过在先决条件的提示下,利用标记的例子作为示范(演示),利用LLM执行具体的下游任务,使LLM发挥杠杆作用。尽管其表现令人充满希望,但精心策划的对抗性攻击对LLM的强健性构成了显著的威胁。现有的攻击要么容易发现,需要用户投入,或者对ICL缺乏具体性。为解决这些问题,这项工作引入了针对ICL的新型可转移的迅速注射攻击,目的是劫持LLMS,以产生目标产出或引起有害反应。在我们的威胁模式中,黑客充当了利用基于梯度的快速搜索方法学习和通过迅速注射将无法察觉的对抗性后遗症附在文本内演示中的一种示范出版商。我们还提出使用几支干净的演示镜头的有效防御战略,加强LLMs在ICL期间的强健性。各种分类和破监狱任务的广泛实验结果表明拟议的攻击和防御战略的有效性。这项工作突出了LMS在ICL期间的重大安全脆弱性,并强调需要进一步深入研究。

Article 104

Title@2025-05-29 (4): Epistemic Errors of Imperfect Multitask Learners When Distributions Shift

Title: Epistemic Errors of Imperfect Multitask Learners When Distributions Shift

Epistemische Fehler von unvollkommenen Multitask Learner bei Verteilungsverschiebungen

发行转移时不完美的多任务学习者 2505.23496v1

Authors: Sabina J. Sloman, Michele Caprio, Samuel Kaski

When data are noisy, a statistical learner’s goal is to resolve epistemic uncertainty about the data it will encounter at test-time, i.e., to identify the distribution of test (target) data. Many real-world learning settings introduce sources of epistemic uncertainty that can not be resolved on the basis of training (source) data alone: The source data may arise from multiple tasks (multitask learning), the target data may differ systematically from the source data tasks (distribution shift), and/or the learner may not arrive at an accurate characterization of the source data (imperfect learning). We introduce a principled definition of epistemic error, and provide a generic, decompositional epistemic error bound. Our error bound is the first to (i) consider epistemic error specifically, (ii) accommodate all the sources of epistemic uncertainty above, and (iii) separately attribute the error to each of multiple aspects of the learning procedure and environment. As corollaries of the generic result, we provide (i) epistemic error bounds specialized to the settings of Bayesian transfer learning and distribution shift within $\epsilon$-neighborhoods, and (ii) a set of corresponding generalization bounds. Finally, we provide a novel definition of negative transfer, and validate its insights in a synthetic experimental setting.

当数据过于吵闹时,统计学习者的目标是解决测试时将遇到的数据的隐含不确定性,即确定测试(目标)数据的分布。许多真实世界学习设置引入仅靠培训(源)数据无法解决的隐含不确定性来源:源数据可能来自多重任务(多任务学习),目标数据可能与源数据任务(分布转移)有系统差异,和/或学习者可能无法准确描述源数据(不完善学习)。我们引入了缩略错误的原则定义,并提供了一种通用的、异相的缩略出错误。我们的错误是(一) 具体考虑缩略出错误,(二) 容纳上述所有隐含不确定性来源,以及(三) 将错误与学习程序和环境的多个方面分别归结。作为一般结果的缩略图,我们提供了(一) 缩略图错误与Bayesian的负值设置、合成转移和最终的缩略图的缩略图(一) 提供我们Gayesian的缩略图, 的缩略图的缩图的缩略图。

Article 105

Title@2025-05-29 (4): Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Title: Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Diagnose und Bewältigung von Pitfalls in KG-RAG-Datensätzen: Zu zuverlässigerem Benchmarking

分析和处理KG-RAG数据集的缺陷:争取更可靠的基准 2505.23495v1

Authors: Liangliang Zhang, Zhuorui Jiang, Hongliang Chi, Haoyang Chen, Mohammed Elkoumy, Fali Wang, Qiong Wu, Zhengyi Zhou, Shirui Pan, Suhang Wang, Yao Ma

Knowledge Graph Question Answering (KGQA) systems rely on high-quality benchmarks to evaluate complex multi-hop reasoning. However, despite their widespread use, popular datasets such as WebQSP and CWQ suffer from critical quality issues, including inaccurate or incomplete ground-truth annotations, poorly constructed questions that are ambiguous, trivial, or unanswerable, and outdated or inconsistent knowledge. Through a manual audit of 16 popular KGQA datasets, including WebQSP and CWQ, we find that the average factual correctness rate is only 57 %. To address these issues, we introduce KGQAGen, an LLM-in-the-loop framework that systematically resolves these pitfalls. KGQAGen combines structured knowledge grounding, LLM-guided generation, and symbolic verification to produce challenging and verifiable QA instances. Using KGQAGen, we construct KGQAGen-10k, a ten-thousand scale benchmark grounded in Wikidata, and evaluate a diverse set of KG-RAG models. Experimental results demonstrate that even state-of-the-art systems struggle on this benchmark, highlighting its ability to expose limitations of existing models. Our findings advocate for more rigorous benchmark construction and position KGQAGen as a scalable framework for advancing KGQA evaluation.

知识图表解答系统(KGQA)依靠高质量的基准来评价复杂的多点推理,然而,尽管这些系统得到广泛使用,但广受欢迎的数据集,如WebQSP和CWQ等,却存在关键性的质量问题,包括不准确或不完整的地面真相说明、结构不当的问题模糊、微不足道或无法回答、过时或不一致。通过对16个广受欢迎的KGQA数据集,包括WebQSP和CWQ进行人工审计,我们发现平均事实正确率仅为57 % 。为了解决这些问题,我们引入了KGQAGen,这是一个系统地解决这些陷阱的LLM-loop框架。KGAG将结构化的知识基础、LLM-指导的生成和象征性的核查结合起来,以产生具有挑战性和可核查的QA实例。我们用KQGAG建立以维基数据为基础的十倍和规模基准基准,并评价一套不同的KG-RAG模型。实验结果显示,甚至将KGG的更严格的能力定位定位定位到KGA的模型。

Article 106

Title@2025-05-29 (4): Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning

Title: Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning

Kurzbefehle in audio-visuellen Deepfake-Erkennungsdatensätzen mit unüberwachtem Lernen

在未经监督的学习的视听深假发现数据集中绕过捷径 2412.00175v3

Authors: Stefan Smeu, Dragos-Alexandru Boldisor, Dan Oneata, Elisabeta Oneata

Good datasets are essential for developing and benchmarking any machine learning system. Their importance is even more extreme for safety critical applications such as deepfake detection - the focus of this paper. Here we reveal that two of the most widely used audio-video deepfake datasets suffer from a previously unidentified spurious feature: the leading silence. Fake videos start with a very brief moment of silence and based on this feature alone, we can separate the real and fake samples almost perfectly. As such, previous audio-only and audio-video models exploit the presence of silence in the fake videos and consequently perform worse when the leading silence is removed. To circumvent latching on such unwanted artifact and possibly other unrevealed ones we propose a shift from supervised to unsupervised learning by training models exclusively on real data. We show that by aligning self-supervised audio-video representations we remove the risk of relying on dataset-specific biases and improve robustness in deepfake detection.

良好的数据集对于开发和设定任何机器学习系统的基准至关重要。它们的重要性对于安全关键应用来说甚至更为极端, 比如深假检测 — — 本文的焦点。我们在这里揭示了两个最广泛使用的音像深假数据集存在一个先前不明的虚假特征: 领先的沉默。假的视频从一个非常短暂的沉默开始, 仅以这一特征为基础, 我们就可以完美地分离真实和假的样本。因此, 以前的只听音和视频模型利用了假的视频中的沉默, 从而在消除主要沉默时表现得更差。为了绕过对此类不需要的文物和其他可能未销毁的文物的悬念, 我们提议从仅靠真实数据的培训模型进行监管的学习转向不受监督的学习。我们表明,通过调整自我监控的视频演示,我们可以消除依赖数据集特定偏差的风险, 并改进深底片探测的稳健性。

Article 107

Title@2025-05-29 (4): A False Discovery Rate Control Method Using a Fully Connected Hidden Markov Random Field for Neuroimaging Data

Title: A False Discovery Rate Control Method Using a Fully Connected Hidden Markov Random Field for Neuroimaging Data

Eine falsche Discovery Rate Control-Methode mit einem vollständig verbundenen versteckten Markov Random Field für Neuroimaging-Daten

假发现率控制方法, 使用完全连接的隐藏 Markov 随机字段来生成 Neuroimage 数据 2505.20688v2

Authors: Taehyo Kim, Qiran Jia, Mony J. de Leon, Hai Shu

False discovery rate (FDR) control methods are essential for voxel-wise multiple testing in neuroimaging data analysis, where hundreds of thousands or even millions of tests are conducted to detect brain regions associated with disease-related changes. Classical FDR control methods (e.g., BH, q-value, and LocalFDR) assume independence among tests and often lead to high false non-discovery rates (FNR). Although various spatial FDR control methods have been developed to improve power, they still fall short of jointly addressing three major challenges in neuroimaging applications: capturing complex spatial dependencies, maintaining low variability in both false discovery proportion (FDP) and false non-discovery proportion (FNP) across replications, and achieving computational scalability for high-resolution data. To address these challenges, we propose fcHMRF-LIS, a powerful, stable, and scalable spatial FDR control method for voxel-wise multiple testing. It integrates the local index of significance (LIS)-based testing procedure with a novel fully connected hidden Markov random field (fcHMRF) designed to model complex spatial structures using a parsimonious parameterization. We develop an efficient expectation-maximization algorithm incorporating mean-field approximation, the Conditional Random Fields as Recurrent Neural Networks (CRF-RNN) technique, and permutohedral lattice filtering, reducing the time complexity from quadratic to linear in the number of tests. Extensive simulations demonstrate that fcHMRF-LIS achieves accurate FDR control, lower FNR, reduced variability in FDP and FNP, and a higher number of true positives compared to existing methods. Applied to an FDG-PET dataset from the Alzheimer’s Disease Neuroimaging Initiative, fcHMRF-LIS identifies neurobiologically relevant brain regions and offers notable advantages in computational efficiency.

假发现率( FDR) 控制方法对于神经成像数据分析中以xoxel 方式进行多重测试至关重要。在神经成像数据分析中,要检测与疾病相关变化有关的大脑区域,就必须进行数十万甚至数百万次的测试。经典FDR控制方法(如BH、q-价值和地方FDR)在测试中具有独立性,并常常导致高虚假的非发现率(FNR )。虽然已经开发了各种空间FDR控制方法来提高功率,但它们仍然不能共同应对神经成像应用中的三大挑战:获取复杂的空间依赖性,在错误发现比例(FDP)和虚假非发现比例(FNP)之间保持低变异性;实现高清晰度的计算率(FDRF-RF) 测试。我们建议FCRFRF-RD控制方法的当前正值、稳定且可缩缩放的FDRFRF-RFRDR 数据控制方法,它将本地重要指数(LIS) 测试程序与新完全连接的隐蔽随机字段随机字段随机字段(fMRIS) 和直径(fMRFRFRFRR) 的直径直径变变,将一个高的智能数据结构的智能数据结构的智能数据结构的模型的模型的模型,它能能的模型的模型的快速变现显示的模型的模型,它能能能能的模型的模型的模型的模型的模型显示的模型的模型结构变现变现的模型结构。

Article 108

Title@2025-05-29 (4): Learning to Poison Large Language Models for Downstream Manipulation

Title: Learning to Poison Large Language Models for Downstream Manipulation

Große Sprachmodelle für Downstream-Manipulation zu vergiften

学习下游操作毒物大语言模式 2402.13459v3

Authors: Xiangyu Zhou, Yao Qiang, Saleh Zare Zade, Mohammad Amin Roshani, Prashant Khanduri, Douglas Zytko, Dongxiao Zhu

The advent of Large Language Models (LLMs) has marked significant achievements in language processing and reasoning capabilities. Despite their advancements, LLMs face vulnerabilities to data poisoning attacks, where the adversary inserts backdoor triggers into training data to manipulate outputs. This work further identifies additional security risks in LLMs by designing a new data poisoning attack tailored to exploit the supervised fine-tuning (SFT) process. We propose a novel gradient-guided backdoor trigger learning (GBTL) algorithm to identify adversarial triggers efficiently, ensuring an evasion of detection by conventional defenses while maintaining content integrity. Through experimental validation across various language model tasks, including sentiment analysis, domain generation, and question answering, our poisoning strategy demonstrates a high success rate in compromising various LLMs’ outputs. We further propose two defense strategies against data poisoning attacks, including in-context learning (ICL) and continuous learning (CL), which effectively rectify the behavior of LLMs and significantly reduce the decline in performance. Our work highlights the significant security risks present during SFT of LLMs and the necessity of safeguarding LLMs against data poisoning attacks.

大语言模型(LLMS)的出现在语言处理和推理能力方面取得了显著成就。尽管取得了进步,LLMS面临数据中毒袭击的脆弱性,因为对手将后门触发器插入培训数据以操纵产出。这项工作进一步确定了LMS的额外安全风险,为此设计了新的数据中毒袭击,专门利用监管的微调(SFT)程序。我们提出了一个新的梯度引导后门触发学习算法,以有效识别对抗性触发器,确保常规防御在保持内容完整性的同时逃避发现。通过对各种语言模型任务(包括情绪分析、域生成和回答问题)的实验性验证,我们的中毒战略在损害LMS的各种产出方面表现出高成功率。我们进一步提出了两种防范数据中毒袭击的防御战略,包括文体内学习(ICL)和持续学习(CLF),以有效纠正LMs的行为并显著降低性能下降。我们的工作突出了SFTM公司在维持内容完整性方面所面临的重大安全风险,以及保护LMS公司免受数据中毒袭击的必要性。

Article 109

Title@2025-05-29 (4): SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training

Title: SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training

SGD als Freie Energie Minimierung: Ein thermodynamischer Blick auf neurales Netzwerktraining

SGD作为自由能源最小化:关于神经网络培训的热动力学观点 2505.23489v1

Authors: Ildus Sadrtdinov, Ivan Klimov, Ekaterina Lobacheva, Dmitry Vetrov

We present a thermodynamic interpretation of the stationary behavior of stochastic gradient descent (SGD) under fixed learning rates (LRs) in neural network training. We show that SGD implicitly minimizes a free energy function $F=U-TS$, balancing training loss $U$ and the entropy of the weights distribution $S$, with temperature $T$ determined by the LR. This perspective offers a new lens on why high LRs prevent training from converging to the loss minima and how different LRs lead to stabilization at different loss levels. We empirically validate the free energy framework on both underparameterized (UP) and overparameterized (OP) models. UP models consistently follow free energy minimization, with temperature increasing monotonically with LR, while for OP models, the temperature effectively drops to zero at low LRs, causing SGD to minimize the loss directly and converge to an optimum. We attribute this mismatch to differences in the signal-to-noise ratio of stochastic gradients near optima, supported by both a toy example and neural network experiments.

我们对神经网络培训中固定学习率(LRs)下的悬浮梯度下降(SGD)的固定行为进行热力学解释。我们表明,SGD隐含地最大限度地减少免费能源功能$F=U-TS$,平衡培训损失美元和重量分布的酶值S$的平衡,温度由LR确定。这个角度提供了一个新的透视点,说明高LD为何阻止培训与损失迷你相融合,以及不同LD如何在不同损失水平上稳定。我们从经验中验证了测量不足(UP)和超分度(OP)模型的免费能源框架。UPD模型始终遵循免费能源最小化,温度与LR单调,而对于OP模型来说,温度在低LR值时实际下降到零,使SGD直接将损失降至最小,并趋于最佳。我们将此错配对称,这归因于在选择系统附近的信号到噪音梯度梯度梯度梯度比率的差异,同时得到一个实例和神经网络实验的支持。

Article 110

Title@2025-05-29 (4): Federated Granger Causality Learning for Interdependent Clients with State Space Representation

Title: Federated Granger Causality Learning for Interdependent Clients with State Space Representation

Föderiertes Granger-Causality-Lernen für interdependente Kunden mit staatlicher Raumdarstellung

为具有国家空间代表制的相互依存客户提供 2501.13890v4

Authors: Ayush Mohanty, Nazal Mohamed, Paritosh Ramanan, Nagi Gebraeel

Advanced sensors and IoT devices have improved the monitoring and control of complex industrial enterprises. They have also created an interdependent fabric of geographically distributed process operations (clients) across these enterprises. Granger causality is an effective approach to detect and quantify interdependencies by examining how one client’s state affects others over time. Understanding these interdependencies captures how localized events, such as faults and disruptions, can propagate throughout the system, possibly causing widespread operational impacts. However, the large volume and complexity of industrial data pose challenges in modeling these interdependencies. This paper develops a federated approach to learning Granger causality. We utilize a linear state space system framework that leverages low-dimensional state estimates to analyze interdependencies. This addresses bandwidth limitations and the computational burden commonly associated with centralized data processing. We propose augmenting the client models with the Granger causality information learned by the server through a Machine Learning (ML) function. We examine the co-dependence between the augmented client and server models and reformulate the framework as a standalone ML algorithm providing conditions for its sublinear and linear convergence rates. We also study the convergence of the framework to a centralized oracle model. Moreover, we include a differential privacy analysis to ensure data security while preserving causal insights. Using synthetic data, we conduct comprehensive experiments to demonstrate the robustness of our approach to perturbations in causality, the scalability to the size of communication, number of clients, and the dimensions of raw data. We also evaluate the performance on two real-world industrial control system datasets by reporting the volume of data saved by decentralization.

先进的传感器和IoT装置改善了对复杂工业企业的监测和控制,它们也形成了这些企业之间地理分布的流程操作(客户)的相互依存结构; 危险的因果关系是一种有效的方法,通过审查一个客户的状态如何长期影响他人来检测和量化相互依存关系; 理解这些相互依存关系可以捕捉整个系统如何传播本地事件,例如故障和干扰,从而可能造成广泛的业务影响; 然而,工业数据的数量和复杂性在模拟这些相互依存关系方面构成了挑战。本文开发了一种在地理上分布的流程操作(客户)的相互依存结构; 我们利用一个直线式的产业空间系统框架,利用低维度状态估计数分析相互依存关系。这解决了带宽限制和计算负担,通常与集中数据处理相联系。我们建议利用服务器通过机器学习(ML)功能学习(ML)所学所学的 “ 重大因果关系 “ 信息来增强客户模式和服务器模型之间的相互依存关系,并将框架重新配置为独立的ML算法,为其次线性和线性因果关系提供了条件。我们还利用一个在线性状态估算数据整合率进行在线性评估。我们还通过使用一个数据分析系统来进行数据整合数据整合数据分析,我们用一个数据分析, 将数据质量分析。

Article 111

Title@2025-05-29 (4): TimePoint: Accelerated Time Series Alignment via Self-Supervised Keypoint and Descriptor Learning

Title: TimePoint: Accelerated Time Series Alignment via Self-Supervised Keypoint and Descriptor Learning

TimePoint: Beschleunigte Zeitreihenausrichtung über selbstüberwachtes Keypoint- und Descriptor-Lernen

时间点:通过自上调关键点和描述学习加速时间序列调整 2505.23475v1

Authors: Ron Shapira Weber, Shahar Ben Ishay, Andrey Lavrinenko, Shahaf E. Finder, Oren Freifeld

Fast and scalable alignment of time series is a fundamental challenge in many domains. The standard solution, Dynamic Time Warping (DTW), struggles with poor scalability and sensitivity to noise. We introduce TimePoint, a self-supervised method that dramatically accelerates DTW-based alignment while typically improving alignment accuracy by learning keypoints and descriptors from synthetic data. Inspired by 2D keypoint detection but carefully adapted to the unique challenges of 1D signals, TimePoint leverages efficient 1D diffeomorphisms, which effectively model nonlinear time warping, to generate realistic training data. This approach, along with fully convolutional and wavelet convolutional architectures, enables the extraction of informative keypoints and descriptors. Applying DTW to these sparse representations yield major speedups and typically higher alignment accuracy than standard DTW applied to the full signals. TimePoint demonstrates strong generalization to real-world time series when trained solely on synthetic data, and further improves with fine-tuning on real data. Extensive experiments demonstrate that TimePoint consistently achieves faster and more accurate alignments than standard DTW, making it a scalable solution for time-series analysis. Our code is available at https://github.com/BGU-CS-VIL/TimePoint

在许多领域,时间序列的快速和可伸缩一致是一个根本性的挑战。标准解决方案,即动态时间扭曲(DTW),与不易缩放和对噪音的敏感度进行斗争。我们引入了时间定位(TimePoint),这是一个自我监督的方法,它大大加快了DTW的调整速度,同时通过学习合成数据的关键点和描述器来提高调整准确性。受2D关键点探测的启发,但经过仔细调整,以适应1D信号的独特挑战。TimePoint利用高效的1D二变形(它们有效地模拟非线性时间扭曲)来生成现实的培训数据。这个方法,加上全面进化和波盘旋的组合结构,使得能够提取信息化关键点和描述器。将DTW应用于这些稀疏少的表示方式可以产生重大加速,通常比对完整信号应用的标准DTW更精确。时间序列显示在仅接受合成数据培训时对现实世界时间序列进行强烈的概括化,并且通过对真实数据进行微调进一步改进。广泛的实验表明,Timerpoint 能够持续实现比标准的更快和更加精确的校正的校准。

Article 112

Title@2025-05-29 (4): Refining Labeling Functions with Limited Labeled Data

Title: Refining Labeling Functions with Limited Labeled Data

Verfeinerung von Beschriftungsfunktionen mit begrenzten beschrifteten Daten

用有限标签数据改进标签功能 2505.23470v1

Authors: Chenjie Li, Amir Gilad, Boris Glavic, Zhengjie Miao, Sudeepa Roy

Programmatic weak supervision (PWS) significantly reduces human effort for labeling data by combining the outputs of user-provided labeling functions (LFs) on unlabeled datapoints. However, the quality of the generated labels depends directly on the accuracy of the LFs. In this work, we study the problem of fixing LFs based on a small set of labeled examples. Towards this goal, we develop novel techniques for repairing a set of LFs by minimally changing their results on the labeled examples such that the fixed LFs ensure that (i) there is sufficient evidence for the correct label of each labeled datapoint and (ii) the accuracy of each repaired LF is sufficiently high. We model LFs as conditional rules which enables us to refine them, i.e., to selectively change their output for some inputs. We demonstrate experimentally that our system improves the quality of LFs based on surprisingly small sets of labeled datapoints.

方案薄弱监督(PWS)通过将用户提供的标签功能(LF)的产出与未贴标签的数据点结合起来,大大减少了人类在标签数据上的努力。然而,产生的标签的质量直接取决于LF的准确性。在这项工作中,我们根据一小组贴标签的例子来研究确定LF的问题。为实现这一目标,我们开发了修复一组LF的新技术,在标签例子上尽可能地改变其结果,以确保(一) 有足够的证据证明每个标签数据点的正确标签;(二) 每一所修理的LF的准确性足够高。我们将LF作为有条件规则,使我们能够改进这些规则,即有选择地改变其产出,用于某些投入。我们实验性地表明,我们的系统根据惊人的小组标签数据点改进了LF的质量。

Article 113

Title@2025-05-29 (4): Surveying the space of descriptions of a composite system with machine learning

Title: Surveying the space of descriptions of a composite system with machine learning

Vermessung des Raumes der Beschreibungen eines Verbundsystems mit maschinellem Lernen

勘查机器学习综合系统说明的空间 2411.18579v2

Authors: Kieran A. Murphy, Yujing Zhang, Dani S. Bassett

Multivariate information theory provides a general and principled framework for understanding how the components of a complex system are connected. Existing analyses are coarse in nature – built up from characterizations of discrete subsystems – and can be computationally prohibitive. In this work, we propose to study the continuous space of possible descriptions of a composite system as a window into its organizational structure. A description consists of specific information conveyed about each of the components, and the space of possible descriptions is equivalent to the space of lossy compression schemes of the components. We introduce a machine learning framework to optimize descriptions that extremize key information theoretic quantities used to characterize organization, such as total correlation and O-information. Through case studies on spin systems, sudoku boards, and letter sequences from natural language, we identify extremal descriptions that reveal how system-wide variation emerges from individual components. By integrating machine learning into a fine-grained information theoretic analysis of composite random variables, our framework opens a new avenues for probing the structure of real-world complex systems.

多变量信息理论为理解复杂系统的各个组成部分是如何连接起来提供了一个一般性和原则性的框架。现有的分析性质粗糙,是由离散子子系统的特征所建立,而且可能无法进行计算。在这项工作中,我们提议研究合成系统作为窗口进入其组织结构的可能描述的连续空间。描述包含就每个组成部分传递的具体信息,而可能的描述空间相当于各组成部分损失压缩计划的空间。我们引入了一个机器学习框架,优化描述,将用于描述组织特征的关键信息理论量(如总相关性和O-信息)加以扩展。通过对旋转系统、苏杜库板和自然语言字母序列的案例研究,我们确定了显示各个组成部分如何出现全系统差异的极端描述。通过将机器学习合成随机变量的精细信息理论分析,我们的框架为验证现实世界复杂系统的结构开辟了新的途径。

Article 114

Title@2025-05-29 (4): Retrieval Visual Contrastive Decoding to Mitigate Object Hallucinations in Large Vision-Language Models

Title: Retrieval Visual Contrastive Decoding to Mitigate Object Hallucinations in Large Vision-Language Models

Retrieval Visuelle Kontrastive Dekodierung zu Mitigate-Objekt-Halluzinationen in großen Vision-Sprachen-Modellen

在大型视觉-语言模型中,将检索视觉对抗性脱钩作为稀释物体幻觉的大型视觉-语言模型 2505.20569v2

Authors: Jihoon Lee, Min Song

Despite significant advancements in Large Vision-Language Models, Object Hallucination (OH) remains a persistent challenge. Building upon prior studies on contrastive decoding that address this issue without requiring additional model training, we introduce RVCD (Retrieval Visual Contrastive Decoding), an advanced method to suppress OH. RVCD leverages both negative and positive images at the logit level, explicitly referencing AI-generated images designed to represent a single concept. Our approach demonstrates substantial improvements over existing decoding-based methods.

尽管大型视觉语言模型(OH)取得了显著进步,但目标幻化(OH)仍然是一个长期挑战。我们以前曾研究过反比解码方法,在不需要额外示范培训的情况下解决这一问题。我们引入了RVCD(RVCD),这是抑制OH的先进方法。RVCD在登录层面利用了负面和正面图像,明确引用了AI生成的图像,目的是代表一个单一的概念。我们的方法表明,与现有的解码方法相比,有了很大的改进。

Article 115

Title@2025-05-29 (4): A Tutorial on Meta-Reinforcement Learning

Title: A Tutorial on Meta-Reinforcement Learning

Ein Tutorial zum Meta-Reinforcement-Lernen

关于元加强学习的教学材料 2301.08028v4

Authors: Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson

While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible. In this survey, we describe the meta-RL problem setting in detail as well as its major variations. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. Using these clusters, we then survey meta-RL algorithms and applications. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.

虽然深层强化学习(RL)在机器学习方面催生了众多引人注目的成功,但是由于广泛采用这种学习方法,其数据效率往往很差,而且其所产生政策的普遍性有限,因此受到阻碍。一个有希望的减轻这些限制的方法是,将更好的RL算法发展成一个机器学习问题本身,在称为Met-RL的进程中,这是一种机器学习问题本身。Meta-RL最常在一种问题环境下进行研究,因为考虑到任务的分配,我们的目标是学习一种能够适应任务分配所产生的任何新任务的政策,尽可能少有数据。我们在这次调查中详细描述了元-RL问题设置及其主要变异。我们讨论如何在高层次上根据任务分配和每项任务可用的学习预算进行元-RL研究。我们利用这些组,然后调查元-RL的算法和应用。我们最后通过介绍在为深层RL执业者提供标准工具箱的元-RL部分的道路上的公开问题。

Article 116

Title@2025-05-29 (4): Agentic Knowledgeable Self-awareness

Title: Agentic Knowledgeable Self-awareness

Agentisch sachkundiges Selbstbewußtsein

A. 动态知识自觉意识 2504.03553v2

Authors: Shuofei Qiao, Zhisong Qiu, Baochang Ren, Xiaobin Wang, Xiangyuan Ru, Ningyu Zhang, Xiang Chen, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

Large Language Models (LLMs) have achieved considerable performance across various agentic planning tasks. However, traditional agent planning approaches adopt a “flood irrigation” methodology that indiscriminately injects gold trajectories, external feedback, and domain knowledge into agent models. This practice overlooks the fundamental human cognitive principle of situational self-awareness during decision-making-the ability to dynamically assess situational demands and strategically employ resources during decision-making. We propose agentic knowledgeable self-awareness to address this gap, a novel paradigm enabling LLM-based agents to autonomously regulate knowledge utilization. Specifically, we propose KnowSelf, a data-centric approach that applies agents with knowledgeable self-awareness like humans. Concretely, we devise a heuristic situation judgement criterion to mark special tokens on the agent’s self-explored trajectories for collecting training data. Through a two-stage training process, the agent model can switch between different situations by generating specific special tokens, achieving optimal planning effects with minimal costs. Our experiments demonstrate that KnowSelf can outperform various strong baselines on different tasks and models with minimal use of external knowledge. Code is available at https://github.com/zjunlp/KnowSelf.

大型语言模型(LLMS)在各种代理规划任务中取得了相当大的成绩,然而,传统代理规划方法采用了一种“洪水灌溉”方法,不加区别地将金轨、外部反馈和领域知识注入代理模型中,这种做法忽略了在决策过程中对情况自我认识的基本人类认知原则,即动态地评估形势需求和在决策中战略性地利用资源的能力。我们提出了一种具有代理知识的自我意识来解决这一差距的新模式,使以LLM为基础的代理能够自主地规范知识的利用。具体地说,我们提出了一种以数据为中心的方法,将具有了解情况的自我认识的代理人应用到像人类那样有知识的自我意识的代理人。具体地说,我们设计了一种超常状况判断标准,以标志该代理人收集培训数据的自我探索轨迹的特殊标志。通过两阶段的培训过程,该代理模型可以在不同情况之间转换,产生特定的特殊标志,以最低的成本实现最佳的规划效果。我们的实验表明,“了解自我”可以超越不同任务和模型上的各种强的基线,而很少使用外部知识。《准则》可在 https://gimb/commus.

Article 117

Title@2025-05-29 (4): Pessimism Principle Can Be Effective: Towards a Framework for Zero-Shot Transfer Reinforcement Learning

Title: Pessimism Principle Can Be Effective: Towards a Framework for Zero-Shot Transfer Reinforcement Learning

Pessimismus-Prinzip kann wirksam sein: Auf dem Weg zu einem Rahmen für Null-Shot-Transfer-Verstärkungs-Lernen

悲观主义原则可以有效:建立一个零热转移强化学习框架 2505.18447v2

Authors: Chi Zhang, Ziying Jia, George K. Atia, Sihong He, Yue Wang

Transfer reinforcement learning aims to derive a near-optimal policy for a target environment with limited data by leveraging abundant data from related source domains. However, it faces two key challenges: the lack of performance guarantees for the transferred policy, which can lead to undesired actions, and the risk of negative transfer when multiple source domains are involved. We propose a novel framework based on the pessimism principle, which constructs and optimizes a conservative estimation of the target domain’s performance. Our framework effectively addresses the two challenges by providing an optimized lower bound on target performance, ensuring safe and reliable decisions, and by exhibiting monotonic improvement with respect to the quality of the source domains, thereby avoiding negative transfer. We construct two types of conservative estimations, rigorously characterize their effectiveness, and develop efficient distributed algorithms with convergence guarantees. Our framework provides a theoretically sound and practically robust solution for transfer learning in reinforcement learning.

加强转让学习的目的是,通过利用相关来源领域的大量数据,为数据有限的目标环境制定接近最佳的政策,但面临两个主要挑战:对转移的政策缺乏业绩保障,这可能导致不理想的行动,以及在涉及多个来源领域时出现负转移的风险。我们提议基于悲观原则的新框架,该新框架构建并优化对目标领域绩效的保守估计。我们的框架有效地应对了两个挑战,对目标业绩提供了最优化的较低约束,确保了安全可靠的决定,并展示了源领域质量方面的单一改进,从而避免了负面转移。我们构建了两种保守的估算,严格地描述其有效性,并制定了具有趋同保证的高效分布算法。我们的框架为在强化学习中转让学习提供了一种理论上健全和切实可靠的解决方案。

Article 118

Title@2025-05-29 (4): LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection

Title: LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection

LENSLLM: Enthüllen von Feintuning-Dynamik für die LLM-Auswahl

LENSLLLM: 用于选择LLM的连续精细调整动态 2505.03793v2

Authors: Xinyue Zeng, Haohui Wang, Junhong Lin, Jun Wu, Tyler Cody, Dawei Zhou

The proliferation of open-sourced Large Language Models (LLMs) and diverse downstream tasks necessitates efficient model selection, given the impracticality of fine-tuning all candidates due to computational constraints. Despite the recent advances in LLM selection, a fundamental research question largely remains nascent: how can we model the dynamic behaviors of LLMs during fine-tuning, thereby enhancing our understanding of their generalization performance across diverse downstream tasks? In this work, we propose a novel theoretical framework that provides a proper lens to assess the generalization capabilities of LLMs, thereby enabling accurate and efficient LLM selection for downstream applications. In particular, we first derive a PAC-Bayesian Generalization Bound that unveils fine-tuning dynamics of LLMs and then introduce LENSLLM, a Neural Tangent Kernel (NTK)-based Rectified Scaling Model that enables accurate performance predictions across diverse tasks while maintaining computational efficiency. Extensive empirical results on 3 large-scale benchmarks demonstrate that our model achieves up to 91.1% accuracy and reduces up to 88.5% computational cost in LLM selection, outperforming 5 state-of-the-art methods. We open-source our proposed LENSLLM model and corresponding results at LensLLM.io.

开放源码大语言模型(LLMS)和多种下游任务的扩散要求高效的模型选择,因为由于计算限制,对所有候选人进行微调不切实际。尽管LLM的选择最近有所进展,但一个基本研究问题基本上仍然新生:我们如何在微调中模拟LLM的动态行为,从而增进我们对不同下游任务一般表现的理解?在这项工作中,我们提出了一个新的理论框架,为评估LLMS的普及能力提供一个适当的透镜,从而能够准确和高效地选择下游应用的LLM。特别是,我们首先得出一个PAC-Bayesian通用圈,它揭示LLMS的微调动态,然后采用以NENSLLM(NTK)为基础的重新定位模型,它能够准确预测不同任务的业绩,同时保持计算效率。关于3个大规模基准的广泛经验结果显示,我们的模型在LM选择中达到91.1%的准确度,并将计算成本降低到88.5%。我们提议的LENSLLLLLLLA和相应结果。

Article 119

Title@2025-05-29 (4): Broadband Ground Motion Synthesis by Diffusion Model with Minimal Condition

Title: Broadband Ground Motion Synthesis by Diffusion Model with Minimal Condition

Broadband Ground Motion Synthese durch Diffusion Modell mit minimalem Zustand

以最小条件传播模型进行宽带地面移动合成 2412.17333v2

Authors: Jaeheun Jung, Jaehyuk Lee, Changhae Jung, Hanyoung Kim, Bosung Jung, Donghun Lee

Shock waves caused by earthquakes can be devastating. Generating realistic earthquake-caused ground motion waveforms help reducing losses in lives and properties, yet generative models for the task tend to generate subpar waveforms. We present High-fidelity Earthquake Groundmotion Generation System (HEGGS) and demonstrate its superior performance using earthquakes from North American, East Asian, and European regions. HEGGS exploits the intrinsic characteristics of earthquake dataset and learns the waveforms using an end-to-end differentiable generator containing conditional latent diffusion model and hi-fidelity waveform construction model. We show the learning efficiency of HEGGS by training it on a single GPU machine and validate its performance using earthquake databases from North America, East Asia, and Europe, using diverse criteria from waveform generation tasks and seismology. Once trained, HEGGS can generate three dimensional E-N-Z seismic waveforms with accurate P/S phase arrivals, envelope correlation, signal-to-noise ratio, GMPE analysis, frequency content analysis, and section plot analysis.

地震引发的冲击波可能是毁灭性的。产生现实的地震引发的地面运动波形有助于减少生命和财产损失,但这项任务的基因模型往往产生亚波形。我们展示了高虚震地震地面变化系统(HEGGS ) , 并用来自北美、东亚和欧洲区域的地震来展示其优异性。高震地震数据组利用地震数据集的内在特征,并利用含有有条件潜伏扩散模型和高频波形构建模型的端到端不同发电机来学习波形。我们用来自北美、东亚和欧洲的地震数据库对高频地震震震动产生系统进行培训,并使用波形生成任务和地震学的不同标准来验证其性能,我们展示了高频GGS的学习效率。高频系统一旦经过培训,可以生成三维E-N-Z地震波状,并配有准确的P/S阶段到达、信封连接、信号到噪音比率、GPEPE分析、频率内容分析和部分绘图分析。

Article 120

Title@2025-05-29 (4): On Global Convergence Rates for Federated Policy Gradient under Heterogeneous Environment

Title: On Global Convergence Rates for Federated Policy Gradient under Heterogeneous Environment

Globale Konvergenzraten für Föderierten politischen Gradienten unter heterogener Umwelt

关于不同不同环境下联邦政策分级制全球趋同率的全球趋同率 2505.23459v1

Authors: Safwan Labbi, Paul Mangold, Daniil Tiapkin, Eric Moulines

Ensuring convergence of policy gradient methods in federated reinforcement learning (FRL) under environment heterogeneity remains a major challenge. In this work, we first establish that heterogeneity, perhaps counter-intuitively, can necessitate optimal policies to be non-deterministic or even time-varying, even in tabular environments. Subsequently, we prove global convergence results for federated policy gradient (FedPG) algorithms employing local updates, under a {\L}ojasiewicz condition that holds only for each individual agent, in both entropy-regularized and non-regularized scenarios. Crucially, our theoretical analysis shows that FedPG attains linear speed-up with respect to the number of agents, a property central to efficient federated learning. Leveraging insights from our theoretical findings, we introduce b-RS-FedPG, a novel policy gradient method that employs a carefully constructed softmax-inspired parameterization coupled with an appropriate regularization scheme. We further demonstrate explicit convergence rates for b-RS-FedPG toward near-optimal stationary policies. Finally, we demonstrate that empirically both FedPG and b-RS-FedPG consistently outperform federated Q-learning on heterogeneous settings.

在这种工作中,我们首先确定,即使在表格环境中,差异性或许是反直觉的,可以要求最佳政策非决定性甚至时间变化性的最佳政策,即使在表格环境中也是如此。随后,我们证明,在使用本地更新的federated 政策梯度(FedPG)算法中,根据一种只对每个个体代理商(在正对式和非正式假设中)保持的 ojjasiewicz 条件,采用当地更新的fredadd point(FedPG) ,全球趋同结果。我们进一步表明,B-RS-FedPG在代理商数量方面实现了线性加速,这是高效联进化学习的中央财产。最后,我们从我们的理论结论中汲取了深刻的见解,我们引入了 b-RS-FedPG,这是一种新的政策梯度方法,采用经过精心构建的软式激励参数化参数化,加上适当的正规化计划。我们进一步证明,B-RS-FDPGG与近优性固定政策之间的明确趋同率率。最后,我们证明,我们不断学习FGPGF和FM-FMFMFMFFFFFFF的不断进化模式。

Article 121

Title@2025-05-29 (4): Diffusion Guidance Is a Controllable Policy Improvement Operator

Title: Diffusion Guidance Is a Controllable Policy Improvement Operator

Diffusion Guidance ist ein kontrollierbarer Politikverbesserungs-Betreiber

传播指导是可控制的政策改进操作员 2505.23458v1

Authors: Kevin Frans, Seohong Park, Pieter Abbeel, Sergey Levine

At the core of reinforcement learning is the idea of learning beyond the performance in the data. However, scaling such systems has proven notoriously tricky. In contrast, techniques from generative modeling have proven remarkably scalable and are simple to train. In this work, we combine these strengths, by deriving a direct relation between policy improvement and guidance of diffusion models. The resulting framework, CFGRL, is trained with the simplicity of supervised learning, yet can further improve on the policies in the data. On offline RL tasks, we observe a reliable trend – increased guidance weighting leads to increased performance. Of particular importance, CFGRL can operate without explicitly learning a value function, allowing us to generalize simple supervised methods (e.g., goal-conditioned behavioral cloning) to further prioritize optimality, gaining performance for “free” across the board.

强化学习的核心是超越数据性能的学习理念。然而,推广这种系统证明是极其棘手的。相反,基因模型的技巧被证明非常可扩展,而且很容易培训。在这项工作中,我们将这些优势结合起来,在政策改进与传播模式指导之间建立直接的关系。由此形成的框架CFGRL经过监督学习的简单培训,但可以进一步改进数据中的政策。在离线的RL任务中,我们观察到一种可靠的趋势 – – 指导权重的增加导致性能的提高。特别重要的是,CFGRL可以在不明确学习价值功能的情况下运作,从而使我们能够将简单的监督方法(如有目标的克隆行为)推广到更优先的层次上,从而获得“免费”的性能。

Article 122

Title@2025-05-29 (4): TabReason: A Reinforcement Learning-Enhanced Reasoning LLM for Explainable Tabular Data Prediction

Title: TabReason: A Reinforcement Learning-Enhanced Reasoning LLM for Explainable Tabular Data Prediction

TabReason: Eine verstärkte Lern-verbesserte Begründung LLM für erklärbare tabellarische Datenvorhersage

TabReson: 用于可解释的图表数据预测的强化学习-提高合理理由的强化学习-强化LLMLM 2505.21807v2

Authors: Tommy Xu, Zhitian Zhang, Xiangyu Sun, Lauren Kelly Zung, Hossein Hajimirsadeghi, Greg Mori

Predictive modeling on tabular data is the cornerstone of many real-world applications. Although gradient boosting machines and some recent deep models achieve strong performance on tabular data, they often lack interpretability. On the other hand, large language models (LLMs) have demonstrated powerful capabilities to generate human-like reasoning and explanations, but remain under-performed for tabular data prediction. In this paper, we propose a new approach that leverages reasoning-based LLMs, trained using reinforcement learning, to perform more accurate and explainable predictions on tabular data. Our method introduces custom reward functions that guide the model not only toward high prediction accuracy but also toward human-understandable reasons for its predictions. Experimental results show that our model achieves promising performance on financial benchmark datasets, outperforming most existing LLMs.

在表格数据上建立预测模型是许多现实世界应用的基石。虽然梯度推动机和最近一些深层模型在表格数据上表现良好,但它们往往缺乏可解释性。另一方面,大型语言模型(LLMs)已经展示出产生人性推理和解释的强大能力,但在表格数据预测方面表现仍然不足。在本文中,我们提出了一种新的方法,利用经过强化学习培训的基于推理的LMs,对列表数据进行更准确和可解释的预测。我们的方法引入了定制奖励功能,不仅引导模型实现高预测准确性,而且引导模型预测的人性难以理解的原因。实验结果显示,我们的模型在财务基准数据集上取得了有希望的业绩,优于大多数现有的LLMs。

Article 123

Title@2025-05-29 (4): Learning Cascade Ranking as One Network

Title: Learning Cascade Ranking as One Network

Kaskaden-Ranking als ein Netzwerk lernen

学习连级安排 “ 一个网络 “ 网络 2503.09492v2

Authors: Yunli Wang, Zhen Zhang, Zhiqiang Wang, Zixuan Yang, Yu Li, Jian Yang, Shiyang Wen, Peng Jiang, Kun Gai

Cascade Ranking is a prevalent architecture in large-scale top-k selection systems like recommendation and advertising platforms. Traditional training methods focus on single-stage optimization, neglecting interactions between stages. Recent advances have introduced interaction-aware training paradigms, but still struggle to 1) align training objectives with the goal of the entire cascade ranking (i.e., end-to-end recall of ground-truth items) and 2) learn effective collaboration patterns for different stages. To address these challenges, we propose LCRON, which introduces a novel surrogate loss function derived from the lower bound probability that ground truth items are selected by cascade ranking, ensuring alignment with the overall objective of the system. According to the properties of the derived bound, we further design an auxiliary loss for each stage to drive the reduction of this bound, leading to a more robust and effective top-k selection. LCRON enables end-to-end training of the entire cascade ranking system as a unified network. Experimental results demonstrate that LCRON achieves significant improvement over existing methods on public benchmarks and industrial applications, addressing key limitations in cascade ranking training and significantly enhancing system performance.

为了应对这些挑战,我们提议LCRON采用新的替代损失功能,即地面真相项目按级联排名的低约束概率选择,确保符合系统的总体目标。根据衍生约束性培训的特性,我们进一步设计每个阶段的附带损失,推动减少这一约束性,导致更有力和更有效的顶级选择。 LCRON能够对整个级联排名系统进行端到端培训,作为一个统一的网络。实验结果显示,LCRON大大改进了公共基准和工业应用的现有方法,解决了级联排名培训的关键限制,并大大加强了系统绩效。

Article 124

Title@2025-05-29 (4): DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

Title: DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

DynaMem: Online-Dynamischer Raum-Semantischer Speicher für mobile Manipulationen in der offenen Welt

DynaMem: 用于开放世界移动操纵的在线动态空间-空间内存 2411.04999v2

Authors: Peiqi Liu, Zhanqiu Guo, Mohit Warke, Soumith Chintala, Chris Paxton, Nur Muhammad Mahi Shafiullah, Lerrel Pinto

Significant progress has been made in open-vocabulary mobile manipulation, where the goal is for a robot to perform tasks in any environment given a natural language description. However, most current systems assume a static environment, which limits the system’s applicability in real-world scenarios where environments frequently change due to human intervention or the robot’s own actions. In this work, we present DynaMem, a new approach to open-world mobile manipulation that uses a dynamic spatio-semantic memory to represent a robot’s environment. DynaMem constructs a 3D data structure to maintain a dynamic memory of point clouds, and answers open-vocabulary object localization queries using multimodal LLMs or open-vocabulary features generated by state-of-the-art vision-language models. Powered by DynaMem, our robots can explore novel environments, search for objects not found in memory, and continuously update the memory as objects move, appear, or disappear in the scene. We run extensive experiments on the Stretch SE3 robots in three real and nine offline scenes, and achieve an average pick-and-drop success rate of 70% on non-stationary objects, which is more than a 2x improvement over state-of-the-art static systems. Our code as well as our experiment and deployment videos are open sourced and can be found on our project website: https://dynamem.github.io/

在开放词汇移动操作方面已经取得了显著进展, 目标是让机器人在自然语言描述下的任何环境中执行任务。然而, 大多数当前系统都假设一个静态环境, 从而限制系统在现实世界环境中的可应用性, 因为在现实世界环境中, 环境经常因人类干预或机器人自己的行动而变化。在此工作中, 我们展示DynaMem, 这是开放世界移动操作的新方法, 使用动态的spatio- 语义内存来代表机器人的环境。 Dynamem 在三个真实和九个离线场的屏幕上, 构建了一个三维数据结构, 以维持点云的动态记忆, 并用modoral LLMS 或由状态视觉语言模型生成的开放词汇特性解答开放词汇对象本地化查询。 Dynam, 我们的机器人可以探索新的环境, 搜索在记忆中找不到的物体, 并随着物体移动、出现或消失在现场, 不断更新记忆。我们在三个真实和九个离线场的屏幕上进行广泛的实验 Stech SE3机器人, 并实现一个平均选取- lib- 成功率在我们的系统上, 70% 。在非静位系统上找到的系统上, 我们的系统可以找到一个正常的系统, 。

Article 125

Title@2025-05-29 (4): Network Inversion for Uncertainty-Aware Out-of-Distribution Detection

Title: Network Inversion for Uncertainty-Aware Out-of-Distribution Detection

Netzwerk-Inversion für unsichere Out-of-Distribution-Erkennung

用于不确定性软件发送外检测的网络转换 2505.23448v1

Authors: Pirzada Suhail, Rehna Afroz, Amit Sethi

Out-of-distribution (OOD) detection and uncertainty estimation (UE) are critical components for building safe machine learning systems, especially in real-world scenarios where unexpected inputs are inevitable. In this work, we propose a novel framework that combines network inversion with classifier training to simultaneously address both OOD detection and uncertainty estimation. For a standard n-class classification task, we extend the classifier to an (n+1)-class model by introducing a “garbage” class, initially populated with random gaussian noise to represent outlier inputs. After each training epoch, we use network inversion to reconstruct input images corresponding to all output classes that initially appear as noisy and incoherent and are therefore excluded to the garbage class for retraining the classifier. This cycle of training, inversion, and exclusion continues iteratively till the inverted samples begin to resemble the in-distribution data more closely, suggesting that the classifier has learned to carve out meaningful decision boundaries while sanitising the class manifolds by pushing OOD content into the garbage class. During inference, this training scheme enables the model to effectively detect and reject OOD samples by classifying them into the garbage class. Furthermore, the confidence scores associated with each prediction can be used to estimate uncertainty for both in-distribution and OOD inputs. Our approach is scalable, interpretable, and does not require access to external OOD datasets or post-hoc calibration techniques while providing a unified solution to the dual challenges of OOD detection and uncertainty estimation.

在这项工作中,我们提出了一个新框架,将网络与分类培训相结合,同时处理OOD检测和不确定性估算问题。对于标准的n级分类任务,我们将分类器扩展为(n+1)级模型,引入一个“垃圾”类,最初由随机的Gausian噪音组成,以代表外部投入。在每次培训后,我们使用网络转换来重建与所有产出类别相对应的不确定性图像,这些产出类别最初显得吵闹和不协调,因此被排除在垃圾分类班之外,以便再培训 OOD 检测和不确定性估算同时处理 OO 。这一培训、转换和排斥的周期继续反复进行,直到被倒置的样本开始更接近分发中的数据,表明分类器学会了将有意义的决定界限分离出来,同时通过将 OOD 内容推入外部检测班级,从而将 OOD 的可校准方法推入到垃圾类中。这一培训计划使得该模型能够有效地检测和拒绝所有产出类别中的输入图象,同时将OOD 的样本用于对质量的估算。

Article 126

Title@2025-05-29 (4): GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning

Title: GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning

GSQ-Tuning: Group-Shared Exponents integer in einer voll quantifizierten Schulung für LLMs On-Device-Fine-Tuning

GSQ-Turning:为在线设计精微调LLM女士提供全面量化培训的集团共享指数整数 2502.12913v3

Authors: Sifan Zhou, Shuo Wang, Zhihang Yuan, Mingjia Shi, Yuzhang Shang, Dawei Yang

Large Language Models (LLMs) fine-tuning technologies have achieved remarkable results. However, traditional LLM fine-tuning approaches face significant challenges: they require large Floating Point (FP) computation, raising privacy concerns when handling sensitive data, and are impractical for resource-constrained edge devices. While Parameter-Efficient Fine-Tuning (PEFT) techniques reduce trainable parameters, their reliance on floating-point arithmetic creates fundamental incompatibilities with edge hardware. In this work, we introduce a novel framework for on-device LLM fine-tuning that eliminates the need for floating-point operations in both inference and training, named GSQ-Tuning. At its core is the Group-Shared Exponents Integer format, which efficiently represents model parameters in integer format using shared exponents among parameter groups. When combined with LoRA-like adapters, this enables fully integer-based fine-tuning that is both memory and compute efficient. We demonstrate that our approach achieves accuracy comparable to BF16-based fine-tuning while significantly reducing 1.85x memory usage. Moreover, compared to FP8, our method can reduce 5x power consumption and 11x chip area with same performance, making large-scale model adaptation feasible on edge devices.

大型语言模型(LLMS)的微调技术取得了显著成果。然而,传统的LLM微调方法面临重大挑战:它们需要大型浮点计算,在处理敏感数据时引起隐私问题,对资源限制的边缘设备不切实际。虽然参数-有效精美微调(PEFT)技术减少了可训练参数,但对浮点计算方法的依赖使得对边端硬件的精确性与边端硬件产生根本的不兼容性。在这项工作中,我们引入了一个新型的LLM微调框架,消除了在感应和培训(称为GSQ-Tuning)中进行浮点操作的需要。其核心是群体共享集价 Integer格式,该格式有效地代表了使用各参数组共享的整数格式的模型参数。当它们与类似LORA的适应器相结合时,可以使完全基于整流点的微调既具有记忆性又具有计算效率。我们的方法达到了与基于BF16的微调的精确性能,同时大大减少了1.85x记忆使用。此外,与可操作性平级的平方标准的S-11级吸能装置相比,我们的方法可以降低高能区平位的平段的平段的性能。

Article 127

Title: SCoTT: Strategic Chain-of-Thought Tasking for Wireless-Aware Robot Navigation in Digital Twins

SCoTT: Strategisches Chain-of-Thought-Tasking für Wireless-Aware-Roboternavigation in digitalen Zwillingen

SCTT: “ 数字双双 “ 中无线软件机器人导航战略研究链任务 2411.18212v2

Authors: Aladin Djuhera, Amin Seffo, Vlad C. Andrei, Holger Boche, Walid Saad

Path planning under wireless performance constraints is a complex challenge in robot navigation. However, naively incorporating such constraints into classical planning algorithms often incurs prohibitive search costs. In this paper, we propose SCoTT, a wireless-aware path planning framework that leverages vision-language models (VLMs) to co-optimize average path gains and trajectory length using wireless heatmap images and ray-tracing data from a digital twin (DT). At the core of our framework is Strategic Chain-of-Thought Tasking (SCoTT), a novel prompting paradigm that decomposes the exhaustive search problem into structured subtasks, each solved via chain-of-thought prompting. To establish strong baselines, we compare classical A* and wireless-aware extensions of it, and derive DP-WA, an optimal, iterative dynamic programming algorithm that incorporates all path gains and distance metrics from the DT, but at significant computational cost. In extensive experiments, we show that SCoTT achieves path gains within 2% of DP-WA while consistently generating shorter trajectories. Moreover, SCoTT’s intermediate outputs can be used to accelerate DP-WA* by reducing its search space, saving up to 62% in execution time. We validate our framework using four VLMs, demonstrating effectiveness across both large and small models, thus making it applicable to a wide range of compact models at low inference cost. We also show the practical viability of our approach by deploying SCoTT as a ROS node within Gazebo simulations. Finally, we discuss data acquisition pipelines, compute requirements, and deployment considerations for VLMs in 6G-enabled DTs, underscoring the potential of natural language interfaces for wireless-aware navigation in real-world applications.

无线性能限制下的路径规划是机器人导航中的一项复杂挑战。然而,将此类限制纳入经典规划算法中,天真地将此类限制纳入经典规划算法,往往带来令人望而却步的搜索费用。在本文中,我们提议SCOTT,即一个利用视觉语言模型(VLMS)来利用无线热映像和数字双体(DT)的射线追踪数据来优化平均路径增益和轨迹长度的无线天线路路路路路规划框架。在我们框架的核心是战略传输任务链(SCoTTT),这是一个将彻底搜索问题分解成结构化子任务的新颖的激励型模式,每个子线条都通过思维链加速解决。为了建立强有力的基线,我们比较了经典A* 和无线线路路路路路路路路路路路路路路路路路路路路路路路路图的扩展,我们通过广泛实验,在DP-WA* 内部实现路径上实现路径增益,同时持续产生较短的轨迹要求。此外,SCTFTFS-LLL 快速运行运行系统运行运行系统运行运行中可以加速运行运行运行。我们使用大型搜索系统运行的运行中,在四大路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路。

Article 128

Title@2025-05-29 (4): The Strong, Weak and Benign Goodhart’s law. An independence-free and paradigm-agnostic formalisation

Title: The Strong, Weak and Benign Goodhart’s law. An independence-free and paradigm-agnostic formalisation

The Strong, Weak and Benign Goodharts Gesetz. Eine unabhängigkeitsfreie und paradigmatisch-agnostische Formalisierung

强势、弱弱和本尼·古德哈特法,无独立和无范式、不可知的正规化 2505.23445v1

Authors: Adrien Majka, El-Mahdi El-Mhamdi

Goodhart’s law is a famous adage in policy-making that states that ``When a measure becomes a target, it ceases to be a good measure’’. As machine learning models and the optimisation capacity to train them grow, growing empirical evidence reinforced the belief in the validity of this law without however being formalised. Recently, a few attempts were made to formalise Goodhart’s law, either by categorising variants of it, or by looking at how optimising a proxy metric affects the optimisation of an intended goal. In this work, we alleviate the simplifying independence assumption, made in previous works, and the assumption on the learning paradigm made in most of them, to study the effect of the coupling between the proxy metric and the intended goal on Goodhart’s law. Our results show that in the case of light tailed goal and light tailed discrepancy, dependence does not change the nature of Goodhart’s effect. However, in the light tailed goal and heavy tailed discrepancy case, we exhibit an example where over-optimisation occurs at a rate inversely proportional to the heavy tailedness of the discrepancy between the goal and the metric. %

Goodhart的法律是决策中一个著名的格言,它指出“当一项措施成为目标时,它不再是一个好的衡量标准”。随着机器学习模式和训练它们的最佳能力的增长,越来越多的经验证据加强了对该法有效性的信念,而没有正式化。最近,有人试图将Goodhart的法律正规化,或者对它进行分类,或者研究如何优化代用衡量标准影响预期目标的优化。在这项工作中,我们减少了以前作品中所作的简化独立假设,以及其中多数作品中对学习模式所作的假设,以研究代用衡量标准与Goodhart法律预期目标的挂钩的影响。我们的结果显示,在光尾目标和小尾差异的情况下,依赖并不改变Goodhart效应的性质。然而,在浅尾尾目标和严重尾端差异的情况下,我们展示了一个例子,过度优化发生的速度与目标之间严重尾部差异之间的反比率。

Article 129

Title@2025-05-29 (4): Strategic Classification with Non-Linear Classifiers

Title: Strategic Classification with Non-Linear Classifiers

Strategische Klassifizierung mit nicht linearen Klassifikatoren

战略分类与非链分类法战略分类 2505.23443v1

Authors: Benyamin Trachtenberg, Nir Rosenfeld

In strategic classification, the standard supervised learning setting is extended to support the notion of strategic user behavior in the form of costly feature manipulations made in response to a classifier. While standard learning supports a broad range of model classes, the study of strategic classification has, so far, been dedicated mostly to linear classifiers. This work aims to expand the horizon by exploring how strategic behavior manifests under non-linear classifiers and what this implies for learning. We take a bottom-up approach showing how non-linearity affects decision boundary points, classifier expressivity, and model classes complexity. A key finding is that universal approximators (e.g., neural nets) are no longer universal once the environment is strategic. We demonstrate empirically how this can create performance gaps even on an unrestricted model class.

在战略分类方面,标准监督的学习环境扩大到支持战略用户行为的概念,其形式是针对分类者进行昂贵的特性操纵。虽然标准学习支持一系列广泛的模型类,但到目前为止,战略分类的研究大多专门针对线性分类者。这项工作旨在扩大视野,探讨非线性分类者的战略行为如何表现,以及这意味着学习什么。我们采取自下而上的方法,表明非线性如何影响决定边界点、分类性直观性和模型类的复杂性。一项关键发现是,一旦环境具有战略意义,通用近似器(例如神经网)就不再具有普遍性。我们从经验上证明,即使在一个不受限制的模型类中,这如何造成业绩差距。

Article 130

Title@2025-05-29 (4): Rethinking Regularization Methods for Knowledge Graph Completion

Title: Rethinking Regularization Methods for Knowledge Graph Completion

Überdenken von Regularisierungsmethoden für Wissensgraphenvervollständigung

重新思考知识图完成正规化方法 2505.23442v1

Authors: Linyu Li, Zhi Jin, Yuanpeng He, Dongming Jin, Haoran Duan, Zhengwei Tao, Xuan Zhang, Jiandong Li

Knowledge graph completion (KGC) has attracted considerable attention in recent years because it is critical to improving the quality of knowledge graphs. Researchers have continuously explored various models. However, most previous efforts have neglected to take advantage of regularization from a deeper perspective and therefore have not been used to their full potential. This paper rethinks the application of regularization methods in KGC. Through extensive empirical studies on various KGC models, we find that carefully designed regularization not only alleviates overfitting and reduces variance but also enables these models to break through the upper bounds of their original performance. Furthermore, we introduce a novel sparse-regularization method that embeds the concept of rank-based selective sparsity into the KGC regularizer. The core idea is to selectively penalize those components with significant features in the embedding vector, thus effectively ignoring many components that contribute little and may only represent noise. Various comparative experiments on multiple datasets and multiple models show that the SPR regularization method is better than other regularization methods and can enable the KGC model to further break through the performance margin.

近些年来,知识图的完成(KGC)由于对提高知识图的质量至关重要,所以引起了相当大的注意。研究人员不断探索各种模型。然而,以往的多数努力都忽略了从更深的视角利用正规化,因此没有充分利用其潜力。本文件重新思考了在KGC应用正规化方法的问题。通过对各种KGC模型的广泛经验研究,我们发现,经过精心设计的正规化不仅减轻了过度和减少差异,而且使这些模型能够突破其原始性能的上限。此外,我们引入了一种新的稀有常规化方法,将基于等级的选择性聚变概念嵌入KGC正规化器中。核心思想是选择性地惩罚那些在嵌入矢量中具有重要特征的成分,从而实际上忽略了许多很少起作用的成分,而且可能只是代表噪音。关于多个数据集和多个模型的各种比较实验表明,SPR正规化方法比其他正规化方法要好,能够使KGC模型进一步突破性差。

Article 131

Title@2025-05-29 (4): The challenge of hidden gifts in multi-agent reinforcement learning

Title: The challenge of hidden gifts in multi-agent reinforcement learning

Die Herausforderung der versteckten Gaben in Multi-Agenten-Verstärkung Lernen

多试剂强化学习中隐藏礼品的挑战 2505.20579v2

Authors: Dane Malenfant, Blake A. Richards

Sometimes we benefit from actions that others have taken even when we are unaware that they took those actions. For example, if your neighbor chooses not to take a parking spot in front of your house when you are not there, you can benefit, even without being aware that they took this action. These “hidden gifts” represent an interesting challenge for multi-agent reinforcement learning (MARL), since assigning credit when the beneficial actions of others are hidden is non-trivial. Here, we study the impact of hidden gifts with a very simple MARL task. In this task, agents in a grid-world environment have individual doors to unlock in order to obtain individual rewards. As well, if all the agents unlock their door the group receives a larger collective reward. However, there is only one key for all of the doors, such that the collective reward can only be obtained when the agents drop the key for others after they use it. Notably, there is nothing to indicate to an agent that the other agents have dropped the key, thus the act of dropping the key for others is a “hidden gift”. We show that several different state-of-the-art RL algorithms, including MARL algorithms, fail to learn how to obtain the collective reward in this simple task. Interestingly, we find that independent model-free policy gradient agents can solve the task when we provide them with information about their own action history, but MARL agents still cannot solve the task with action history. Finally, we derive a correction term for these independent agents, inspired by learning aware approaches, which reduces the variance in learning and helps them to converge to collective success more reliably. These results show that credit assignment in multi-agent settings can be particularly challenging in the presence of “hidden gifts”, and demonstrate that learning awareness in independent agents can benefit these settings.

有时我们从其他人的行动中受益,即使我们不知道他们采取了这些行动。例如,如果邻居选择不在其家中时不在其家门前停泊,即使不知道他们采取了这一行动,也可以受益。这些“隐藏的礼物”代表了多试剂强化学习(MARL)的一个有趣的挑战,因为当其他人的有益行动被隐藏起来时,就分配信用是非三角的。在这里,我们研究隐藏的礼品的影响,任务很简单,MARL的任务非常简单。在这个任务中,网格世界环境中的代理商有单独的门可以打开,以获得个人报酬。同样,如果所有代理商都打开了他们的家门,他们也可以得到更大的集体奖赏。然而,所有这些“隐藏的礼物”只是当代理人在其他人的有益行动被隐藏起来的时候,集体奖赏才能得到。值得注意的是,没有什么可以告诉代理商其他代理商已经放下了钥匙,因此,放弃他人的钥匙的行为就是“隐藏的礼物”。我们用不同的门打开门打开了自己的门来获得个人奖赏。同样,如果所有的代理商都打开他们的门门, 包括MAL 算算,那么,我们就能在他们自己学习了一个真正的历史任务中,我们如何在学习这些任务中,我们如何在学习这些任务中,我们是如何学习了。

Article 132

Title@2025-05-29 (4): LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

Title: LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

LoTUS: Großformatige Maschine entlernen mit einem Geschmack von Ungewissheit

LoTUS: 大型机器与不确定性的味道脱钩 2503.18314v4

Authors: Christoforos N. Spartalis, Theodoros Semertzidis, Efstratios Gavves, Petros Daras

We present LoTUS, a novel Machine Unlearning (MU) method that eliminates the influence of training samples from pre-trained models, avoiding retraining from scratch. LoTUS smooths the prediction probabilities of the model up to an information-theoretic bound, mitigating its over-confidence stemming from data memorization. We evaluate LoTUS on Transformer and ResNet18 models against eight baselines across five public datasets. Beyond established MU benchmarks, we evaluate unlearning on ImageNet1k, a large-scale dataset, where retraining is impractical, simulating real-world conditions. Moreover, we introduce the novel Retrain-Free Jensen-Shannon Divergence (RF-JSD) metric to enable evaluation under real-world conditions. The experimental results show that LoTUS outperforms state-of-the-art methods in terms of both efficiency and effectiveness. Code: https://github.com/cspartalis/LoTUS.

我们介绍LotUS,这是消除培训样本对培训前模式的影响、避免从零开始再培训的新颖的机器不学习方法;LotUS将模型的预测概率顺通到信息理论约束,减轻数据记忆的过度信任;我们根据五个公共数据集的八个基线对变异器和ResNet18模型进行评估;除了已经确立的MU基准外,我们还评估在图像Net1k上的未学习,这是一个大规模数据集,在那里,再培训是不切实际的,模拟现实世界条件;此外,我们推出新的Retrain Free Jensen-Shannon Divergence (RF-JSD) 标准,以便能够在现实世界条件下进行评估;实验结果显示,LotUS在效率和有效性方面都超越了最新技术方法。代码:https://github.com/cspartalis/LATUS。

Article 133

Title@2025-05-29 (4): Bounded-Abstention Pairwise Learning to Rank

Title: Bounded-Abstention Pairwise Learning to Rank

Gebundene Abhaltung Pairwise Learning to Rank

学习排名 2505.23437v1

Authors: Antonio Ferrara, Andrea Pugnana, Francesco Bonchi, Salvatore Ruggieri

Ranking systems influence decision-making in high-stakes domains like health, education, and employment, where they can have substantial economic and social impacts. This makes the integration of safety mechanisms essential. One such mechanism is $\textit{abstention}$, which enables algorithmic decision-making system to defer uncertain or low-confidence decisions to human experts. While abstention have been predominantly explored in the context of classification tasks, its application to other machine learning paradigms remains underexplored. In this paper, we introduce a novel method for abstention in pairwise learning-to-rank tasks. Our approach is based on thresholding the ranker’s conditional risk: the system abstains from making a decision when the estimated risk exceeds a predefined threshold. Our contributions are threefold: a theoretical characterization of the optimal abstention strategy, a model-agnostic, plug-in algorithm for constructing abstaining ranking models, and a comprehensive empirical evaluations across multiple datasets, demonstrating the effectiveness of our approach.

分级制度影响保健、教育和就业等高层次领域的决策,可以对这些领域产生巨大的经济和社会影响。这使得整合安全机制至关重要。其中一种机制是美元,使算法决策系统能够将不确定或低信任决定推迟给人类专家。虽然主要在分类任务方面探索了弃权权,但在其他机器学习范式中应用弃权权的情况仍未得到充分探讨。在本文中,我们引入了一种新颖的方法,在对齐学习到排序的任务中弃权权。我们的方法是基于对排级者的有条件风险设定门槛:当估计风险超过预先确定的门槛时,系统不做出决策。我们的贡献有三重:最佳弃权战略的理论定性、构建不排位模型的、插入式算法,以及跨多个数据集的全面经验评估,显示了我们的方法的有效性。

Article 134

Title@2025-05-29 (4): Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning

Title: Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning

Trainieren mit Perturbation, schlussfolgern nach Merging: Ein Zwei-Stufen-Rahmen für kontinuierliches Lernen

接转训练、合并后的推推:持续学习的双阶段框架 2505.22389v2

Authors: Haomiao Qiu, Miao Zhang, Ziyue Qiao, Liqiang Nie

Continual Learning (CL) aims to enable models to continuously acquire new knowledge from a sequence of tasks with avoiding the forgetting of learned information. However, existing CL methods only rely on the parameters of the most recent task for inference, which makes them susceptible to catastrophic forgetting. Inspired by the recent success of model merging techniques, we propose \textbf{Perturb-and-Merge (P\&M)}, a novel continual learning framework that integrates model merging into the CL paradigm to mitigate forgetting. Specifically, after training on each task, P\&M constructs a new model by forming a convex combination of the previous model and the newly trained task-specific model. Through theoretical analysis, we minimize the total loss increase across all tasks and derive an analytical solution for the optimal merging coefficient. To further improve the performance of the merged model, we observe that the degradation introduced during merging can be alleviated by a regularization term composed of the task vector and the Hessian matrix of the loss function. Interestingly, we show that this term can be efficiently approximated using second-order symmetric finite differences, and a stochastic perturbation strategy along the task vector direction is accordingly devised which incurs no additional forward or backward passes while providing an effective approximation of the regularization term. Finally, we combine P\&M with LoRA, a parameter-efficient fine-tuning method, to reduce memory overhead. Our proposed approach achieves state-of-the-art performance on several continual learning benchmark datasets.

持续学习(CL)旨在让模型能够不断从一系列任务中获取新知识,避免忘记已学习的信息;然而,现有的CL方法只依靠最近一项推断任务的参数,因此很容易发生灾难性的忘记。受最近成功的模型合并技术的启发,我们提议了\ textbf{Perturb-and-Merge(PM)},这是一个新的持续学习框架,将模式合并到CL模式中,以减少遗忘。具体地说,在对每项任务进行培训之后,PM通过将以前的模型和新培训的具体任务模型结合起来,来构建一个新的模型。通过理论分析,我们最大限度地减少所有任务的总损失,并为最佳合并系数找到分析解决办法。为了进一步改善合并模型的性能,我们观察到,合并过程中引入的退化可以通过由任务矢量组成的正规化术语和损失函数的赫西式矩阵来缓解。有趣的是,在对每项任务进行培训后,我们可以用二级的精细度定的固定差异和新培训的具体任务模式来构建一个新的模型。最后,我们用一个不断调整的轨道化的方法来将一个我们进化的系统化的系统化的系统化方向结合起来。

Article 135

Title@2025-05-29 (4): Emergent Risk Awareness in Rational Agents under Resource Constraints

Title: Emergent Risk Awareness in Rational Agents under Resource Constraints

Emergent Risk Awareness in Rational Agents unter Ressourcenbeschränkungen

资源限制下对合理代理的新兴风险意识 2505.23436v1

Authors: Daniel Jarne Ornia, Nicholas Bishop, Joel Dyer, Wei-Chen Lee, Ani Calinescu, Doyne Farme, Michael Wooldridge

Advanced reasoning models with agentic capabilities (AI agents) are deployed to interact with humans and to solve sequential decision-making problems under (approximate) utility functions and internal models. When such problems have resource or failure constraints where action sequences may be forcibly terminated once resources are exhausted, agents face implicit trade-offs that reshape their utility-driven (rational) behaviour. Additionally, since these agents are typically commissioned by a human principal to act on their behalf, asymmetries in constraint exposure can give rise to previously unanticipated misalignment between human objectives and agent incentives. We formalise this setting through a survival bandit framework, provide theoretical and empirical results that quantify the impact of survival-driven preference shifts, identify conditions under which misalignment emerges and propose mechanisms to mitigate the emergence of risk-seeking or risk-averse behaviours. As a result, this work aims to increase understanding and interpretability of emergent behaviours of AI agents operating under such survival pressure, and offer guidelines for safely deploying such AI systems in critical resource-limited environments.

运用具有代理能力的高级推理模型(AI代理商)与人类互动,并在(近似)通用功能和内部模型下解决顺序决策问题。当这类问题有资源或失败制约,一旦资源用尽,行动序列可能被迫终止时,代理商面临暗含的权衡,从而改变其使用(理性)行为。此外,由于这些代理商通常由一位人类主委托代表其行事,因此,在受限制的接触中的不对称可能导致人类目标与代理商激励之间先前未曾预料到的不匹配。我们通过生存强盗框架将这一设置正规化,提供理论和经验结果,量化由生存驱动的优惠转移的影响,查明出现不匹配的条件,并提出机制,以减轻在这种生存压力下运作的AI代理商新出现的行为,并提供在关键资源有限的环境中安全部署此类AI系统的指导方针。

Article 136

Title@2025-05-29 (4): Diversity-Aware Policy Optimization for Large Language Model Reasoning

Title: Diversity-Aware Policy Optimization for Large Language Model Reasoning

Diversity-Aware-Politikoptimierung für groß angelegte Sprachmodell-Reasoning

大语言示范理由的多样性政策优化 2505.23433v1

Authors: Jian Yao, Ran Cheng, Xingyu Wu, Jibin Wu, Kay Chen Tan

The reasoning capabilities of large language models (LLMs) have advanced rapidly, particularly following the release of DeepSeek R1, which has inspired a surge of research into data quality and reinforcement learning (RL) algorithms. Despite the pivotal role diversity plays in RL, its influence on LLM reasoning remains largely underexplored. To bridge this gap, this work presents a systematic investigation into the impact of diversity in RL-based training for LLM reasoning, and proposes a novel diversity-aware policy optimization method. Across evaluations on 12 LLMs, we observe a strong positive correlation between the solution diversity and Potential at k (a novel metric quantifying an LLM’s reasoning potential) in high-performing models. This finding motivates our method to explicitly promote diversity during RL training. Specifically, we design a token-level diversity and reformulate it into a practical objective, then we selectively apply it to positive samples. Integrated into the R1-zero training framework, our method achieves a 3.5 percent average improvement across four mathematical reasoning benchmarks, while generating more diverse and robust solutions.

大型语言模型(LLMs)的推理能力迅速发展,特别是在DeepSeek R1发布后,它激发了数据质量和强化学习算法研究的激增。尽管多样性在RL中发挥着关键作用,但对LLM推理的影响仍然在很大程度上没有得到充分探讨。为弥合这一差距,这项工作对基于RL培训的多样性对LLM推理的影响进行了系统调查,并提出了新的多样性认识政策优化方法。在对12LMs的评价中,我们看到在高绩效模型中,解决办法的多样性和潜力在 k(对LLM推理潜力进行量化的新指标)之间有着强烈的正相关关系。这一发现激励了我们在RL培训中明确促进多样性的方法。具体地说,我们设计了象征性的多样性并将其转化为一个实际目标,然后有选择地将其应用于积极的样本。我们的方法被纳入R1-零培训框架,在四个数学推理基准中实现了平均3.5%的改进,同时产生了更加多样化和有力的解决方案。

Article 137

Title@2025-05-29 (4): Improved Learning via k-DTW: A Novel Dissimilarity Measure for Curves

Title: Improved Learning via k-DTW: A Novel Dissimilarity Measure for Curves

Verbessertes Lernen über k-DTW: Ein neuartiges Maß an Unähnlichkeit für Kurven

通过 k-DTW改进学习:曲线的新差异措施 2505.23431v1

Authors: Amer Krivošija, Alexander Munteanu, André Nusser, Chris Schwiegelshohn

This paper introduces $k$-Dynamic Time Warping ($k$-DTW), a novel dissimilarity measure for polygonal curves. $k$-DTW has stronger metric properties than Dynamic Time Warping (DTW) and is more robust to outliers than the Fr'{e}chet distance, which are the two gold standards of dissimilarity measures for polygonal curves. We show interesting properties of $k$-DTW and give an exact algorithm as well as a $(1+\varepsilon)$-approximation algorithm for $k$-DTW by a parametric search for the $k$-th largest matched distance. We prove the first dimension-free learning bounds for curves and further learning theoretic results. $k$-DTW not only admits smaller sample size than DTW for the problem of learning the median of curves, where some factors depending on the curves’ complexity $m$ are replaced by $k$, but we also show a surprising separation on the associated Rademacher and Gaussian complexities: $k$-DTW admits strictly smaller bounds than DTW, by a factor $\tilde\Omega(\sqrt{m})$ when $k\ll m$. We complement our theoretical findings with an experimental illustration of the benefits of using $k$-DTW for clustering and nearest neighbor classification.

本文介绍了对多边形曲线的美元- 机动时间扭曲(kk$- DTW) 。美元- DTW比动态时间扭曲(DTW) 具有比动态时间扭曲(DTW) 更强的度量特性,并且比Fr'{e}chet 距离(Fr'{e}chet 距离) 更强的外端学习范围。这是多边形曲线不同计量的两个金标准。我们展示了K美元- DTW 的有趣特性,给出了精确的算法以及美元- DTW 的(1 varepsilon) 和美元- DTW 和美元(美元- 美元) 相联的比方程式的比值( 美元- DDTQ_ m) 。我们证明,对于曲线的计算结果来说,第一个无维度学习的学习范围比 DTW 的样本小。美元- DTW 不仅接受比 DTW 的中位值小的样本大小, 某些取决于曲线的精度值的精度值以美元的精度值。

Article 138

Title@2025-05-29 (4): Proper Dataset Valuation by Pointwise Mutual Information

Title: Proper Dataset Valuation by Pointwise Mutual Information

Richtiger Datensatz Bewertung durch pointwise Gegenseitige Informationen

按点对点相互信息分列的适当数据集估价 2405.18253v3

Authors: Shuran Zheng, Xuan Qi, Rui Ray Chen, Yongchan Kwon, James Zou

Data plays a central role in advancements in modern artificial intelligence, with high-quality data emerging as a key driver of model performance. This has prompted the development of principled and effective data curation methods in recent years. However, existing methods largely rely on heuristics, and whether they are truly effective remains unclear. For instance, standard evaluation methods that assess a trained model’s performance on specific benchmarks may incentivize assigning high scores to data that merely resembles the test set. This issue exemplifies Goodhart’s law: when a measure becomes a target, it ceases to be a good measure. To address this issue, we propose an information-theoretic framework for evaluating data curation methods. We define dataset quality in terms of its informativeness about the true model parameters, formalized using the Blackwell ordering of informativeness. Under this ordering, Blackwell’s theorem ensures that more informative data yields optimal models with lower expected loss on the true underlying distribution. To measure informativeness, we show that the Blackwell order can be determined by the Shannon mutual information between the curated data and the test data. To estimate this mutual information, we introduce a novel method that trains Bayesian models on embedded datasets and computes mutual information from the posteriors of model parameters. Experiments on real-world data demonstrate that our mutual information-based evaluation assigns appropriately lower scores to data curation strategies that reduce dataset informativeness, while traditional test score-based evaluation methods may favor data curation strategies that overfit to the test set but compromise the training data’s informativeness.

数据在现代人工智能的进步中发挥着核心作用, 高品质数据正在成为模型性能的关键驱动力。这促使近年来发展了有原则和有效的数据校正方法。但是, 现有方法在很大程度上依赖脂质学, 并且它们是否真正有效, 仍然不清楚。例如, 评估经过培训的模型在具体基准方面的绩效的标准评价方法, 可能会激励将高分分配到仅仅类似于测试集的数据中。这个问题体现了Goodhart 的法律: 当一项措施成为目标时, 它不再是一个良好的衡量标准。为了解决这个问题, 我们提出了一个用于评价数据校正性方法的信息―― 信息性框架。我们用关于真正模型参数的信息性来定义数据集的质量, 使用Blackwell 的信息性排序正规化。根据此命令, 更丰富的数据性数据性能可以产生最佳模型, 其真实基础分布的预期损失较低。为了衡量基于信息性, 我们显示, 黑well 命令可以由香农数据曲线性数据与测试性数据模型之间的相互信息性评估确定。我们从共同数据测试性数据测试模型中, 引入了一种共同数据级数据测试方法, 以测试性数据性数据性数据性数据性测试模型, 。

Article 139

Title@2025-05-29 (4): Understanding and Mitigating Overrefusal in LLMs from an Unveiling Perspective of Safety Decision Boundary

Title: Understanding and Mitigating Overrefusal in LLMs from an Unveiling Perspective of Safety Decision Boundary

Überrefusal in LLMs aus Sicht der Sicherheitsentscheidungsgrenze zu verstehen und zu mildern

从安全裁定边界的始终如一的视角理解和减轻LLM女士的过度拒绝 2505.18325v2

Authors: Licheng Pan, Yongqi Tong, Xin Zhang, Xiaolu Zhang, Jun Zhou, Zhixuan Chu

Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they often refuse to answer legitimate queries-a phenomenon known as overrefusal. Overrefusal typically stems from over-conservative safety alignment, causing models to treat many reasonable prompts as potentially risky. To systematically understand this issue, we probe and leverage the models’safety decision boundaries to analyze and mitigate overrefusal. Our findings reveal that overrefusal is closely tied to misalignment at these boundary regions, where models struggle to distinguish subtle differences between benign and harmful content. Building on these insights, we present RASS, an automated framework for prompt generation and selection that strategically targets overrefusal prompts near the safety boundary. By harnessing steering vectors in the representation space, RASS efficiently identifies and curates boundary-aligned prompts, enabling more effective and targeted mitigation of overrefusal. This approach not only provides a more precise and interpretable view of model safety decisions but also seamlessly extends to multilingual scenarios.We have explored the safety decision boundaries of various LLMs and construct the MORBench evaluation set to facilitate robust assessment of model safety and helpfulness across multiple languages. Code and datasets will be released at https://anonymous.4open.science/r/RASS-80D3.

大型语言模型(LLMS)在一系列广泛的任务中表现出了非凡的能力,然而,它们往往拒绝回答被称为过度反悔的正当问题。过度反悔通常源于过度保守的安全协调,导致许多合理的提示被视为潜在的风险。为了系统地理解这一问题,我们探究并利用模型的安全决定界限来分析和减轻过度反悔。我们的调查结果显示,过度反悔与这些边界区域的错误对立密切相关,这些模型努力区分良性内容和有害内容之间的微妙差异。我们根据这些洞察力提出RASS,这是一个迅速生成和选择的自动框架,其战略目标是在安全边界附近过度反悔。通过在代表空间利用指导矢量,RASS有效地识别和调节边界对立的提示,从而能够更有效和有针对性地减轻过度反悔。这种方法不仅为示范安全决定提供了更准确和可解释的视角,而且无缝地扩展到多语种情景。我们探索了各种LMS的安全决定的界限,并构建了MORBESCSE/ASSADR号多版本评估系统,将便利对安全性和安全/开放性准则进行强有力的评估。

Article 140

Title@2025-05-29 (4): On the Validity of Head Motion Patterns as Generalisable Depression Biomarkers

Title: On the Validity of Head Motion Patterns as Generalisable Depression Biomarkers

Über die Gültigkeit von Head Motion Patterns als Generalisable Depression Biomarkers

头动模式作为可普遍适用的萧条生物标志物的有效性 2505.23427v1

Authors: Monika Gahalawat, Maneesh Bilalpur, Raul Fernandez Rojas, Jeffrey F. Cohn, Roland Goecke, Ramanathan Subramanian

Depression is a debilitating mood disorder negatively impacting millions worldwide. While researchers have explored multiple verbal and non-verbal behavioural cues for automated depression assessment, head motion has received little attention thus far. Further, the common practice of validating machine learning models via a single dataset can limit model generalisability. This work examines the effectiveness and generalisability of models utilising elementary head motion units, termed kinemes, for depression severity estimation. Specifically, we consider three depression datasets from different western cultures (German: AVEC2013, Australian: Blackdog and American: Pitt datasets) with varied contextual and recording settings to investigate the generalisability of the derived kineme patterns via two methods: (i) k-fold cross-validation over individual/multiple datasets, and (ii) model reuse on other datasets. Evaluating classification and regression performance with classical machine learning methods, our results show that: (1) head motion patterns are efficient biomarkers for estimating depression severity, achieving highly competitive performance for both classification and regression tasks on a variety of datasets, including achieving the second best Mean Absolute Error (MAE) on the AVEC2013 dataset, and (2) kineme-based features are more generalisable than (a) raw head motion descriptors for binary severity classification, and (b) other visual behavioural cues for severity estimation (regression).

虽然研究人员探索了多种语言和非语言行为提示,用于自动抑郁症评估,但头部运动迄今很少受到关注。此外,通过单一数据集验证机器学习模型的常见做法可能限制模型的通用性。这项工作审查了使用基本头动单元的模型的有效性和可概括性,即所谓的“直流”,以进行抑郁程度估计。具体地说,我们考虑了来自不同西方文化(德国:AVEC2013、澳大利亚:黑狗和美国:Pittt数据集)的三种抑郁症数据集,它们具有不同的背景和记录设置,以调查衍生的直系血型模式的可通用性。此外,通过两种方法:(一) 个人/多重数据集的千倍交叉校验,以及(二) 利用其他数据集的模型再利用。用经典机器学习方法评估分类和回归性表现。我们的结果显示:(1) 头动模式是评估抑郁症严重程度的高效生物标志,在各种数据集的分类和回归任务上实现高度竞争性的性表现,包括实现第二最佳直系直系模式(MAE) 和基于其他直观的直观性(直观性) AVA) 和直观性分析性(AVARC性(MA) ) 的直观) 和直观性(A-R) 直观) 。

Article 141

Title@2025-05-29 (4): Enhanced DACER Algorithm with High Diffusion Efficiency

Title: Enhanced DACER Algorithm with High Diffusion Efficiency

Verbesserter DACER-Algorithmus mit hoher Diffusionseffizienz

DACER 高传播效率增强的DACER 计算法 2505.23426v1

Authors: Yinuo Wang, Mining Tan, Wenjun Zou, Haotian Lin, Xujie Song, Wenxuan Wang, Tong Liu, Likun Wang, Guojian Zhan, Tianze Zhu, Shiqi Liu, Jingliang Duan, Shengbo Eben Li

Due to their expressive capacity, diffusion models have shown great promise in offline RL and imitation learning. Diffusion Actor-Critic with Entropy Regulator (DACER) extended this capability to online RL by using the reverse diffusion process as a policy approximator, trained end-to-end with policy gradient methods, achieving strong performance. However, this comes at the cost of requiring many diffusion steps, which significantly hampers training efficiency, while directly reducing the steps leads to noticeable performance degradation. Critically, the lack of inference efficiency becomes a significant bottleneck for applying diffusion policies in real-time online RL settings. To improve training and inference efficiency while maintaining or even enhancing performance, we propose a Q-gradient field objective as an auxiliary optimization target to guide the denoising process at each diffusion step. Nonetheless, we observe that the independence of the Q-gradient field from the diffusion time step negatively impacts the performance of the diffusion policy. To address this, we introduce a temporal weighting mechanism that enables the model to efficiently eliminate large-scale noise in the early stages and refine actions in the later stages. Experimental results on MuJoCo benchmarks and several multimodal tasks demonstrate that the DACER2 algorithm achieves state-of-the-art performance in most MuJoCo control tasks with only five diffusion steps, while also exhibiting stronger multimodality compared to DACER.

传播模型由于其表现能力,在离线RL和模仿学习中表现出了巨大的希望。与 Entropy 监管机构(DACER)的Difulation Actor-Cripic-Cripic with Entropy Control (DACER)将这一能力扩展至在线RL,方法是将反向传播进程用作政策辅助工具,通过政策梯度方法培训端至端端,取得强效。然而,这样做的代价是要求采取许多传播步骤,这严重妨碍培训效率,同时直接减少步骤导致显著的绩效退化。关键是,缺乏推论效率成为了在实时在线RL环境中应用传播政策的重大瓶颈。为了提高培训和推断效率,同时保持甚至提高绩效,我们提议将分级外地目标作为辅助性优化目标,以指导每个推广步骤的分解进程。然而,我们认为,相对于传播时间步骤的独立性对传播政策的业绩产生了消极影响。为了解决这一问题,我们引入了时间加权机制,使模型能够在早期应用大规模噪音,并且改进行动效率,同时提高培训和推导效效率,同时在后阶段也展示了FAROAL2 级任务,在最高级任务上展示了更强的MAL-CA上,在最高级任务上,仅能取得更强的模级基准,在后阶段展示了BA-CAMA级的进度上展示了5级基准。

Article 142

Title@2025-05-29 (4): Hierarchical Neuro-Symbolic Decision Transformer

Title: Hierarchical Neuro-Symbolic Decision Transformer

Hierarchischer neuro-symbolischer Entscheidungstransformator

等级性神经-共制决定变换器 2503.07148v3

Authors: Ali Baheri, Cecilia O. Alm

We present a hierarchical neuro-symbolic control framework that tightly couples a classical symbolic planner with a transformer-based policy to address long-horizon decision-making under uncertainty. At the high level, the planner assembles an interpretable sequence of operators that guarantees logical coherence with task constraints, while at the low level each operator is rendered as a sub-goal token that conditions a decision transformer to generate fine-grained actions directly from raw observations. This bidirectional interface preserves the combinatorial efficiency and explainability of symbolic reasoning without sacrificing the adaptability of deep sequence models, and it permits a principled analysis that tracks how approximation errors from both planning and execution accumulate across the hierarchy. Empirical studies in stochastic grid-world domains demonstrate that the proposed method consistently surpasses purely symbolic, purely neural and existing hierarchical baselines in both success and efficiency, highlighting its robustness for sequential tasks.

我们提出了一个等级级神经 – – 精神共振控制框架,将传统的象征性规划师与基于变压器的政策紧紧结合在一起,以便在不确定的情况下处理长视距的决策。在高层,规划员将可解释的操作员序列组合在一起,保证逻辑上与任务限制保持一致,而在低层,每个操作员则作为次级目标象征,为决定变压器提供条件,直接从原始观测中产生细微的分级行动。这个双向界面保存组合的效率和象征性推理的可解释性,同时又不牺牲深层次序列模型的适应性,并允许进行有原则的分析,以跟踪规划和执行过程中的近似差如何在等级结构中积累。在随机化的网域域域的实证研究表明,拟议的方法在成功和效率方面始终超越纯粹的象征性、纯粹的神经性和现有的分级基线,突出其对于连续任务的坚固性。

Article 143

Title@2025-05-29 (4): Risk-aware Direct Preference Optimization under Nested Risk Measure

Title: Risk-aware Direct Preference Optimization under Nested Risk Measure

Risikobewusste Direktpräferenzoptimierung unter verschachtelter Risikomaßnahme

内层风险措施下认识到风险的直接最优化 2505.20359v2

Authors: Lijun Zhang, Lin Li, Yajie Qi, Huizhong Song, Yaodong Yang, Jun Wang, Wei Wei

When fine-tuning pre-trained Large Language Models (LLMs) to align with human values and intentions, maximizing the estimated reward can lead to superior performance, but it also introduces potential risks due to deviations from the reference model’s intended behavior. Most existing methods typically introduce KL divergence to constrain deviations between the trained model and the reference model; however, this may not be sufficient in certain applications that require tight risk control. In this paper, we introduce Risk-aware Direct Preference Optimization (Ra-DPO), a novel approach that incorporates risk-awareness by employing a class of nested risk measures. This approach formulates a constrained risk-aware advantage function maximization problem and then converts the Bradley-Terry model into a token-level representation. The objective function maximizes the likelihood of the policy while suppressing the deviation between a trained model and the reference model using a sequential risk ratio, thereby enhancing the model’s risk-awareness. Experimental results across three open-source datasets: IMDb Dataset, Anthropic HH Dataset, and AlpacaEval, demonstrate the proposed method’s superior performance in balancing alignment performance and model drift. Our code is opensourced at https://github.com/zlj123-max/Ra-DPO.

当微调经过训练的大型语言模型(LLMS)与人类价值观和意图相匹配时,尽量扩大估计的奖励可以带来优异业绩,但也带来因偏离参考模型预期行为而带来的潜在风险。大多数现有方法通常引入KL差异以限制经过训练的模型与参考模型之间的偏差;然而,在某些应用中,这也许不够充分,需要严格的风险控制。在本文件中,我们引入了风险觉悟直接优化(Ra-DPO),这是一种新颖的办法,通过采用某种类嵌巢式风险计量措施,纳入风险意识。这种方法提出了有限的风险意识优势功能最大化问题,然后将布拉德利-Terry模型转换成象征性代表。客观功能在抑制经过训练的模型与参考模型之间的偏差的同时,使用一个顺序风险比比来控制,从而增强模型的风险意识。三种开放源数据集(IMDb Dataset、Athropic HDataset和AlpacaEval)的实验结果,展示了拟议的方法在Ormal-Oral-Axligal oral coal Progisal dal disal dismalgard) 和Reval disal disalgresligresligaldaldaldaldaldormald.

Article 144

Title@2025-05-29 (4): OTPTO: Joint Product Selection and Inventory Optimization in Fresh E-commerce Front-End Warehouses

Title: OTPTO: Joint Product Selection and Inventory Optimization in Fresh E-commerce Front-End Warehouses

OTPTO: Gemeinsame Produktauswahl und Bestandsoptimierung in Fresh E-Commerce Front-End Warehouses

OTPTO: 在新的电子商务前端仓库中联合产品选择和清单优化 2505.23421v1

Authors: Zheming Zhang, Yan Jiang, Qingshan Li, Ai Han

In China’s competitive fresh e-commerce market, optimizing operational strategies, especially inventory management in front-end warehouses, is key to enhance customer satisfaction and to gain a competitive edge. Front-end warehouses are placed in residential areas to ensure the timely delivery of fresh goods and are usually in small size. This brings the challenge of deciding which goods to stock and in what quantities, taking into account capacity constraints. To address this issue, traditional predict-then-optimize (PTO) methods that predict sales and then decide on inventory often don’t align prediction with inventory goals, as well as fail to prioritize consumer satisfaction. This paper proposes a multi-task Optimize-then-Predict-then-Optimize (OTPTO) approach that jointly optimizes product selection and inventory management, aiming to increase consumer satisfaction by maximizing the full order fulfillment rate. Our method employs a 0-1 mixed integer programming model OM1 to determine historically optimal inventory levels, and then uses a product selection model PM1 and the stocking model PM2 for prediction. The combined results are further refined through a post-processing algorithm OM2. Experimental results from JD.com’s 7Fresh platform demonstrate the robustness and significant advantages of our OTPTO method. Compared to the PTO approach, our OTPTO method substantially enhances the full order fulfillment rate by 4.34% (a relative increase of 7.05%) and narrows the gap to the optimal full order fulfillment rate by 5.27%. These findings substantiate the efficacy of the OTPTO method in managing inventory at front-end warehouses of fresh e-commerce platforms and provide valuable insights for future research in this domain.

在中国具有竞争力的新电子商务市场中,优化业务战略,特别是前端仓库的库存管理,是提高客户满意度和获得竞争优势的关键。前端仓库位于住宅区,以确保及时交付新鲜货物,通常规模较小。这带来了确定哪些货物储存和数量的挑战,同时考虑到能力限制。为了解决这一问题,传统的预测-即时优化(PTO)方法预测销售,然后决定库存,往往不与库存目标的预测保持一致,也没有优先考虑消费者满意度。本文建议采用多任务优化-即时优化-即时优化(OTPTO)方法,共同优化产品选择和库存管理(OTTO)方法,目的是通过最大限度地实现全订单完成率提高消费者满意度。我们的方法采用了0-1混合的OM1编程编程模型来确定历史最佳库存水平,然后使用产品选择模型PM1和库存储存模式PMM2来进行预测。通过后期电子处理算方法进一步改进了消费者满意度。7-OM2的当前最佳优化方法,通过JDOTO方法提升了我们前端汇率的完整排序。

Article 145

Title@2025-05-29 (4): Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition

Title: Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition

Probeneffiziente menschliche Bewertung großer Sprachmodelle durch maximalen Diskrepanzwettbewerb

通过最大差异竞争对大语言模式进行抽样有效人力评价 2404.08008v2

Authors: Kehua Feng, Keyan Ding, Hongzhi Tan, Kede Ma, Zhihua Wang, Shuangquan Guo, Yuzhou Cheng, Ge Sun, Guozhou Zheng, Qiang Zhang, Huajun Chen

Reliable evaluation of large language models (LLMs) is impeded by two key challenges: objective metrics often fail to reflect human perception of natural language, and exhaustive human labeling is prohibitively expensive. Here, we propose a sample-efficient human evaluation method for LLMs based on the principle of MAximum Discrepancy (MAD) Competition. Our method automatically and adaptively selects a compact set of input instructions that maximize semantic discrepancy between pairs of LLM responses. Human evaluators then perform three-alternative forced choices on these paired responses, which are aggregated into a global ranking using Elo rating. We apply our approach to compare eight widely used LLMs across four tasks: scientific knowledge understanding, mathematical reasoning, creative and functional writing, and code generation and explanation. Experimental results show that our sample-efficient evaluation method recovers “gold-standard” model rankings with a handful of MAD-selected instructions, reveals respective strengths and weaknesses of each LLM, and offers nuanced insights to guide future LLM development. Code is available at https://github.com/weiji-Feng/MAD-Eval .

对大型语言模型(LLMS)的可靠评价受到两大挑战的阻碍:客观指标往往不能反映人类对自然语言的看法,而详尽的人类标签则极其昂贵。在这里,我们提议根据Meximum差异(MAD)竞争原则,对LLMS进行抽样有效的人类评价。我们的方法自动和适应性地选择一套紧凑的投入指示,最大限度地扩大LLM对口答复之间的语义差异。然后,人类评价者对这些配对反应进行三种选择性强迫选择,然后用Elo等级汇总为全球排名。我们采用我们的方法,将八种广泛使用的LLMS对四大任务进行比较:科学知识理解、数学推理、创造性和功能性写作、代码生成和解释。实验结果表明,我们的抽样有效评价方法恢复了“古老标准”模型的排名,并有少数MAD选定的指示,揭示了每个LM的长处和短处,并提供了细微的洞察见解,以指导未来的LM发展。代码可在https://github.com/weiji-Feng/MAD-Eval查阅。

Article 146

Title@2025-05-29 (4): Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates

Title: Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates

Robustheitskongruente Adversarial Training für sicheres maschinelles Lernen Modellaktualisierungen

安全机器学习模型更新的强力和共性安全机器学习模型自动培训 2402.17390v2

Authors: Daniele Angioni, Luca Demetrio, Maura Pintor, Luca Oneto, Davide Anguita, Battista Biggio, Fabio Roli

Machine-learning models demand periodic updates to improve their average accuracy, exploiting novel architectures and additional data. However, a newly updated model may commit mistakes the previous model did not make. Such misclassifications are referred to as negative flips, experienced by users as a regression of performance. In this work, we show that this problem also affects robustness to adversarial examples, hindering the development of secure model update practices. In particular, when updating a model to improve its adversarial robustness, previously ineffective adversarial attacks on some inputs may become successful, causing a regression in the perceived security of the system. We propose a novel technique, named robustness-congruent adversarial training, to address this issue. It amounts to fine-tuning a model with adversarial training, while constraining it to retain higher robustness on the samples for which no adversarial example was found before the update. We show that our algorithm and, more generally, learning with non-regression constraints, provides a theoretically-grounded framework to train consistent estimators. Our experiments on robust models for computer vision confirm that both accuracy and robustness, even if improved after model update, can be affected by negative flips, and our robustness-congruent adversarial training can mitigate the problem, outperforming competing baseline methods.

机器学习模型要求定期更新,以提高其平均准确性,利用新建筑和额外数据。然而,新更新的模型可能犯前一个模型没有犯过的错误。这种错误分类被称作负翻转,用户作为业绩的倒退经历。在这项工作中,我们表明,这一问题还影响到对抗性实例的稳健性,妨碍制定安全的模型更新做法。特别是,在更新一个模型以提高其对抗性强力的模型时,以前对一些投入的无效对抗性攻击可能会成功,导致对系统安全感知的倒退。我们为解决这一问题提出了一种叫作 “ 稳健的对立式培训 “ 的新技术。它相当于用对抗性培训对模型进行微调,同时限制它在更新前没有找到对抗性范例的样本上保持更高的稳健性。我们表明,我们的算法,更一般地说,在不倒退的限制下学习,为培训一致的估算者提供了一个有理论基础的框架。我们关于稳健的计算机视觉模型的实验证实,既准确又稳健又稳健,即使在更新模型后改进了对立性基准,也会受到消极性的影响。

Article 147

Title@2025-05-29 (4): Privacy Amplification by Structured Subsampling for Deep Differentially Private Time Series Forecasting

Title: Privacy Amplification by Structured Subsampling for Deep Differentially Private Time Series Forecasting

Datenschutzverstärkung durch strukturierte Subsampling für tief differential private Zeitreihen Forecasting

以结构化的分抽样对深相异私人时间序列预测进行隐私放大 2502.02410v2

Authors: Jan Schuchardt, Mina Dalirrooyfard, Jed Guzelkabaagac, Anderson Schneider, Yuriy Nevmyvaka, Stephan Günnemann

Many forms of sensitive data, such as web traffic, mobility data, or hospital occupancy, are inherently sequential. The standard method for training machine learning models while ensuring privacy for units of sensitive information, such as individual hospital visits, is differentially private stochastic gradient descent (DP-SGD). However, we observe in this work that the formal guarantees of DP-SGD are incompatible with time-series-specific tasks like forecasting, since they rely on the privacy amplification attained by training on small, unstructured batches sampled from an unstructured dataset. In contrast, batches for forecasting are generated by (1) sampling sequentially structured time series from a dataset, (2) sampling contiguous subsequences from these series, and (3) partitioning them into context and ground-truth forecast windows. We theoretically analyze the privacy amplification attained by this structured subsampling to enable the training of forecasting models with sound and tight event- and user-level privacy guarantees. Towards more private models, we additionally prove how data augmentation amplifies privacy in self-supervised training of sequence models. Our empirical evaluation demonstrates that amplification by structured subsampling enables the training of forecasting models with strong formal privacy guarantees.

许多形式的敏感数据,如网络流量、流动数据或医院占用等,本质上是相继的。培训机器学习模式的标准方法,在确保敏感信息单位隐私(如个别医院访问)的同时,确保个人隐私的机器学习模式的标准方法,有差异的私人随机梯度下降(DP-SGD)。然而,我们在这项工作中观察到,DP-SGD的正式保障与诸如预测等特定时间序列任务不相容,因为它们依赖从一个非结构化数据集抽样的小型、非结构化批次培训所实现的隐私放大。相比之下,预测的批次是通过以下方式产生的:(1) 从数据集中抽样按顺序结构排列的时间序列,(2) 取样这些序列的连续序列,(3) 将其分割到上下文和地面的预测窗口。我们从理论上分析这种结构化的子样本所实现的隐私扩展,以便能够对预测模型进行稳妥、紧的事件和用户级隐私保障的培训。为了建立更隐秘的模型,我们进一步证明数据扩充如何在自我监督的序列模型培训中增强隐私。我们的经验性评估表明,通过结构化的子模型进行严格的预测。

Article 148

Title@2025-05-29 (4): On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

Title: On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

On-Device Collaborative Language Modeling über eine Mischung aus Generalisten und Spezialisten

通过通识主义者和专家混合组合的在线合作语言建模 2409.13931v4

Authors: Dongyang Fan, Bettina Messmer, Nikita Doikov, Martin Jaggi

On-device LLMs have gained increasing attention for their ability to enhance privacy and provide a personalized user experience. To facilitate private learning with scarce data, Federated Learning has become a standard approach. However, it faces challenges such as computational resource heterogeneity and data heterogeneity among end users. We propose CoMiGS ($\textbf{Co}$llaborative learning with a $\textbf{Mi}$xture of $\textbf{G}$eneralists and $\textbf{S}$pecialists), the first approach to address both challenges. A key innovation of our method is the bi-level optimization formulation of the Mixture-of-Experts learning objective, where the router is optimized using a separate validation set to ensure alignment with the target distribution. We solve our objective with alternating minimization, for which we provide a theoretical analysis. Our method shares generalist experts across users while localizing a varying number of specialist experts, thereby adapting to users’ computational resources and preserving privacy. Through extensive experiments, we show CoMiGS effectively balances general and personalized knowledge for each token generation. We demonstrate that CoMiGS remains robust against overfitting-due to the generalists’ regularizing effect-while adapting to local data through specialist expertise. We open source our codebase for collaborative LLMs.

提高隐私能力和提供个性化用户经验的能力日益受到重视。为了便利私人利用稀缺数据进行私人学习,联邦学习协会已成为一种标准做法,但它面临着计算资源差异和终端用户数据差异等挑战。我们提议使用美元(textbf{Co}Co}$xtural leaudial learning with $\ textbf{G}$xture,我们用一个理论分析来解决我们的目标。我们的方法与用户的一般专家共享,同时将不同数量的专家本地化,从而适应用户的计算资源并保护隐私。通过广泛的实验,我们的方法的一项关键创新是双级优化制定混合-Explants学习目标,即使用单独的校正组合优化路由器,以确保与目标分配保持一致。我们用一个交替最小化的方法来解决我们的目标。我们的方法与用户的普通专家专家专家专家分享,从而适应用户的计算资源并保护隐私。我们通过广泛的实验,展示CoMIGS公司有效平衡普通和个体化知识,以适应每一代的正常数据源。我们继续展示我们普通专家的开放数据库。

Article 149

Title@2025-05-29 (4): KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction

Title: KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction

KVzip: Query-Agnostic KV Cache-Kompression mit Kontext-Rekonstruktion

KVzip: 在背景重建中压缩缓存 2505.23416v1

Authors: Jang-Hyun Kim, Jinuk Kim, Sangwoo Kwon, Jae W. Lee, Sangdoo Yun, Hyun Oh Song

Transformer-based large language models (LLMs) cache context as key-value (KV) pairs during inference. As context length grows, KV cache sizes expand, leading to substantial memory overhead and increased attention latency. This paper introduces KVzip, a query-agnostic KV cache eviction method enabling effective reuse of compressed KV caches across diverse queries. KVzip quantifies the importance of a KV pair using the underlying LLM to reconstruct original contexts from cached KV pairs, subsequently evicting pairs with lower importance. Extensive empirical evaluations demonstrate that KVzip reduces KV cache size by 3-4$\times$ and FlashAttention decoding latency by approximately 2$\times$, with negligible performance loss in question-answering, retrieval, reasoning, and code comprehension tasks. Evaluations include various models such as LLaMA3.1-8B, Qwen2.5-14B, and Gemma3-12B, with context lengths reaching up to 170K tokens. KVzip significantly outperforms existing query-aware KV eviction methods, which suffer from performance degradation even at a 90% cache budget ratio under multi-query scenarios.

以变换器为基础的大语言缓存模型(LLMS)作为关键值(KV)在推断过程中的对等缓存环境。随着上下文长度的扩大,KV缓存规模扩大,导致大量存储管理费用,并增加关注时间。本文件介绍KVzip,这是一个查询-分析KV缓存驱逐方法,可以在不同查询中有效再利用压缩的KV缓存。KVzip量化了KV配对的重要性,使用基本的LLMM重建原始背景,从缓存的KV配对中重建原始背景,随后驱逐重要性较低的双对。广泛的实证评估表明,KVzip将KV缓存规模减少3-4美元,使KV缓存规模减少约2美元/美元,闪电控制解码拉长约2美元,在问答、检索、推理和代码理解任务中可忽略不计的绩效损失。评估包括各种模型,如LLLAMA3.1-8B、Quen2.5-14B和Gemma3-12B,其上的背景长度高达170K表示。 KVzip显著超出现有的查询-awa KV驱逐比例。在预算下,在90-V驱逐假设下,其业绩退化为90。

Article 150

Title@2025-05-29 (4): Bidirectional predictive coding

Title: Bidirectional predictive coding

Bidirektionale vorausschauende Kodierung

双向预测双向预测编码 2505.23415v1

Authors: Gaspard Oliviers, Mufeng Tang, Rafal Bogacz

Predictive coding (PC) is an influential computational model of visual learning and inference in the brain. Classical PC was proposed as a top-down generative model, where the brain actively predicts upcoming visual inputs, and inference minimises the prediction errors. Recent studies have also shown that PC can be formulated as a discriminative model, where sensory inputs predict neural activities in a feedforward manner. However, experimental evidence suggests that the brain employs both generative and discriminative inference, while unidirectional PC models show degraded performance in tasks requiring bidirectional processing. In this work, we propose bidirectional PC (bPC), a PC model that incorporates both generative and discriminative inference while maintaining a biologically plausible circuit implementation. We show that bPC matches or outperforms unidirectional models in their specialised generative or discriminative tasks, by developing an energy landscape that simultaneously suits both tasks. We also demonstrate bPC’s superior performance in two biologically relevant tasks including multimodal learning and inference with missing information, suggesting that bPC resembles biological visual inference more closely.

预测编码(PC)是大脑中视觉学习和推断的有影响力的计算模型。古典PC被建议为自上而下的基因模型,大脑积极预测即将出现的视觉输入,推断最小化预测错误。最近的研究还显示,PC可以作为一种歧视模型,感官输入以进食方式预测神经活动。然而,实验证据表明,大脑同时使用基因和歧视性推断,而单向PC模型则显示需要双向处理的任务的性能退化。在这项工作中,我们提出了双向PC(BPC),这是一种包含基因化和歧视性推断的PC(PC),同时保持生物上可信的电路执行。我们表明,BPC通过开发既适合两种任务又适合两种任务的能源景观,其单向性模型与单向性模型相匹配或超出。我们还表明,BPC在两种与生物相关的任务中表现优异,包括多式学习和与缺失信息的推断,表明BPC更接近生物视觉。

Article 151

Title@2025-05-29 (4): Identification and Optimal Nonlinear Control of Turbojet Engine Using Koopman Eigenfunction Model

Title: Identification and Optimal Nonlinear Control of Turbojet Engine Using Koopman Eigenfunction Model

Identifizierung und optimale nichtlineare Steuerung der Turbojet-Engine mit Koopman Eigenfunktionsmodell

使用 Koopman Eigen功能模型对涡轮喷气发动机进行最佳非线性识别和最佳非线性控制 2505.10438v2

Authors: David Grasev

Gas turbine engines represent complex highly nonlinear dynamical systems. Deriving their physics-based models can be challenging as it requires performance characteristics, that are not always available, and one often has to make many simplifying assumptions. In this paper, the limitations of conventional experimental methods used to derive component-level and locally linear parameter-varying models are discussed and addressed by employing identification techniques based on data collected from standard engine operation under closed-loop control. The rotor dynamics were estimated using the sparse identification of nonlinear dynamics. Subsequently, the autonomous part of the dynamics was mapped into an optimally constructed Koopman eigenfunction space. The process included eigenvalue optimization using metaheuristic algorithms and temporal projection, followed by gradient-based eigenfunction identification. The resulting Koopman model was validated against an in-house reference component-level model. A globally optimal nonlinear feedback controller and a Kalman estimator were then designed in the eigenfunction space and compared to the classical and gain-scheduled proportional-integral controllers, as well as a proposed internal model control approach. The eigenmode structure allowed targeting individual modes during the optimization process, resulting in a better performance tuning. The results showed that the Koopman-based controller outperformed the other benchmark controllers in both reference tracking and disturbance rejection, under sea-level and varying flight conditions, due to its global nature.

燃气涡轮机引擎代表复杂的高度非线性动态系统。开发基于物理的模型可能具有挑战性,因为它需要性能特征,这些特征并非总有,而且往往需要做出许多简化的假设。在本文件中,根据闭环控制下的标准引擎操作所收集的数据,使用基于闭环控制下标准发动机操作所收集的数据,对常规实验方法得出组件级和局部线性参数分布式模型的局限性进行讨论和解决。转动动态使用稀疏的非线性动态识别法来估计。随后,该动态的自主部分被映射成一个最佳构造的库普曼天文功能空间。这一过程包括使用美术算法和时间预测来进行精精精精精度优化,然后是基于梯度的叶机能识别。由此形成的Koopman模型在内部参考级模型模型模型模型模型上得到验证。一个全球最佳的非线性反馈控制器和一个卡尔曼天文估计仪,然后在机能空间中设计出一个全球最佳的非线性阻力控制器,与基于经典和增益的成型成型成型成型成型成型成型成型的成型成型成型的成型的成型控制器。结构结构允许在优化过程中将单个的飞行成型调整结果,在测试后,在优化后,在测试后,在优化后,在优化后,在优化后将每个飞行成型的飞行成型后,在优化后,在优化后,在优化后,在调整后,在调整后,在调整成型后,在调整后,在调整后,在调整后,在调整后,在调整后,在调整后,在调整后,在调整后,在优化后,在调整后,在调整后,在调整后,在调整了其他的飞行成型后,在调整成型后制式操作制式后,在调整后,在调整成型后,在调整后,在调整后,在调整后,在调整后,在调整后制式的飞行。

Article 152

Title@2025-05-29 (4): Buffer-free Class-Incremental Learning with Out-of-Distribution Detection

Title: Buffer-free Class-Incremental Learning with Out-of-Distribution Detection

Pufferfreies Klassen-Inkrementelles Lernen mit Out-of-Distribution Detection

含有扩散外检测检测的无缓缓度免费类级学习 2505.23412v1

Authors: Srishti Gupta, Daniele Angioni, Maura Pintor, Ambra Demontis, Lea Schönherr, Battista Biggio, Fabio Roli

Class-incremental learning (CIL) poses significant challenges in open-world scenarios, where models must not only learn new classes over time without forgetting previous ones but also handle inputs from unknown classes that a closed-set model would misclassify. Recent works address both issues by (i)~training multi-head models using the task-incremental learning framework, and (ii) predicting the task identity employing out-of-distribution (OOD) detectors. While effective, the latter mainly relies on joint training with a memory buffer of past data, raising concerns around privacy, scalability, and increased training time. In this paper, we present an in-depth analysis of post-hoc OOD detection methods and investigate their potential to eliminate the need for a memory buffer. We uncover that these methods, when applied appropriately at inference time, can serve as a strong substitute for buffer-based OOD detection. We show that this buffer-free approach achieves comparable or superior performance to buffer-based methods both in terms of class-incremental learning and the rejection of unknown samples. Experimental results on CIFAR-10, CIFAR-100 and Tiny ImageNet datasets support our findings, offering new insights into the design of efficient and privacy-preserving CIL systems for open-world settings.

在开放世界情景中,各种模型不仅必须长期学习新班级,而不能忘记以前班级,而且还必须处理封闭型模式可能错误分类的未知班级的投入。最近的工作通过(一) 培训使用任务强化学习框架的多头模型,以及(二) 预测任务身份,使用分配以外的探测器(OOD)检测器。虽然有效,但后者主要依靠与过去数据的记忆缓冲联合培训,引起对隐私、可扩缩性和增加培训时间的关切。我们在本文件中深入分析了HOC OOD后检测方法,并调查其消除记忆缓冲需求的潜力。我们发现,这些方法,如果在回溯时间适当应用,可以有力地替代缓冲性OOD检测。我们表明,这种缓冲性方法在课堂内学习和拒绝未知样本方面都取得了类似或优异于缓冲性方法的绩效。我们在CIFAR-10、CIFAR-100和Tinal-Refreialal 图像网络系统中的实验结果,支持了我们对CIFAR-10、CIFAR-G-100和Timliviewalalal-deal-ILSetal Devely Devely dislation sy sy sse real sy sy sy surviewmal sy systemal sy sy sy sy systection sy sy sy sy sy sy sy sy sy sy sy sy sy sy sy sy sy sy sy sy sy sy symal sy sy sy systections sy sy systections sy sy sy system sy sy sy sy sy sy sy sy sy sy sy sy system sy sy sy sy system system system system systemts.sm system system system systems.s.s.

Article 153

Title@2025-05-29 (4): Video Editing for Audio-Visual Dubbing

Title: Video Editing for Audio-Visual Dubbing

Videobearbeitung für Audio-Visual-Dubbing

音像视频编辑 2505.23406v1

Authors: Binyamin Manela, Sharon Gannot, Ethan Fetyaya

Visual dubbing, the synchronization of facial movements with new speech, is crucial for making content accessible across different languages, enabling broader global reach. However, current methods face significant limitations. Existing approaches often generate talking faces, hindering seamless integration into original scenes, or employ inpainting techniques that discard vital visual information like partial occlusions and lighting variations. This work introduces EdiDub, a novel framework that reformulates visual dubbing as a content-aware editing task. EdiDub preserves the original video context by utilizing a specialized conditioning scheme to ensure faithful and accurate modifications rather than mere copying. On multiple benchmarks, including a challenging occluded-lip dataset, EdiDub significantly improves identity preservation and synchronization. Human evaluations further confirm its superiority, achieving higher synchronization and visual naturalness scores compared to the leading methods. These results demonstrate that our content-aware editing approach outperforms traditional generation or inpainting, particularly in maintaining complex visual elements while ensuring accurate lip synchronization.

视觉遮盖,即面部遮盖与新语言同步,对于让不同语言的内容能够无障碍使用至关重要,使全球范围更加广泛。然而,目前的方法面临巨大的限制。现有的方法往往产生说话面孔,阻碍无缝融入原始场景,或采用丢弃重要视觉信息的油漆技术,如部分隔离和照明变异。这项工作引入了EdiDub,这是一个将视觉遮盖重新配置为内容识别编辑任务的新框架。EdiDub通过利用专门设置确保忠实和准确的修改而不是仅仅复制来保存原始视频环境。在多个基准上,包括具有挑战性的隐蔽滑坡数据集,EdiDub显著改进了身份保护和同步。人类评估进一步证实了其优越性,实现了更高的同步性和视觉自然性分数,而与主要方法相比,这些结果表明,我们的内容觉编辑方法超越了传统生成或画面,特别是在保持复杂的视觉元素的同时确保准确的唇同步。

Article 154

Title@2025-05-29 (4): A Refined Analysis of UCBVI

Title: A Refined Analysis of UCBVI

Eine raffinierte Analyse von UCBVI

UCBVI的精细分析 2502.17370v2

Authors: Simone Drago, Marco Mussi, Alberto Maria Metelli

In this work, we provide a refined analysis of the UCBVI algorithm (Azar et al., 2017), improving both the bonus terms and the regret analysis. Additionally, we compare our version of UCBVI with both its original version and the state-of-the-art MVP algorithm. Our empirical validation demonstrates that improving the multiplicative constants in the bounds has significant positive effects on the empirical performance of the algorithms.

在这项工作中,我们提供了对UCBVI算法(Azar等人,2017年)的精细分析,改进了奖金条件和遗憾分析。此外,我们比较了我们的UCBVI版本及其原始版本和最新的MVP算法。我们的经验验证表明,改进界限中的多倍常数对算法的经验性表现具有重大的积极影响。

Article 155

Title@2025-05-29 (4): Closed-form Solutions: A New Perspective on Solving Differential Equations

Title: Closed-form Solutions: A New Perspective on Solving Differential Equations

Closed-form Lösungen: Eine neue Perspektive zur Lösung von Differentialgleichungen

封闭式解决办法:解决差异等量的新视角 2405.14620v3

Authors: Shu Wei, Yanjie Li, Lina Yu, Weijun Li, Min Wu, Linjun Sun, Jufeng Han, Yan Pang

The quest for analytical solutions to differential equations has traditionally been constrained by the need for extensive mathematical expertise. Machine learning methods like genetic algorithms have shown promise in this domain, but are hindered by significant computational time and the complexity of their derived solutions. This paper introduces SSDE (Symbolic Solver for Differential Equations), a novel reinforcement learning-based approach that derives symbolic closed-form solutions for various differential equations. Evaluations across a diverse set of ordinary and partial differential equations demonstrate that SSDE outperforms existing machine learning methods, delivering superior accuracy and efficiency in obtaining analytical solutions.

对不同方程式的分析性解决办法的寻求历来受到对广泛数学专门知识需要的制约,基因算法等机械学习方法在这一领域显示了希望,但受到大量计算时间及其衍生解决方案复杂性的阻碍,本文件介绍了SDE(不同等式的Symbolic Solveer),这是一种新型强化学习法,为各种差异方程式提供象征性的封闭式解决方案。对各种普通和部分差异方程式的评价表明,SSSDE优于现有机器学习方法,在获得分析解决方案方面提供了更高的准确性和效率。

Article 156

Title@2025-05-29 (4): Subgroups Matter for Robust Bias Mitigation

Title: Subgroups Matter for Robust Bias Mitigation

Untergruppen Materie für robuste Bias Mitigation

稳健的Biust Bias 减轻风险的分组事项 2505.21363v2

Authors: Anissa Alloula, Charles Jones, Ben Glocker, Bartłomiej W. Papież

Despite the constant development of new bias mitigation methods for machine learning, no method consistently succeeds, and a fundamental question remains unanswered: when and why do bias mitigation techniques fail? In this paper, we hypothesise that a key factor may be the often-overlooked but crucial step shared by many bias mitigation methods: the definition of subgroups. To investigate this, we conduct a comprehensive evaluation of state-of-the-art bias mitigation methods across multiple vision and language classification tasks, systematically varying subgroup definitions, including coarse, fine-grained, intersectional, and noisy subgroups. Our results reveal that subgroup choice significantly impacts performance, with certain groupings paradoxically leading to worse outcomes than no mitigation at all. Our findings suggest that observing a disparity between a set of subgroups is not a sufficient reason to use those subgroups for mitigation. Through theoretical analysis, we explain these phenomena and uncover a counter-intuitive insight that, in some cases, improving fairness with respect to a particular set of subgroups is best achieved by using a different set of subgroups for mitigation. Our work highlights the importance of careful subgroup definition in bias mitigation and presents it as an alternative lever for improving the robustness and fairness of machine learning models.

尽管不断为机器学习开发新的减少偏见方法,但没有方法始终成功,还有一个根本问题仍未解答:减少偏见技术何时和为何失败?在本文中,我们假设一个关键因素可能是许多减轻偏见方法(即分组的定义)经常被忽略但至关重要的步骤:分组的定义。为了调查这一点,我们全面评估了多种愿景和语言分类任务中最先进的减轻偏见方法,系统化的不同分组定义,包括粗糙、细细的、交叉的和吵闹的分组。我们的结果显示分组选择对业绩产生了重大影响,某些分组的偏差导致的结果反常,而不是完全没有缓解。我们的调查结果表明,观察一组分组之间的差异并不是利用这些分组进行减缓的充分理由。我们通过理论分析,解释这些现象并找出反直觉的洞察力,即在某些情况下,通过使用不同的分组来改进对特定分组的公平性是最好的办法。我们的工作强调了审慎的分组定义在减轻偏见方面的重要性,并把它作为提高机器学习模型的稳健性和公正性的替代杠杆。

Article 157

Title@2025-05-29 (4): Unraveling the Interplay between Carryover Effects and Reward Autocorrelations in Switchback Experiments

Title: Unraveling the Interplay between Carryover Effects and Reward Autocorrelations in Switchback Experiments

Entschlüsselung des Interplays zwischen Übertragungseffekten und Belohnungsautokorrelationen in Switchback-Experimenten

在回转实验中解开结转效应与回转回实验中回调自动关系之间的交互作用 2403.17285v3

Authors: Qianglin Wen, Chengchun Shi, Ying Yang, Niansheng Tang, Hongtu Zhu

A/B testing has become the gold standard for policy evaluation in modern technological industries. Motivated by the widespread use of switchback experiments in A/B testing, this paper conducts a comprehensive comparative analysis of various switchback designs in Markovian environments. Unlike many existing works which derive the optimal design based on specific and relatively simple estimators, our analysis covers a range of state-of-the-art estimators developed in the reinforcement learning (RL) literature. It reveals that the effectiveness of different switchback designs depends crucially on (i) the size of the carryover effect and (ii) the auto-correlations among reward errors over time. Meanwhile, these findings are estimator-agnostic, i.e., they apply to most RL estimators. Based on these insights, we provide a workflow to offer guidelines for practitioners on designing switchback experiments in A/B testing.

A/B测试已成为现代技术产业政策评价的黄金标准,由于在A/B测试中广泛使用回转实验,本文件对Markovian环境中的各种回转设计进行了全面比较分析。与许多现有工程不同,这些工程根据具体和相对简单的估测器得出最佳设计,我们的分析涵盖在强化学习文献(RL)中开发的一系列最先进的估计器。它显示,不同的回转设计的有效性关键取决于(一) 转转效应的大小和(二) 随时间推移奖励错误之间的自动反差。同时,这些结论是估计式的,也就是说,这些结果适用于大多数RL估计器。基于这些认识,我们提供了一个工作流程,为A/B测试中设计回转实验的从业者提供指导方针。

Article 158

Title@2025-05-29 (4): Dynamic Estimation Loss Control in Variational Quantum Sensing via Online Conformal Inference

Title: Dynamic Estimation Loss Control in Variational Quantum Sensing via Online Conformal Inference

Dynamische Abschätzungsverlustkontrolle bei der variationalen Quantensensing über Online-Konforme Inferenz

通过在线非正式推断在变化量测量中动态估计损失控制 2505.23389v1

Authors: Ivana Nikoloska, Hamdi Joudeh, Ruud van Sloun, Osvaldo Simeone

Quantum sensing exploits non-classical effects to overcome limitations of classical sensors, with applications ranging from gravitational-wave detection to nanoscale imaging. However, practical quantum sensors built on noisy intermediate-scale quantum (NISQ) devices face significant noise and sampling constraints, and current variational quantum sensing (VQS) methods lack rigorous performance guarantees. This paper proposes an online control framework for VQS that dynamically updates the variational parameters while providing deterministic error bars on the estimates. By leveraging online conformal inference techniques, the approach produces sequential estimation sets with a guaranteed long-term risk level. Experiments on a quantum magnetometry task confirm that the proposed dynamic VQS approach maintains the required reliability over time, while still yielding precise estimates. The results demonstrate the practical benefits of combining variational quantum algorithms with online conformal inference to achieve reliable quantum sensing on NISQ devices.

量子遥感利用非古典效应来克服古典传感器的局限性,其应用范围从引力波探测到纳米级成像等,然而,在噪音和取样装置上建立的实际量子传感器面临重大的噪音和取样限制,而目前的变量测量方法缺乏严格的性能保障。本文建议为VQS建立一个在线控制框架,以动态更新变量参数,同时在估算中提供确定性误差栏。通过利用在线符合性推论技术,该方法产生有保障长期风险水平的顺序估算数据集。量子磁测量任务实验证实,拟议的VQS动态方法在一段时间内保持必要的可靠性,同时仍然得出准确的估计数。结果显示,将变量算法与在线一致性推导法相结合,以在 NISQ设备上实现可靠的量子测量的实际好处。

Article 159

Title@2025-05-29 (4): BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction

Title: BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction

BatteryLife: Ein umfassender Datensatz und Benchmark für die Vorhersage der Akkulaufzeit

电池寿命:电池寿命预测综合数据集和基准 2502.18807v4

Authors: Ruifeng Tan, Weixiang Hong, Jiayue Tang, Xibin Lu, Ruijun Ma, Xiang Zheng, Jia Li, Jiaqiang Huang, Tong-Yi Zhang

Battery Life Prediction (BLP), which relies on time series data produced by battery degradation tests, is crucial for battery utilization, optimization, and production. Despite impressive advancements, this research area faces three key challenges. Firstly, the limited size of existing datasets impedes insights into modern battery life data. Secondly, most datasets are restricted to small-capacity lithium-ion batteries tested under a narrow range of diversity in labs, raising concerns about the generalizability of findings. Thirdly, inconsistent and limited benchmarks across studies obscure the effectiveness of baselines and leave it unclear if models popular in other time series fields are effective for BLP. To address these challenges, we propose BatteryLife, a comprehensive dataset and benchmark for BLP. BatteryLife integrates 16 datasets, offering a 2.5 times sample size compared to the previous largest dataset, and provides the most diverse battery life resource with batteries from 8 formats, 59 chemical systems, 9 operating temperatures, and 421 charge/discharge protocols, including both laboratory and industrial tests. Notably, BatteryLife is the first to release battery life datasets of zinc-ion batteries, sodium-ion batteries, and industry-tested large-capacity lithium-ion batteries. With the comprehensive dataset, we revisit the effectiveness of baselines popular in this and other time series fields. Furthermore, we propose CyclePatch, a plug-in technique that can be employed in various neural networks. Extensive benchmarking of 18 methods reveals that models popular in other time series fields can be unsuitable for BLP, and CyclePatch consistently improves model performance establishing state-of-the-art benchmarks. Moreover, BatteryLife evaluates model performance across aging conditions and domains. BatteryLife is available at https://github.com/Ruifeng-Tan/BatteryLife.

电池寿命预测(BLP)依赖电池降解测试产生的时间序列数据,对电池的使用、优化和生产至关重要。尽管取得了令人印象深刻的进步,但这一研究领域面临三大挑战。首先,现有数据集的规模有限,妨碍了对现代电池寿命数据的洞察力。第二,大多数数据集仅限于在实验室中以范围很窄的多种形式测试的小容量锂离子电池,使人们对调查结果的可概括性产生担忧。第三,各研究之间不一致和有限的基准模糊了基线的有效性,并使人们不清楚其他时间序列中流行的模型是否对电池的使用有效。为了应对这些挑战,我们提议电池服务(Belly Liferation),一个全面的数据集(BLP),一个全面的数据集(Telly Lifer),一个全面的数据集(2.5倍),比以前最大的数据集(2.5倍),提供最多样化的电池生命资源,从8种格式、59个化学系统、9个运行温度和421个充电/排电协议,包括实验室和工业测试。值得注意的是,电池生命是释放电池生命数据系列的第一至排放期数据系列(Bliferreal-real-de),我们正在测试其他的碱- Streal-reax-de-de-deal-de-deal-deal-deal-deal-deal-deal-deal-deal-deal-deal-deal-de Streal-deal-de-deal-deal-de-deal-de-de-deal-destration-destration-deal-de-destration-destration-destration-de 。我们使用了这个系统-deal-deal-stal-de-real-st-de-de-de-de-de-de-de-de-deal-deal-deal-deal-deal-deal-de-de-de-deal-deal-deal-deal-deal-deal-deal-deal-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-

Article 160

Title@2025-05-29 (4): A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers

Title: A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers

Eine statistische Lernperspektive zur halbdualen Neural Optimal Transport Solvers

半对半对半的神经神经优化运输解决方案的统计学习视角 2502.01310v2

Authors: Roman Tarasov, Petr Mokrov, Milena Gazdieva, Evgeny Burnaev, Alexander Korotin

Neural network-based optimal transport (OT) is a recent and fruitful direction in the generative modeling community. It finds its applications in various fields such as domain translation, image super-resolution, computational biology and others. Among the existing OT approaches, of considerable interest are adversarial minimax solvers based on semi-dual formulations of OT problems. While promising, these methods lack theoretical investigation from a statistical learning perspective. Our work fills this gap by establishing upper bounds on the generalization error of an approximate OT map recovered by the minimax quadratic OT solver. Importantly, the bounds we derive depend solely on some standard statistical and mathematical properties of the considered functional classes (neural nets). While our analysis focuses on the quadratic OT, we believe that similar bounds could be derived for general OT case, paving the promising direction for future research.

以神经网络为基础的最佳运输(OT)是基因模型界最近的一个富有成果的方向。它发现它在各个领域的应用,例如域译、图像超分辨率、计算生物学和其他领域。在现有OT方法中,相当感兴趣的是基于半双向的OT问题配方的对抗性微轴求解器。这些方法虽然很有希望,但缺乏从统计学习角度进行理论调查的理论性研究。我们的工作填补了这一空白,确定了迷你麦角四极OT求解器所恢复的近似OT地图的概括性误差的上限。重要的是,我们获得的界限完全取决于所考虑的功能类(内网)的一些标准统计和数学特性。虽然我们的分析侧重于二次式OT,但我们认为,一般OT案例可以得出类似的界限,为未来的研究铺平了有希望的方向。

Article 161

Title@2025-05-29 (4): Automated Modeling Method for Pathloss Model Discovery

Title: Automated Modeling Method for Pathloss Model Discovery

Automatisierte Modellierungsmethode für Pathloss Model Discovery

病理模型发现自动建模方法 2505.23383v1

Authors: Ahmad Anaqreh, Shih-Kai Chou, Mihael Mohorčič, Carolina Fortuna

Modeling propagation is the cornerstone for designing and optimizing next-generation wireless systems, with a particular emphasis on 5G and beyond era. Traditional modeling methods have long relied on statistic-based techniques to characterize propagation behavior across different environments. With the expansion of wireless communication systems, there is a growing demand for methods that guarantee the accuracy and interoperability of modeling. Artificial intelligence (AI)-based techniques, in particular, are increasingly being adopted to overcome this challenge, although the interpretability is not assured with most of these methods. Inspired by recent advancements in AI, this paper proposes a novel approach that accelerates the discovery of path loss models while maintaining interpretability. The proposed method automates the model formulation, evaluation, and refinement, facilitating model discovery. We evaluate two techniques: one based on Deep Symbolic Regression, offering full interpretability, and the second based on Kolmogorov-Arnold Networks, providing two levels of interpretability. Both approaches are evaluated on two synthetic and two real-world datasets. Our results show that Kolmogorov-Arnold Networks achieve R^2 values close to 1 with minimal prediction error, while Deep Symbolic Regression generates compact models with moderate accuracy. Moreover, on the selected examples, we demonstrate that automated methods outperform traditional methods, achieving up to 75% reduction in prediction errors, offering accurate and explainable solutions with potential to increase the efficiency of discovering next-generation path loss models.

建模传播是设计和优化下一代无线系统的基石,特别侧重于5G及以后时代。传统的建模方法长期依赖基于统计数据的技术来描述不同环境的传播行为。随着无线通信系统的扩展,对保证建模准确性和互操作性的方法的需求日益增长。人造智能(AI)技术,特别是人造智能(AI)技术,正日益被采用来克服这一挑战,尽管这些方法大多不能保证解释性。根据最近AI的进步,本文件提出了一种新颖的方法,加快了路径丢失模型的发现,同时保持了可解释性。拟议的方法将模型的制定、评价和完善自动化地结合了不同环境的传播行为特征。随着无线通信系统的扩展,我们评估了两种技术:一种基于深度反射力的模型,提供了完全的可解释性,第二种基于科尔莫托洛夫-奥尔纳德网络,提供了两种可解释性水平的可解释性方法。两种方法都是在两种合成和两种真实世界数据集的基础上加以评价的。我们的结果显示,科尔莫戈洛夫-阿诺尔德网络实现了接近R2的值,同时保持可解释性2,将模型自动地与模型连接,为模型,便于预测性错误。我们用深度分析方法展示了降低75的方法,同时展示了一种方法,用最精确的方法,用最精确的方法展示的方法展示的方式展示的方法展示了降低的方法,用。

Article 162

Title@2025-05-29 (4): Tracking Progress Towards Sustainable Development Goal 6 Using Satellite Imagery

Title: Tracking Progress Towards Sustainable Development Goal 6 Using Satellite Imagery

Fortschritte auf dem Weg zu einer nachhaltigen Entwicklung verfolgen Ziel 6 Nutzung von Satellitenbildern

利用卫星图像跟踪可持续发展目标6的进展情况 2411.19093v2

Authors: Othmane Echchabi, Aya Lahlou, Nizar Talty, Josh Malcolm Manto, Ka Leung Lam

Clean water and sanitation are essential for health, well-being, and sustainable development, yet significant global disparities persist. Although the United Nations’ Sustainable Development Goal (SDG) 6 clearly defines targets for universal access to clean water and sanitation, limitations in data coverage and openness impede accurate tracking of progress in many countries. To bridge these gaps, this study integrates Afrobarometer survey data, satellite imagery from Landsat 8 and Sentinel-2, and advanced deep learning techniques using Meta’s self-supervised Distillation with No Labels (DINO) model to develop a modeling framework for evaluating access to piped water and sewage system across diverse African regions. The modeling framework achieved notable accuracy, with over 96% for piped water and 97% for sewage system access classification. When combined with geospatial population data, validation against official statistics from the United Nations Joint Monitoring Program demonstrated high concordance at the national scale (R2 of 0.95 for piped water access and R2 of 0.85 for sewage system access). The national-level estimates can represent SDG Indicators 6.1.1 and 6.2.1. This approach provides policymakers and stakeholders with an effective, scalable, and cost-efficient tool to pinpoint underserved areas requiring targeted intervention. The methodology developed herein can be adapted for assessing other infrastructure-related SDGs, promoting enhanced monitoring and informed decision-making towards achieving global sustainability objectives.

虽然联合国可持续发展目标(SDG)6明确规定了普遍获得清洁饮用水和卫生设施的目标,但数据覆盖面和开放程度的局限性妨碍了对许多国家进展情况的准确跟踪。为弥合这些差距,本研究综合了非洲晴雨表调查数据、来自Landsat 8和Sentinel-2的卫星图像,以及利用Meta自行监督的无拉贝(DINO)蒸馏模型的先进深层次学习技术,以制定一个模型框架,用以评价非洲各区域获得自来水和下水道系统的情况。示范框架达到了显著的准确性,自来水占96%以上,污水系统使用分类占97%。在与地理空间人口数据相结合时,对照联合国联合监测方案官方统计数据的验证表明,国家规模高度一致(用于管道供水的0.95R2和用于污水系统获取的0.85R2)。国家一级估算可代表SDG指标6.1.1和6.2.1。这一方法为决策者和利益攸关方提供了有效的、可扩展的、可调整的和成本效率更高的基础设施,以实现全球决策目标的更有针对性的工具。

Article 163

Title@2025-05-29 (4): Meta-Learning Approaches for Speaker-Dependent Voice Fatigue Models

Title: Meta-Learning Approaches for Speaker-Dependent Voice Fatigue Models

Meta-Learning-Ansätze für Sprecher-Abhängige Sprachmüdigkeitsmodelle

议长 – – 独立的声音 “ fatigue “ 模式的元学习方法 2505.23378v1

Authors: Roseline Polle, Agnes Norbury, Alexandra Livia Georgescu, Nicholas Cummins, Stefano Goria

Speaker-dependent modelling can substantially improve performance in speech-based health monitoring applications. While mixed-effect models are commonly used for such speaker adaptation, they require computationally expensive retraining for each new observation, making them impractical in a production environment. We reformulate this task as a meta-learning problem and explore three approaches of increasing complexity: ensemble-based distance models, prototypical networks, and transformer-based sequence models. Using pre-trained speech embeddings, we evaluate these methods on a large longitudinal dataset of shift workers (N=1,185, 10,286 recordings), predicting time since sleep from speech as a function of fatigue, a symptom commonly associated with ill-health. Our results demonstrate that all meta-learning approaches tested outperformed both cross-sectional and conventional mixed-effects models, with a transformer-based method achieving the strongest performance.

依赖议长的建模可以大幅提高语音健康监测应用的绩效。虽然使用混合效应模型通常用于对演讲者进行适应,但需要为每次新观察进行费用昂贵的计算再培训,使其在生产环境中不切实际。我们将此任务改造成一个元学习问题,并探索三种日益复杂的方法:基于共同的远程模型、原型网络和基于变压器的序列模型。我们使用预先培训的语音嵌入,对轮班工人的大型纵向数据集(N=1,185、10,286录音)进行评估,预测演讲后睡眠时间的疲劳功能,这种症状通常与健康不良相关。我们的结果表明,所有经测试的元学习方法都优于跨部门和常规混合效应模型,而基于变压器的方法取得最强的性能。

Article 164

Title@2025-05-29 (4): GWQ: Gradient-Aware Weight Quantization for Large Language Models

Title: GWQ: Gradient-Aware Weight Quantization for Large Language Models

GWQ: Gradient-Aware Weight Quantization für große Sprachmodelle

GWQ: 大语言模型的渐变软件重量 2411.00850v4

Authors: Yihua Shao, Yan Gu, Siyu Chen, Haiyang Liu, Zixian Zhu, Zijian Ling, Minxi Yan, Ziyang Yan, Chenyu Zhang, Michele Magno, Haotong Qin, Yan Wang, Jingcai Guo, Ling Shao, Hao Tang

Large language models (LLMs) show impressive performance in solving complex language tasks. However, its large number of parameters presents significant challenges for the deployment. So, compressing LLMs to low bits can enable to deploy on resource-constrained devices. To address this problem, we propose gradient-aware weight quantization (GWQ), the first quantization approach for low-bit weight quantization that leverages gradients to localize outliers, requiring only a minimal amount of calibration data for outlier detection. GWQ retains the top 1\% outliers preferentially at FP16 precision, while the remaining non-outlier weights are stored in a low-bit. We widely evaluate GWQ on different task include language modeling, grounding detection, massive multitask language understanding and vision-language question and answering. Results show that models quantified by GWQ performs better than other quantization method. During quantization process, GWQ only need one calibration set to realize effective quant. Also, GWQ achieves 1.2x inference speedup in comparison to the original model and effectively reduces the inference memory.

大型语言模型(LLMS)在解决复杂的语言任务方面表现出令人印象深刻的成绩。然而,它的大量参数对部署提出了巨大的挑战。因此,将LLMS压缩到低位位位上可以使资源受限制的装置得到部署。为了解决这个问题,我们提议了低位重量四分法(GWQ),即低位重量四分法(GWQ),这是利用梯度使外层局部化的首个量化方法,只需要最低限度的校准数据来进行外层检测。GWQ在FP16精确度上优先保留顶端的1外端值,而剩余的非外层重量则储存在低位。我们广泛评价GWQ的不同任务包括语言建模、地基探测、大型多任务语言理解和视觉语言问题及回答。结果显示,GWQ量化的模型比其他四分法方法效果更好。在四分法过程中,GWQ只需要一个校准装置来实现有效的夸度。此外,GWQ在与原始模型相比,实现了1.2x的推力速度,并有效地减少了内存。

Article 165

Title@2025-05-29 (4): Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Title: Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Das Nachdenken über die Auswahlkriterien bei der Stärkung des Lernens für LLM-Reasoning: Eine Kompetenz-Schwierigkeits-Alignment-Perspektive

重新思考在加强学习学习中为LLM 合理性提供强化学习的抽样标准:能力-困难-协调观点 2505.17652v2

Authors: Deyang Kong, Qi Guo, Xiangyu Xi, Wei Wang, Jingang Wang, Xunliang Cai, Shikun Zhang, Wei Ye

Reinforcement learning exhibits potential in enhancing the reasoning abilities of large language models, yet it is hard to scale for the low sample efficiency during the rollout phase. Existing methods attempt to improve efficiency by scheduling problems based on problem difficulties. However, these approaches suffer from unstable and biased estimations of problem difficulty and fail to capture the alignment between model competence and problem difficulty in RL training, leading to suboptimal results. To tackle these limitations, this paper introduces $\textbf{C}$ompetence-$\textbf{D}$ifficulty $\textbf{A}$lignment $\textbf{S}$ampling ($\textbf{CDAS}$), which enables accurate and stable estimation of problem difficulties by aggregating historical performance discrepancies of problems. Then the model competence is quantified to adaptively select problems whose difficulty is in alignment with the model’s current competence using a fixed-point system. Experimental results across a range of challenging mathematical benchmarks show that CDAS achieves great improvements in both accuracy and efficiency. CDAS attains the highest average accuracy against baselines and exhibits significant speed advantages compared to Dynamic Sampling, a competitive strategy in DAPO, which is 2.33 times slower than CDAS.

强化学习展示了提高大语言模型推理能力的潜力,然而,在推出阶段很难推广低抽样效率,因为现有方法试图通过根据问题困难安排问题列表来提高效率;然而,这些方法存在问题难度的不稳定和偏差估计,无法反映模型能力与问题培训困难之间的吻合,导致结果不尽理想。为了克服这些限制,本文件采用了$\ textbf{C}$offompentence-$\ textbf{D}$culticy $textbf{A}$clucy collemination $\ textbf{S}$样本(textbf{S}$)试图提高效率,通过汇集问题的历史性能差异,准确和稳定地估计问题困难。然后,模型能力量化为适应性选择的问题,这些问题的难度与目前使用固定点系统的能力相符。一系列具有挑战性的数学基准的实验结果显示,CDAS在准确性和效率两方面都取得了很大的改进。CDAS在基准方面达到了最高的平均精确度,并展示了比CAS慢速度优势,而DA33比CPO具有竞争力的战略是SDA33。

Article 166

Title@2025-05-29 (4): Dynamic Spectral Backpropagation for Efficient Neural Network Training

Title: Dynamic Spectral Backpropagation for Efficient Neural Network Training

Dynamische Spektral-Backpropagation für effizientes Neural-Netzwerk-Training

促进高效神经网络培训的动态光谱后方通信 2505.23369v1

Authors: Mannmohan Muthuraman

Dynamic Spectral Backpropagation (DSBP) enhances neural network training under resource constraints by projecting gradients onto principal eigenvectors, reducing complexity and promoting flat minima. Five extensions are proposed, dynamic spectral inference, spectral architecture optimization, spectral meta learning, spectral transfer regularization, and Lie algebra inspired dynamics, to address challenges in robustness, fewshot learning, and hardware efficiency. Supported by a third order stochastic differential equation (SDE) and a PAC Bayes limit, DSBP outperforms Sharpness Aware Minimization (SAM), Low Rank Adaptation (LoRA), and Model Agnostic Meta Learning (MAML) on CIFAR 10, Fashion MNIST, MedMNIST, and Tiny ImageNet, as demonstrated through extensive experiments and visualizations. Future work focuses on scalability, bias mitigation, and ethical considerations.

在资源限制下,动态光谱反后推进(DSBP)通过预测主要成分器的梯度、降低复杂性和促进平板微型微粒,增强神经网络培训;提议了五个扩展,即动态光谱推断、光谱结构优化、光谱元学习、光谱传输正规化和立叶代数激励动态,以应对稳健性、微小学习和硬件效率方面的挑战。在第三顺序分异差方程和PAC贝耶限制的支持下,DSBP在CIFAR 10、时装MIS、MMDMISIS和Tiny图像网络上,通过广泛的实验和视觉化来显示,DSBP优于敏化意识最小化(SAM)、低品位适应(LORA)和模型Agnictic Muta Learning(MAMML),未来工作的重点是可扩展性、减少偏见和道德考虑。

Article 167

Title@2025-05-29 (4): Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs

Title: Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs

Graph of Records: Steigerung der retrieval Augmented Generation für Langkontext-Zusammenfassung mit Graphen

记录图图:用图表进行长文本摘要的推进检索增量生成器 2410.11001v2

Authors: Haozhen Zhang, Tao Feng, Jiaxuan You

Retrieval-augmented generation (RAG) has revitalized Large Language Models (LLMs) by injecting non-parametric factual knowledge. Compared with long-context LLMs, RAG is considered an effective summarization tool in a more concise and lightweight manner, which can interact with LLMs multiple times using diverse queries to get comprehensive responses. However, the LLM-generated historical responses, which contain potentially insightful information, are largely neglected and discarded by existing approaches, leading to suboptimal results. In this paper, we propose $\textit{graph of records}$ ($\textbf{GoR}$), which leverages historical responses generated by LLMs to enhance RAG for long-context global summarization. Inspired by the $\textit{retrieve-then-generate}$ paradigm of RAG, we construct a graph by establishing an edge between the retrieved text chunks and the corresponding LLM-generated response. To further uncover the intricate correlations between them, GoR features a $\textit{graph neural network}$ and an elaborately designed $\textit{BERTScore}$-based objective for self-supervised model training, enabling seamless supervision signal backpropagation between reference summaries and node embeddings. We comprehensively compare GoR with 12 baselines across four long-context summarization datasets, and the results indicate that our proposed method reaches the best performance ($\textit{e.g.}$, 15%, 8%, and 19% improvement over retrievers w.r.t. Rouge-L, Rouge-1, and Rouge-2 on the WCEP dataset). Extensive experiments further demonstrate the effectiveness of GoR.

Retrieval- 放大生成(RAG) 通过注入非参数事实知识,使大语言模型{ LLMs (LLMs) 注入了非参数事实知识。与长文本LLMs相比,RAG被视为一种更简便和轻量化的有效总和工具,它可以与LLMs多次互动,使用不同的查询来获得全面答复。然而,LLLM 生成的历史响应,包含潜在的深刻信息,在很大程度上被现有方法所忽视和抛弃,导致低于最佳结果。在本文中,我们提议$\textit{记录图}$($\ textb{RRR}$),利用LLMS的历史性回应来提高RAG的长文本全球总和化。受 $\ textitalite{reat- generate} 模式的启发,我们通过在回收的文本块和相应的LLMRRRRS建立边缘关系来构建一个图表。GRF_Retrietrealation, 将一个基于 net netnal netroduction netroal 网络的 Net netw} $ $, 和一个精细化的自我化的模型, SIal deal dealational deal deal dealationalational deal dislational deal dislational disl dald the thewegald slod the weal be weald the wealdaldald supaldald supald sild.

Article 168

Title@2025-05-29 (4): Guarantees of a Preconditioned Subgradient Algorithm for Overparameterized Asymmetric Low-rank Matrix Recovery

Title: Guarantees of a Preconditioned Subgradient Algorithm for Overparameterized Asymmetric Low-rank Matrix Recovery

Garantien eines vorkonditionierten Subgradienten Algorithmus für überparameterisierte asymmetrische Low-rank Matrix Erholung

保证为超参数化的测量性对称低级矩阵恢复提供先决条件的亚梯分算法的保障 2410.16826v2

Authors: Paris Giampouras, HanQin Cai, Rene Vidal

In this paper, we focus on a matrix factorization-based approach to recover low-rank {\it asymmetric} matrices from corrupted measurements. We propose an {\it Overparameterized Preconditioned Subgradient Algorithm (OPSA)} and provide, for the first time in the literature, linear convergence rates independent of the rank of the sought asymmetric matrix in the presence of gross corruptions. Our work goes beyond existing results in preconditioned-type approaches addressing their current limitation, i.e., the lack of convergence guarantees in the case of {\it asymmetric matrices of unknown rank}. By applying our approach to (robust) matrix sensing, we highlight its merits when the measurement operator satisfies a mixed-norm restricted isometry property. Lastly, we present extensive numerical experiments that validate our theoretical results and demonstrate the effectiveness of our approach for different levels of overparameterization and outlier corruptions.

在本文中,我们侧重于一个基于要素化的矩阵化方法,从腐败的测量中恢复低级别(Iit 不对称)矩阵。我们建议采用“超度超标预设亚临界测算法 ” ( OPSA ) , 并在文献中首次规定,在出现严重腐败的情况下,线性趋同率独立于所寻求的非对称矩阵的等级。我们的工作超越了处理当前限制的前提条件型方法的现有结果,即:在未知级别( iit 不对称矩阵 ) 的情况下缺乏趋同保证。我们通过对( robust) 矩阵感测采用我们的方法,我们强调测量操作员满足混合的规范限制的测量属性的优点。最后,我们提出了广泛的数字实验,以证实我们的理论结果,并表明我们处理不同程度的过度定量和外部腐败的方法的有效性。

Article 169

Title@2025-05-29 (4): Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

Title: Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

Grower-in-the-Loop Interaktives Verstärkungslernen für Greenhouse Climate Control

种植者在Loop-Loop 互动强化学习促进温室气候控制 2505.23355v1

Authors: Maxiu Xiao, Jianglin Lan, Jingxing Yu, Eldert van Henten, Congcong Sun

Climate control is crucial for greenhouse production as it directly affects crop growth and resource use. Reinforcement learning (RL) has received increasing attention in this field, but still faces challenges, including limited training efficiency and high reliance on initial learning conditions. Interactive RL, which combines human (grower) input with the RL agent’s learning, offers a potential solution to overcome these challenges. However, interactive RL has not yet been applied to greenhouse climate control and may face challenges related to imperfect inputs. Therefore, this paper aims to explore the possibility and performance of applying interactive RL with imperfect inputs into greenhouse climate control, by: (1) developing three representative interactive RL algorithms tailored for greenhouse climate control (reward shaping, policy shaping and control sharing); (2) analyzing how input characteristics are often contradicting, and how the trade-offs between them make grower’s inputs difficult to perfect; (3) proposing a neural network-based approach to enhance the robustness of interactive RL agents under limited input availability; (4) conducting a comprehensive evaluation of the three interactive RL algorithms with imperfect inputs in a simulated greenhouse environment. The demonstration shows that interactive RL incorporating imperfect grower inputs has the potential to improve the performance of the RL agent. RL algorithms that influence action selection, such as policy shaping and control sharing, perform better when dealing with imperfect inputs, achieving 8.4% and 6.8% improvement in profit, respectively. In contrast, reward shaping, an algorithm that manipulates the reward function, is sensitive to imperfect inputs and leads to a 9.4% decrease in profit. This highlights the importance of selecting an appropriate mechanism when incorporating imperfect inputs.

气候控制对温室生产至关重要,因为它直接影响到作物增长和资源使用。强化学习(RL)在这一领域受到越来越多的关注,但仍然面临挑战,包括培训效率有限和高度依赖初始学习条件。交互式RL将人(growwer)的投入与RL代理的学习结合起来,为克服这些挑战提供了潜在的解决办法。然而,互动的RL尚未应用于温室气候控制,并可能面临与不完善的投入有关的挑战。因此,本文件旨在探讨应用互动式RL(RL)和不完善的投入对温室气候控制进行互动RL的可能性和性能,具体做法是:(1) 开发三种具有代表性的互动式RL(RL)算法,专门用于温室气体控制重要投入(升级成型、政策制定和共享);(2) 分析投入特点常常与R(growwer)交错,使种植者的投入难以完善;(3) 提出以神经网络为基础的方法,在有限的投入提供量的情况下,增强互动RL算法的稳健性;(4) 在模拟的温室环境中,对具有不完善投入的三种互动的RL算法进行全面评价。演示表明,将不完善的RL的RL(ralimalevilevalimalevalim )的计算,在改进过程中,这种算值的精定值的精定值的精细性能能能的精细性能能能能改进了精准性能,从而改进了精制成成成成成成成成成成成成成成成能,从而改进了RL,从而改进了RL 改进了RL 改进了Ral性能性能,使RL 改进了RL 。

Article 170

Title@2025-05-29 (4): ChatHuman: Chatting about 3D Humans with Tools

Title: ChatHuman: Chatting about 3D Humans with Tools

ChatHuman: Chatten über 3D-Menschen mit Tools

聊天:用工具聊天关于3D人类 2405.04533v2

Authors: Jing Lin, Yao Feng, Weiyang Liu, Michael J. Black

Numerous methods have been proposed to detect, estimate, and analyze properties of people in images, including 3D pose, shape, contact, human-object interaction, and emotion. While widely applicable in vision and other areas, such methods require expert knowledge to select, use, and interpret the results. To address this, we introduce ChatHuman, a language-driven system that integrates the capabilities of specialized methods into a unified framework. ChatHuman functions as an assistant proficient in utilizing, analyzing, and interacting with tools specific to 3D human tasks, adeptly discussing and resolving related challenges. Built on a Large Language Model (LLM) framework, ChatHuman is trained to autonomously select, apply, and interpret a diverse set of tools in response to user inputs. Our approach overcomes significant hurdles in adapting LLMs to 3D human tasks, including the need for domain-specific knowledge and the ability to interpret complex 3D outputs. The innovations of ChatHuman include leveraging academic publications to instruct the LLM on tool usage, employing a retrieval-augmented generation model to create in-context learning examples for managing new tools, and effectively discriminating between and integrating tool results by transforming specialized 3D outputs into comprehensible formats. Experiments demonstrate that ChatHuman surpasses existing models in both tool selection accuracy and overall performance across various 3D human tasks, and it supports interactive chatting with users. ChatHuman represents a significant step toward consolidating diverse analytical methods into a unified, robust system for 3D human tasks.

已经提出了许多方法来检测、估计和分析人们在图像中的特性,包括3D形象、形状、接触、人类物件互动和情感。这些方法在视觉和其他领域广泛适用,但需要专家知识来选择、使用和解释结果。为此,我们引入了Chathuman,这是一个语言驱动系统,将专门方法的能力纳入一个统一的框架。ChatHuman作为一个助理,在利用、分析和与3D人类任务具体工具互动方面十分熟练的助理职能,恰当地讨论和解决相关挑战。在大语言模型(LLLM)框架上构建了大语言模型框架,ChatHuman接受了自主选择、应用和解释一套针对用户投入的多样化工具的培训。我们的方法克服了将LMMs与3D人类任务相适应的重大障碍,包括需要具体领域的知识和解释复杂的3D产出的能力。Chathuenhury的创新包括利用学术出版物来指导LM的工具使用,使用一种检索和推荐的多样化的生成模型,以创建用于管理新工具的连带学习范例,以及有效地将人文分析任务与整个分析工具加以区别,从而将人权选择工具转化为。

Article 171

Title@2025-05-29 (4): BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Behavioural Change

Title: BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Behavioural Change

BAH-Datensatz für Ambivalenz/Hesitanzerkennung in Videos für Verhaltensänderungen

BAH 行为变化视频中双向/隐私识别 BAH 数据集 2505.19328v2

Authors: Manuela González-González, Soufiane Belharbi, Muhammad Osama Zeeshan, Masoumeh Sharafi, Muhammad Haseeb Aslam, Marco Pedersoli, Alessandro Lameiras Koerich, Simon L Bacon, Eric Granger

Recognizing complex emotions linked to ambivalence and hesitancy (A/H) can play a critical role in the personalization and effectiveness of digital behaviour change interventions. These subtle and conflicting emotions are manifested by a discord between multiple modalities, such as facial and vocal expressions, and body language. Although experts can be trained to identify A/H, integrating them into digital interventions is costly and less effective. Automatic learning systems provide a cost-effective alternative that can adapt to individual users, and operate seamlessly within real-time, and resource-limited environments. However, there are currently no datasets available for the design of ML models to recognize A/H. This paper introduces a first Behavioural Ambivalence/Hesitancy (BAH) dataset collected for subject-based multimodal recognition of A/H in videos. It contains videos from 224 participants captured across 9 provinces in Canada, with different age, and ethnicity. Through our web platform, we recruited participants to answer 7 questions, some of which were designed to elicit A/H while recording themselves via webcam with microphone. BAH amounts to 1,118 videos for a total duration of 8.26 hours with 1.5 hours of A/H. Our behavioural team annotated timestamp segments to indicate where A/H occurs, and provide frame- and video-level annotations with the A/H cues. Video transcripts and their timestamps are also included, along with cropped and aligned faces in each frame, and a variety of participants meta-data. We include results baselines for BAH at frame- and video-level recognition in multi-modal setups, in addition to zero-shot prediction, and for personalization using unsupervised domain adaptation. The limited performance of baseline models highlights the challenges of recognizing A/H in real-world videos. The data, code, and pretrained weights are available.

承认与矛盾和偏执(A/H)相关的复杂情感,可在数字行为变化干预措施的个人化和有效性方面发挥关键作用。这些微妙和冲突的情感表现为面部和声频表达方式等多种模式与身体语言之间的不协调。尽管专家可以接受识别A/H的培训,但将其纳入数字干预措施的费用和效果较低。自动学习系统提供了一种具有成本效益的替代方法,可以适应个人用户,并在实时和资源有限的环境中无缝运作。然而,目前没有为设计识别A/H的ML模型提供数据集。本文首次介绍了在视频中为基于主题的A/H的多式识别形式(BAH)收集的双向/喜剧(BAH)数据集。该数据集包含加拿大9个省不同年龄和族裔的224名参与者的视频,通过我们的网络平台,我们招募了7个问题,其中一些是用来在实时和资源有限的情况下进行A/H的,在8.26个总时间段里,在视频/A/H格式上,在视频/图表中,在视频/图表中显示每个视频/图表的直径上,在A/A/A/H格式中,在A/A/A/A/H格式上,在视频/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/A/

Article 172

Title@2025-05-29 (4): Towards Reward Fairness in RLHF: From a Resource Allocation Perspective

Title: Towards Reward Fairness in RLHF: From a Resource Allocation Perspective

Zur Belohnung Fairness in RLHF: Aus Ressourcenzuweisungsperspektive

走向RLHF的奖励公平:从资源分配角度 2505.23349v1

Authors: Sheng Ouyang, Yulan Hu, Ge Chen, Qingyang Li, Fuzheng Zhang, Yong Liu

Rewards serve as proxies for human preferences and play a crucial role in Reinforcement Learning from Human Feedback (RLHF). However, if these rewards are inherently imperfect, exhibiting various biases, they can adversely affect the alignment of large language models (LLMs). In this paper, we collectively define the various biases present in rewards as the problem of reward unfairness. We propose a bias-agnostic method to address the issue of reward fairness from a resource allocation perspective, without specifically designing for each type of bias, yet effectively mitigating them. Specifically, we model preference learning as a resource allocation problem, treating rewards as resources to be allocated while considering the trade-off between utility and fairness in their distribution. We propose two methods, Fairness Regularization and Fairness Coefficient, to achieve fairness in rewards. We apply our methods in both verification and reinforcement learning scenarios to obtain a fairness reward model and a policy model, respectively. Experiments conducted in these scenarios demonstrate that our approach aligns LLMs with human preferences in a more fair manner.

奖赏是人类偏好的代言人,在加强从人类反馈中学习(RLHF)中发挥着关键作用。然而,如果这些奖赏本质上不完美,表现出各种偏见,则会对大型语言模式(LLMS)的匹配产生不利影响。在本文件中,我们共同将奖赏中存在的各种偏见定义为奖赏不公平的问题。我们建议一种不偏颇的方法,从资源分配的角度解决奖赏公平问题,而不具体设计每一种类型的偏见,而是有效地减轻这些偏见。具体地说,我们把优待学习作为一种资源分配问题,在考虑其分配的效用和公平之间取舍时,将奖赏作为应分配的资源处理。我们提出了两种方法,即公平性和公平性,以实现奖赏的公平性。我们用我们的方法来核查和加强学习情景,分别获得公平奖赏模式和政策模式。在这些情景中进行的实验表明,我们的做法以更公平的方式使LMs与人类的偏好相一致。

Article 173

Title@2025-05-29 (4): Sentinel: Scheduling Live Streams with Proactive Anomaly Detection in Crowdsourced Cloud-Edge Platforms

Title: Sentinel: Scheduling Live Streams with Proactive Anomaly Detection in Crowdsourced Cloud-Edge Platforms

Sentinel: Planung von Livestreams mit proaktiver Anomalieerkennung in Crowdsourced Cloud-Edge-Plattformen

哨兵:将现场流排成日程,在人源云源云源平台上进行主动异常探测 2505.23347v1

Authors: Yuting Li, Shaoyuan Huang, Tengwen Zhang, Cheng Zhang, Xiaofei Wang, Victor C. M. Leung

With the rapid growth of live streaming services, Crowdsourced Cloud-edge service Platforms (CCPs) are playing an increasingly important role in meeting the increasing demand. Although stream scheduling plays a critical role in optimizing CCPs’ revenue, most optimization strategies struggle to achieve practical results due to various anomalies in unstable CCPs. Additionally, the substantial scale of CCPs magnifies the difficulties of anomaly detection in time-sensitive scheduling. To tackle these challenges, this paper proposes Sentinel, a proactive anomaly detection-based scheduling framework. Sentinel models the scheduling process as a two-stage Pre-Post-Scheduling paradigm: in the pre-scheduling stage, Sentinel conducts anomaly detection and constructs a strategy pool; in the post-scheduling stage, upon request arrival, it triggers an appropriate scheduling based on a pre-generated strategy to implement the scheduling process. Extensive experiments on realistic datasets show that Sentinel significantly reduces anomaly frequency by 70%, improves revenue by 74%, and doubles the scheduling speed.

随着现场流流服务的迅速增长,众源云端服务平台(CCP)在满足不断增长的需求方面发挥着越来越重要的作用。虽然流流时间安排在优化CP收入方面发挥着关键作用,但大多数优化战略都因不稳定的CP的异常而难以取得实际成果。此外,大量CCP扩大了在时间敏感时间安排中发现异常情况的困难。为了应对这些挑战,本文件建议哨兵,这是一个积极主动的异常检测列表框架。哨兵将时间安排过程作为两个阶段的排期前模式:在排期前阶段,Sentinel进行异常检测,并建立一个战略集合;在排期后阶段,在接到要求后阶段,根据事先制定的战略,启动适当的时间安排,以实施排期进程。关于现实数据集的广泛实验显示,Sentinel显著减少异常频率70%,增加收入74%,并增加时间安排速度的两倍。

Article 174

Title@2025-05-29 (4): Graph Positional Autoencoders as Self-supervised Learners

Title: Graph Positional Autoencoders as Self-supervised Learners

Graphische Positionale Autoencoder als selbstüberwachte Lernende

作为自监管学习者进行定位自动校对的图形图 2505.23345v1

Authors: Yang Liu, Deyu Bo, Wenxuan Cao, Yuan Fang, Yawen Li, Chuan Shi

Graph self-supervised learning seeks to learn effective graph representations without relying on labeled data. Among various approaches, graph autoencoders (GAEs) have gained significant attention for their efficiency and scalability. Typically, GAEs take incomplete graphs as input and predict missing elements, such as masked nodes or edges. While effective, our experimental investigation reveals that traditional node or edge masking paradigms primarily capture low-frequency signals in the graph and fail to learn the expressive structural information. To address these issues, we propose Graph Positional Autoencoders (GraphPAE), which employs a dual-path architecture to reconstruct both node features and positions. Specifically, the feature path uses positional encoding to enhance the message-passing processing, improving GAE’s ability to predict the corrupted information. The position path, on the other hand, leverages node representations to refine positions and approximate eigenvectors, thereby enabling the encoder to learn diverse frequency information. We conduct extensive experiments to verify the effectiveness of GraphPAE, including heterophilic node classification, graph property prediction, and transfer learning. The results demonstrate that GraphPAE achieves state-of-the-art performance and consistently outperforms baselines by a large margin.

在各种方法中,图形自动解析器(GAE)因其效率和可缩缩性而得到极大关注。通常,GAE采用不完整的图形作为输入,并预测缺失的元素,如掩码节点或边缘。虽然我们实验性调查有效,但发现传统的节点或边缘遮蔽模式主要在图形中捕捉低频信号,无法学习表达式结构信息。为了解决这些问题,我们提议了图形定位自动解析器(GraphPAE),它使用双向结构来重建节点特征和位置。具体地说,功能路径使用定位编码来增强信息传递处理,提高GAE预测腐败信息的能力。在另一方面,定位路径利用节点表达方式来改进定位和近似易位源,从而使得编码器能够学习不同的频率信息。我们进行了广泛的实验,以核实GAPAE的有效性,包括肝脏节点分类、图形属性预测,以及持续地平距转移等。图表展示了通过大规模定位和转移的状态。

Article 175

Title@2025-05-29 (4): A Descriptor Is All You Need: Accurate Machine Learning of Nonadiabatic Coupling Vectors

Title: A Descriptor Is All You Need: Accurate Machine Learning of Nonadiabatic Coupling Vectors

Ein Deskriptor ist alles, was Sie brauchen: Genaues maschinelles Lernen von nichtadiabatischen Kupplungsvektoren

描述符是你需要的:非非异相叠合矢量的精确机器学习 2505.23344v1

Authors: Jakub Martinka, Lina Zhang, Yi-Fan Hou, Mikołaj Martyka, Jiří Pittner, Mario Barbatti, Pavlo O. Dral

Nonadiabatic couplings (NACs) play a crucial role in modeling photochemical and photophysical processes with methods such as the widely used fewest-switches surface hopping (FSSH). There is therefore a strong incentive to machine learn NACs for accelerating simulations. However, this is challenging due to NACs’ vectorial, double-valued character and the singularity near a conical intersection seam. For the first time, we design NAC-specific descriptors based on our domain expertise and show that they allow learning NACs with never-before-reported accuracy of $R^2$ exceeding 0.99. The key to success is also our new ML phase-correction procedure. We demonstrate the efficiency and robustness of our approach on a prototypical example of fully ML-driven FSSH simulations of fulvene targeting the SA-2-CASSCF(6,6) electronic structure level. This ML-FSSH dynamics leads to an accurate description of $S_1$ decay while reducing error bars by allowing the execution of a large ensemble of trajectories. Our implementations are available in open-source MLatom.

在模拟光化学和光物理过程方面,非非异性联结(NACs)在模拟光化学和光物理过程方面发挥着关键作用,使用的方法包括广泛使用的最少开关的表面购物(FSSH)等。因此,有强大的动力为加速模拟而机械学习NACs学习NACs。然而,由于NACs的矢量性、双值性格和在锥形交叉接合处附近的独特性,这具有挑战性。我们首次根据我们的领域专长设计了针对NAC的专用描述器,并表明它们允许以从未报告过的精确度超过0.99美元为单位学习NACs,而从未报告精确度超过0.99美元。成功的关键也是我们新的ML阶段校正程序。我们展示了我们在完全ML驱动的FSSHSH模拟针对SA-2-CSSCSF(6,6)电子结构水平的原型例子方面的做法的效率和稳健。ML-FSSHSH的动态导致准确描述$_1美元的腐烂度,同时减少误差条,允许执行大型串径。我们可以在开放的MLS-L的操作中进行。

Article 176

Title@2025-05-29 (4): Matryoshka Model Learning for Improved Elastic Student Models

Title: Matryoshka Model Learning for Improved Elastic Student Models

Matryoshka Model Learning für verbesserte elastische Studentenmodelle

Matryoshka 改进弹性学生模式示范学习模式 2505.23337v1

Authors: Chetan Verma, Aditya Srinivas Timmaraju, Cho Jui-Hsieh, Suyash Damle, Ngot Bui, Yang Zhang, Wen Chen, Xin Liu, Prateek Jain, Inderjit S Dhillon

Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development. In this paper, we propose MatTA, a framework for training multiple accurate Student models using a novel Teacher-TA-Student recipe. TA models are larger versions of the Student models with higher capacity, and thus allow Student models to better relate to the Teacher model and also bring in more domain-specific expertise. Furthermore, multiple accurate Student models can be extracted from the TA model. Therefore, despite only one training run, our methodology provides multiple servable options to trade off accuracy for lower serving cost. We demonstrate the proposed method, MatTA, on proprietary datasets and models. Its practical efficacy is underscored by live A/B tests within a production ML system, demonstrating 20% improvement on a key metric. We also demonstrate our method on GPT-2 Medium, a public model, and achieve relative improvements of over 24% on SAT Math and over 10% on the LAMBADA benchmark.

工业级ML模型经过仔细设计,以适应迅速变化的服务限制,这需要大量资源用于模型开发。在本文件中,我们提议马特塔(MatTA),这是一个使用新型教师-TA-学生食谱培训多种准确学生模型的框架。TA模型是能力较高的学生模型的较大版本,从而使学生模型能够更好地与教师模型相联系,并带来更多的具体领域的专门知识。此外,可以从TA模型中提取多种准确的学生模型。因此,尽管只有一次培训,我们的方法为降低服务成本的准确性提供了多种易用选项。我们展示了拟议的方法,即专有数据集和模型的MatTA,其实际效力体现在生产ML系统内的活A/B测试中,显示关键指标的20%的改进。我们还展示了我们在GPT-2中度(公共模型)上的方法,并在SAT数学基准上实现了超过24%的相对改进,在LAMBADA基准上则超过10%。

Article 177

Title@2025-05-29 (4): X2Graph for Cancer Subtyping Prediction on Biological Tabular Data

Title: X2Graph for Cancer Subtyping Prediction on Biological Tabular Data

X2Graph für Krebs Subtyping Vorhersage auf biologische Tabellendaten

用于对生物表表数据进行癌症子图谱预测的X2Graph 2505.23334v1

Authors: Tu Bui, Mohamed Suliman, Aparajita Haldar, Mohammed Amer, Serban Georgescu

Despite the transformative impact of deep learning on text, audio, and image datasets, its dominance in tabular data, especially in the medical domain where data are often scarce, remains less clear. In this paper, we propose X2Graph, a novel deep learning method that achieves strong performance on small biological tabular datasets. X2Graph leverages external knowledge about the relationships between table columns, such as gene interactions, to convert each sample into a graph structure. This transformation enables the application of standard message passing algorithms for graph modeling. Our X2Graph method demonstrates superior performance compared to existing tree-based and deep learning methods across three cancer subtyping datasets.

尽管深入学习对文本、音频和图像数据集产生了变革性影响,但其在表格数据中的主导地位,特别是在数据往往稀缺的医疗领域,仍然不那么清楚。在本论文中,我们提出了X2Graph,这是在小型生物表格数据集上取得强效的新颖的深层次学习方法。X2Graph利用关于表格列之间关系的外部知识,例如基因互动,将每个样本转换成图表结构。这一转变使得能够应用标准的信息传递算法进行图形建模。我们的X2Graph方法显示,与三个癌症子型数据集相比,与现有的基于树的深层次学习方法相比,其性能更高。

Article 178

Title@2025-05-29 (4): Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization

Title: Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization

Feintuning Next-Scale Visual Autoregressive Modelle mit gruppenrelativer Politikoptimierung

采用群体相对政策优化优化的下尺度视觉自动递减模型 2505.23331v1

Authors: Matteo Gallici, Haitz Sáez de Ocáriz Borde

Fine-tuning pre-trained generative models with Reinforcement Learning (RL) has emerged as an effective approach for aligning outputs more closely with nuanced human preferences. In this paper, we investigate the application of Group Relative Policy Optimization (GRPO) to fine-tune next-scale visual autoregressive (VAR) models. Our empirical results demonstrate that this approach enables alignment to intricate reward signals derived from aesthetic predictors and CLIP embeddings, significantly enhancing image quality and enabling precise control over the generation style. Interestingly, by leveraging CLIP, our method can help VAR models generalize beyond their initial ImageNet distribution: through RL-driven exploration, these models can generate images aligned with prompts referencing image styles that were absent during pre-training. In summary, we show that RL-based fine-tuning is both efficient and effective for VAR models, benefiting particularly from their fast inference speeds, which are advantageous for online sampling, an aspect that poses significant challenges for diffusion-based alternatives.

通过强化学习(RL),微调培训前的基因化模型已成为使产出更接近于细微人类偏好的一种有效方法。在本文中,我们调查了群体相对政策优化(GROP)模型的应用,以微调下一个规模的视觉自动递减(VAR)模型。我们的实证结果表明,这一方法使得与从美学预测器和CLIP嵌入中得出的复杂奖赏信号相匹配,大大提高了图像质量,并使得能够精确控制生成风格。有趣的是,通过利用CLIP,我们的方法可以帮助VAR模型在最初的图像网络分布之外加以普及:通过RL驱动的探索,这些模型能够产生与预训练期间缺少的提示图像样式相匹配的速效图像。总之,我们表明,基于RL的微调对于VAR模型既高效又有效,特别是从其快速推断速度中获益,这对在线取样有利,这是对基于传播的替代品构成重大挑战的一个方面。

Article 179

Title@2025-05-29 (4): Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism

Title: Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism

Fehlerübertragung und Decorrelation als potenzieller künstlicher und natürlicher Lernmechanismus

错误广播和装饰关系作为一种潜在的人工和自然学习机制 2504.11558v2

Authors: Mete Erdogan, Cengiz Pehlevan, Alper T. Erdogan

We introduce Error Broadcast and Decorrelation (EBD), a novel learning framework for neural networks that addresses credit assignment by directly broadcasting output errors to individual layers, circumventing weight transport of backpropagation. EBD is rigorously grounded in the stochastic orthogonality property of Minimum Mean Square Error estimators. This fundamental principle states that the error of an optimal estimator is orthogonal to functions of the input. Guided by this insight, EBD defines layerwise loss functions that directly penalize correlations between layer activations and output errors, thereby establishing a principled foundation for error broadcasting. This theoretically sound mechanism naturally leads to the experimentally observed three-factor learning rule and integrates with biologically plausible frameworks to enhance performance and plausibility. Numerical experiments demonstrate EBD’s competitive or better performance against other error-broadcast methods on benchmark datasets. Our findings establish EBD as an efficient, biologically plausible, and principled alternative for neural network training.

我们引入了错误广播和礼节关系(EBD),这是一个神经网络的新学习框架,它通过将输出错误直接广播到各个层,从而绕过反向传播的重量迁移,解决信用分配问题。EBD严格地植根于最低中平方错误估测器的随机或纵深属性。这一基本原则表明,最佳估测器的错误与输入的函数是正交错的。根据这一洞察,EBD定义了层值损失功能,直接惩罚层激活和输出错误之间的相互关系,从而为错误广播奠定了原则基础。这一理论上的健全机制自然导致实验性观测到的三要素学习规则,并与生物上可信的框架相结合,以提高性能和可信赖性。数字实验表明,EBD相对于基准数据集上的其他错误-路标方法具有竞争性或更好的性能。我们的调查结果将EBD确定为神经网络培训的一种高效、生物上合理和有原则的替代方法。

Article 180

Title@2025-05-29 (4): Combinatorial Rising Bandit

Title: Combinatorial Rising Bandit

Kombinatorial Rising Bandit

混合崛起强盗 2412.00798v3

Authors: Seockbean Song, Youngsik Yoon, Siwei Wang, Wei Chen, Jungseul Ok

Combinatorial online learning is a fundamental task for selecting the optimal action (or super arm) as a combination of base arms in sequential interactions with systems providing stochastic rewards. It is applicable to diverse domains such as robotics, social advertising, network routing, and recommendation systems. In many real-world scenarios, we often encounter rising rewards, where playing a base arm not only provides an instantaneous reward but also contributes to the enhancement of future rewards, e.g., robots enhancing proficiency through practice and social influence strengthening in the history of successful recommendations. Moreover, the enhancement of a single base arm may affect multiple super arms that include it, introducing complex dependencies that are not captured by existing rising bandit models. To address this, we introduce the Combinatorial Rising Bandit (CRB) framework and propose a provably efficient algorithm, Combinatorial Rising Upper Confidence Bound (CRUCB). We establish an upper bound on regret CRUCB and show that it is nearly tight by deriving a matching lower bound. In addition, we empirically demonstrate the effectiveness of CRUCB not only in synthetic environments but also in realistic applications of deep reinforcement learning.

混合在线学习是选择最佳行动(或超级臂)的一项基本任务,是选择最佳行动(或超级臂)作为基础臂与提供随机奖励的系统相继互动的一种组合。它适用于机器人、社会广告、网络路由和建议系统等不同领域。在许多现实世界情景中,我们经常遇到不断上升的奖励,玩一个基臂不仅能瞬间提供奖励,而且有助于提高今后的奖励,例如机器人通过实践和社会影响在成功建议的历史中强化了熟练程度。此外,加强一个单一基臂可能会影响多个超级臂,包括它,引入现有的不断上升的强盗模式无法捕捉到的复杂依赖性。为了解决这个问题,我们引入了组合上升强盗(CRB)框架,并提出了一种非常高效的算法,即组合式提高高度自信(CRUCB) 。我们建立了对CRUCB的上层约束,并表明通过得出相匹配的更低约束几乎是紧要紧的。此外,我们从经验上表明CRUCB不仅在合成环境中,而且还在深度强化学习的现实应用中证明了其有效性。

Article 181

Title@2025-05-29 (4): Efficient Parameter Estimation for Bayesian Network Classifiers using Hierarchical Linear Smoothing

Title: Efficient Parameter Estimation for Bayesian Network Classifiers using Hierarchical Linear Smoothing

Effiziente Parameterschätzung für Bayesian Network Klassifikatoren mit Hierarchical Linear Glättung

Bayesian 网络分类器使用等级线性线性平滑法的高效参数参数估测 2505.23320v1

Authors: Connor Cooper, Geoffrey I. Webb, Daniel F. Schmidt

Bayesian network classifiers (BNCs) possess a number of properties desirable for a modern classifier: They are easily interpretable, highly scalable, and offer adaptable complexity. However, traditional methods for learning BNCs have historically underperformed when compared to leading classification methods such as random forests. Recent parameter smoothing techniques using hierarchical Dirichlet processes (HDPs) have enabled BNCs to achieve performance competitive with random forests on categorical data, but these techniques are relatively inflexible, and require a complicated, specialized sampling process. In this paper, we introduce a novel method for parameter estimation that uses a log-linear regression to approximate the behaviour of HDPs. As a linear model, our method is remarkably flexible and simple to interpret, and can leverage the vast literature on learning linear models. Our experiments show that our method can outperform HDP smoothing while being orders of magnitude faster, remaining competitive with random forests on categorical data.

Bayesian网络分类(BNCs)拥有一些适合现代分类器的特性:这些特性易于解释,可高度缩放,具有适应性复杂性。然而,与随机森林等主要分类方法相比,传统的学习BNCs的方法在历史上表现不佳。最近使用分级Dirichlet工艺的参数平滑技术使Benesian网络分类器在绝对数据上实现了与随机森林的性能竞争,但这些技术相对不灵活,需要复杂的专门取样程序。在本文中,我们引入了一种新颖的参数估计方法,该方法使用日志线状回归法来近似HDPs的行为。作为一个线性模型,我们的方法非常灵活和简单,可以利用大量文献来解释线性模型。我们的实验表明,我们的方法可以超越HDP的光滑,同时速度更快,与任意森林的绝对数据保持竞争力。

Article 182

Title@2025-05-29 (4): A Straightforward Gradient-Based Approach for High-Tc Superconductor Design: Leveraging Domain Knowledge via Adaptive Constraints

Title: A Straightforward Gradient-Based Approach for High-Tc Superconductor Design: Leveraging Domain Knowledge via Adaptive Constraints

Ein einfacher gradient-basierter Ansatz für High-Tc-Supraleiter-Design: Nutzung von Domain-Wissen über adaptive Einschränkungen

高Tc超级导体设计的直向渐进式高超导体设计方法:通过适应性制约因素利用域知识 2403.13627v2

Authors: Akihiro Fujii, Anh Khoa Augustin Lu, Koji Shimizu, Satoshi Watanabe

Materials design aims to discover novel compounds with desired properties. However, prevailing strategies face critical trade-offs. Conventional element-substitution approaches readily and adaptively incorporate various domain knowledge but remain confined to a narrow search space. In contrast, deep generative models efficiently explore vast compositional landscapes, yet they struggle to flexibly integrate domain knowledge. To address these trade-offs, we propose a gradient-based material design framework that combines these strengths, offering both efficiency and adaptability. In our method, chemical compositions are optimised to achieve target properties by using property prediction models and their gradients. In order to seamlessly enforce diverse constraints, including those reflecting domain insights such as oxidation states, discretised compositional ratios, types of elements, and their abundance, we apply masks and employ a special loss function, namely the integer loss. Furthermore, we initialise the optimisation using promising candidates from existing dataset, effectively guiding the search away from unfavourable regions and thus helping to avoid poor solutions. Our approach demonstrates a more efficient exploration of superconductor candidates, uncovering candidate materials with higher critical temperature than conventional element-substitution and generative models. Importantly, it could propose new compositions beyond those found in existing databases, including new hydride superconductors absent from the training dataset but which share compositional similarities with materials found in literature. This synergy of domain knowledge and machine-learning-based scalability provides a robust foundation for rapid, adaptive, and comprehensive materials design for superconductors and beyond.

然而,主流战略面临着关键的权衡。常规元素替代方法很容易和适应性地融合了各种领域知识,但仍然局限于狭小的搜索空间。相比之下,深基因模型有效地探索了广泛的构成景观,但却努力灵活整合领域知识。为了解决这些权衡,我们提议了一个基于梯度的材料设计框架,将这些优势结合起来,既提供效率和适应性,又提供效率和适应性。在我们的方法中,化学成分最理想地通过使用财产预测模型及其梯度来实现目标属性。为了无缝地执行各种限制,包括反映诸如氧化状态、分解的构成比率、元素类型及其丰度等领域洞察力的制约。相比之下,我们应用了隐蔽式模型,并运用了特殊损失功能,即全方位损失。此外,我们更倾向于利用有希望的候选者,有效地指导从不利区域搜索,从而帮助避免问题。我们的方法表明,对超级导体候选者进行了更高效的探索,发现候选材料的温度比常规元素替代和基因化模型要高得多。重要的是,它可以提出超越快速设计模型的适应性结构,而从快速设计数据库中找到的弹性结构,其中包括在现有结构中找到的高级研究数据库中找到的弹性数据。

Article 183

Title@2025-05-29 (4): Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments

Title: Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments

Verbesserung der Genauigkeit der Markerbewertung durch ordinelles Vertrauensmodellierung in Bildungsbewertungen

通过在教育评估中建立常规信任模型,加强标标码的准确度 2505.23315v1

Authors: Abhirup Chakravarty, Mark Brenchley, Trevor Breakspear, Ian Lewin, Yan Huang

A key ethical challenge in Automated Essay Scoring (AES) is ensuring that scores are only released when they meet high reliability standards. Confidence modelling addresses this by assigning a reliability estimate measure, in the form of a confidence score, to each automated score. In this study, we frame confidence estimation as a classification task: predicting whether an AES-generated score correctly places a candidate in the appropriate CEFR level. While this is a binary decision, we leverage the inherent granularity of the scoring domain in two ways. First, we reformulate the task as an n-ary classification problem using score binning. Second, we introduce a set of novel Kernel Weighted Ordinal Categorical Cross Entropy (KWOCCE) loss functions that incorporate the ordinal structure of CEFR labels. Our best-performing model achieves an F1 score of 0.97, and enables the system to release 47% of scores with 100% CEFR agreement and 99% with at least 95% CEFR agreement -compared to approximately 92% (approx.) CEFR agreement from the standalone AES model where we release all AM predicted scores.

自动读取系统( AES) 中的一个关键道德挑战是确保分数只有在达到高可靠性标准时才释放出来。信任建模通过给每个自动分分分配一个可靠性估计尺度, 以信任分的形式对每个自动分进行。在本研究中, 我们将信任估测设定为分类任务: 预测 AES 生成的得分是否正确地将候选人置于适当的 CEFR 级别上。虽然这是一个二进制决定, 我们以两种方式利用评分域固有的颗粒性。首先, 我们使用分数宾点将任务重新表述为n- 分类问题。第二, 我们推出一套包含 CEFR 标签的方形结构的新型 Kernelweighted Ordinal Categorical Entropy (KWOCCE) 损失函数。我们最优秀的模型达到F1 0.97 分, 使系统能够以100% CEFR协议和99%的得分数发放47%, 至少95%的CEFRFR 协议 — 约92% (approxx) 。

Article 184

Title@2025-05-29 (4): Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition

Title: Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition

Adversariale Semantische und Label-Störung Angriff für Fußgänger Attribute Anerkennung

对抗性语义和Label干扰攻击,以确认佩德斯特属性 2505.23313v1

Authors: Weizhe Kong, Xiao Wang, Ruichong Gao, Chenglong Li, Yu Zhang, Xing Yang, Yaowei Wang, Jin Tang

Pedestrian Attribute Recognition (PAR) is an indispensable task in human-centered research and has made great progress in recent years with the development of deep neural networks. However, the potential vulnerability and anti-interference ability have still not been fully explored. To bridge this gap, this paper proposes the first adversarial attack and defense framework for pedestrian attribute recognition. Specifically, we exploit both global- and patch-level attacks on the pedestrian images, based on the pre-trained CLIP-based PAR framework. It first divides the input pedestrian image into non-overlapping patches and embeds them into feature embeddings using a projection layer. Meanwhile, the attribute set is expanded into sentences using prompts and embedded into attribute features using a pre-trained CLIP text encoder. A multi-modal Transformer is adopted to fuse the obtained vision and text tokens, and a feed-forward network is utilized for attribute recognition. Based on the aforementioned PAR framework, we adopt the adversarial semantic and label-perturbation to generate the adversarial noise, termed ASL-PAR. We also design a semantic offset defense strategy to suppress the influence of adversarial attacks. Extensive experiments conducted on both digital domains (i.e., PETA, PA100K, MSP60K, RAPv2) and physical domains fully validated the effectiveness of our proposed adversarial attack and defense strategies for the pedestrian attribute recognition. The source code of this paper will be released on https://github.com/Event-AHU/OpenPAR.

Pedestrian 属性识别(PAR)是人类核心研究中不可或缺的一项任务,近年来随着深层神经网络的发展取得了巨大进展。然而,潜在的脆弱性和反干预能力仍未得到充分探讨。为弥合这一差距,本文件提出了第一个行人属性识别对抗攻击和防御框架。具体地说,我们利用预先培训的 CLIP PAR 框架对行人图像进行全球和补丁攻击,首先将行人图像分为非重叠的补丁,然后将其嵌入投影层的特征嵌入。与此同时,将属性组扩大为使用预培训的 CLIP 文本编码的提示和内嵌入属性特征。采用了多式变形变形变形器,将获得的愿景和文字符号结合起来,并使用进路图网络进行感化识别。基于上述PAR框架,我们采用对抗性隐性图和标签性变形图解来生成对抗性噪音,称为 ASL-PAR 。我们还在设计了100个直径攻击的直径防御战略。我们为SLIBRial-realal-Destral Arealalal restral Areal revistrational destrational restistraction prep press press 。我们进行了了数字式的磁性攻击。我们进行了对立体攻击。

Article 185

Title@2025-05-29 (4): Rethinking Gradient-Based Methods: Multi-Property Materials Design Beyond Differentiable Targets

Title: Rethinking Gradient-Based Methods: Multi-Property Materials Design Beyond Differentiable Targets

Rethinking Gradient-Based Methods: Multi-Property Materials Design Beyond Differentiable Targets

重新思考渐进方法:超出可区别目标的多财产材料设计 2410.08562v4

Authors: Akihiro Fujii, Yoshitaka Ushiku, Koji Shimizu, Anh Khoa Augustin Lu, Satoshi Watanabe

Gradient-based methods offer a simple, efficient strategy for materials design by directly optimizing candidates using gradients from pretrained property predictors. However, their use in crystal structure optimization is hindered by two key challenges: handling non-differentiable constraints, such as charge neutrality and structural fidelity, and susceptibility to poor local minima. We revisit and extend the gradient-based methods to address these issues. We propose Simultaneous Multi-property Optimization using Adaptive Crystal Synthesizer (SMOACS), which integrates oxidation-number masks and template-based initialization to enforce non-differentiable constraints, avoid poor local minima, and flexibly incorporate additional constraints without retraining. SMOACS enables multi-property optimization. including exceptional targets such as high-temperature superconductivity, and scales to large crystal systems, both persistent challenges for generative models, even those enhanced with gradient-based guidance from property predictors. In experiments on five target properties and three datasets, SMOACS outperforms generative models and Bayesian optimization methods, successfully designing 135-atom perovskite structures that satisfy multiple property targets and constraints, a task at which the other methods fail entirely.

以梯度为基础的方法为材料设计提供了一种简单、有效的战略,即直接优化候选人使用来自预先培训的财产预测器的梯度来优化材料设计;然而,晶体结构优化中的使用却受到两大挑战的阻碍:处理非差别的限制,如收费中立性和结构忠诚性,以及易受当地低劣微粒的影响。我们重新审视并推广基于梯度的方法,以解决这些问题。我们提议使用适应性水晶合成器(SMOACS),即结合氧化数字面罩和基于模板的初始化,以实施不可区别的限制,避免当地微小的差,灵活地纳入额外的限制,而不进行再培训。SMOACS能够实现多丙型优化,包括高温超导力和尺度等特殊目标,将其推广到大型晶体系统,两者都对基因化模型构成持续的挑战,即使是用基于梯度的预测器(SMOACS)对五个目标特性和三个数据集进行实验,SMOACS优于基因化模型和Bayes最优化方法,成功地设计了135-Atomat系统的其他限制,从而完全满足了多重目标。

Article 186

Title@2025-05-29 (4): Score-based Generative Modeling for Conditional Independence Testing

Title: Score-based Generative Modeling for Conditional Independence Testing

Score-basierte Generative Modellierung für die Prüfung der bedingten Unabhängigkeit

有条件独立测试基于记分率生成模型 2505.23309v1

Authors: Yixin Ren, Chenghou Jin, Yewei Xia, Li Ke, Longtao Huang, Hui Xue, Hao Zhang, Jihong Guan, Shuigeng Zhou

Determining conditional independence (CI) relationships between random variables is a fundamental yet challenging task in machine learning and statistics, especially in high-dimensional settings. Existing generative model-based CI testing methods, such as those utilizing generative adversarial networks (GANs), often struggle with undesirable modeling of conditional distributions and training instability, resulting in subpar performance. To address these issues, we propose a novel CI testing method via score-based generative modeling, which achieves precise Type I error control and strong testing power. Concretely, we first employ a sliced conditional score matching scheme to accurately estimate conditional score and use Langevin dynamics conditional sampling to generate null hypothesis samples, ensuring precise Type I error control. Then, we incorporate a goodness-of-fit stage into the method to verify generated samples and enhance interpretability in practice. We theoretically establish the error bound of conditional distributions modeled by score-based generative models and prove the validity of our CI tests. Extensive experiments on both synthetic and real-world datasets show that our method significantly outperforms existing state-of-the-art methods, providing a promising way to revitalize generative model-based CI testing.

确定随机变量之间的有条件独立(CI)关系,是机器学习和统计方面,特别是在高维环境中,一项根本性但具有挑战性的任务。现有的基于基因模型的CI测试方法,例如使用基因对抗网络(GANs),常常与不可取的有条件分布模式和培训不稳定性模型进行斗争,从而导致低级性能。为了解决这些问题,我们提议通过基于分数的基因化模型进行新的CI测试方法,实现精确的I型错误控制和强力测试能力。具体地说,我们首先采用切片有条件得分比对方案,准确估计有条件得分,并利用Langevin动态的有条件抽样来生成完全的假设样本,确保准确的I型错误控制。然后,我们把一个合适的阶段纳入核实生成样本和加强实际解释性的方法中。我们理论上确定由基于分谱的基因化模型模型模型模型模型进行有条件分布的错误,并证明我们的CI测试的有效性。对合成和真实世界数据集进行的广泛实验表明,我们的方法大大超越了现有的状态方法,提供了振兴基于基因模型的有希望的方法。

Article 187

Title@2025-05-29 (4): MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction

Title: MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction

MGE-LDM: Gemeinsame Latente Diffusion für simultane Musikgeneration und Quellenextraktion

MGE-LDM:同时制作音乐和来源采掘联合前期传播 2505.23305v1

Authors: Yunkee Chae, Kyogu Lee

We present MGE-LDM, a unified latent diffusion framework for simultaneous music generation, source imputation, and query-driven source separation. Unlike prior approaches constrained to fixed instrument classes, MGE-LDM learns a joint distribution over full mixtures, submixtures, and individual stems within a single compact latent diffusion model. At inference, MGE-LDM enables (1) complete mixture generation, (2) partial generation (i.e., source imputation), and (3) text-conditioned extraction of arbitrary sources. By formulating both separation and imputation as conditional inpainting tasks in the latent space, our approach supports flexible, class-agnostic manipulation of arbitrary instrument sources. Notably, MGE-LDM can be trained jointly across heterogeneous multi-track datasets (e.g., Slakh2100, MUSDB18, MoisesDB) without relying on predefined instrument categories. Audio samples are available at our project page: https://yoongi43.github.io/MGELDM_Samples/.

我们提出MGE-LDM,这是同步音乐生成、源估算和查询源分离的统一潜在扩散框架。与以往限制固定仪器类别的做法不同,MGE-LDM学会了在单一紧凑潜在扩散模型中对全部混合物、亚混合物和单个源头进行联合分布。根据推断,MGE-LDM使(1) 完全的混合物生成,(2) 部分生成(即源估算)和(3) 任意源的文字提取。通过在潜在空间中将分离和估算作为有条件的油漆任务,我们的方法支持对任意仪器源进行灵活、类分类处理。值得注意的是,MGE-LDM可以在不依赖预定仪器类别的情况下,在多轨数据集(例如,Slakh2100, MUSDB18, MoisesDBDB)之间联合进行培训。我们的项目网页:https://yogi43.github.io/MGELDM_Samples/。

Article 188

Title@2025-05-29 (4): Understanding and Mitigating Miscalibration in Prompt Tuning for Vision-Language Models

Title: Understanding and Mitigating Miscalibration in Prompt Tuning for Vision-Language Models

Verstehen und Abmildern von Fehlkalibrierung bei sofortiger Tuning für Vision-Language-Modelle

理解和减缓视觉语言模型快速开票时的误差 2410.02681v4

Authors: Shuoyuan Wang, Yixuan Li, Hongxin Wei

Confidence calibration is critical for the safe deployment of machine learning models in the real world. However, such issue in vision-language models like CLIP, particularly after fine-tuning, has not been fully addressed. In this work, we demonstrate that existing prompt tuning methods usually lead to a trade-off of calibration between base and new classes: the cross-entropy loss in CoOp causes overconfidence in new classes by increasing textual label divergence, whereas the regularization of KgCoOp maintains the confidence level but results in underconfidence in base classes due to the improved accuracy. Inspired by the observations, we introduce Dynamic Outlier Regularization (DOR) to ensure the confidence calibration on both base and new classes after fine-tuning. In particular, we propose to minimize the feature deviation of novel textual labels (instead of base classes) sampled from a large vocabulary. In effect, DOR prevents the increase in textual divergence for new labels while easing restrictions on base classes. Extensive experiments demonstrate that DOR can enhance the calibration performance of current fine-tuning methods on base and new classes.

信任度校准对于在现实世界中安全部署机器学习模型至关重要,然而,CLIP等视觉语言模型中的这类问题,特别是在微调后,还没有得到充分解决。在这项工作中,我们证明现有的快速调试方法通常导致基准和新等级之间校准的权衡:Coop中的交叉有机物损失通过增加文字标签差异造成新类别中的不信任过大,而KgCoOOp的正规化则维持了信任水平,但由于基础等级的准确性提高而导致信任度不足。在观察的启发下,我们引入了动态外部常规化(DOR)以确保在微调后对基础和新等级的校准。特别是,我们建议尽可能减少从大词汇中抽样的新文本标签(而不是基础等级)的特征偏差。实际上,DOR防止新标签的文字差异增加,同时放宽对基础等级的限制。广泛的实验表明,DOR可以提高目前在基础和新等级的校准方法的校准性。

Article 189

Title@2025-05-29 (4): How Does Response Length Affect Long-Form Factuality

Title: How Does Response Length Affect Long-Form Factuality

Wie wirkt sich die Response-Länge auf die Langform-Faktizität aus?

反应时间长度如何影响长期事实质量 2505.23295v1

Authors: James Xu Zhao, Jimmy Z. J. Liu, Bryan Hooi, See-Kiong Ng

Large language models (LLMs) are widely used for long-form text generation. However, factual errors in the responses would undermine their reliability. Despite growing attention to LLM factuality, the effect of response length on factuality remains underexplored. In this work, we systematically investigate this relationship by first introducing an automatic and bi-level long-form factuality evaluation framework, which achieves high agreement with human annotations while being cost-effective. Using this framework, we conduct controlled experiments and find that longer responses exhibit lower factual precision, confirming the presence of length bias. To explain this phenomenon, we empirically examine three hypotheses: error propagation, long context, and facts exhaustion. Our results reveal that facts exhaustion, where the model gradually exhausts more reliable knowledge, is the primary cause of factual degradation, rather than the other two hypotheses.

大型语言模型(LLMs)被广泛用于长式文本的生成,但是,答复中的事实错误会损害其可靠性。尽管人们日益关注LLM的实际情况,但答复时间长度对事实质量的影响仍未得到充分探讨。在这项工作中,我们系统地调查这种关系,首先采用自动和双级长式事实质量评估框架,在符合成本效益的情况下与人的注释取得高度一致。我们利用这一框架,进行有控制的实验,发现较长的答复显示事实准确性较低,证实存在时间偏差。为了解释这一现象,我们从经验上研究了三种假设:错误传播、长背景和事实耗竭。我们的结果显示,在模型逐渐耗尽更可靠的知识的情况下,事实用尽是造成实际退化的主要原因,而不是其他两种假设。

Article 190

Title: Multi-Modal Framing Analysis of News

Multi-Modal Framing Analyse der Nachrichten

新闻多模式结构分析 2503.20960v3

Authors: Arnav Arora, Srishti Yadav, Maria Antoniak, Serge Belongie, Isabelle Augenstein

Automated frame analysis of political communication is a popular task in computational social science that is used to study how authors select aspects of a topic to frame its reception. So far, such studies have been narrow, in that they use a fixed set of pre-defined frames and focus only on the text, ignoring the visual contexts in which those texts appear. Especially for framing in the news, this leaves out valuable information about editorial choices, which include not just the written article but also accompanying photographs. To overcome such limitations, we present a method for conducting multi-modal, multi-label framing analysis at scale using large (vision-) language models. Grounding our work in framing theory, we extract latent meaning embedded in images used to convey a certain point and contrast that to the text by comparing the respective frames used. We also identify highly partisan framing of topics with issue-specific frame analysis found in prior qualitative work. We demonstrate a method for doing scalable integrative framing analysis of both text and image in news, providing a more complete picture for understanding media bias.

在计算社会科学中,政治传播的自动框架分析是一项流行的任务,用于研究作者如何选择一个专题的方方面面来设计其接受范围。迄今为止,这种研究范围很窄,使用一套固定的预设框架,只注重文本,忽视了文本的视觉背景。特别是为了在新闻中进行设计,这留下了关于编辑选择的宝贵信息,其中不仅包括书面文章,也包括相附照片。为了克服这些限制,我们提出了一个方法,用大型(视觉)语言模型进行规模的多式多标签框架分析。我们用构思理论作为我们工作的基础,我们从图像中提取潜含的含义,用来传递一个特定点,并通过比较所使用的相应框架来对比文本。我们还确定了高度偏向性的主题框架,在先前的质量工作中发现了针对具体问题的框架分析。我们展示了对文本和新闻图像进行可扩展的综合框架分析的方法,为理解媒体偏见提供了更完整的图片。

Article 191

Title@2025-05-29 (4): Comparative Analysis of the Land Use and Land Cover Changes in Different Governorates of Oman using Spatiotemporal Multi-spectral Satellite Data

Title: Comparative Analysis of the Land Use and Land Cover Changes in Different Governorates of Oman using Spatiotemporal Multi-spectral Satellite Data

Vergleichende Analyse der Bodennutzungs- und Bodenbedeckungsänderungen in verschiedenen Gouvernements von Oman unter Verwendung spatiotemporaler multispektraler Satellitendaten

利用斯帕蒂多光谱多谱段卫星数据对阿曼不同省份土地利用和土地覆盖变化的比较分析 2505.23285v1

Authors: Muhammad Shafi, Syed Mohsin Bokhari

Land cover and land use (LULC) changes are key applications of satellite imagery, and they have critical roles in resource management, urbanization, protection of soils and the environment, and enhancing sustainable development. The literature has heavily utilized multispectral spatiotemporal satellite data alongside advanced machine learning algorithms to monitor and predict LULC changes. This study analyzes and compares LULC changes across various governorates (provinces) of the Sultanate of Oman from 2016 to 2021 using annual time steps. For the chosen region, multispectral spatiotemporal data were acquired from the open-source Sentinel-2 satellite dataset. Supervised machine learning algorithms were used to train and classify different land covers, such as water bodies, crops, urban, etc. The constructed model was subsequently applied within the study region, allowing for an effective comparative evaluation of LULC changes within the given timeframe.

土地覆盖和土地利用变化是卫星图像的关键应用,在资源管理、城市化、土壤和环境保护以及促进可持续发展方面发挥着关键作用;文献大量利用多光谱空间卫星数据以及先进的机器学习算法来监测和预测土地覆盖和土地利用变化;这项研究利用年度时间步骤分析和比较了阿曼苏丹国各省(省)从2016年至2021年的土地覆盖和土地利用变化;对于选定的区域,从开放源码Sentinel-2卫星数据集获取了多谱波段时数据;利用了超导机学习算法来培训和分类不同的土地覆盖,如水体、作物、城市等;随后在研究区域内应用了所构建的模式,以便能够在规定的时限内对土地覆盖和土地利用变化进行有效的比较评估。

Article 192

Title@2025-05-29 (4): Improving Continual Learning Performance and Efficiency with Auxiliary Classifiers

Title: Improving Continual Learning Performance and Efficiency with Auxiliary Classifiers

Verbesserung der kontinuierlichen Lernleistung und Effizienz mit Hilfsklassifikatoren

提高持续学习成绩和效率,辅级分级 2403.07404v4

Authors: Filip Szatkowski, Yaoyue Zheng, Fei Yang, Bartłomiej Twardowski, Tomasz Trzciński, Joost van de Weijer

Continual learning is crucial for applying machine learning in challenging, dynamic, and often resource-constrained environments. However, catastrophic forgetting - overwriting previously learned knowledge when new information is acquired - remains a major challenge. In this work, we examine the intermediate representations in neural network layers during continual learning and find that such representations are less prone to forgetting, highlighting their potential to accelerate computation. Motivated by these findings, we propose to use auxiliary classifiers(ACs) to enhance performance and demonstrate that integrating ACs into various continual learning methods consistently improves accuracy across diverse evaluation settings, yielding an average 10% relative gain. We also leverage the ACs to reduce the average cost of the inference by 10-60% without compromising accuracy, enabling the model to return the predictions before computing all the layers. Our approach provides a scalable and efficient solution for continual learning.

持续学习对于在具有挑战性、动态性和经常受到资源制约的环境中应用机器学习至关重要。然而,灾难性的遗忘 — — 在获得新信息时超过先前学到的知识 — — 仍然是一个重大挑战。在这项工作中,我们检查了神经网络层在持续学习过程中的中间代表,发现这种代表不太容易忘记,强调其加速计算的潜力。我们建议利用辅助分类器提高业绩,并表明将ACs纳入各种持续学习方法,不断提高不同评价环境的准确性,平均产生10%的相对收益。我们还利用ACs将推断的平均成本降低10-60%,同时不损害准确性,使模型能够在计算所有层次之前返回预测。我们的方法为持续学习提供了可扩展和高效的解决方案。

Article 193

Title@2025-05-29 (4): Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

Title: Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

Optimale Protokolle für kontinuierliches Lernen über statistische Physik und Steuerungstheorie

通过统计物理和控制理论不断学习的最佳最佳协议 2409.18061v3

Authors: Francesco Mori, Stefano Sarao Mannelli, Francesca Mignacco

Artificial neural networks often struggle with catastrophic forgetting when learning multiple tasks sequentially, as training on new tasks degrades the performance on previously learned tasks. Recent theoretical work has addressed this issue by analysing learning curves in synthetic frameworks under predefined training protocols. However, these protocols relied on heuristics and lacked a solid theoretical foundation assessing their optimality. In this paper, we fill this gap by combining exact equations for training dynamics, derived using statistical physics techniques, with optimal control methods. We apply this approach to teacher-student models for continual learning and multi-task problems, obtaining a theory for task-selection protocols maximising performance while minimising forgetting. Our theoretical analysis offers non-trivial yet interpretable strategies for mitigating catastrophic forgetting, shedding light on how optimal learning protocols modulate established effects, such as the influence of task similarity on forgetting. Finally, we validate our theoretical findings with experiments on real-world data.

人工神经网络在连续学习多重任务时往往与灾难性的遗忘作斗争,因为关于新任务的培训会降低以前学到的任务的绩效。最近的理论工作通过分析根据预先确定的培训规程在合成框架中的学习曲线来解决这个问题。然而,这些规程依赖超自然学,缺乏坚实的理论基础来评估其最佳性能。在本文中,我们通过将培训动态的精确方程式、利用统计物理技术的推算和最佳控制方法来填补这一差距。我们将这种方法应用于师生模式,用于持续学习和多任务问题,获取任务选择协议优化绩效的理论,同时尽量减少遗忘。我们的理论分析为减轻灾难性的遗忘提供了非边际但可解释的战略,并展示了最佳学习规程如何调整既定效果,例如任务相似对遗忘的影响。最后,我们用真实世界数据实验来验证我们的理论发现。

Article 194

Title@2025-05-29 (4): LADA: Scalable Label-Specific CLIP Adapter for Continual Learning

Title: LADA: Scalable Label-Specific CLIP Adapter for Continual Learning

LADA: Skalierbarer Label-Spezifischer CLIP Adapter für kontinuierliches Lernen

旱地退化评估:用于持续学习的可缩放标签特定CLIP适应器 2505.23271v1

Authors: Mao-Lin Luo, Zi-Hao Zhou, Tong Wei, Min-Ling Zhang

Continual learning with vision-language models like CLIP offers a pathway toward scalable machine learning systems by leveraging its transferable representations. Existing CLIP-based methods adapt the pre-trained image encoder by adding multiple sets of learnable parameters, with each task using a partial set of parameters. This requires selecting the expected parameters for input images during inference, which is prone to error that degrades performance. To address this problem, we introduce LADA (Label-specific ADApter). Instead of partitioning parameters across tasks, LADA appends lightweight, label-specific memory units to the frozen CLIP image encoder, enabling discriminative feature generation by aggregating task-agnostic knowledge. To prevent catastrophic forgetting, LADA employs feature distillation for seen classes, preventing their features from being interfered with by new classes. Positioned after the image encoder, LADA prevents gradient flow to the frozen CLIP parameters, ensuring efficient training. Extensive results show that LADA achieves state-of-the-art performance in continual learning settings. The implementation code is available at https://github.com/MaolinLuo/LADA.

利用CLIP等视觉语言模型不断学习,通过利用其可转移的表示式,为可扩缩机器学习系统提供了一条路径。基于CLIP的现有方法通过添加多套可学习参数,对预培训的图像编码器进行调整,每项任务都使用一套部分参数。这要求在推断过程中选择输入图像的预期参数,这容易导致性能下降的错误。为了解决这一问题,我们引入了旱地退化评估(Label专用自动适应器) 。旱地退化评估(LADA)不是将参数分成不同任务,而是将轻量级、特定标签的记忆单位分到冷冻的 CLIP 图像编码器,通过汇总任务- 认知知识, 使歧视性特性生成。为了防止灾难性的遗忘,旱地退化评估在所见的班级使用特征蒸馏,防止其特征受到新班的干扰。在图像编码后定位,旱地退化评估防止梯梯梯向冻结的CLIP参数流,确保有效的培训。广泛的结果显示,旱地退化评估在持续学习环境中实现状态-艺术性表现。执行代码可在 http://github.com/MaolinLuo/LADADADADADADAD。

Article 195

Title@2025-05-29 (4): Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs

Title: Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs

Entfernt Machine Unlearning wirklich Modellwissen? Ein Rahmen für die Prüfung von Unlearning in LLMs

机器取消学习是否真正删除了示范知识? 审计框架是否在LLMM中取消学习? 2505.23270v1

Authors: Haokun Chen, Yueqi Zhang, Yuan Bi, Yao Zhang, Tong Liu, Jinhe Bi, Jian Lan, Jindong Gu, Claudia Grosser, Denis Krompass, Nassir Navab, Volker Tresp

In recent years, Large Language Models (LLMs) have achieved remarkable advancements, drawing significant attention from the research community. Their capabilities are largely attributed to large-scale architectures, which require extensive training on massive datasets. However, such datasets often contain sensitive or copyrighted content sourced from the public internet, raising concerns about data privacy and ownership. Regulatory frameworks, such as the General Data Protection Regulation (GDPR), grant individuals the right to request the removal of such sensitive information. This has motivated the development of machine unlearning algorithms that aim to remove specific knowledge from models without the need for costly retraining. Despite these advancements, evaluating the efficacy of unlearning algorithms remains a challenge due to the inherent complexity and generative nature of LLMs. In this work, we introduce a comprehensive auditing framework for unlearning evaluation, comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods. By using various auditing algorithms, we evaluate the effectiveness and robustness of different unlearning strategies. To explore alternatives beyond prompt-based auditing, we propose a novel technique that leverages intermediate activation perturbations, addressing the limitations of auditing methods that rely solely on model inputs and outputs.

近年来,大语言模型(LLMS)取得了显著进步,引起了研究界的极大关注,其能力主要归功于大型结构,需要大规模数据集的广泛培训,然而,这类数据集往往包含来自公共互联网的敏感或版权内容,引起对数据隐私和所有权的关切。《一般数据保护条例》(GDPR)等监管框架赋予个人要求删除这类敏感信息的权利。这促使开发了旨在将特定知识从模型中去除而无需花费昂贵的再培训的机读算法。尽管取得了这些进步,但由于LLMS的内在复杂性和基因性质,评价未学习算法的功效仍然是一项挑战。在这项工作中,我们引入了一个非学习评价综合审计框架,由三个基准数据集、六个未学习算法和五个快速审计方法组成。我们通过使用各种审计算法,评估不同不学习战略的有效性和稳健性。为了探索超越快速审计的替代方法,我们提出了一种创新技术,即利用中间启动过动,解决仅依赖投入和产出的审计方法的局限性。

Article 196

Title@2025-05-29 (4): Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning

Title: Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning

Behavior-Regularized Diffusion Policy Optimierung für Offline-Verstärkung Lernen

离线强化学习的传播政策优化 2502.04778v2

Authors: Chen-Xiao Gao, Chenyang Wu, Mingjun Cao, Chenjun Xiao, Yang Yu, Zongzhang Zhang

Behavior regularization, which constrains the policy to stay close to some behavior policy, is widely used in offline reinforcement learning (RL) to manage the risk of hazardous exploitation of unseen actions. Nevertheless, existing literature on behavior-regularized RL primarily focuses on explicit policy parameterizations, such as Gaussian policies. Consequently, it remains unclear how to extend this framework to more advanced policy parameterizations, such as diffusion models. In this paper, we introduce BDPO, a principled behavior-regularized RL framework tailored for diffusion-based policies, thereby combining the expressive power of diffusion policies and the robustness provided by regularization. The key ingredient of our method is to calculate the Kullback-Leibler (KL) regularization analytically as the accumulated discrepancies in reverse-time transition kernels along the diffusion trajectory. By integrating the regularization, we develop an efficient two-time-scale actor-critic RL algorithm that produces the optimal policy while respecting the behavior constraint. Comprehensive evaluations conducted on synthetic 2D tasks and continuous control tasks from the D4RL benchmark validate its effectiveness and superior performance.

行为正规化政策限制了政策接近某些行为政策,在非在线强化学习(RL)中广泛用于管理危险利用无形行动的风险,然而,关于行为正规化的RL的现有文献主要侧重于明确的政策参数化,如高斯政策,因此仍不清楚如何将这一框架扩展至更先进的政策参数化,如推广模式。在本文件中,我们引入了BDPO,这是为基于传播的政策量身定制的基于行为正规化的原则性RL框架,从而结合了传播政策的表达力和规范化所提供的稳健性。我们方法的关键要素是分析计算Kullback-Leiberr(KL)的正规化,作为沿传播轨道的逆时过渡内圈的累积差异。通过整合正规化,我们开发了一种高效的双时制的演员-critict RL算法,在尊重行为约束的同时产生最佳政策。对合成2D任务进行了全面评价,D4RL基准的连续控制任务验证了其有效性和优劣性。

Article 197

Title@2025-05-29 (4): Efficiently Access Diffusion Fisher: Within the Outer Product Span Space

Title: Efficiently Access Diffusion Fisher: Within the Outer Product Span Space

Effizienter Zugriff auf Diffusion Fisher: Innerhalb des Outer Product Span Space

有效获取扩散渔渔场:在外生产品空间内 2505.23264v1

Authors: Fangyikang Wang, Hubery Yin, Shaobin Zhuang, Huminhao Zhu, Yinan Li, Lei Qian, Chao Zhang, Hanbin Zhao, Hui Qian, Chen Li

Recent Diffusion models (DMs) advancements have explored incorporating the second-order diffusion Fisher information (DF), defined as the negative Hessian of log density, into various downstream tasks and theoretical analysis. However, current practices typically approximate the diffusion Fisher by applying auto-differentiation to the learned score network. This black-box method, though straightforward, lacks any accuracy guarantee and is time-consuming. In this paper, we show that the diffusion Fisher actually resides within a space spanned by the outer products of score and initial data. Based on the outer-product structure, we develop two efficient approximation algorithms to access the trace and matrix-vector multiplication of DF, respectively. These algorithms bypass the auto-differentiation operations with time-efficient vector-product calculations. Furthermore, we establish the approximation error bounds for the proposed algorithms. Experiments in likelihood evaluation and adjoint optimization demonstrate the superior accuracy and reduced computational cost of our proposed algorithms. Additionally, based on the novel outer-product formulation of DF, we design the first numerical verification experiment for the optimal transport property of the general PF-ODE deduced map.

最近的传播模型(DMs)已经探索了将第二阶扩散渔业信息(DF)纳入各种下游任务和理论分析中,目前的做法通常通过对学分网络应用自动差异来接近Fisher的传播。这种黑箱方法虽然简单,缺乏准确性保证,而且耗时。在本文中,我们表明,扩散渔业者实际上生活在分数外产品和初始数据所覆盖的空间之内。根据外产品结构,我们分别开发了两种高效近似算法,以获取DF的痕量和矩阵矢量乘数乘数。这些算法绕过自动差异操作,以具有时间效率的矢量产品计算。此外,我们为拟议的算法确定了近似误差界限。在可能性评估和联合优化方面进行的实验表明,我们提议的算法的精确性和计算成本已经降低。此外,根据DF的新型外产品配方,我们设计了第一次数字核查实验,用于PFO-OD推算的通用地图的最佳运输特性。

Article 198

Title@2025-05-29 (4): Stable Thompson Sampling: Valid Inference via Variance Inflation

Title: Stable Thompson Sampling: Valid Inference via Variance Inflation

Stabile Thompson-Probenahme: Gültige Schlussfolgerung durch Varianz-Inflation

稳定汤普森抽样:因通货膨胀差异而得出的有效推论 2505.23260v1

Authors: Budhaditya Halder, Shubhayan Pan, Koulik Khamaru

We consider the problem of statistical inference when the data is collected via a Thompson Sampling-type algorithm. While Thompson Sampling (TS) is known to be both asymptotically optimal and empirically effective, its adaptive sampling scheme poses challenges for constructing confidence intervals for model parameters. We propose and analyze a variant of TS, called Stable Thompson Sampling, in which the posterior variance is inflated by a logarithmic factor. We show that this modification leads to asymptotically normal estimates of the arm means, despite the non-i.i.d. nature of the data. Importantly, this statistical benefit comes at a modest cost: the variance inflation increases regret by only a logarithmic factor compared to standard TS. Our results reveal a principled trade-off: by paying a small price in regret, one can enable valid statistical inference for adaptive decision-making algorithms.

在通过汤普森抽样类型算法收集数据时,我们考虑统计推论问题。尽管汤普森抽样(TS)已知是非现成的最佳和实证有效的,但其适应性抽样办法对建立模型参数的信任间隔提出了挑战。我们提出并分析了TS的变种,称为Stabable Thompson抽样,其中后方差异因对数因素而膨胀。我们表明,这一修改导致对手臂手段的无现性正常估计,尽管数据是非i.d.性质。重要的是,这一统计效益的代价不大:差价通胀因与标准TS相比的逻辑因素而增加遗憾。我们的结果揭示了一种有原则的权衡:如果支付少量的价钱,人们就可以为适应性决策算法提供有效的统计推论。

Article 199

Title@2025-05-29 (4): BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

Title: BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

BOFormer: Lernen, Multi-Objektive Bayesian Optimierung über nicht-Markovian RL zu lösen

BOFormer: 学会通过非马尔科维安RL解决多目标巴耶斯最佳利用 2505.21974v2

Authors: Yu-Heng Hung, Kai-Jie Lin, Yu-Heng Lin, Chien-Yi Wang, Cheng Sun, Ping-Chun Hsieh

Bayesian optimization (BO) offers an efficient pipeline for optimizing black-box functions with the help of a Gaussian process prior and an acquisition function (AF). Recently, in the context of single-objective BO, learning-based AFs witnessed promising empirical results given its favorable non-myopic nature. Despite this, the direct extension of these approaches to multi-objective Bayesian optimization (MOBO) suffer from the \textit{hypervolume identifiability issue}, which results from the non-Markovian nature of MOBO problems. To tackle this, inspired by the non-Markovian RL literature and the success of Transformers in language modeling, we present a generalized deep Q-learning framework and propose \textit{BOFormer}, which substantiates this framework for MOBO via sequence modeling. Through extensive evaluation, we demonstrate that BOFormer constantly outperforms the benchmark rule-based and learning-based algorithms in various synthetic MOBO and real-world multi-objective hyperparameter optimization problems. We have made the source code publicly available to encourage further research in this direction.

贝叶斯优化(BO)为优化黑箱功能提供了高效管道,在Gaussian进程之前和获取功能(AF)的帮助下,优化黑箱功能提供了高效管道。最近,在单一目标BO的背景下,基于学习的AF公司由于其有利的非微型性质,见证了有希望的经验结果。尽管如此,这些方法直接扩展到多目标巴伊西亚优化(MOBO),这得益于MOBO问题的非马尔科维尼亚性质。为了解决这个问题,在非马尔科维尼亚RL文献和变异者在语言建模方面的成功启发下,我们提出了一个普遍深入的Q学习框架,并提议通过序列建模为MOBO提供这种框架。我们通过广泛的评估表明,BOFormer公司不断超越各种合成MOBO和实体-世界多功能化的基于学习的算法。我们公开提供了源代码,以鼓励在这方面进行进一步的研究。

Article 200

Title@2025-05-29 (4): Skywork Open Reasoner 1 Technical Report

Title: Skywork Open Reasoner 1 Technical Report

Skywork Open Reasoner 1 Technischer Bericht

” 天窗开放理由1 “ 技术报告 2505.22312v2

Authors: Jujie He, Jiacai Liu, Chris Yuhao Liu, Rui Yan, Chaojie Wang, Peng Cheng, Xiaoyu Zhang, Fuxiang Zhang, Jiacheng Xu, Wei Shen, Siyuan Li, Liang Zeng, Tianwen Wei, Cheng Cheng, Bo An, Yang Liu, Yahui Zhou

The success of DeepSeek-R1 underscores the significant role of reinforcement learning (RL) in enhancing the reasoning capabilities of large language models (LLMs). In this work, we present Skywork-OR1, an effective and scalable RL implementation for long Chain-of-Thought (CoT) models. Building on the DeepSeek-R1-Distill model series, our RL approach achieves notable performance gains, increasing average accuracy across AIME24, AIME25, and LiveCodeBench from 57.8% to 72.8% (+15.0%) for the 32B model and from 43.6% to 57.5% (+13.9%) for the 7B model. Our Skywork-OR1-32B model surpasses both DeepSeek-R1 and Qwen3-32B on the AIME24 and AIME25 benchmarks, while achieving comparable results on LiveCodeBench. The Skywork-OR1-7B and Skywork-OR1-Math-7B models demonstrate competitive reasoning capabilities among models of similar size. We perform comprehensive ablation studies on the core components of our training pipeline to validate their effectiveness. Additionally, we thoroughly investigate the phenomenon of entropy collapse, identify key factors affecting entropy dynamics, and demonstrate that mitigating premature entropy collapse is critical for improved test performance. To support community research, we fully open-source our model weights, training code, and training datasets.

DeepSeek-R1的成功突显了加强学习(RL)在提高大型语言模型(LLMs)推理能力方面的重要作用。在这项工作中,我们展示了Skywork-OR1,这是长搜索链(COT)模型的一种有效和可扩展的RL执行。在DeepSeek-R1-Distry模型系列的基础上,我们的REL方法取得了显著的绩效收益,使32B模型的AME24、AIME25和LiveCodeBench的平均准确率从57.8%提高到72.8%(+15.0%),7B模型的推理能力从43.6%提高到57.5%(+13.9%)。我们的Skywork-OR1-32B模型在AIME24和AIME25基准方面超过了DeepStual-R1和Qwen3-32B,同时在LiveCode Bench、Skywork-OR1-7B和Skywork-OR1-Math-7B模型中, 展示了类似规模的竞争性推理推论能力。我们进行了全面的推算研究,我们进行了全面研究,并验证了对关键数据流数据流流流数据流的精度研究,并验证了基础的精度的精度研究。

Article 201

Title@2025-05-29 (4): Tensor Product Attention Is All You Need

Title: Tensor Product Attention Is All You Need

Tensor Produkt-Achtung ist alles, was Sie brauchen

色素产品关注是所有你需要的 2501.06425v4

Authors: Yifan Zhang, Yifeng Liu, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew C Yao

Scaling language models to handle longer input sequences typically necessitates large key-value (KV) caches, resulting in substantial memory overhead during inference. In this paper, we propose Tensor Product Attention (TPA), a novel attention mechanism that uses tensor decompositions to represent queries, keys, and values compactly, substantially shrinking the KV cache size at inference time. By factorizing these representations into contextual low-rank components and seamlessly integrating with Rotary Position Embedding (RoPE), TPA achieves improved model quality alongside memory efficiency. Based on TPA, we introduce the Tensor Product Attention Transformer,(T6), a new model architecture for sequence modeling. Through extensive empirical evaluation on language modeling tasks, we demonstrate that T6 surpasses or matches the performance of standard Transformer baselines, including Multi-Head Attention (MHA), Multi-Query Attention (MQA), Grouped-Query Attention (GQA), and Multi-Head Latent Attention (MLA) across various metrics, including perplexity and a range of established evaluation benchmarks. Notably, TPA’s memory efficiency and computational efficiency at the decoding stage enable processing longer sequences under fixed resource constraints, addressing a critical scalability challenge in modern language models. The code is available at https://github.com/tensorgi/T6.

用于处理较长输入序列的扩缩语言模型通常需要大量关键值缓存(KV),从而在推断过程中产生大量的记忆管理。在本文件中,我们提议Tensor产品注意(TPA),这是一个新式关注机制,它使用高分分解来代表查询、键和紧凑的值,在推论时间大大缩小了KV缓存的大小。通过将这些表达方式纳入上下文低级别组件,并与扶轮定位嵌入器(ROPE)无缝结合,TPA在存储效率的同时提高了模型质量。在TPA的基础上,我们引入了Tensor产品注意变换器(T6),这是一个用于序列建模的新模型架构。通过对语言模型任务进行广泛的经验性评估,我们证明T6超过或匹配标准变换器基线的性能,包括多处注意(MAHA)、多处注意(MQA)、集体-Query 注意(GQA)和多处迟应注意(MLA)等各种指标,包括过硬度和一系列既定评价基准。值得注意的是,TPAA的存储序列可控系统在Sqlabal Scal Scal Scal Scal Procal commal Procal Procal commal Procal competion commal commal competion commal competion competional compeal competion commal commal commal commal commal commal commal commal commal commal commal commal commal comm commal commal commal commal commal comm comm comm comm comm comm comm commal commal commal commal commal commal commal comm comm comm comm comm comm comm comm comm comm comm comm comm comm comm comm comm commcal comm com

Article 202

Title@2025-05-29 (4): Sparseformer: a Transferable Transformer with Multi-granularity Token Sparsification for Medical Time Series Classification

Title: Sparseformer: a Transferable Transformer with Multi-granularity Token Sparsification for Medical Time Series Classification

Sparseformer: ein übertragbarer Transformer mit Multigranularitäts-Tokensparsifikation für die Klassifizierung medizinischer Zeitreihen

分散式分析器:医疗时间序列分类的可转让变异器,具有多管质质调分法 2503.15578v2

Authors: Jiexia Ye, Weiqi Zhang, Ziyue Li, Jia Li, Fugee Tsung

Medical time series (MedTS) classification is crucial for improved diagnosis in healthcare, and yet it is challenging due to the varying granularity of patterns, intricate inter-channel correlation, information redundancy, and label scarcity. While existing transformer-based models have shown promise in time series analysis, they mainly focus on forecasting and fail to fully exploit the distinctive characteristics of MedTS data. In this paper, we introduce Sparseformer, a transformer specifically designed for MedTS classification. We propose a sparse token-based dual-attention mechanism that enables global modeling and token compression, allowing dynamic focus on the most informative tokens while distilling redundant features. This mechanism is then applied to the multi-granularity, cross-channel encoding of medical signals, capturing intra- and inter-granularity correlations and inter-channel connections. The sparsification design allows our model to handle heterogeneous inputs of varying lengths and channels directly. Further, we introduce an adaptive label encoder to address label space misalignment across datasets, equipping our model with cross-dataset transferability to alleviate the medical label scarcity issue. Our model outperforms 12 baselines across seven medical datasets under supervised learning. In the few-shot learning experiments, our model also achieves superior average results. In addition, the in-domain and cross-domain experiments among three diagnostic scenarios demonstrate our model’s zero-shot learning capability. Collectively, these findings underscore the robustness and transferability of our model in various medical applications.

医疗时间序列(MedTS)分类对于改善医疗诊断至关重要,然而,由于模式的颗粒性、渠道间关联的复杂、信息冗余和标签稀缺等不同,这种分类具有挑战性。虽然基于变压器的现有模型在时间序列分析中显示出希望,但主要侧重于预测,未能充分利用MedTS数据的独特特性。在本文中,我们引入了专门为MedTS分类设计的变压器Sprasserector(Sparseexor),这是专门为MedTS分类设计的变压器。我们建议了一种稀疏的象征性双向定位机制,可以进行全球建模和象征性压缩,允许动态地关注信息最丰富的代号,同时蒸馏多余的特性。这个机制随后应用到医疗信号的多频谱性、跨通道编码,捕捉到内部和群体间关联和通道间连接。在模型中,我们模型外加固的标签模型模型,在三个医学标签短缺问题上的交叉传输能力。我们模型外演化了12个实验,在模型中学习了我们的标准级模型。

Article 203

Title@2025-05-29 (4): RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting

Title: RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting

RiverMamba: Ein staatliches Weltraummodell für globale Flussentladung und Hochwasserprognose

RiverMamba:全球河流排泄和洪水预报国家空间模型 2505.22535v2

Authors: Mohamad Hakam Shams Eddin, Yikui Zhang, Stefan Kollet, Juergen Gall

Recent deep learning approaches for river discharge forecasting have improved the accuracy and efficiency in flood forecasting, enabling more reliable early warning systems for risk management. Nevertheless, existing deep learning approaches in hydrology remain largely confined to local-scale applications and do not leverage the inherent spatial connections of bodies of water. Thus, there is a strong need for new deep learning methodologies that are capable of modeling spatio-temporal relations to improve river discharge and flood forecasting for scientific and operational applications. To address this, we present RiverMamba, a novel deep learning model that is pretrained with long-term reanalysis data and that can forecast global river discharge and floods on a $0.05^\circ$ grid up to 7 days lead time, which is of high relevance in early warning. To achieve this, RiverMamba leverages efficient Mamba blocks that enable the model to capture global-scale channel network routing and enhance its forecast capability for longer lead times. The forecast blocks integrate ECMWF HRES meteorological forecasts, while accounting for their inaccuracies through spatio-temporal modeling. Our analysis demonstrates that RiverMamba delivers reliable predictions of river discharge, including extreme floods across return periods and lead times, surpassing both operational AI- and physics-based models.

最近,河流排放预测的深层次学习方法提高了洪水预报的准确性和效率,使更可靠的风险管理预警系统得以建立,然而,水文学的现有深层次学习方法仍主要局限于局部应用,没有利用水体固有的空间联系,因此,迫切需要采用新的深层次学习方法,能够模拟地表-时际关系,以改进河流排放和洪水预报,用于科学和业务应用;为此,我们介绍河马巴,这是一个具有新颖的深层次学习模式,经过长期再分析数据培训,可以预测全球河流排放和洪水,耗资0.05 circ$的电网,最多7天,这在预警方面具有高度相关性。为了实现这一点,河曼巴利用高效的曼巴区块,使该模型能够捕捉全球规模的水道网络路线,并加强其预测较长的准备时间。为了解决这个问题,预报区将ECMWFFH HRES气象预报综合起来,同时通过基于地段的模型计算出其不准确性。我们的分析表明,河曼巴提供了可靠的河流排放预测,包括跨回回段的极端洪水。

Article 204

Title@2025-05-29 (4): Accelerating RLHF Training with Reward Variance Increase

Title: Accelerating RLHF Training with Reward Variance Increase

Beschleunigung des RLHF-Trainings mit Belohnungsvarianzsteigerung

加快RLHF培训,增加奖励差异 2505.23247v1

Authors: Zonglin Yang, Zhexuan Gu, Houduo Qi, Yancheng Yuan

Reinforcement learning from human feedback (RLHF) is an essential technique for ensuring that large language models (LLMs) are aligned with human values and preferences during the post-training phase. As an effective RLHF approach, group relative policy optimization (GRPO) has demonstrated success in many LLM-based applications. However, efficient GRPO-based RLHF training remains a challenge. Recent studies reveal that a higher reward variance of the initial policy model leads to faster RLHF training. Inspired by this finding, we propose a practical reward adjustment model to accelerate RLHF training by provably increasing the reward variance and preserving the relative preferences and reward expectation. Our reward adjustment method inherently poses a nonconvex optimization problem, which is NP-hard to solve in general. To overcome the computational challenges, we design a novel $O(n \log n)$ algorithm to find a global solution of the nonconvex reward adjustment model by explicitly characterizing the extreme points of the feasible set. As an important application, we naturally integrate this reward adjustment model into the GRPO algorithm, leading to a more efficient GRPO with reward variance increase (GRPOVI) algorithm for RLHF training. As an interesting byproduct, we provide an indirect explanation for the empirical effectiveness of GRPO with rule-based reward for RLHF training, as demonstrated in DeepSeek-R1. Experiment results demonstrate that the GRPOVI algorithm can significantly improve the RLHF training efficiency compared to the original GRPO algorithm.

从人类反馈中强化学习(RLHF)是确保大型语言模式(LLMS)在培训后阶段与人的价值和偏好相一致的一项必要技术。作为一种有效的RLHF方法,集体相对政策优化(GROP)在许多基于LLM的应用中证明是成功的。然而,基于GROP的高效RLHF培训仍是一项挑战。最近的研究显示,初步政策模式的奖励差异较大,导致更快的RLHF培训。根据这一发现,我们提出了一个实际的奖励调整模式,以加快RLHF培训的进度,具体地增加奖励差异,保持相对的偏好和预期。我们的奖励调整方法必然造成一个非convex优化问题,而这个问题一般难以解决。为了克服计算方面的挑战,我们设计了一个新的$O(n\log)算法,以寻找全球办法解决非Convex奖励调整模式,明确描述可行的RLHF培训的极端点。作为一个重要的应用,我们自然地将这一奖励调整模式纳入GROPO的算法,从而实现更高效的REGRO-LLLLL培训的深度解释,通过令人感兴趣的RGROGV规则,为我们提供了一种令人感兴趣的标准的升级的增值培训结果。

Article 205

Title@2025-05-29 (4): Measuring Participant Contributions in Decentralized Federated Learning

Title: Measuring Participant Contributions in Decentralized Federated Learning

Messung der Teilnehmerbeiträge im dezentralisierten Föderierten Lernen

分权联邦学习中的衡量参与者贡献 2505.23246v1

Authors: Honoka Anada, Tatsuya Kaneko, Shinya Takamaeda-Yamazaki

Federated learning (FL) enables multiple clients to collaboratively train models without sharing their data. Measuring participant contributions in FL is crucial for incentivizing clients and ensuring transparency. While various methods have been proposed for contribution measurement, they are designed exclusively for centralized federated learning (CFL), where a central server collects and aggregates client models, along with evaluating their contributions. Meanwhile, decentralized federated learning (DFL), in which clients exchange models directly without a central server, has gained significant attention for mitigating communication bottlenecks and eliminating a single point of failure. However, applying existing contribution measurement methods to DFL is challenging due to the presence of multiple global models and the absence of a central server. In this study, we present novel methodologies for measuring participant contributions in DFL. We first propose DFL-Shapley, an extension of the Shapley value tailored for DFL, adapting this widely used CFL metric to decentralized settings. Given the impracticality of computing the ideal DFL-Shapley in real-world systems, we introduce DFL-MR, a computable approximation that estimates overall contributions by accumulating round-wise Shapley values. We evaluate DFL-Shapley and DFL-MR across various FL scenarios and compare them with existing CFL metrics. The experimental results confirm DFL-Shapley as a valid ground-truth metric and demonstrate DFL-MR’s proximity to DFL-Shapley across various settings, highlighting their effectiveness as contribution metrics in DFL.

联邦学习(FL)使多个客户能够在不分享数据的情况下合作培训模型。衡量FL中的参与者贡献对于激励客户和确保透明度至关重要。虽然提出了各种衡量贡献的方法,但是这些方法只用于中央联合学习(CFL),中央服务器收集并汇总客户模式,同时评价其贡献。与此同时,分散化的联邦学习(DFL),客户在没有中央服务器的情况下直接交换模式,从而在减少通信瓶颈和消除单一的失败点方面得到了极大关注。然而,由于存在多种全球模式和缺乏中央服务器,将现有捐款计量方法应用于DLFL是具有挑战性的。我们在本研究中提出了衡量参与者贡献的新方法。我们首先提议DFL-Shapley,这是为DFLD定制的“普利值”的延伸,将这一广泛使用的CFLLM(D-Shapley-Shapley)指标应用于分散的环境。鉴于在现实世界系统中计算理想的DFLFL-S-Shaplay(DFL-D-SL)指标不切实际的难度,我们引入DL-ML-MR,通过不断积累的莎平面的莎平面和CL(C-R)和直径(BL) 将现有成本(我们评估)和直径(DFLFLFL-R-R-R-R-R-R-R)的模型的模型(我们评估)和各种)的模拟的模拟的模型,评估,评估,将现有的模型作为整个的模拟的模拟的模拟的模拟的模拟的模型作为对各种结果。

Article 206

Title@2025-05-29 (4): Are You Using Reliable Graph Prompts? Trojan Prompt Attacks on Graph Neural Networks

Title: Are You Using Reliable Graph Prompts? Trojan Prompt Attacks on Graph Neural Networks

Verwenden Sie zuverlässige Graph-Prompts? Trojanische Prompt-Angriffe auf Graph-Neural-Netzwerke

你用的是可靠图形提示吗? Trojan对图形神经网络的迅速攻击 2410.13974v2

Authors: Minhua Lin, Zhiwei Zhang, Enyan Dai, Zongyu Wu, Yilong Wang, Xiang Zhang, Suhang Wang

Graph Prompt Learning (GPL) has been introduced as a promising approach that uses prompts to adapt pre-trained GNN models to specific downstream tasks without requiring fine-tuning of the entire model. Despite the advantages of GPL, little attention has been given to its vulnerability to backdoor attacks, where an adversary can manipulate the model’s behavior by embedding hidden triggers. Existing graph backdoor attacks rely on modifying model parameters during training, but this approach is impractical in GPL as GNN encoder parameters are frozen after pre-training. Moreover, downstream users may fine-tune their own task models on clean datasets, further complicating the attack. In this paper, we propose TGPA, a backdoor attack framework designed specifically for GPL. TGPA injects backdoors into graph prompts without modifying pre-trained GNN encoders and ensures high attack success rates and clean accuracy. To address the challenge of model fine-tuning by users, we introduce a finetuning-resistant poisoning approach that maintains the effectiveness of the backdoor even after downstream model adjustments. Extensive experiments on multiple datasets under various settings demonstrate the effectiveness of TGPA in compromising GPL models with fixed GNN encoders.

快速化图形学习(GPL)已作为一种很有希望的方法被引入为一种很有希望的方法,它使用快速的来使经过预先训练的GNN模型适应具体的下游任务,而无需对整个模型进行微调。尽管GPL的优点,它却很少注意其易受幕后攻击的脆弱性,即对手可以通过嵌入隐藏的触发器来操纵模型的行为。现有的图形后门攻击依靠的是修改培训期间的模型参数,但在GPL中这种做法是不切实际的,因为GNN编码参数在培训前被冻结。此外,下游用户可能会在清洁数据集方面微调自己的任务模型,使攻击进一步复杂化。在本文件中,我们提议TGPA是专门为GPL设计的后门攻击框架。TGPA将后门注入图形提示,而不修改经过预先训练的GNNNC编码器,确保高攻击成功率和干净的准确性。为了应对模型用户微调的挑战,我们引入了微调耐毒药的方法,即使在下游模型调整之后仍后门的功效。在各种环境下对多个数据集进行广泛的试验,展示了GGPL固定模型。

Article 207

Title@2025-05-29 (4): Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts

Title: Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts

Autonome Datenauswahl mit Zero-shot Generative Klassifikatoren für mathematische Texte

具有数学文本零光生成分类器的自动数据选择 2402.07625v6

Authors: Yifan Zhang, Yifan Luo, Yang Yuan, Andrew C Yao

We present Autonomous Data Selection (AutoDS), a method that leverages base language models themselves as zero-shot “generative classifiers” to automatically curate high-quality mathematical texts. Unlike prior approaches that require human annotations or training a dedicated data filter, AutoDS relies solely on a model’s logits to determine whether a given passage is mathematically informative and educational. By integrating AutoDS into a continual pretraining pipeline, we substantially boost downstream performance on challenging math benchmarks (MATH, GSM8K, and BBH) while using far fewer tokens than previous methods. Empirically, our approach achieves roughly a twofold improvement in pretraining token efficiency over strong baselines, underscoring the potential of self-directed data selection in enhancing mathematical reasoning. We release our curated AutoMathText dataset to facilitate future research in automated domain-specific data curation. The AutoMathText dataset is available at https://huggingface.co/datasets/math-ai/AutoMathText. The code is available at https://github.com/yifanzhang-pro/AutoMathText.

我们推出自动数据选择(AutoDS) , 这是一种将基本语言模型本身作为零光“ 遗传分类器” 来自动翻译高质量数学文本的方法。与以前要求人为说明或培训专用数据过滤器的方法不同, AutoDS 完全依靠模型的登录来确定某一特定通道是否具有数学上的信息和教育性。通过将AutoDS纳入持续的培训前管道,我们大大提升了具有挑战性的数学基准(MATH、GSM8K和BBH)的下游性能,同时使用远比以往少得多的符号。从目前来看,我们的方法在强化基线的预培训标语效率上取得了双重改进,强调了在加强数学推理过程中自行选择数据的潜力。我们发行了我们经过校准的AutoMathText数据集,以促进未来在自动特定域数据曲线上的研究。 AutoMatext数据集可在https://huggingface.co/datasts/math-ai/AutoMatthText上查阅。代码可在 https://github.com/yfanzhah- promatthTextText查阅。

Article 208

Title@2025-05-29 (4): Equivalence of stochastic and deterministic policy gradients

Title: Equivalence of stochastic and deterministic policy gradients

Gleichwertigkeit stochastischer und deterministischer politischer Gradienten

政策梯度和确定性政策梯度等同 2505.23244v1

Authors: Emo Todorov

Policy gradients in continuous control have been derived for both stochastic and deterministic policies. Here we study the relationship between the two. In a widely-used family of MDPs involving Gaussian control noise and quadratic control costs, we show that the stochastic and deterministic policy gradients, natural gradients, and state value functions are identical; while the state-control value functions are different. We then develop a general procedure for constructing an MDP with deterministic policy that is equivalent to a given MDP with stochastic policy. The controls of this new MDP are the sufficient statistics of the stochastic policy in the original MDP. Our results suggest that policy gradient methods can be unified by approximating state value functions rather than state-control value functions.

连续控制的政策梯度是用于随机和确定性政策的政策梯度。我们在这里研究两者之间的关系。在涉及高山控制噪音和二次控制成本的多用途产品系列中,我们显示,随机和确定性政策梯度、自然梯度和国家价值功能是相同的;虽然国家控制值功能不同。然后,我们制定了一个一般程序,用以构建一个具有确定性政策的多用途产品多用途产品多用途产品多用途产品,该政策相当于具有随机政策的某个多用途产品多用途产品多用途产品。这一新多用途产品多用途产品的控制是原始多用途产品多用途产品多用途产品的充分统计。我们的结果表明,政策梯度方法可以通过近似国家价值功能而不是国家控制值功能来统一。

Article 209

Title@2025-05-29 (4): Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game

Title: Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game

Sprachagenten mit Verstärkung Lernen für strategisches Spiel im Werwolf Spiel

在狼人游戏中进行战略游戏强化学习的语文代理 2310.18940v4

Authors: Zelai Xu, Chao Yu, Fei Fang, Yu Wang, Yi Wu

Agents built with large language models (LLMs) have shown great potential across a wide range of domains. However, in complex decision-making tasks, pure LLM-based agents tend to exhibit intrinsic bias in their choice of actions, which is inherited from the model’s training data and results in suboptimal performance. To develop strategic language agents, i.e., agents that generate flexible language actions and possess strong decision-making abilities, we propose a novel framework that powers LLM-based agents with reinforcement learning (RL). We consider Werewolf, a popular social deduction game, as a challenging testbed that emphasizes versatile communication and strategic gameplay. To mitigate the intrinsic bias in language actions, our agents use an LLM to perform deductive reasoning and generate a diverse set of action candidates. Then an RL policy trained to optimize the decision-making ability chooses an action from the candidates to play in the game. Extensive experiments show that our agents overcome the intrinsic bias and outperform existing LLM-based agents in the Werewolf game. We also conduct human-agent experiments and find that our agents achieve human-level performance and demonstrate strong strategic play.

然而,在复杂的决策任务中,纯粹的LLM代理商往往在选择行动时表现出内在的偏见,这种偏见是从该模式的培训数据所继承的,其结果不尽人意。为了发展战略语言代理商,即产生灵活语言行动和拥有强大决策能力的代理商,我们提议了一个赋予LLM代理商以强化学习能力的新框架。我们认为Wrewolf是一种流行的社会推理游戏,是一种具有挑战性的试金,它强调多功能的沟通和战略游戏。为了减轻语言行动的内在偏见,我们的代理商利用LLM进行推理推理和产生一套不同的行动候选人。然后,为优化决策能力而培训的RL政策从候选人中选择了在游戏中玩的动作。广泛的实验表明,我们的代理商克服了内在的偏见,超越了在Werewolf游戏中现有的LM代理商。我们还进行人力代理实验,发现我们的代理商取得了人的水平表现并展示了强有力的战略游戏。

Article 210

Title@2025-05-29 (4): Joint estimation of smooth graph signals from partial linear measurements

Title: Joint estimation of smooth graph signals from partial linear measurements

Gemeinsame Schätzung glatter Graphensignale aus partiellen linearen Messungen

对部分线性测量得出的平滑图示信号的联合估计 2505.23240v1

Authors: Hemant Tyagi

Given an undirected and connected graph $G$ on $T$ vertices, suppose each vertex $t$ has a latent signal $x_t \in \mathbb{R}^n$ associated to it. Given partial linear measurements of the signals, for a potentially small subset of the vertices, our goal is to estimate $x_t$’s. Assuming that the signals are smooth w.r.t $G$, in the sense that the quadratic variation of the signals over the graph is small, we obtain non-asymptotic bounds on the mean squared error for jointly recovering $x_t$’s, for the smoothness penalized least squares estimator. In particular, this implies for certain choices of $G$ that this estimator is weakly consistent (as $T \rightarrow \infty$) under potentially very stringent sampling, where only one coordinate is measured per vertex for a vanishingly small fraction of the vertices. The results are extended to a multi-layer'' ranking problem where $x_t$ corresponds to the latent strengths of a collection of $n$ items, and noisy pairwise difference measurements are obtained at eachlayer’’ $t$ via a measurement graph $G_t$. Weak consistency is established for certain choices of $G$ even when the individual $G_t$’s are very sparse and disconnected.

假设一个未引导且连接的图形$G$美元对美元的顶端值值, 假设每个顶端美元都有一个隐性信号 $x_ t $ $ 美元 $ mathbb{Rn 美元。如果对信号进行部分线性测量, 对于潜在的一小部分顶端, 我们的目标是估算$x t$ 美元。假设信号是平滑的 w.r. t G$ 美元 , 也就是说, 图形上的信号的四面形变化很小, 我们得到的是每个顶端的平方误差上的非防线, 以共同恢复$x t$, 以平整平坦的方式处罚最小的正方位数。特别是, 对于某些G$的选择, 这个顶端值可能不太一致( 如$T r.r. t. t. g. 美元) , 假设在可能非常严格的取样中, 只能测量每个顶端值的顶端值只有一个坐标, 以消失的每平坦度选择。结果被扩展为每平方平方平方平方美元, 美元, 每平方平方平方平方平方平方平方平方平方平方平方的测量问题, 美元。

Article 211

Title@2025-05-29 (4): Learn Singularly Perturbed Solutions via Homotopy Dynamics

Title: Learn Singularly Perturbed Solutions via Homotopy Dynamics

Singulär perturbed Lösungen über Homotopy Dynamics lernen

通过智多基动力学学习单点受扰动的解决方案 2502.00488v3

Authors: Chuqi Chen, Yahong Yang, Yang Xiang, Wenrui Hao

Solving partial differential equations (PDEs) using neural networks has become a central focus in scientific machine learning. Training neural networks for singularly perturbed problems is particularly challenging due to certain parameters in the PDEs that introduce near-singularities in the loss function. In this study, we overcome this challenge by introducing a novel method based on homotopy dynamics to effectively manipulate these parameters. From a theoretical perspective, we analyze the effects of these parameters on training difficulty in these singularly perturbed problems and establish the convergence of the proposed homotopy dynamics method. Experimentally, we demonstrate that our approach significantly accelerates convergence and improves the accuracy of these singularly perturbed problems. These findings present an efficient optimization strategy leveraging homotopy dynamics, offering a robust framework to extend the applicability of neural networks for solving singularly perturbed differential equations.

使用神经网络解决部分差异方程式(PDEs)已成为科学机器学习的中心焦点。由于PDEs中的某些参数在损失函数中引入了近同质元素,因此,为奇特受扰动的问题培训神经网络尤其具有挑战性。在本研究中,我们通过采用基于同质动态的新颖方法来有效操控这些参数,克服了这一挑战。从理论角度看,我们分析了这些参数对这些奇特受扰动问题培训难度的影响,并建立了拟议的同质动态方法的趋同。实验来看,我们证明我们的方法大大加快了这些奇异受扰的问题的趋同,提高了这些问题的准确性。这些研究结果展示了利用同质动态的高效优化战略,提供了一个强大的框架,以扩大神经网络在解决奇受扰动差异方程式方面的适用性。

Article 212

Title@2025-05-29 (4): HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model

Title: HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model

HiDe-LlaVA: Hierarchische Entkopplung zur kontinuierlichen Instruktionstuning von multimodalen Großsprachenmodellen

HIDE-LLALAVA:多式大语言模式连续教学制导的等级脱钩 2503.12941v2

Authors: Haiyang Guo, Fanhu Zeng, Ziwei Xiang, Fei Zhu, Da-Han Wang, Xu-Yao Zhang, Cheng-Lin Liu

Instruction tuning is widely used to improve a pre-trained Multimodal Large Language Model (MLLM) by training it on curated task-specific datasets, enabling better comprehension of human instructions. However, it is infeasible to collect all possible instruction datasets simultaneously in real-world scenarios. Thus, enabling MLLM with continual instruction tuning is essential for maintaining their adaptability. However, existing methods often trade off memory efficiency for performance gains, significantly compromising overall efficiency. In this paper, we propose a task-specific expansion and task-general fusion framework based on the variations in Centered Kernel Alignment (CKA) similarity across different model layers when trained on diverse datasets. Furthermore, we analyze the information leakage present in the existing benchmark and propose a new and more challenging benchmark to rationally evaluate the performance of different methods. Comprehensive experiments showcase a significant performance improvement of our method compared to existing state-of-the-art methods. Code and dataset are released at https://github.com/Ghy0501/HiDe-LLaVA.

教学调整被广泛用来改进经过事先训练的多式大语言模型(MLLM),方法是对它进行关于具体任务数据集的培训,以便更好地了解人类的指令;然而,在现实世界情景中,不可能同时收集所有可能的指令数据集;因此,使MLLM能够不断进行教学调整,对于保持其适应性至关重要;然而,现有方法往往以业绩增益来交换记忆效率,从而大大降低总体效率;在本文件中,我们建议,根据在就不同数据集进行的培训中枢 Kernel对齐(CKA) 不同模型层的变异,建立任务扩展和任务一般融合框架;此外,我们分析现有基准中的信息渗漏情况,并提出新的、更具挑战性的基准,以合理评估不同方法的绩效;全面试验显示,我们的方法与现有“最先进”方法相比,业绩有显著改进。代码和数据集公布在https://github.com/Ghy0501/Hide-LLAVA。

Article 213

Title@2025-05-29 (4): Graph Random Walk with Feature-Label Space Alignment: A Multi-Label Feature Selection Method

Title: Graph Random Walk with Feature-Label Space Alignment: A Multi-Label Feature Selection Method

Graph Random Walk mit Feature-Label-Raumausrichtung: Eine Multi-Label-Feature-Auswahlmethode

带有地貌标签空间对齐的任意漫步图图 : 多标签特征选择方法 2505.23228v1

Authors: Wanfu Gao, Jun Gao, Qingqi Han, Hanlin Pan, Kunpeng Liu

The rapid growth in feature dimension may introduce implicit associations between features and labels in multi-label datasets, making the relationships between features and labels increasingly complex. Moreover, existing methods often adopt low-dimensional linear decomposition to explore the associations between features and labels. However, linear decomposition struggles to capture complex nonlinear associations and may lead to misalignment between the feature space and the label space. To address these two critical challenges, we propose innovative solutions. First, we design a random walk graph that integrates feature-feature, label-label, and feature-label relationships to accurately capture nonlinear and implicit indirect associations, while optimizing the latent representations of associations between features and labels after low-rank decomposition. Second, we align the variable spaces by leveraging low-dimensional representation coefficients, while preserving the manifold structure between the original high-dimensional multi-label data and the low-dimensional representation space. Extensive experiments and ablation studies conducted on seven benchmark datasets and three representative datasets using various evaluation metrics demonstrate the superiority of the proposed method\footnote{Code: https://github.com/Heilong623/-GRW-}.

多标签数据集的特征和标签的迅速增长可能会导致多标签数据集的特征和标签之间的隐含关联,使特征和标签之间的关系日益复杂;此外,现有方法往往采用低维线性分解法来探索特征和标签之间的关联;然而,线性分解挣扎以捕捉复杂的非线性关联,并可能导致特征空间和标签空间之间的错位。为了应对这两个关键挑战,我们提出了创新的解决办法。首先,我们设计一个随机行进图,将特征-特点、标签-标签和特征-标签关系结合起来,以准确捕捉非线性和非线性间接关联,同时在低级别脱压缩后优化特征和标签之间的潜在关联。第二,我们通过利用低度代表系数来调整变量空间,同时保持原高维多标签数据和低维度代表空间之间的多重结构。我们利用各种评价指标对七个基准数据集和三个具有代表性的数据集进行了广泛的实验和对比研究,展示了拟议方法的优越性:https://github.wcom/Hegry3/HI。

Article 214

Title@2025-05-29 (4): am-ELO: A Stable Framework for Arena-based LLM Evaluation

Title: am-ELO: A Stable Framework for Arena-based LLM Evaluation

am-ELO: Ein stabiles Rahmenwerk für Arena-basierte LLM-Evaluierung

AM-ELO:基于竞技场的LLM评价稳定框架 2505.03475v2

Authors: Zirui Liu, Jiatong Li, Yan Zhuang, Qi Liu, Shuanghong Shen, Jie Ouyang, Mingyue Cheng, Shijin Wang

Arena-based evaluation is a fundamental yet significant evaluation paradigm for modern AI models, especially large language models (LLMs). Existing framework based on ELO rating system suffers from the inevitable instability problem due to ranking inconsistency and the lack of attention to the varying abilities of annotators. In this paper, we introduce a novel stable arena framework to address these issues by enhancing the ELO Rating System. Specifically, we replace the iterative update method with a Maximum Likelihood Estimation (MLE) approach, m-ELO, and provide theoretical proof of the consistency and stability of the MLE approach for model ranking. Additionally, we proposed the am-ELO, which modify the Elo Rating’s probability function to incorporate annotator abilities, enabling the simultaneous estimation of model scores and annotator reliability. Experiments demonstrate that this method ensures stability, proving that this framework offers a more robust, accurate, and stable evaluation method for LLMs.

以Arena为基础的评价是现代AI模型,特别是大型语言模型的一个基本而重要的评价范例。基于ELO评级制度的现有框架由于排名不一致和对说明者能力的不同缺乏重视而不可避免地面临不稳定问题。在本文件中,我们引入了一个新的稳定的舞台框架,通过加强ELO评级制度解决这些问题。具体地说,我们用最大相似估计法(MLE)取代迭代更新方法,m-ELO, 并提供理论证据,证明模型排名MLE方法的一致性和稳定性。此外,我们提议了修改Elo Raiting概率功能的AM-ELO, 以纳入说明者能力,使模型分数和说明者可靠性能够同时估算。实验表明,这一方法确保了稳定性,证明这一框架为LLMMS提供了更加可靠、准确和稳定的评价方法。

Article 215

Title@2025-05-29 (4): Generalizability vs. Counterfactual Explainability Trade-Off

Title: Generalizability vs. Counterfactual Explainability Trade-Off

Generalisierbarkeit vs. gegenfaktische Erklärbarkeit Trade-Off

通用与反事实解释 2505.23225v1

Authors: Fabiano Veglianti, Flavio Giorgi, Fabrizio Silvestri, Gabriele Tolomei

In this work, we investigate the relationship between model generalization and counterfactual explainability in supervised learning. We introduce the notion of $\varepsilon$-valid counterfactual probability ($\varepsilon$-VCP) – the probability of finding perturbations of a data point within its $\varepsilon$-neighborhood that result in a label change. We provide a theoretical analysis of $\varepsilon$-VCP in relation to the geometry of the model’s decision boundary, showing that $\varepsilon$-VCP tends to increase with model overfitting. Our findings establish a rigorous connection between poor generalization and the ease of counterfactual generation, revealing an inherent trade-off between generalization and counterfactual explainability. Empirical results validate our theory, suggesting $\varepsilon$-VCP as a practical proxy for quantitatively characterizing overfitting.

在这项工作中,我们调查了模型的概括化和在监督的学习中反事实解释之间的关系。我们引入了 $\ varepsilon$-valid evactial objective (\ varepsilon$-VCP) 的概念 – – 在其 $\ varepsilon$-neiborbority内找到一个数据点的扰动的概率,从而导致标签的改变。我们在模型决定边界的几何学上对$\ varepsilon$-VCP 提供了理论分析,表明美元-VCP 往往随着模型的安装而增加。我们的调查结果在差的概括化和反事实生成的容易性之间建立了严格的联系,揭示了一般化和反事实解释之间的内在权衡。实证结果证实了我们的理论,认为 $\ varepslon$-VCP 是量化过度配置的实用代言。

Article 216

Title@2025-05-29 (4): JANET: Joint Adaptive predictioN-region Estimation for Time-series

Title: JANET: Joint Adaptive predictioN-region Estimation for Time-series

JANET: Gemeinsame adaptive Vorhersage-Region Schätzung für Zeitreihen

JANET: 时间序列联合适应性预测N-区域估算 2407.06390v2

Authors: Eshant English, Eliot Wong-Toi, Matteo Fontana, Stephan Mandt, Padhraic Smyth, Christoph Lippert

Conformal prediction provides machine learning models with prediction sets that offer theoretical guarantees, but the underlying assumption of exchangeability limits its applicability to time series data. Furthermore, existing approaches struggle to handle multi-step ahead prediction tasks, where uncertainty estimates across multiple future time points are crucial. We propose JANET (Joint Adaptive predictioN-region Estimation for Time-series), a novel framework for constructing conformal prediction regions that are valid for both univariate and multivariate time series. JANET generalises the inductive conformal framework and efficiently produces joint prediction regions with controlled K-familywise error rates, enabling flexible adaptation to specific application needs. Our empirical evaluation demonstrates JANET’s superior performance in multi-step prediction tasks across diverse time series datasets, highlighting its potential for reliable and interpretable uncertainty quantification in sequential data.

非正式预测为机器学习模型提供了提供理论保障的预测数据集,但互换性的基本假设限制了其对时间序列数据的适用性。此外,现有方法在努力处理多步前的预测任务,而今后多个时间点的不确定性估计至关重要。我们提议JANET(联合适应性预测-N-区域对时间序列的估算),这是构建适用于单体和多变时间序列的符合性预测区域的新框架。JANET概括了进化符合性框架,并有效地生成了带有可控K-家庭误差率的联合预测区域,使得能够灵活适应具体应用需要。我们的经验评估表明,JANET在跨不同时间序列数据集的多步预测任务中表现优异,突出了其在连续数据中可靠和可解释的不确定性量化的潜力。

Article 217

Title@2025-05-29 (4): A Signed Graph Approach to Understanding and Mitigating Oversmoothing in GNNs

Title: A Signed Graph Approach to Understanding and Mitigating Oversmoothing in GNNs

Ein signierter Graphansatz zum Verständnis und zur Milderung von Übersäuerung in GNNs

签署《理解和减缓全球NNNs中过度过度使用问题图表方法》 2502.11394v2

Authors: Jiaqi Wang, Xinyi Wu, James Cheng, Yifei Wang

Deep graph neural networks (GNNs) often suffer from oversmoothing, where node representations become overly homogeneous with increasing depth. While techniques like normalization, residual connections, and edge dropout have been proposed to mitigate oversmoothing, they are typically developed independently, with limited theoretical understanding of their underlying mechanisms. In this work, we present a unified theoretical perspective based on the framework of signed graphs, showing that many existing strategies implicitly introduce negative edges that alter message-passing to resist oversmoothing. However, we show that merely adding negative edges in an unstructured manner is insufficient-the asymptotic behavior of signed propagation depends critically on the strength and organization of positive and negative edges. To address this limitation, we leverage the theory of structural balance, which promotes stable, cluster-preserving dynamics by connecting similar nodes with positive edges and dissimilar ones with negative edges. We propose Structural Balanced Propagation (SBP), a plug-and-play method that assigns signed edges based on either labels or feature similarity to explicitly enhance structural balance in the constructed signed graphs. Experiments on nine benchmarks across both homophilic and heterophilic settings demonstrate that SBP consistently improves classification accuracy and mitigates oversmoothing, even at depths of up to 300 layers. Our results provide a principled explanation for prior oversmoothing remedies and introduce a new direction for signed message-passing design in deep GNNs.

深图形内心网络(GNNS)往往受到过度透透的困扰,因为节点表现随着深度的提高而变得过于相似。虽然有人提议采用正常化、剩余连接和边缘辍学等技术来缓解过度透透析,但它们通常是独立开发的,其基本机制的理论理解有限。在这项工作中,我们以签名图表框架为基础提出一个统一的理论观点,表明许多现有战略隐含着改变信息传递以抵制过度透析的负面边缘。然而,我们表明,仅仅以非结构化方式增加负边缘是不够的,即签名传播的无弹性行为严重依赖正面和负边的力度和组织。为了应对这一局限性,我们利用结构平衡理论,通过将类似节点与正边和反面的偏差联系起来,促进稳定、集束保留动态。我们提出了结构平衡(SBP ) , 一种基于标签或特征为明确加强已签名的图表结构平衡而增加的深度偏差( ) 不够充分。

Article 218

Title@2025-05-29 (4): Daunce: Data Attribution through Uncertainty Estimation

Title: Daunce: Data Attribution through Uncertainty Estimation

Daunce: Datenzuweisung durch Unsicherheitsabschätzung

Daunce:通过不确定性估计数据归属 2505.23223v1

Authors: Xingyuan Pan, Chenlu Ye, Joseph Melkonian, Jiaqi W. Ma, Tong Zhang

Training data attribution (TDA) methods aim to identify which training examples influence a model’s predictions on specific test data most. By quantifying these influences, TDA supports critical applications such as data debugging, curation, and valuation. Gradient-based TDA methods rely on gradients and second-order information, limiting their applicability at scale. While recent random projection-based methods improve scalability, they often suffer from degraded attribution accuracy. Motivated by connections between uncertainty and influence functions, we introduce Daunce - a simple yet effective data attribution approach through uncertainty estimation. Our method operates by fine-tuning a collection of perturbed models and computing the covariance of per-example losses across these models as the attribution score. Daunce is scalable to large language models (LLMs) and achieves more accurate attribution compared to existing TDA methods. We validate Daunce on tasks ranging from vision tasks to LLM fine-tuning, and further demonstrate its compatibility with black-box model access. Applied to OpenAI’s GPT models, our method achieves, to our knowledge, the first instance of data attribution on proprietary LLMs.

培训数据归属(TDA)方法旨在确定哪些培训范例影响具体测试数据模型的预测。通过量化这些影响,TDA支持关键应用,如数据调试、整理和估值。基于梯度的TDA方法依赖梯度和二级信息,限制其规模的适用性。虽然最近的随机预测方法提高了可缩放性,但它们往往会受到可缩放性差的准确性。受不确定性和影响功能之间联系的驱动,我们引入了Daunce(一种简单而有效的数据归属方法,通过不确定性估计。我们的方法是对这些模型的渗透模型进行微调,并计算这些模型中每项损失的共差值,作为属性分。Daunce对大语言模型(LLLMS)具有可缩放性,并比现有的TDA方法实现更准确的归属。我们验证Daunce(Daunce)的任务从愿景任务到LM微调,并进一步证明它与黑箱模型访问的兼容性。我们的方法适用于OpenAI的GPTM模型, 我们的方法达到我们的知识,即拥有LM的数据归属的第一个例子。

Article 219

Title@2025-05-29 (4): Trajectory Generator Matching for Time Series

Title: Trajectory Generator Matching for Time Series

Trajektorie Generator passend für Zeitreihen

时间序列匹配轨迹生成器 2505.23215v1

Authors: T. Jahn, J. Chemseddine, P. Hagemann, C. Wald, G. Steidl

Accurately modeling time-continuous stochastic processes from irregular observations remains a significant challenge. In this paper, we leverage ideas from generative modeling of image data to push the boundary of time series generation. For this, we find new generators of SDEs and jump processes, inspired by trajectory flow matching, that have the marginal distributions of the time series of interest. Specifically, we can handle discontinuities of the underlying processes by parameterizing the jump kernel densities by scaled Gaussians that allow for closed form formulas of the corresponding Kullback-Leibler divergence in the loss. Unlike most other approaches, we are able to handle irregularly sampled time series.

从非正常观测中精确地模拟持续时间的随机过程仍是一项重大挑战。在本文中,我们利用图像数据基因模型模型的创意来推动时间序列生成的界限。为此,我们发现由轨迹流量匹配所启发的SDEs和跳跃过程的新生成器,这些生成器具有时间序列的边际分布。具体地说,我们可以通过将跳动内核密度作为参数来应对基本过程的不连续性, 由比例尺的高斯人来参数, 从而允许相应的 Kullback- Leibler 差异的封闭形式公式。与大多数其他方法不同, 我们能够处理不规则的抽样时间序列。

Article 220

Title@2025-05-29 (4): Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Title: Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Engere Datenschutzprüfung von DP-SGD im Hidden State Threat Model

对隐藏国家威胁模式DP-SGD的更严格隐私审计 2405.14457v3

Authors: Tudor Cebere, Aurélien Bellet, Nicolas Papernot

Machine learning models can be trained with formal privacy guarantees via differentially private optimizers such as DP-SGD. In this work, we focus on a threat model where the adversary has access only to the final model, with no visibility into intermediate updates. In the literature, this hidden state threat model exhibits a significant gap between the lower bound from empirical privacy auditing and the theoretical upper bound provided by privacy accounting. To challenge this gap, we propose to audit this threat model with adversaries that craft a gradient sequence designed to maximize the privacy loss of the final model without relying on intermediate updates. Our experiments show that this approach consistently outperforms previous attempts at auditing the hidden state model. Furthermore, our results advance the understanding of achievable privacy guarantees within this threat model. Specifically, when the crafted gradient is inserted at every optimization step, we show that concealing the intermediate model updates in DP-SGD does not enhance the privacy guarantees. The situation is more complex when the crafted gradient is not inserted at every step: our auditing lower bound matches the privacy upper bound only for an adversarially-chosen loss landscape and a sufficiently large batch size. This suggests that existing privacy upper bounds can be improved in certain regimes.

通过DP-SGD等不同的私人优化设备,可以对机器学习模式进行正式的隐私保障培训。在这项工作中,我们侧重于一个威胁模式,对手只能使用最后模式,中间更新没有可见度。在文献中,这种隐蔽的国家威胁模式在经验隐私审计的较低约束与隐私会计提供的理论上限之间存在巨大差距。为了挑战这一差距,我们提议与对手一起审计这一威胁模式,这些对手设计了一个梯度序列,目的是在不依赖中间更新的情况下最大限度地减少最后模式的隐私损失。我们的实验表明,这一方法始终优于先前审计隐蔽状态模式的尝试。此外,我们的成果促进了对这一威胁模式中可实现的隐私保障的理解。具体地说,在每次优化步骤插入精心设计的梯度时,我们表明,隐藏DP-SGD的中间模型更新并不增强隐私保障。如果不是每一步都插入精心设计的梯度,那么情况就更加复杂:我们的审计就更低约束了隐私上限,只有对敌对式混合损失场景和足够大批量尺寸。这表明,现有的隐私上限在某些制度中是可以改进的。

Article 221

Title@2025-05-29 (4): Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces

Title: Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces

Verbesserung der parallelen Programmleistung mit LLM-Optimierern über Agent-System-Schnittstellen

通过代理-系统接口改进与LLM优化器的平行方案绩效 2410.15625v3

Authors: Anjiang Wei, Allen Nie, Thiago S. F. X. Teixeira, Rohan Yadav, Wonchan Lee, Ke Wang, Alex Aiken

Modern scientific discovery increasingly relies on high-performance computing for complex modeling and simulation. A key challenge in improving parallel program performance is efficiently mapping tasks to processors and data to memory, a process dictated by intricate, low-level system code known as mappers. Developing high-performance mappers demands days of manual tuning, posing a significant barrier for domain scientists without systems expertise. We introduce a framework that automates mapper development with generative optimization, leveraging richer feedback beyond scalar performance metrics. Our approach features the Agent-System Interface, which includes a Domain-Specific Language (DSL) to abstract away the low-level complexity of system code and define a structured search space, as well as AutoGuide, a mechanism that interprets raw execution output into actionable feedback. Unlike traditional reinforcement learning methods such as OpenTuner, which rely solely on scalar feedback, our method finds superior mappers in far fewer iterations. With just 10 iterations, it outperforms OpenTuner even after 1000 iterations, achieving 3.8X faster performance. Our approach finds mappers that surpass expert-written mappers by up to 1.34X speedup across nine benchmarks while reducing tuning time from days to minutes.

现代科学发现日益依赖高性能计算来进行复杂的建模和模拟。改进平行程序性能的一个关键挑战是高效地绘制处理器和数据到记忆的处理器和数据的工作,这一过程由复杂、低层次的系统代码(即映射器)所决定。开发高性能绘图师需要数日人工调整,这对没有系统专长的域科学家构成了巨大的障碍。我们引入了一个框架,使成像开发自动成像,使其具有基因化优化,使更丰富的反馈超过缩微性能度量度尺度。我们的方法特征是代理系统-系统界面,包括一个DSL(DSL)来抽取系统代码的低度复杂度,并定义结构搜索空间,以及AutoGuide(一个将原始执行输出解释为可操作反馈的机制) 。与OpenTuner(OpenTuner)等传统的强化学习方法不同, 我们的方法仅依靠缩放反馈, 其发现高级地图师在更小得多的迭。我们的方法在10次的外, 它比OpenTuster(OnTustry-TultalTustr)更接近于1000次后, 实现3.X更快的功能。我们的方法从超过专家写地图数日,同时将速度调整到1.34时间调整至1.34时间到1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Article 222

Title@2025-05-29 (4): On the performance of machine-learning-assisted Monte Carlo in sampling from simple statistical physics models

Title: On the performance of machine-learning-assisted Monte Carlo in sampling from simple statistical physics models

Über die Leistung von Monte Carlo mit maschinellem Lernen bei der Probenahme von einfachen Modellen der statistischen Physik

关于机械学习辅助蒙特卡洛利用简单统计物理模型取样的 2505.22598v2

Authors: Luca Maria Del Bono, Federico Ricci-Tersenghi, Francesco Zamponi

Recent years have seen a rise in the application of machine learning techniques to aid the simulation of hard-to-sample systems that cannot be studied using traditional methods. Despite the introduction of many different architectures and procedures, a wide theoretical understanding is still lacking, with the risk of suboptimal implementations. As a first step to address this gap, we provide here a complete analytic study of the widely-used Sequential Tempering procedure applied to a shallow MADE architecture for the Curie-Weiss model. The contribution of this work is twofold: firstly, we give a description of the optimal weights and of the training under Gradient Descent optimization. Secondly, we compare what happens in Sequential Tempering with and without the addition of local Metropolis Monte Carlo steps. We are thus able to give theoretical predictions on the best procedure to apply in this case. This work establishes a clear theoretical basis for the integration of machine learning techniques into Monte Carlo sampling and optimization.

近年来,在应用机器学习技术协助模拟无法使用传统方法研究的难以取样的系统方面,机器学习技术的运用有所增加。尽管采用了许多不同的结构和程序,但仍然缺乏广泛的理论理解,存在执行不理想的风险。作为缩小这一差距的第一步,我们在此对适用于Curie-Weiss模型浅层陶瓷结构的广泛使用的序列诱惑程序进行了全面的分析研究。这项工作的贡献有两个方面:第一,我们描述了最佳重量和在梯层源优化下进行的培训。第二,我们比较了在序列式诱惑中发生的情况,而没有加上Metopolis Monte Carlo的当地步骤。因此,我们能够对本案应用的最佳程序作出理论预测。这项工作为将机器学习技术纳入Monte Carlo采样和优化提供了一个明确的理论基础。

Article 223

Title@2025-05-29 (4): Towards Robust Overlapping Speech Detection: A Speaker-Aware Progressive Approach Using WavLM

Title: Towards Robust Overlapping Speech Detection: A Speaker-Aware Progressive Approach Using WavLM

Auf dem Weg zu einer robusten, überlappenden Spracherkennung: Ein Lautsprecher-Bewusst-Progressiver Ansatz mit WavLM

争取强劲的超重叠语音探测:使用WavLM 的演讲者-警示渐进方法 2505.23207v1

Authors: Zhaokai Sun, Li Zhang, Qing Wang, Pan Zhou, Lei Xie

Overlapping Speech Detection (OSD) aims to identify regions where multiple speakers overlap in a conversation, a critical challenge in multi-party speech processing. This work proposes a speaker-aware progressive OSD model that leverages a progressive training strategy to enhance the correlation between subtasks such as voice activity detection (VAD) and overlap detection. To improve acoustic representation, we explore the effectiveness of state-of-the-art self-supervised learning (SSL) models, including WavLM and wav2vec 2.0, while incorporating a speaker attention module to enrich features with frame-level speaker information. Experimental results show that the proposed method achieves state-of-the-art performance, with an F1 score of 82.76\% on the AMI test set, demonstrating its robustness and effectiveness in OSD.

迭代语音探测(OSD)旨在确定多个发言者在对话中重叠的区域,这是多党语言处理中的一项关键挑战,这项工作提议采用一个语音意识渐进式OSD模式,利用渐进式培训战略,加强语音活动探测(VAD)和重叠探测等子任务之间的相互关系。为了改善声学表现,我们探索最先进的自我监督学习(SSL)模式的有效性,包括WavLM 和 wav2vec 2.0,同时纳入一个语音关注模块,以丰富框架级演讲者信息的特征。实验结果显示,拟议方法取得了最新业绩,在AMI测试集上获得了82.76%的F1分,表明其在OSD中非常健全和有效。

Article 224

Title@2025-05-29 (4): Disentangled Multi-span Evolutionary Network against Temporal Knowledge Graph Reasoning

Title: Disentangled Multi-span Evolutionary Network against Temporal Knowledge Graph Reasoning

Disentangled Multi-Span Evolutionary Network gegen Temporal Knowledge Graph Reasoning

对抗时间知识图表推理的多空间演进网络 2505.14020v2

Authors: Hao Dong, Ziyue Qiao, Zhiyuan Ning, Qi Hao, Yi Du, Pengyang Wang, Yuanchun Zhou

Temporal Knowledge Graphs (TKGs), as an extension of static Knowledge Graphs (KGs), incorporate the temporal feature to express the transience of knowledge by describing when facts occur. TKG extrapolation aims to infer possible future facts based on known history, which has garnered significant attention in recent years. Some existing methods treat TKG as a sequence of independent subgraphs to model temporal evolution patterns, demonstrating impressive reasoning performance. However, they still have limitations: 1) In modeling subgraph semantic evolution, they usually neglect the internal structural interactions between subgraphs, which are actually crucial for encoding TKGs. 2) They overlook the potential smooth features that do not lead to semantic changes, which should be distinguished from the semantic evolution process. Therefore, we propose a novel Disentangled Multi-span Evolutionary Network (DiMNet) for TKG reasoning. Specifically, we design a multi-span evolution strategy that captures local neighbor features while perceiving historical neighbor semantic information, thus enabling internal interactions between subgraphs during the evolution process. To maximize the capture of semantic change patterns, we design a disentangle component that adaptively separates nodes’ active and stable features, used to dynamically control the influence of historical semantics on future evolution. Extensive experiments conducted on four real-world TKG datasets show that DiMNet demonstrates substantial performance in TKG reasoning, and outperforms the state-of-the-art up to 22.7% in MRR.

时间知识图(TKGGs)是静态知识图(KGs)的延伸,它包含时间特征,通过描述事实发生时描述知识的瞬态。TKG外推法旨在根据已知历史推断未来可能发生的事实,这些历史近年来引起了极大关注。有些现有方法将TKG视为一个独立的子集,用来模拟时间演变模式,展示令人印象深刻的推理性能。然而,它们仍然有局限性:(1)在模拟子图的语义演变中,它们通常忽视子图之间的内部结构互动,而这种互动实际上对于编码TKGs至关重要。(2)它们忽略了不会导致语义变化的潜在平稳特征,而这种变化应当与语义演变过程区别开来。因此,我们提出了一个新的新颖的解交错多谱进化网络(DIMNet),用于模拟时间进化模式,具体地说,我们设计一个多谱进化战略,在探测历史邻居的语义进化信息,从而在进化过程中促进内部互动。为了最大限度地采集语义变化的进化和进化模式,我们设计了对历史进化的进化的进化的进化结构的进化结构的进化、进化、进化的进化的进化、进化、进化、进化的进化、进化、进化的进化、进化、进化的进化的进化的进化、进化的进化的进化的进化的进进化的进进化的进化的进化、进化、进化、进化、进化、进进进化的进进进化的进进进进进进进进进进进进进进进进进进进进进进进进的进的进的进进进进进进进的进的进的进的进进进进进进进进进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进进进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进的进

Article 225

Title@2025-05-29 (4): Aligning Text to Image in Diffusion Models is Easier Than You Think

Title: Aligning Text to Image in Diffusion Models is Easier Than You Think

Text an Bild in Diffusions-Modellen ausrichten ist einfacher, als Sie denken

在传播模型中将文本对齐到图像比您想象的容易 2503.08250v4

Authors: Jaa-Yeon Lee, Byunghee Cha, Jeongsol Kim, Jong Chul Ye

While recent advancements in generative modeling have significantly improved text-image alignment, some residual misalignment between text and image representations still remains. Some approaches address this issue by fine-tuning models in terms of preference optimization, etc., which require tailored datasets. Orthogonal to these methods, we revisit the challenge from the perspective of representation alignment-an approach that has gained popularity with the success of REPresentation Alignment (REPA). We first argue that conventional text-to-image (T2I) diffusion models, typically trained on paired image and text data (i.e., positive pairs) by minimizing score matching or flow matching losses, is suboptimal from the standpoint of representation alignment. Instead, a better alignment can be achieved through contrastive learning that leverages existing dataset as both positive and negative pairs. To enable efficient alignment with pretrained models, we propose SoftREPA- a lightweight contrastive fine-tuning strategy that leverages soft text tokens for representation alignment. This approach improves alignment with minimal computational overhead by adding fewer than 1M trainable parameters to the pretrained model. Our theoretical analysis demonstrates that our method explicitly increases the mutual information between text and image representations, leading to enhanced semantic consistency. Experimental results across text-to-image generation and text-guided image editing tasks validate the effectiveness of our approach in improving the semantic consistency of T2I generative models.

虽然最近在变形模型方面的进步大大改善了文字形象的调整,但文本和图像表示形式之间还存在一些残留的不匹配现象。有些方法通过在偏好优化方面微调模型来解决这个问题,这需要量制数据集。对方法的调整,我们从代表调整方法的角度重新审视挑战,这种方法随着降级调整的成功而获得普遍欢迎。我们首先认为,传统文本对比图像(T2I)传播模式,通常通过尽量减少得分匹配或流动匹配损失来培训成对图像和文本数据(即正对对对),从代表调整的角度来说,这种方法并不理想。相反,通过对比性学习,将现有数据集作为正对和负对等工具,可以实现更好的调整。为了能够有效地与预先接受的模型保持一致,我们提议SoftREPA-一种较轻量的对比微调整战略,利用软文本标记来调整代表比例。这一方法通过将低于1M的可培训参数添加到预升级的图象化模型,从而改进了我们的正统性。我们之间的理论分析方法明确地显示了我们改进了在生成图像格式上的一致性。

Article 226

Title@2025-05-29 (4): JAPAN: Joint Adaptive Prediction Areas with Normalising-Flows

Title: JAPAN: Joint Adaptive Prediction Areas with Normalising-Flows

JAPAN: Gemeinsame adaptive Vorhersagebereiche mit Normalisierungs-Flows

JAPAN: 联合适应性预测区与标准化花束 2505.23196v1

Authors: Eshant English, Christoph Lippert

Conformal prediction provides a model-agnostic framework for uncertainty quantification with finite-sample validity guarantees, making it an attractive tool for constructing reliable prediction sets. However, existing approaches commonly rely on residual-based conformity scores, which impose geometric constraints and struggle when the underlying distribution is multimodal. In particular, they tend to produce overly conservative prediction areas centred around the mean, often failing to capture the true shape of complex predictive distributions. In this work, we introduce JAPAN (Joint Adaptive Prediction Areas with Normalising-Flows), a conformal prediction framework that uses density-based conformity scores. By leveraging flow-based models, JAPAN estimates the (predictive) density and constructs prediction areas by thresholding on the estimated density scores, enabling compact, potentially disjoint, and context-adaptive regions that retain finite-sample coverage guarantees. We theoretically motivate the efficiency of JAPAN and empirically validate it across multivariate regression and forecasting tasks, demonstrating good calibration and tighter prediction areas compared to existing baselines. We also provide several \emph{extensions} adding flexibility to our proposed framework.

综合预测为不确定性的量化提供了一个具有有限抽样有效性保证的模型 – – 不可知性框架,使它成为构建可靠预测数据集的有吸引力的工具;然而,现有方法通常依赖基于剩余值的符合性评分,在基本分布为多式联运时,这些评分会施加几何限制和困难;特别是,它们往往产生以平均值为中心的过于保守的预测区,往往不能捕捉复杂预测分布的真实形状;在这项工作中,我们引入了日本航空航天研究所(联合适应性预测区,具有标准化-花样),一个使用基于密度的符合性评分的符合性预测框架。日本航空航天研究所通过利用基于流量的模型,估算(预测性)密度和构建预测区,方法是对估计密度评分进行阈值,使保持有限抽样覆盖的紧凑、可能不连贯和背景适应性区域得以维持。我们从理论上鼓励日本航空航天研究所的效率,并在多重回归和预测任务中进行实证,表明良好的校准和较现有基线更为严格的预测区。我们还提供数个基于流基模型的预测区,以增加我们提议的框架的灵活性。

Article 227

Title@2025-05-29 (4): Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning

Title: Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning

Weniger ist mehr: Unlocking Spezialisierung von Time Series Foundation Models über strukturiertes Pruning

较少是更多:通过结构式普鲁宁解锁时间序列基础模型的专业化 2505.23195v1

Authors: Lifan Zhao, Yanyan Shen, Zhaoyang Liu, Xue Wang, Jiaji Deng

Scaling laws motivate the development of Time Series Foundation Models (TSFMs) that pre-train vast parameters and achieve remarkable zero-shot forecasting performance. Surprisingly, even after fine-tuning, TSFMs cannot consistently outperform smaller, specialized models trained on full-shot downstream data. A key question is how to realize effective adaptation of TSFMs for a target forecasting task. Through empirical studies on various TSFMs, the pre-trained models often exhibit inherent sparsity and redundancy in computation, suggesting that TSFMs have learned to activate task-relevant network substructures to accommodate diverse forecasting tasks. To preserve this valuable prior knowledge, we propose a structured pruning method to regularize the subsequent fine-tuning process by focusing it on a more relevant and compact parameter space. Extensive experiments on seven TSFMs and six benchmarks demonstrate that fine-tuning a smaller, pruned TSFM significantly improves forecasting performance compared to fine-tuning original models. This “prune-then-finetune” paradigm often enables TSFMs to achieve state-of-the-art performance and surpass strong specialized baselines.

令人惊讶的是,即使是在微调后,高科技模型也不能始终优于全速下游数据专门模型。一个关键问题是如何使高科技模型有效地适应目标预测任务。通过对各种高科技模型的实证研究,预先培训的模型往往在计算中表现出固有的空间和冗余,表明高科技模型已经学会了启动与任务相关的网络子结构以适应不同的预测任务。为了保存这一宝贵的先前知识,我们建议了一种结构化的调整方法,以规范随后的微调过程,将重点置于一个更相关和紧凑的参数空间上。对7个高科技模型和6个基准的广泛实验表明,微调一个较小的、精细的TSFM模型与微调原型模型相比,大大改进了预测业绩。这种“春-正时-菲涅纳”模型常常使高科技模型能够实现最先进的业绩和超强的专业化基线。

Article 228

Title@2025-05-29 (4): Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection

Title: Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection

Multimodale Inverse Aufmerksamkeit Netzwerk mit Intrinsic Discriminant Feature Exploitation für gefälschte Nachrichten Erkennung

多式反向关注网络,利用内在差异性地貌特征利用假新闻探测 2502.01699v2

Authors: Tianlin Zhang, En Yu, Yi Shao, Jiande Sun

Multimodal fake news detection has garnered significant attention due to its profound implications for social security. While existing approaches have contributed to understanding cross-modal consistency, they often fail to leverage modal-specific representations and explicit discrepant features. To address these limitations, we propose a Multimodal Inverse Attention Network (MIAN), a novel framework that explores intrinsic discriminative features based on news content to advance fake news detection. Specifically, MIAN introduces a hierarchical learning module that captures diverse intra-modal relationships through local-to-global and local-to-local interactions, thereby generating enhanced unimodal representations to improve the identification of fake news at the intra-modal level. Additionally, a cross-modal interaction module employs a co-attention mechanism to establish and model dependencies between the refined unimodal representations, facilitating seamless semantic integration across modalities. To explicitly extract inconsistency features, we propose an inverse attention mechanism that effectively highlights the conflicting patterns and semantic deviations introduced by fake news in both intra- and inter-modality. Extensive experiments on benchmark datasets demonstrate that MIAN significantly outperforms state-of-the-art methods, underscoring its pivotal contribution to advancing social security through enhanced multimodal fake news detection.

由于对社会保障的深刻影响,多式假新闻探测已经引起人们的极大关注。虽然现有办法有助于理解跨式一致性,但往往未能利用模式特定的表现方式和明显的差异性。为了解决这些限制,我们提议建立一个多式反向关注网络(MIAN),这是一个新颖的框架,根据新闻内容探索内在的歧视性特征,以推动假新闻探测。具体地说,MIAN引入了一个等级学习模块,通过地方对全球和地方对地方的互动,通过地方对地方对地方对地方对地方对地方对地方对地方的互动,从而产生强化的单一形式表现方式,改进对假新闻的识别。此外,跨式互动模块采用共同注意机制,在完善的单式表达方式之间建立和模式依赖性,促进各模式之间无缝的相互融合。为了明确提取不一致特征,我们提议一个反向关注机制,有效地突出假消息在内部和现代新闻中出现的相互冲突的模式和语义偏差。关于基准数据集的广泛实验表明,MIAN大大超越了其通过改进的多式联运方式对改进其关键信息探测方式的贡献。

Article 229

Title@2025-05-29 (4): Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics

Title: Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics

Beyond Zero Initialization: Untersuchung der Auswirkungen von Non-Zero Initialization auf LoRA Fine-Tuning Dynamics

零启动后零启动后:调查非零初始化对LORA微调动力学的影响 2505.23194v1

Authors: Shiwei Li, Xiandi Luo, Xing Tang, Haozhao Wang, Hao Chen, Weihong Luo, Yuhua Li, Xiuqiang He, Ruixuan Li

Low-rank adaptation (LoRA) is a widely used parameter-efficient fine-tuning method. In standard LoRA layers, one of the matrices, $A$ or $B$, is initialized to zero, ensuring that fine-tuning starts from the pretrained model. However, there is no theoretical support for this practice. In this paper, we investigate the impact of non-zero initialization on LoRA’s fine-tuning dynamics from an infinite-width perspective. Our analysis reveals that, compared to zero initialization, simultaneously initializing $A$ and $B$ to non-zero values improves LoRA’s robustness to suboptimal learning rates, particularly smaller ones. Further analysis indicates that although the non-zero initialization of $AB$ introduces random noise into the pretrained weight, it generally does not affect fine-tuning performance. In other words, fine-tuning does not need to strictly start from the pretrained model. The validity of our findings is confirmed through extensive experiments across various models and datasets. The code is available at https://github.com/Leopold1423/non_zero_lora-icml25.

低级别适应(LORA)是一种广泛使用的参数效率微调方法。在标准的LORA层中,一个矩阵,即$A美元或$B$,被初始化为零,确保微调从预培训模式开始。然而,这种做法没有理论上的支持。在本文中,我们从无限宽的角度调查非零初始化对LORA微调动态的影响。我们的分析表明,与零初始化相比,同时初始化美元和美元B$至非零值提高了LORA对亚最佳学习率的稳健性,特别是较小的。进一步的分析表明,尽管美元的非零初始化将随机噪音引入预培训的重量,但通常不会影响微调性能。换句话说,微调不需要严格地从预培训模式开始。我们的调查结果的有效性通过各种模型和数据集的广泛实验得到确认。代码见https://github.com/Leopoldnon_ze_zerola-rica-icm25。

Article 230

Title@2025-05-29 (4): DeepRTE: Pre-trained Attention-based Neural Network for Radiative Tranfer

Title: DeepRTE: Pre-trained Attention-based Neural Network for Radiative Tranfer

DeepRTE: Pre-trained Aufmerksamkeit-basiertes Neural-Netzwerk für Radiative Tranfer

DeepRTE: 培训前的辐射Tranfer神经网络,以关注为主的神经网络 2505.23190v1

Authors: Yekun Zhu, Min Tang, Zheng Ma

In this study, we propose a novel neural network approach, termed DeepRTE, to address the steady-state Radiative Transfer Equation (RTE). The RTE is a differential-integral equation that governs the propagation of radiation through a participating medium, with applications spanning diverse domains such as neutron transport, atmospheric radiative transfer, heat transfer, and optical imaging. Our proposed DeepRTE framework leverages pre-trained attention-based neural networks to solve the RTE with high accuracy and computational efficiency. The efficacy of the proposed approach is substantiated through comprehensive numerical experiments.

在这次研究中,我们提出了一个新的神经网络方法,称为“深线网络”,以解决稳定状态的辐射转移赤道(RTE)问题。 RETE是一个差异整体方程式,它通过一个参与媒介管理辐射的传播,其应用范围涵盖多个领域,如中子传输、大气辐射转移、热传输和光学成像。我们提议的DeepRTE框架利用预先训练的以注意力为基础的神经网络,以高精确度和计算效率解决RETE。拟议的方法的效力通过全面的数值实验得到证实。

Article 231

Title@2025-05-29 (4): Plug In and Learn: Federated Intelligence over a Smart Grid of Models

Title: Plug In and Learn: Federated Intelligence over a Smart Grid of Models

Plug In and Learn: Federated Intelligence über ein Smart Grid aus Modellen

插插插和学习:对智能模型网的联邦情报 2302.04363v4

Authors: S. Abdurakhmanova, Y. SarcheshmehPour, A. Jung

We present a model-agnostic federated learning method that mirrors the operation of a smart power grid: diverse local models, like energy prosumers, train independently on their own data while exchanging lightweight signals to coordinate with statistically similar peers. This coordination is governed by a graph-based regularizer that encourages connected models to produce similar predictions on a shared, public unlabeled dataset. The resulting method is a flexible instance of regularized empirical risk minimization and supports a wide variety of local models - both parametric and non-parametric - provided they can be trained via regularized loss minimization. Such training is readily supported by standard ML libraries including scikit-learn, Keras, and PyTorch.

我们提出了一个示范的、不可知的联邦学习方法,它反映了智能电网的运作:各种当地模型,如能源制造者,独立地以自己的数据进行训练,同时交换轻量的信号,以便与统计上类似的同行进行协调。这种协调由一个基于图表的正规化器管理,它鼓励连接模型对共用的、公开的、没有标签的数据集作出类似的预测。由此产生的方法是一个将经验风险降到最低的正规化的灵活实例,它支持各种各样的当地模型,包括参数和非参数模型,只要它们能够通过将损失降到最低的方式加以培训。这种培训很容易得到标准ML图书馆的支持,包括Scikit-learn、Keras和PyTorch。

Article 232

Title@2025-05-29 (4): Dequantified Diffusion-Schr{ö}dinger Bridge for Density Ratio Estimation

Title: Dequantified Diffusion-Schr{ö}dinger Bridge for Density Ratio Estimation

Dequantifizierte Diffusion-Schr{ö}dinger-Brücke für Dichte-Verhältnis-Schätzung

密度比率估计的量化扩散 - Schrdinger桥 2505.05034v3

Authors: Wei Chen, Shigui Li, Jiacheng Li, Junmei Yang, John Paisley, Delu Zeng

Density ratio estimation is fundamental to tasks involving f-divergences, yet existing methods often fail under significantly different distributions or inadequately overlapping supports – the density-chasm and the support-chasm problems. Additionally, prior approaches yield divergent time scores near boundaries, leading to instability. We design $\textbf{D}^3\textbf{RE}$, a unified framework for robust, stable and efficient density ratio estimation. We propose the dequantified diffusion bridge interpolant (DDBI), which expands support coverage and stabilizes time scores via diffusion bridges and Gaussian dequantization. Building on DDBI, the proposed dequantified Schr{"o}dinger bridge interpolant (DSBI) incorporates optimal transport to solve the Schr{"o}dinger bridge problem, enhancing accuracy and efficiency. Our method offers uniform approximation and bounded time scores in theory, and outperforms baselines empirically in mutual information and density estimation tasks.

密度比率估计对于涉及裂变的任务来说至关重要,然而,现有方法往往在分布差异很大或重叠不足的情况下失败 – – 密度和支点问题。此外,先前的办法在边界附近产生不同的时间分数,导致不稳定。我们设计了$textbf{D3\textbf{{RE}$,这是一个稳健、稳定、高效的密度比率估计的统一框架。我们提议了分解扩散桥间插点(DDBI),它通过传播桥和高斯断层来扩大支持覆盖面和稳定时间分数。在DBI的基础上,拟议的取消的Schr}o}dinger桥间连接(DSBI)将最佳运输纳入解决Schr_‘o}dinger桥问题,提高准确性和效率。我们的方法在理论上提供了统一的近似和捆绑时间分数,在相互信息和密度估计任务方面超越了基线。

Article 233

Title@2025-05-29 (4): Unsupervisedly Learned Representations: Should the Quest be Over?

Title: Unsupervisedly Learned Representations: Should the Quest be Over?

Unüberwacht gelernte Repräsentationen: Sollte die Suche vorbei sein?

无人监督的派任代表:调查是否应该结束? 2001.07495v6

Authors: Daniel N. Nissani

After four decades of research there still exists a Classification accuracy gap of about 20% between our best Unsupervisedly Learned Representations methods and the accuracy rates achieved by intelligent animals. It thus may well be that we are looking in the wrong direction. A possible solution to this puzzle is presented. We demonstrate that Reinforcement Learning can learn representations which achieve the same accuracy as that of animals. Our main modest contribution lies in the observations that: a. when applied to a real world environment Reinforcement Learning does not require labels, and thus may be legitimately considered as Unsupervised Learning, and b. in contrast, when Reinforcement Learning is applied in a simulated environment it does inherently require labels and should thus be generally be considered as Supervised Learning. The corollary of these observations is that further search for Unsupervised Learning competitive paradigms which may be trained in simulated environments may be futile.

经过40年的研究,我们的最佳未经监督的教学方法与智能动物所达到的精确率之间仍然存在着约20%的分类准确性差距。因此,我们很可能在寻找错误的方向。我们展示了这个谜题的可能解决办法。我们证明,加强学习可以学习与动物一样精确的表达方式。我们的主要微小贡献在于以下观察:a.在应用到真实的世界环境中,加强学习并不需要标签,因此可以被合法地视为不受监督的学习,b. 相比之下,当强化学习在模拟环境中应用时,它的确需要标签,因此一般应被视为监督学习。这些观察的必然结果是,进一步寻找可能在模拟环境中培训的不受监督的学习竞争性模式可能徒劳无益。

Article 234

Title@2025-05-29 (4): Rethinking Positive Pairs in Contrastive Learning

Title: Rethinking Positive Pairs in Contrastive Learning

Positive Paare im kontrastistischen Lernen neu denken

在反竞争学习中重新思考正对对 2410.18200v2

Authors: Jiantao Wu, Sara Atito, Zhenhua Feng, Shentong Mo, Josef Kitler, Muhammad Awais

The training methods in AI do involve semantically distinct pairs of samples. However, their role typically is to enhance the between class separability. The actual notion of similarity is normally learned from semantically identical pairs. This paper presents SimLAP: a simple framework for learning visual representation from arbitrary pairs. SimLAP explores the possibility of learning similarity from semantically distinct sample pairs. The approach is motivated by the observation that for any pair of classes there exists a subspace in which semantically distinct samples exhibit similarity. This phenomenon can be exploited for a novel method of learning, which optimises the similarity of an arbitrary pair of samples, while simultaneously learning the enabling subspace. The feasibility of the approach will be demonstrated experimentally and its merits discussed.

AI中的培训方法确实涉及分解的样本,但是,它们的作用通常是加强分级的分离性。通常,相似性的实际概念是从同义的对等中学习的。本文介绍了SimLAP:从任意的对等中学习视觉表现的简单框架。SimLAP探讨了从分立的样本对对等中学习相似性的可能性。这种方法的动因是观测到,对于任何一对类别中存在一个分空间,在分层中,分立的样本表现出相似性。这个现象可以被用于一种新型的学习方法,这种方法在选择任意的对等样本的相似性的同时学习赋能的子空间。该方法的可行性将在实验中加以展示,并讨论其优点。

Article 235

Title@2025-05-29 (4): Improving the Effective Receptive Field of Message-Passing Neural Networks

Title: Improving the Effective Receptive Field of Message-Passing Neural Networks

Verbesserung des effektiven Empfangsfeldes von message-passing Neural Networks

改进信息传送神经网络的有效接收领域 2505.23185v1

Authors: Shahaf E. Finder, Ron Shapira Weber, Moshe Eliasof, Oren Freifeld, Eran Treister

Message-Passing Neural Networks (MPNNs) have become a cornerstone for processing and analyzing graph-structured data. However, their effectiveness is often hindered by phenomena such as over-squashing, where long-range dependencies or interactions are inadequately captured and expressed in the MPNN output. This limitation mirrors the challenges of the Effective Receptive Field (ERF) in Convolutional Neural Networks (CNNs), where the theoretical receptive field is underutilized in practice. In this work, we show and theoretically explain the limited ERF problem in MPNNs. Furthermore, inspired by recent advances in ERF augmentation for CNNs, we propose an Interleaved Multiscale Message-Passing Neural Networks (IM-MPNN) architecture to address these problems in MPNNs. Our method incorporates a hierarchical coarsening of the graph, enabling message-passing across multiscale representations and facilitating long-range interactions without excessive depth or parameterization. Through extensive evaluations on benchmarks such as the Long-Range Graph Benchmark (LRGB), we demonstrate substantial improvements over baseline MPNNs in capturing long-range dependencies while maintaining computational efficiency.

信息传递神经网络已成为处理和分析图表结构数据的基石,但其效力往往受到过度夸大等现象的阻碍,因为长期依赖性或相互作用在MPNN输出中没有得到充分的反映和表达。这一限制反映了动态神经网络中有效接收域(ERF)的挑战,理论可接受域在实践中没有得到充分利用。在这项工作中,我们展示和理论上解释MPNN的有限ERF问题。此外,在CNN ER扩增最近的进展的启发下,我们提议在MPNNN建立跨离式多级信息传递神经网络(IM-MPNNN)结构,以解决这些问题。我们的方法包括图的等级分解,使信息能够跨越多尺度的表达方式,便利远程互动,而没有过度深度或参数化。我们通过对远程图像基准(LRGBN)等基准的广泛评价,显示在计算效率的同时,在捕获远程依赖性基准(IM-MPN)方面大大改进了基线。

Article 236

Title@2025-05-29 (4): Two Is Better Than One: Rotations Scale LoRAs

Title: Two Is Better Than One: Rotations Scale LoRAs

Zwei ist besser als eins: Rotationsskala LoRAs

二比一好:轮作规模LORAs 2505.23184v1

Authors: Hongcan Guo, Guoshun Nan, Yuan Yang, Diyang Zhang, Haotian Li, Zhican Chen, Qinchuan Zhou, Yuhan Ran, Xinye Cao, Sicong Leng, Xiaofeng Tao, Xudong Jiang

Scaling Low-Rank Adaptation (LoRA)-based Mixture-of-Experts (MoE) facilitates large language models (LLMs) to efficiently adapt to diverse tasks. However, traditional gating mechanisms that route inputs to the best experts may fundamentally hinder LLMs’ scalability, leading to poor generalization and underfitting issues. We identify that the root cause lies in the restricted expressiveness of existing weighted-sum mechanisms, both within and outside the convex cone of LoRA representations. This motivates us to propose RadarGate, a novel geometrically inspired gating method that introduces rotational operations of LoRAs representations to boost the expressiveness and facilitate richer feature interactions among multiple LoRAs for scalable LLMs. Specifically, we first fuse each LoRA representation to other LoRAs using a learnable component and then feed the output to a rotation matrix. This matrix involves learnable parameters that define the relative angular relationship between LoRA representations. Such a simple yet effective mechanism provides an extra degree of freedom, facilitating the learning of cross-LoRA synergies and properly tracking the challenging poor generalization and underfitting issues as the number of LoRA grows. Extensive experiments on 6 public benchmarks across 21 tasks show the effectiveness of our RadarGate for scaling LoRAs. We also provide valuable insights, revealing that the rotations to each pair of representations are contrastive, encouraging closer alignment of semantically similar representations during geometrical transformation while pushing distance ones further apart. We will release our code to the community.

低朗适应(LORA)基于低朗适应(LORA)的低朗适应(LOE)的混合物(MOE)有助于大型语言模型(LLMS)有效适应各种任务;然而,将投入投入输送给最佳专家的传统机制可能会从根本上阻碍LLMS的伸缩性,导致LLMS的简化和不适当问题;我们发现,根源在于现有加权和加权机制在LORA的表层内和外的表达方式的清晰度有限;这促使我们提出雷达Gate(RadarGate),这是一种具有地貌灵感的新型定位方法,引入LORA代表方式的旋转性操作,以提升其清晰度,便利多个LORA的伸缩性,促进多个LLOMS之间的更丰富性特征互动。具体地说,我们首先将每个LORA代表方式与其他LAM的伸缩性整合起来,然后将输出到轮值矩阵中。这种简单但有效的机制提供了额外的自由度,有助于学习LARA的交叉互动协作,并正确跟踪具有挑战性的缩缩缩略缩缩缩缩的缩缩缩缩缩缩图表。

Article 237

Title@2025-05-29 (4): MADCluster: Model-agnostic Anomaly Detection with Self-supervised Clustering Network

Title: MADCluster: Model-agnostic Anomaly Detection with Self-supervised Clustering Network

MADCluster: Modell-agnostische Anomalieerkennung mit selbstüberwachtem Clustering-Netzwerk

MADCluster:使用自监管的集群网进行模型-不可知异常探测 2505.16223v2

Authors: Sangyong Lee, Subo Hwang, Dohoon Kim

In this paper, we propose MADCluster, a novel model-agnostic anomaly detection framework utilizing self-supervised clustering. MADCluster is applicable to various deep learning architectures and addresses the ‘hypersphere collapse’ problem inherent in existing deep learning-based anomaly detection methods. The core idea is to cluster normal pattern data into a ‘single cluster’ while simultaneously learning the cluster center and mapping data close to this center. Also, to improve expressiveness and enable effective single clustering, we propose a new ‘One-directed Adaptive loss’. The optimization of this loss is mathematically proven. MADCluster consists of three main components: Base Embedder capturing high-dimensional temporal dynamics, Cluster Distance Mapping, and Sequence-wise Clustering for continuous center updates. Its model-agnostic characteristics are achieved by applying various architectures to the Base Embedder. Experiments on four time series benchmark datasets demonstrate that applying MADCluster improves the overall performance of comparative models. In conclusion, the compatibility of MADCluster shows potential for enhancing model performance across various architectures.

在本文中,我们提出MADCluster, 这是一种利用自我监督的集群, 新的模型- 不可知异常检测框架。 MADCluster 适用于各种深层学习结构, 并解决现有深层学习异常检测方法中固有的“ 整体崩溃” 问题。核心思想是将正常模式数据分组成“ 单组群集” , 同时学习集束中心, 绘制与此中心相近的数据。另外, 为了提高表达性, 并能够进行有效的单一集束, 我们提出了一个新的“ 单向适应损失” 。对这种损失的优化得到了数学上的证明。 MADCluster 由三个主要组成部分组成: 底嵌入器捕捉高度时间动态, 集群距离绘图, 以及连续中心更新的顺序组合。其模型- 数学特征是通过对基底嵌入器应用各种结构实现的。对四个时间序列基准数据集的实验表明, 应用MADCluster 将改善比较模型的总体性。总之, MADCluster 的兼容性展示了各种结构中增强模型性的潜力。

Article 238

Title@2025-05-29 (4): FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation

Title: FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation

FSL-SAGE: Beschleunigung des Federated Split Learning durch Smashed Activation Gradient Abschätzung

FSL-SAGE:通过分散的激励加速渐进式估算,加速联邦分化学习 2505.23182v1

Authors: Srijith Nair, Michael Lin, Amirreza Talebi, Peizhong Ju, Elizabeth Bentley, Jia Liu

Collaborative training methods like Federated Learning (FL) and Split Learning (SL) enable distributed machine learning without sharing raw data. However, FL assumes clients can train entire models, which is infeasible for large-scale models. In contrast, while SL alleviates the client memory constraint in FL by offloading most training to the server, it increases network latency due to its sequential nature. Other methods address the conundrum by using local loss functions for parallel client-side training to improve efficiency, but they lack server feedback and potentially suffer poor accuracy. We propose FSL-SAGE (Federated Split Learning via Smashed Activation Gradient Estimation), a new federated split learning algorithm that estimates server-side gradient feedback via auxiliary models. These auxiliary models periodically adapt to emulate server behavior on local datasets. We show that FSL-SAGE achieves a convergence rate of $\mathcal{O}(1/\sqrt{T})$, where $T$ is the number of communication rounds. This result matches FedAvg, while significantly reducing communication costs and client memory requirements. Our empirical results also verify that it outperforms existing state-of-the-art FSL methods, offering both communication efficiency and accuracy.

合作培训方法,如Federal Learning(FL)和Splet Learning(SL)等合作培训方法,使得分散的机器学习无需共享原始数据。然而,FL假设客户可以培训整个模型,这对于大型模型来说是行不通的。相比之下,虽然SL通过将大多数培训卸载到服务器,减轻了FL客户的记忆限制,但由于其相继性质,它增加了网络的延迟性。其他方法通过利用当地损失功能进行平行客户方培训来解决难题,提高效率,但是它们缺乏服务器反馈,并可能受到错误的准确性。我们提议FSL-SAG(通过超速动作快速动画快速动画快速动)可以培训整个模型(FSL-SAGAGE ) , 一种新的联合分离学习算法,通过辅助模型来估计服务器-侧梯度反馈。这些辅助模型定期适应当地数据集的服务器行为。我们显示FSAL-SAGEGA达到$math cal {O}(1/ sqrt{T}) $, 其中的通信回合数为$T}。这个结果与FAVAvg,同时大幅降低通信成本和客户记忆要求。我们的经验结果也提供了。

Article 239

Title@2025-05-29 (4): FreRA: A Frequency-Refined Augmentation for Contrastive Learning on Time Series Classification

Title: FreRA: A Frequency-Refined Augmentation for Contrastive Learning on Time Series Classification

FreRA: Eine frequenzrefinierte Augmentation für kontrastives Lernen in der Zeitreihenklassifikation

FreRA:关于时间序列分类的校对性学习频率改进 2505.23181v1

Authors: Tian Tian, Chunyan Miao, Hangwei Qian

Contrastive learning has emerged as a competent approach for unsupervised representation learning. However, the design of an optimal augmentation strategy, although crucial for contrastive learning, is less explored for time series classification tasks. Existing predefined time-domain augmentation methods are primarily adopted from vision and are not specific to time series data. Consequently, this cross-modality incompatibility may distort the semantically relevant information of time series by introducing mismatched patterns into the data. To address this limitation, we present a novel perspective from the frequency domain and identify three advantages for downstream classification: global, independent, and compact. To fully utilize the three properties, we propose the lightweight yet effective Frequency Refined Augmentation (FreRA) tailored for time series contrastive learning on classification tasks, which can be seamlessly integrated with contrastive learning frameworks in a plug-and-play manner. Specifically, FreRA automatically separates critical and unimportant frequency components. Accordingly, we propose semantic-aware Identity Modification and semantic-agnostic Self-adaptive Modification to protect semantically relevant information in the critical frequency components and infuse variance into the unimportant ones respectively. Theoretically, we prove that FreRA generates semantic-preserving views. Empirically, we conduct extensive experiments on two benchmark datasets, including UCR and UEA archives, as well as five large-scale datasets on diverse applications. FreRA consistently outperforms ten leading baselines on time series classification, anomaly detection, and transfer learning tasks, demonstrating superior capabilities in contrastive representation learning and generalization in transfer learning scenarios across diverse datasets.

相互抵触的学习已成为不受监督的代表制学习的一种合格方法。然而,设计最佳增强战略虽然对对比学习至关重要,但对于时间序列分类任务而言,探索得较少。现有的预先定义的时间域增强方法主要从视觉角度采用,而并非针对时间序列数据。因此,这种交叉现代不兼容性可能会通过在数据中引入不匹配的模式,扭曲时间序列的语义信息。为了应对这一限制,我们从频率域提出一个新的视角,并找出下游分类的三个优势:全球、独立和紧凑。要充分利用这三个属性,我们建议为时间序列任务专门设计为时间序列对比学习而设计的轻重但有效的频率更新的增强(FreRA)方法,这些方法可以以插接和播放的方式与对比性学习框架密切结合。具体地说,FreRA自动将关键和不重要的频率组成部分分开。因此,我们提议从频率域域角度认识身份的修改和语义识别自定义自适应性自我调整,以保护关键频率组件中的语系相关信息,并且不精确地将精确的递增缩的变校程,在常规的档案中分别生成数据。

Article 240

Title@2025-05-29 (4): The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning

Title: The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning

Die Panaceas zur Verbesserung der Zersetzung mit geringem Rank im kommunikativ-effizienten Federated Learning

改善通信-高效联邦学习中低-兰克分解的全景 2505.23176v1

Authors: Shiwei Li, Xiandi Luo, Haozhao Wang, Xing Tang, Shijie Xu, Weihong Luo, Yuhua Li, Xiuqiang He, Ruixuan Li

To improve the training efficiency of federated learning (FL), previous research has employed low-rank decomposition techniques to reduce communication overhead. In this paper, we seek to enhance the performance of these low-rank decomposition methods. Specifically, we focus on three key issues related to decomposition in FL: what to decompose, how to decompose, and how to aggregate. Subsequently, we introduce three novel techniques: Model Update Decomposition (MUD), Block-wise Kronecker Decomposition (BKD), and Aggregation-Aware Decomposition (AAD), each targeting a specific issue. These techniques are complementary and can be applied simultaneously to achieve optimal performance. Additionally, we provide a rigorous theoretical analysis to ensure the convergence of the proposed MUD. Extensive experimental results show that our approach achieves faster convergence and superior accuracy compared to relevant baseline methods. The code is available at https://github.com/Leopold1423/fedmud-icml25.

为了提高联邦学习的培训效率,先前的研究采用了低级分解技术,以减少通信管理费用。在本文中,我们力求提高这些低级分解方法的绩效。具体地说,我们侧重于与FL分解有关的三个关键问题:分解什么,如何分解,如何分解,以及如何综合。随后,我们引入了三种新颖技术:模范更新分解技术(MUD),布洛克-中克罗内克分解技术(BKD),以及聚合-Aware分解技术(AAAD),这些技术都是针对一个具体问题的。这些技术是相辅相成的,可以同时应用,以实现最佳绩效。此外,我们提供了严格的理论分析,以确保拟议的MUD的趋同。广泛的实验结果表明,我们的方法与相关的基线方法相比,更快地趋同和更加精确。该代码可在https://github.com/Leopold1423Fedmud-icml25上查阅。

Article 241

Title@2025-05-29 (4): Contrastive Learning and Abstract Concepts: The Case of Natural Numbers

Title: Contrastive Learning and Abstract Concepts: The Case of Natural Numbers

Kontrastives Lernen und abstrakte Konzepte: Der Fall natürlicher Zahlen

差异学习和抽象概念:自然数字案例 2408.02247v6

Authors: Daniel N. Nissani

Contrastive Learning (CL) has been successfully applied to classification and other downstream tasks related to concrete concepts, such as objects contained in the ImageNet dataset. No attempts seem to have been made so far in applying this promising scheme to more abstract entities. A prominent example of these could be the concept of (discrete) Quantity. CL can be frequently interpreted as a self-supervised scheme guided by some profound and ubiquitous conservation principle (e.g. conservation of identity in object classification tasks). In this introductory work we apply a suitable conservation principle to the semi-abstract concept of natural numbers by which discrete quantities can be estimated or predicted. We experimentally show, by means of a toy problem, that contrastive learning can be trained to count at a glance with high accuracy both at human as well as at super-human ranges.. We compare this with the results of a trained-to-count at a glance supervised learning (SL) neural network scheme of similar architecture. We show that both schemes exhibit similar good performance on baseline experiments, where the distributions of the training and testing stages are equal. Importantly, we demonstrate that in some generalization scenarios, where training and testing distributions differ, CL boasts more robust and much better error performance.

在与具体概念有关的分类和其他下游任务方面,例如图像网络数据集中包含的物体,已经成功地应用了对比学习(CL),在分类和其他与具体概念相关的下游任务方面,例如图像网络数据集中包含的物体。在将这一有希望的计划应用于更抽象的实体方面,迄今似乎没有尝试过任何尝试。其中的一个突出的例子可能是(分辨)数量的概念。CL可以经常被解释为一种由某些深刻和无处不在的保存原则(例如,在物体分类任务中保存身份)指导的自我监督计划。在这项介绍性工作中,我们对自然数字的半抽取概念适用了适当的保护原则,通过这种概念可以估计或预测离散的数量。我们实验性地表明,通过一个微小的问题,对比学习可以被训练在人类和超人范围内以高精度的眼光进行计算。我们把这个结果与经过培训后计数的类似结构的视觉学习(SL)神经网络计划的结果进行比较。我们表明,这两个计划在基线实验中都表现出类似的良好业绩表现,在那里,培训和测试阶段的分布是相同的,我们在一般的判断中显示比较精确的成绩。

Article 242

Title@2025-05-29 (4): Pseudo Multi-Source Domain Generalization: Bridging the Gap Between Single and Multi-Source Domain Generalization

Title: Pseudo Multi-Source Domain Generalization: Bridging the Gap Between Single and Multi-Source Domain Generalization

Pseudo-Multi-Source-Domain-Verallgemeinerung: Die Lücke zwischen Single- und Multi-Source-Domain-Verallgemeinerung überbrücken

Pseudo多源多源通用化:缩小单一源和多源通用化之间的差距 2505.23173v1

Authors: Shohei Enomoto

Deep learning models often struggle to maintain performance when deployed on data distributions different from their training data, particularly in real-world applications where environmental conditions frequently change. While Multi-source Domain Generalization (MDG) has shown promise in addressing this challenge by leveraging multiple source domains during training, its practical application is limited by the significant costs and difficulties associated with creating multi-domain datasets. To address this limitation, we propose Pseudo Multi-source Domain Generalization (PMDG), a novel framework that enables the application of sophisticated MDG algorithms in more practical Single-source Domain Generalization (SDG) settings. PMDG generates multiple pseudo-domains from a single source domain through style transfer and data augmentation techniques, creating a synthetic multi-domain dataset that can be used with existing MDG algorithms. Through extensive experiments with PseudoDomainBed, our modified version of the DomainBed benchmark, we analyze the effectiveness of PMDG across multiple datasets and architectures. Our analysis reveals several key findings, including a positive correlation between MDG and PMDG performance and the potential of pseudo-domains to match or exceed actual multi-domain performance with sufficient data. These comprehensive empirical results provide valuable insights for future research in domain generalization. Our code is available at https://github.com/s-enmt/PseudoDomainBed.

深度学习模式往往难以在与培训数据不同的数据发布上保持绩效,特别是在环境条件经常变化的现实世界应用中。多源通用化(MDG)在通过培训中利用多个源域显示应对这一挑战的前景,但其实际应用却因创建多域数据集的巨大成本和困难而受到限制。为了应对这一限制,我们提议Pseudo多多源多源通用化(PMDG),这是一个新颖的框架,使得能够在更实用的单一源通用化(SDG)设置中应用复杂的千年发展目标算法。PMD在单一源域中产生多种假数据,通过样式转让和数据增强技术,从单一源域产生多种假数据,创建合成多域数据集,与现有的千年发展目标算法一起使用。通过与我们修改版的DoneBed基准(Pseudomamaineed)的广泛实验,我们分析了多数据集和结构中PMDMDG的有效性。我们的分析揭示了几项关键结论,包括千年发展目标和PMDGPGS的性能和伪Dodomain-mains潜力,以匹配或超过我们现有的多域/Gealmamamamainalalalalalal exalalalal exalal exmental disals提供足够或超过或超过或超过我们现有的多域内的现有数据。

Article 243

Title@2025-05-29 (4): Global Tensor Motion Planning

Title: Global Tensor Motion Planning

Globale Tensor-Bewegungsplanung

全球时势规划 2411.19393v3

Authors: An T. Le, Kay Hansel, João Carvalho, Joe Watson, Julen Urain, Armin Biess, Georgia Chalvatzaki, Jan Peters

Batch planning is increasingly necessary to quickly produce diverse and quality motion plans for downstream learning applications, such as distillation and imitation learning. This paper presents Global Tensor Motion Planning (GTMP) – a sampling-based motion planning algorithm comprising only tensor operations. We introduce a novel discretization structure represented as a random multipartite graph, enabling efficient vectorized sampling, collision checking, and search. We provide a theoretical investigation showing that GTMP exhibits probabilistic completeness while supporting modern GPU/TPU. Additionally, by incorporating smooth structures into the multipartite graph, GTMP directly plans smooth splines without requiring gradient-based optimization. Experiments on lidar-scanned occupancy maps and the MotionBenchMarker dataset demonstrate GTMP’s computation efficiency in batch planning compared to baselines, underscoring GTMP’s potential as a robust, scalable planner for diverse applications and large-scale robot learning tasks.

批量规划对于迅速为下游学习应用(如蒸馏和模仿学习)制定多样化和高质量的运动计划越来越有必要。本文介绍了全球电锯运动规划(GTMP) – – 一种基于取样的运动规划算法,仅包含高温操作。我们引入了一种新型的离散结构,作为随机的多部分图,使高效的矢量取样、碰撞检查和搜索成为可能。我们提供了一项理论调查,表明GTMP在支持现代GPU/TPU的同时,表现出了概率性的完整性。此外,通过将平滑的结构纳入多面图,GTMP直接计划平滑的样条,而不需要基于梯度的优化。关于Lidar扫描占用图和Mtion BenchMarker数据集的实验显示了GTMP在批量规划中与基线相比的计算效率,强调GTMP作为各种应用和大规模机器人学习任务的强大、可扩展的规划员的潜力。

Article 244

Title@2025-05-29 (4): Pre-training for Recommendation Unlearning

Title: Pre-training for Recommendation Unlearning

Vorschulung für Empfehlung Unlearning

建议培训前培训 2505.22649v2

Authors: Guoxuan Chen, Lianghao Xia, Chao Huang

Modern recommender systems powered by Graph Neural Networks (GNNs) excel at modeling complex user-item interactions, yet increasingly face scenarios requiring selective forgetting of training data. Beyond user requests to remove specific interactions due to privacy concerns or preference changes, regulatory frameworks mandate recommender systems’ ability to eliminate the influence of certain user data from models. This recommendation unlearning challenge presents unique difficulties as removing connections within interaction graphs creates ripple effects throughout the model, potentially impacting recommendations for numerous users. Traditional approaches suffer from significant drawbacks: fragmentation methods damage graph structure and diminish performance, while influence function techniques make assumptions that may not hold in complex GNNs, particularly with self-supervised or random architectures. To address these limitations, we propose a novel model-agnostic pre-training paradigm UnlearnRec that prepares systems for efficient unlearning operations. Our Influence Encoder takes unlearning requests together with existing model parameters and directly produces updated parameters of unlearned model with little fine-tuning, avoiding complete retraining while preserving model performance characteristics. Extensive evaluation on public benchmarks demonstrates that our method delivers exceptional unlearning effectiveness while providing more than 10x speedup compared to retraining approaches. We release our method implementation at: https://github.com/HKUDS/UnlearnRec.

由图形神经网络(GNNS)推动的现代推荐系统在模拟复杂的用户-项目互动方面十分出色,但日益面临需要选择性地忘记培训数据的各种情景。除了用户要求消除因隐私问题或偏好变化而产生的特定互动外,监管框架授权用户要求删除特定用户数据的影响。这个建议不学习的挑战带来了独特的困难,因为删除互动图中的连接在整个模型中产生连结效应,可能对许多用户产生潜在影响。传统方法存在重大缺陷:碎裂方法损坏图表结构并降低性能,而影响功能技术则在复杂的GNNS中,特别是自我监督或随机结构中做出可能无法维持的假设。为了解决这些限制,我们提议建立一个新型的模型-认知前模式UnlearnRec,为高效的不学习操作准备系统。我们的影响编码器与现有的模型参数一起,直接生成了不学习模型的最新参数,很少进行微调,避免完全再培训,同时保持模型性能特性。对公共基准进行广泛的评估表明,我们的方法在提供超乎寻常的不学习有效性,同时提供10x/REUDRUDS对比再培训方法。

Article 245

Title@2025-05-29 (4): Best Arm Identification with Possibly Biased Offline Data

Title: Best Arm Identification with Possibly Biased Offline Data

Best Arm Identification mit möglicherweise Biased Offline Daten

最佳武器标识(可能附带的离线数据) 2505.23165v1

Authors: Le Yang, Vincent Y. F. Tan, Wang Chi Cheung

We study the best arm identification (BAI) problem with potentially biased offline data in the fixed confidence setting, which commonly arises in real-world scenarios such as clinical trials. We prove an impossibility result for adaptive algorithms without prior knowledge of the bias bound between online and offline distributions. To address this, we propose the LUCB-H algorithm, which introduces adaptive confidence bounds by incorporating an auxiliary bias correction to balance offline and online data within the LUCB framework. Theoretical analysis shows that LUCB-H matches the sample complexity of standard LUCB when offline data is misleading and significantly outperforms it when offline data is helpful. We also derive an instance-dependent lower bound that matches the upper bound of LUCB-H in certain scenarios. Numerical experiments further demonstrate the robustness and adaptability of LUCB-H in effectively incorporating offline data.

我们研究固定信心环境下可能偏差的离线数据的最佳手臂识别问题,这通常出现在临床试验等现实世界情景中。我们证明,在没有事先了解在线和离线分布之间的偏差的情况下,适应性算法不可能产生结果。为了解决这个问题,我们提议采用LUCB-H算法,通过在 LUCB框架内纳入辅助性偏差校正以平衡离线和在线数据,引入适应性信任界限。理论分析表明,LUCB-H与标准的LUCB的样本复杂性相匹配,因为离线数据具有误导性,在离线数据有帮助时大大超过它。我们还得出了一个与LUCB-H的上界相匹配的低实例约束。数字实验进一步证明了LUCB-H在有效纳入离线数据方面的稳健性和适应性。

Article 246

Title@2025-05-29 (4): Temporal Relation Extraction in Clinical Texts: A Span-based Graph Transformer Approach

Title: Temporal Relation Extraction in Clinical Texts: A Span-based Graph Transformer Approach

Temporale Beziehungsextraktion in klinischen Texten: Ein Span-basierter Graph Transformer-Ansatz

临床文本中的时间关系抽取时间关系:基于泛泛面的图形变形器方法 2503.18085v2

Authors: Rochana Chaturvedi, Peyman Baghershahi, Sourav Medya, Barbara Di Eugenio

Temporal information extraction from unstructured text is essential for contextualizing events and deriving actionable insights, particularly in the medical domain. We address the task of extracting clinical events and their temporal relations using the well-studied I2B2 2012 Temporal Relations Challenge corpus. This task is inherently challenging due to complex clinical language, long documents, and sparse annotations. We introduce GRAPHTREX, a novel method integrating span-based entity-relation extraction, clinical large pre-trained language models (LPLMs), and Heterogeneous Graph Transformers (HGT) to capture local and global dependencies. Our HGT component facilitates information propagation across the document through innovative global landmarks that bridge distant entities. Our method improves the state-of-the-art with 5.5% improvement in the tempeval $F_1$ score over the previous best and up to 8.9% improvement on long-range relations, which presents a formidable challenge. We further demonstrate generalizability by establishing a strong baseline on the E3C corpus. This work not only advances temporal information extraction but also lays the groundwork for improved diagnostic and prognostic models through enhanced temporal reasoning.

从非结构化文本中抽取时空信息对于使事件背景化和产生可操作的洞察力至关重要,特别是在医疗领域。我们利用研究周密的2012年I2B2《时际关系挑战》来应对提取临床事件及其时间关系的任务。由于复杂的临床语言、长的文件和稀疏的注释,这项任务具有内在挑战性。我们引入了GRAPHTREX,这是将基于跨实体关系的提取、临床预先培训的大型语言模型(LPLMS)和异质图形变异器(HGT)整合在一起的新方法,以捕捉本地和全球依赖性。我们HGT部分不仅通过连接遥远实体的创新的全球里程碑促进在文件中的信息传播,而且通过强化的时空推理推理为改进诊断和预测模型打下基础。

Article 247

Title@2025-05-29 (4): Implicit Inversion turns CLIP into a Decoder

Title: Implicit Inversion turns CLIP into a Decoder

Implizite Inversion macht CLIP zu einem Decoder

隐隐性 Indicide Inversion 将 CLIP 转换为解码器 2505.23161v1

Authors: Antonio D’Orazio, Maria Rosaria Briglia, Donato Crisostomi, Dario Loi, Emanuele Rodolà, Iacopo Masi

CLIP is a discriminative model trained to align images and text in a shared embedding space. Due to its multimodal structure, it serves as the backbone of many generative pipelines, where a decoder is trained to map from the shared space back to images. In this work, we show that image synthesis is nevertheless possible using CLIP alone – without any decoder, training, or fine-tuning. Our approach optimizes a frequency-aware implicit neural representation that encourages coarse-to-fine generation by stratifying frequencies across network layers. To stabilize this inverse mapping, we introduce adversarially robust initialization, a lightweight Orthogonal Procrustes projection to align local text and image embeddings, and a blending loss that anchors outputs to natural image statistics. Without altering CLIP’s weights, this framework unlocks capabilities such as text-to-image generation, style transfer, and image reconstruction. These findings suggest that discriminative models may hold untapped generative potential, hidden in plain sight.

CLIP是一种歧视性模型,在共同嵌入空间对图像和文字进行匹配。由于其多式结构, 它是许多基因管道的主干, 在那里, 解码器经过培训, 从共享空间映射回到图像。在这项工作中, 我们显示图像合成仍然有可能单独使用 CLIP – – 无需任何解码、培训或微调。我们的方法优化了频觉隐性神经代表, 从而通过对网络各层的频率进行分解来鼓励粗化到纤维的生成。为了稳定这一反向映射, 我们引入了对抗性强的初始化, 轻量的 Orthogonal Procrustes 投影, 以对本地文本和图像嵌入进行匹配, 以及将输出锁定到自然图像统计数据的混合损失。在不改变 CLIP 的重量的情况下, 这个框架释放了文本到图像生成、风格传输和图像重建等能力。这些发现显示, 歧视模式可能具有未开发的基因化潜力, 隐藏在普通的视野中。

Article 248

Title@2025-05-29 (4): Topological Adaptive Least Mean Squares Algorithms over Simplicial Complexes

Title: Topological Adaptive Least Mean Squares Algorithms over Simplicial Complexes

Topologische Adaptive Least Mean Squares Algorithmen über Simplicial Complexes

简单综合体的地形适应性最低中度平方平方平方平方平方平方平方平 2505.23160v1

Authors: Lorenzo Marinucci, Claudio Battiloro, Paolo Di Lorenzo

This paper introduces a novel adaptive framework for processing dynamic flow signals over simplicial complexes, extending classical least-mean-squares (LMS) methods to high-order topological domains. Building on discrete Hodge theory, we present a topological LMS algorithm that efficiently processes streaming signals observed over time-varying edge subsets. We provide a detailed stochastic analysis of the algorithm, deriving its stability conditions, steady-state mean-square-error, and convergence speed, while exploring the impact of edge sampling on performance. We also propose strategies to design optimal edge sampling probabilities, minimizing rate while ensuring desired estimation accuracy. Assuming partial knowledge of the complex structure (e.g., the underlying graph), we introduce an adaptive topology inference method that integrates with the proposed LMS framework. Additionally, we propose a distributed version of the algorithm and analyze its stability and mean-square-error properties. Empirical results on synthetic and real-world traffic data demonstrate that our approach, in both centralized and distributed settings, outperforms graph-based LMS methods by leveraging higher-order topological features.

本文介绍一个新的适应框架,用于处理简单复合物的动态流信号,将典型的最小比例(LMS)方法扩大到高阶地形领域。根据离散的Hodge理论,我们提出一种地貌式LMS算法,高效处理在时间变化边缘子集中观测到的流信号。我们对算法进行详细的随机分析,得出其稳定性条件、稳定状态平均比例值和趋同速度,同时探索边缘取样对性能的影响。我们还提出了设计最佳边缘取样概率的战略,最大限度地降低比率,同时确保预期的估计准确性。假设对复杂结构(例如基本图)有部分了解,我们采用适应性地貌推论方法,与拟议的LMS框架相结合。此外,我们提出一个分布式的算法,分析其稳定性和平均比例值特性。关于合成和真实世界交通数据的实证结果表明,我们在中央和分布式环境中采用的方法,超越了基于图表的精确度。

Article 249

Title@2025-05-29 (4): Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference Services

Title: Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference Services

Privacy-Aware Joint DNN Model Bereitstellung und Partitionierung Optimierung für kollaborative Edge Inferenz Services

DNN 联合DNN 合作边缘推断服务示范部署和分离优化优化模式 2502.16091v3

Authors: Zhipeng Cheng, Xiaoyu Xia, Hong Wang, Minghui Liwang, Ning Chen, Xuwei Fan, Xianbin Wang

Edge inference (EI) has emerged as a promising paradigm to address the growing limitations of cloud-based Deep Neural Network (DNN) inference services, such as high response latency, limited scalability, and severe data privacy exposure. However, deploying DNN models on resource-constrained edge devices introduces additional challenges, including limited computation/storage resources, dynamic service demands, and heightened privacy risks. To tackle these issues, this paper presents a novel privacy-aware optimization framework that jointly addresses DNN model deployment, user-server association, and model partitioning, with the goal of minimizing long-term average inference delay under resource and privacy constraints. The problem is formulated as a complex, NP-hard stochastic optimization. To efficiently handle system dynamics and computational complexity, we employ a Lyapunov-based approach to transform the long-term objective into tractable per-slot decisions. Furthermore, we introduce a coalition formation game to enable adaptive user-server association and design a greedy algorithm for model deployment within each coalition. Extensive simulations demonstrate that the proposed algorithm significantly reduces inference delay and consistently satisfies privacy constraints, outperforming state-of-the-art baselines across diverse scenarios.

为解决这些问题,本文件提出了一个新的隐私优化框架,共同解决基于云的深神经网络(DNN)的测算服务(DNN)的日益局限性,如高反应延迟、缩放有限和数据隐私暴露严重等。然而,在资源紧缺的边缘装置中部署DNN模型带来了额外的挑战,包括有限的计算/存储资源、动态服务需求和增加隐私风险。为解决这些问题,本文件提出了一个新的隐私意识优化框架,共同解决DNN模型部署、用户-服务器关联和模型分割,目标是在资源和隐私限制下尽量减少长期平均推论延迟。问题被表述为复杂、硬的系统优化。为高效处理系统动态和计算复杂性,我们采用了基于Lyapunov的方法将长期目标转化为可移动的人均决定。此外,我们引入了一个联盟形成游戏,以便能够适应用户-服务器的组合,并为每个联盟内的模型部署设计一种贪婪的算法。广泛的模拟表明,拟议的算法大大降低了各种隐私基线限制之间的误差,并持续履行各种要求。

Article 250

Title@2025-05-29 (4): Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Title: Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Größer, regularisiert, kategorisch: High-Kapacity-Wert-Funktionen sind effiziente Multi-Task-Lerner

大型、正规、分类:高能力价值功能是高效多任务学习者 2505.23150v1

Authors: Michal Nauman, Marek Cygan, Carmelo Sferrazza, Aviral Kumar, Pieter Abbeel

Recent advances in language modeling and vision stem from training large models on diverse, multi-task data. This paradigm has had limited impact in value-based reinforcement learning (RL), where improvements are often driven by small models trained in a single-task context. This is because in multi-task RL sparse rewards and gradient conflicts make optimization of temporal difference brittle. Practical workflows for generalist policies therefore avoid online training, instead cloning expert trajectories or distilling collections of single-task policies into one agent. In this work, we show that the use of high-capacity value models trained via cross-entropy and conditioned on learnable task embeddings addresses the problem of task interference in online RL, allowing for robust and scalable multi-task training. We test our approach on 7 multi-task benchmarks with over 280 unique tasks, spanning high degree-of-freedom humanoid control and discrete vision-based RL. We find that, despite its simplicity, the proposed approach leads to state-of-the-art single and multi-task performance, as well as sample-efficient transfer to new tasks.

语言建模和愿景方面的近期进展来自对多种多任务数据大型模型的培训,这种模式在基于价值的强化学习(RL)方面影响有限,因为改进往往由在单一任务背景下培训的小型模型驱动。这是因为在多任务RL稀薄的奖励和梯度冲突中,时间差的优化使时间差变得微不足道。一般政策的实际工作流程因此避免了在线培训,而避免了克隆专家轨迹或将单任务政策集成成一个代理物。在这项工作中,我们发现,使用通过交叉渗透和以可学习任务嵌入为条件而培训的高能力值模型,解决了在线RL的任务干扰问题,从而能够进行稳健和可扩展的多任务培训。我们测试了7项多任务基准的方法,其任务超过280项,涵盖高程度自由类人类控制和离散的视野RL。我们发现,尽管拟议方法简单,但最终导致采用最先进的单一和多任务状态,并且具有抽样效率地转移到新任务。

Article 251

Title@2025-05-29 (4): FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing

Title: FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing

FlowAlign: Trajektorie-regularisierte, inversionsfreie Fluss-basierte Bildbearbeitung

流动对等: 轨迹- 重新分类、转换- 无流动图像编辑 2505.23145v1

Authors: Jeongsol Kim, Yeobin Hong, Jong Chul Ye

Recent inversion-free, flow-based image editing methods such as FlowEdit leverages a pre-trained noise-to-image flow model such as Stable Diffusion 3, enabling text-driven manipulation by solving an ordinary differential equation (ODE). While the lack of exact latent inversion is a core advantage of these methods, it often results in unstable editing trajectories and poor source consistency. To address this limitation, we propose FlowAlign, a novel inversion-free flow-based framework for consistent image editing with principled trajectory control. FlowAlign introduces a flow-matching loss as a regularization mechanism to promote smoother and more stable trajectories during the editing process. Notably, the flow-matching loss is shown to explicitly balance semantic alignment with the edit prompt and structural consistency with the source image along the trajectory. Furthermore, FlowAlign naturally supports reverse editing by simply reversing the ODE trajectory, highlighting the reversible and consistent nature of the transformation. Extensive experiments demonstrate that FlowAlign outperforms existing methods in both source preservation and editing controllability.

最近的无反向、无流动的图像编辑方法,如FlowEdit 等,利用事先训练的噪音到图像流模式,如Snable Difil 3,通过解决普通的差别方程式(ODE)进行文本驱动的操纵。虽然缺乏确切的潜在反向是这些方法的核心优势,但往往导致编辑轨迹不稳定和源源一致性差。为解决这一限制,我们提议了FlowAlign,这是一个新的无逆向流动框架,用于与有原则的轨迹控制进行一致的图像编辑。FlowAlign 引入了流动匹配损失,作为在编辑过程中促进更顺畅和更稳定的轨迹的正规化机制。值得注意的是,流程匹配损失明显平衡了与按轨迹与源图像编辑的快速和结构一致性之间的平衡。此外,FlowAlign 自然支持反向编辑,只是颠倒了ODE的轨迹,强调变换的可逆转性和一致性。广泛的实验表明,FlowAltal超越了在源保护和编辑可控性两方面的现有方法。

Article 252

Title@2025-05-29 (4): OmniArch: Building Foundation Model For Scientific Computing

Title: OmniArch: Building Foundation Model For Scientific Computing

OmniArch: Building Foundation Model for Scientific Computing

OmniArch:建筑基金会科学计算模型 2402.16014v3

Authors: Tianyu Chen, Haoyi Zhou, Ying Li, Hao Wang, Chonghan Gao, Rongye Shi, Shanghang Zhang, Jianxin Li

Foundation models have revolutionized language modeling, while whether this success is replicated in scientific computing remains unexplored. We present OmniArch, the first prototype aiming at solving multi-scale and multi-physics scientific computing problems with physical alignment. We addressed all three challenges with one unified architecture. Its pre-training stage contains a Fourier Encoder-decoder fading out the disharmony across separated dimensions and a Transformer backbone integrating quantities through temporal dynamics, and the novel PDE-Aligner performs physics-informed fine-tuning under flexible conditions. As far as we know, we first conduct 1D-2D-3D united pre-training on the PDEBench, and it sets not only new performance benchmarks for 1D, 2D, and 3D PDEs but also demonstrates exceptional adaptability to new physics via in-context and zero-shot learning approaches, which supports realistic engineering applications and foresight physics discovery.

基金会模型已经使语言建模发生了革命性的变化,而科学计算中是否复制了这一成功,至今仍未探索。我们展示了旨在解决多规模和多物理科学计算问题的第一个原型OmniArch,这是在物理一致性方面解决多规模和多物理科学计算问题的首个原型。我们用一个统一的架构应对了所有这三项挑战。其培训前阶段包含一个Fourier Eccder-decoder ,它通过不同维度的不和谐和通过时间动态将数量整合在一起的变异主干柱,而新颖的PDE-Aligner则在灵活条件下进行物理知情的微调。据我们所知,我们首先在PDEBench上进行了1D-2D-3D联合培训,它不仅为1D、2D和3D PDEs设定了新的性能基准,而且还展示了通过文字和零光学方法对新物理学的特殊适应性,支持现实的工程应用和展望物理发现。

Article 253

Title@2025-05-29 (4): Policy Filtration for RLHF to Mitigate Noise in Reward Models

Title: Policy Filtration for RLHF to Mitigate Noise in Reward Models

Politische Filtration für RLHF zur Mititation von Lärm in Prämienmodellen

将RLHF政策归类为奖励模型中最小噪音的政策 2409.06957v4

Authors: Chuheng Zhang, Wei Shen, Li Zhao, Xuyun Zhang, Xiaolong Xu, Wanchun Dou, Jiang Biang

While direct policy optimization methods exist, pioneering LLMs are fine-tuned with reinforcement learning from human feedback (RLHF) to generate better responses under the supervision of a reward model learned from preference data. One major challenge of RLHF is the inaccuracy of the intermediate reward model, especially in the tasks that requires complex reasoning for the reward model to score a response. We find that the reliability of the reward model varies across responses assigned with different rewards. This motivates us to filter the samples whose rewards may be unreliable to improve the signal-to-noise ratio during policy learning, resulting in Policy Filtration for Proximal Policy Optimization (PF-PPO). To choose a proper policy filtering strategy, we use the coefficient of determination (R2) between the rewards and actual scores on filtered samples as the metrics to help us find promising strategies since it measures how well the rewards filtered by PF-PPO indicate real performance. We provide extensive experiments to validate the effectiveness of PF-PPO in code generation and math reasoning tasks. In code generation, PF-PPO achieves the state-of-the-art performance of 7-billion-parameter models on HumanEval (+7.9%), MBPP (+0.7%), and LeetCode Contest (+10.0%) which is a more challenging benchmark created by us. In math reasoning, PF-PPO yields performance increase using different reward models and benchmarks (Ape210K and CMATH). Code is available on https://github.com/DtYXs/verl/tree/pf-ppo.

虽然存在直接的政策优化方法,但开拓性LLMS的精细调整与人类反馈(RLHF)的强化学习(RLHF)相匹配,以在从优惠数据中学习的奖赏模式(PF-PPPO)的监督下产生更好的反应。RLHF的主要挑战之一是中间奖赏模式的不准确性能,特别是在奖赏模式需要复杂推理才能得分的任务中。我们发现奖赏模式的可靠性因不同答复而不同。这促使我们过滤奖赏模式的可靠性,这些样本的奖赏可能并不可靠,以便在政策学习期间改善信号到音响的比例,从而导致对准性政策优化的政策(PF-PPPPO)进行政策化(PF-PF-PPPO 优化政策(PF-PP-PPP-PPP-PPPO) 政策优化政策(PF-PF-PF-PF-PPPP-PPP-PL) 政策优化政策优化政策政策政策(PF-PF-PPPPPP-P-PM-PLS-PL) 政策优化政策优化政策优化政策政策政策化的优化战略。我们选择适当的政策过滤战略,我们选择战略(R2(R2)与实际判断系数(R2-R2-R2) 与实际成本/MLOMLOMLOMLOMLOD+PSMM)和实际性标准 7+7+PSMLMLMLM) 标准/数学/数学/数学/数学/数学基准和数学标准标准和数学标准标准标准标准和数学/数学/数学/数学/数学。

Article 254

Title@2025-05-29 (4): Learning to Reason under Off-Policy Guidance

Title: Learning to Reason under Off-Policy Guidance

Unter außerpolitischer Anleitung zur Vernunft lernen

根据非政策指导学习理由 2504.14945v4

Authors: Jianhao Yan, Yafu Li, Zican Hu, Zhi Wang, Ganqu Cui, Xiaoye Qu, Yu Cheng, Yue Zhang

Recent advances in large reasoning models (LRMs) demonstrate that sophisticated behaviors such as multi-step reasoning and self-reflection can emerge via reinforcement learning with verifiable rewards~(\textit{RLVR}). However, existing \textit{RLVR} approaches are inherently ``on-policy’’, limiting learning to a model’s own outputs and failing to acquire reasoning abilities beyond its initial capabilities. To address this issue, we introduce \textbf{LUFFY} (\textbf{L}earning to reason \textbf{U}nder o\textbf{FF}-polic\textbf{Y} guidance), a framework that augments \textit{RLVR} with off-policy reasoning traces. LUFFY dynamically balances imitation and exploration by combining off-policy demonstrations with on-policy rollouts during training. Specifically, LUFFY combines the Mixed-Policy GRPO framework, which has a theoretically guaranteed convergence rate, alongside policy shaping via regularized importance sampling to avoid superficial and rigid imitation during mixed-policy training. Compared with previous RLVR methods, LUFFY achieves an over \textbf{+6.4} average gain across six math benchmarks and an advantage of over \textbf{+6.2} points in out-of-distribution tasks. Most significantly, we show that LUFFY successfully trains weak models in scenarios where on-policy RLVR completely fails. These results provide compelling evidence that LUFFY transcends the fundamental limitations of on-policy RLVR and demonstrates the great potential of utilizing off-policy guidance in RLVR.

大型推理模型(LRMs)的近期进步表明,多步推理和自我反省等复杂行为可以通过以可核查的回报来强化学习~(\ textit{RLVR}) 。但是,现有的\ textit{RLVR} 方法本质上是“ 政策性” , 将学习限制在模型自己的产出上, 并且没有获得超出其初始能力的推理能力。为了解决这个问题, 我们引入了\ textbf{LUFFY} (\ textb{L}学习到理性的多步推理( textbf{U} ) 和自我反动( textb} ) 自我反动学习。但是, 现有的\ textitleitle{RLLLLVRRRRR} 方法本身, 通过常规性取样避免表面和僵硬性地模仿( RFF) 基础性分析结果, 与以往的RFF 平均的RFF 方法相比, 成功地展示了以往的RFF 水平优势。

Article 255

Title@2025-05-29 (4): VERINA: Benchmarking Verifiable Code Generation

Title: VERINA: Benchmarking Verifiable Code Generation

VERINA: Benchmarking der überprüfbaren Code-Generierung

VERINA:可核实代码生成基准 2505.23135v1

Authors: Zhe Ye, Zhengxu Yan, Jingxuan He, Timothe Kasriel, Kaiyu Yang, Dawn Song

Large language models (LLMs) are increasingly integrated in software development, but ensuring correctness in LLM-generated code remains challenging and often requires costly manual review. Verifiable code generation – jointly generating code, specifications, and proofs of code-specification alignment – offers a promising path to address this limitation and further unleash LLMs’ benefits in coding. Yet, there exists a significant gap in evaluation: current benchmarks often lack support for end-to-end verifiable code generation. In this paper, we introduce Verina (Verifiable Code Generation Arena), a high-quality benchmark enabling a comprehensive and modular evaluation of code, specification, and proof generation as well as their compositions. Verina consists of 189 manually curated coding tasks in Lean, with detailed problem descriptions, reference implementations, formal specifications, and extensive test suites. Our extensive evaluation of state-of-the-art LLMs reveals significant challenges in verifiable code generation, especially in proof generation, underscoring the need for improving LLM-based theorem provers in verification domains. The best model, OpenAI o4-mini, generates only 61.4% correct code, 51.0% sound and complete specifications, and 3.6% successful proofs, with one trial per task. We hope Verina will catalyze progress in verifiable code generation by providing a rigorous and comprehensive benchmark. We release our dataset on https://huggingface.co/datasets/sunblaze-ucb/verina and our evaluation code on https://github.com/sunblaze-ucb/verina.

大型语言模型(LLMS)日益融入软件开发,但确保LLM生成的代码的正确性仍具有挑战性,而且往往需要花费昂贵的人工审查。可验证代码的生成 – – 共同生成代码、规格和具体编码协调的证明 – – 为解决这一限制和进一步释放LLMS的编码好处提供了一条充满希望的道路。然而,在评价方面存在着巨大的差距:目前的基准往往缺乏对端至端可核查代码生成的支持。在本文件中,我们引入了一个高质量基准,从而能够对代码、规格和证据生成及其构成进行全面和模块化评价。Verina由189个手工拼凑的编码任务组成,其中有详细的问题描述、参考执行、正式规格和广泛的测试套件。我们对目前最先进的LLMSM(可验证代码生成的DLMSUCSDS/Arencrearetures)的生成存在重大挑战。我们的最佳模型(OO4minirea)只能生成61.4%的代码,51.0%的硬度和3.6%的精确度的代码,我们将提供我们精确度数据生成的进度和3.6%的数据。

Article 256

Title@2025-05-29 (4): DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs

Title: DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs

DOPPLER: Dual-Policy-Lernen für die Gerätezuordnung in asynchronen Datenflussgraphen

DOPPLER: 同步数据流图表中设备分配的双政策学习 2505.23131v1

Authors: Xinyu Yao, Daniel Bourgeois, Abhinav Jain, Yuxin Tang, Jiawen Yao, Zhimin Ding, Arlei Silva, Chris Jermaine

We study the problem of assigning operations in a dataflow graph to devices to minimize execution time in a work-conserving system, with emphasis on complex machine learning workloads. Prior learning-based methods often struggle due to three key limitations: (1) reliance on bulk-synchronous systems like TensorFlow, which under-utilize devices due to barrier synchronization; (2) lack of awareness of the scheduling mechanism of underlying systems when designing learning-based methods; and (3) exclusive dependence on reinforcement learning, ignoring the structure of effective heuristics designed by experts. In this paper, we propose \textsc{Doppler}, a three-stage framework for training dual-policy networks consisting of 1) a $\mathsf{SEL}$ policy for selecting operations and 2) a $\mathsf{PLC}$ policy for placing chosen operations on devices. Our experiments show that \textsc{Doppler} outperforms all baseline methods across tasks by reducing system execution time and additionally demonstrates sampling efficiency by reducing per-episode training time.

我们研究在数据流图中将操作分配到在工作保护系统中最大限度地减少执行时间的设备上的问题,重点是复杂的机器学习工作量。先前的学习方法往往由于三个关键限制而困难重重:(1) 依赖诸如TensorFlow这样的散装同步系统,这些系统由于障碍同步而未充分利用设备;(2) 在设计学习方法时对基础系统的时间安排机制缺乏认识;(3) 完全依赖强化学习,忽视专家设计的有效超常结构。在本文中,我们提议为培训双政策网络建立一个三阶段框架,包括:1) $\mathsf{SEL} 业务选择政策;和(2) 将选定操作安装在设备上的政策。我们的实验表明, ktextsc{Doppler} 通过减少系统执行时间和通过减少人均培训时间来进一步展示取样效率,从而超越了所有任务的基线方法。

Article 257

Title@2025-05-29 (4): Developing Cryptocurrency Trading Strategy Based on Autoencoder-CNN-GANs Algorithms

Title: Developing Cryptocurrency Trading Strategy Based on Autoencoder-CNN-GANs Algorithms

Entwicklung einer Cryptowährungs-Handelsstrategie auf der Grundlage von Autoencoder-CNN-GAN-Algorithmen

制定基于自动编码器-CNN-GANs算法的加密货币交易战略 2412.18202v5

Authors: Zhuohuan Hu, Richard Yu, Zizhou Zhang, Haoran Zheng, Qianying Liu, Yining Zhou

This paper leverages machine learning algorithms to forecast and analyze financial time series. The process begins with a denoising autoencoder to filter out random noise fluctuations from the main contract price data. Then, one-dimensional convolution reduces the dimensionality of the filtered data and extracts key information. The filtered and dimensionality-reduced price data is fed into a GANs network, and its output serve as input of a fully connected network. Through cross-validation, a model is trained to capture features that precede large price fluctuations. The model predicts the likelihood and direction of significant price changes in real-time price sequences, placing trades at moments of high prediction accuracy. Empirical results demonstrate that using autoencoders and convolution to filter and denoise financial data, combined with GANs, achieves a certain level of predictive performance, validating the capabilities of machine learning algorithms to discover underlying patterns in financial sequences. Keywords - CNN;GANs; Cryptocurrency; Prediction.

本文利用机器学习算法来预测和分析财务时间序列。这一过程始于从主合同价格数据中过滤随机噪音波动的自定义自动编码器。然后, 单维演化会降低过滤数据维度并提取关键信息。过滤和维度降低的价格数据被输入GANs网络, 其输出作为完全连接网络的输入。通过交叉校验, 一个模型被训练来捕捉价格大幅波动之前的特征。该模型预测实时价格序列中重大价格变化的可能性和方向, 将交易置于高预测准确度的时刻。光学结果显示, 使用自动编码器和组合过滤和嵌入金融数据, 与 GANs 相结合, 达到一定水平的预测性能, 验证机器学习算法在财务序列中发现基本模式的能力。关键词 - CNNN; GANs; Cryptocalcument; 预测性能。

Article 258

Title@2025-05-29 (4): Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network

Title: Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network

Surrogate-Assisted Evolutionary Verstärkung Lernen auf der Grundlage von Autoencoder und Hyperbolic Neural Network

基于自动编码器和双曲神经网络的代用辅助辅助进化辅助进化强化学习 2505.19423v2

Authors: Bingdong Li, Mei Jiang, Hong Qian, Ke Tang, Aimin Zhou, Peng Yang

Evolutionary Reinforcement Learning (ERL), training the Reinforcement Learning (RL) policies with Evolutionary Algorithms (EAs), have demonstrated enhanced exploration capabilities and greater robustness than using traditional policy gradient. However, ERL suffers from the high computational costs and low search efficiency, as EAs require evaluating numerous candidate policies with expensive simulations, many of which are ineffective and do not contribute meaningfully to the training. One intuitive way to reduce the ineffective evaluations is to adopt the surrogates. Unfortunately, existing ERL policies are often modeled as deep neural networks (DNNs) and thus naturally represented as high-dimensional vectors containing millions of weights, which makes the building of effective surrogates for ERL policies extremely challenging. This paper proposes a novel surrogate-assisted ERL that integrates Autoencoders (AE) and Hyperbolic Neural Networks (HNN). Specifically, AE compresses high-dimensional policies into low-dimensional representations while extracting key features as the inputs for the surrogate. HNN, functioning as a classification-based surrogate model, can learn complex nonlinear relationships from sampled data and enable more accurate pre-selection of the sampled policies without real evaluations. The experiments on 10 Atari and 4 Mujoco games have verified that the proposed method outperforms previous approaches significantly. The search trajectories guided by AE and HNN are also visually demonstrated to be more effective, in terms of both exploration and convergence. This paper not only presents the first learnable policy embedding and surrogate-modeling modules for high-dimensional ERL policies, but also empirically reveals when and why they can be successful.

强化进化强化学习(ERL) , 培训强化学习(RL) 政策, 以进化算法( EAs) 培训强化学习( RL) 政策, 展示出比传统政策梯度( EAs) 更高的探索能力和更强。然而, ERL 受到高计算成本和低搜索效率的影响, 因为 EAs 需要用昂贵的模拟来评估众多候选政策, 其中很多模拟无效, 并且没有为培训做出有意义的贡献。减少无效评价的一种直观方法是采用代孕。不幸的是, 现有的ERL 政策往往以深层神经网络( DNNS) 的模式为模型, 因而自然地代表着包含数百万重量的高维度矢量的高度矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量, 使得为ERL政策制定有效的有效代谢( ER) 。本文提出一个新的代谢辅助ERL , 将A( AE) 和 Syblicol Nealcol Neal) 的演示政策纳入了先前的演示模型, 。

Article 259

Title@2025-05-29 (4): Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals

Title: Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals

Lernen, in wiederholten Hauptagenten-Problemen mit Adversarial Agent Ankunft zu fördern

学习鼓励与抵达时的对冲代理人员重复发生主要问题 2505.23124v1

Authors: Junyan Liu, Arnab Maiti, Artin Tajdini, Kevin Jamieson, Lillian J. Ratliff

We initiate the study of a repeated principal-agent problem over a finite horizon $T$, where a principal sequentially interacts with $K\geq 2$ types of agents arriving in an adversarial order. At each round, the principal strategically chooses one of the $N$ arms to incentivize for an arriving agent of unknown type. The agent then chooses an arm based on its own utility and the provided incentive, and the principal receives a corresponding reward. The objective is to minimize regret against the best incentive in hindsight. Without prior knowledge of agent behavior, we show that the problem becomes intractable, leading to linear regret. We analyze two key settings where sublinear regret is achievable. In the first setting, the principal knows the arm each agent type would select greedily for any given incentive. Under this setting, we propose an algorithm that achieves a regret bound of $O(\min{\sqrt{KT\log N},K\sqrt{T}})$ and provide a matching lower bound up to a $\log K$ factor. In the second setting, an agent’s response varies smoothly with the incentive and is governed by a Lipschitz constant $L\geq 1$. Under this setting, we show that there is an algorithm with a regret bound of $\tilde{O}((LN)^{1/3}T^{2/3})$ and establish a matching lower bound up to logarithmic factors. Finally, we extend our algorithmic results for both settings by allowing the principal to incentivize multiple arms simultaneously in each round.

我们开始研究一个在一定范围内反复发生的主要代理人问题, 即T$, 一位主要代理人与以对抗性命令到达的K$\geq 2$的代理人依次互动。在每一回合中, 负责人从战略上选择一个美元武器来激励一个身份不明的代理人。然后, 代理根据其自身的效用和所提供的奖励选择一个手臂, 并且委托人得到相应的奖励。目标是在事后看到的最佳激励下, 最大限度地减少遗憾。在不事先了解代理人行为的情况下, 我们显示问题变得棘手, 导致线性遗憾。我们分析两个关键设置, 亚线性遗憾是可以实现的。在第一回合中, 委托人知道每种武器类型的手臂会贪婪地选择任何给予的奖励。在此情况下, 我们提出一种算法, 实现美元( miníqrrt{KT} 的遗憾绑定, K\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

Article 260

Title@2025-05-29 (4): BroadGen: A Framework for Generating Effective and Efficient Advertiser Broad Match Keyphrase Recommendations

Title: BroadGen: A Framework for Generating Effective and Efficient Advertiser Broad Match Keyphrase Recommendations

BroadGen: Ein Framework zur Generierung effektiver und effizienter Advertiser Broad Match Keyphrase-Empfehlungen

BloadGen:一个产生有效和高效广告的高效和高效广告大匹配关键词句建议的框架 2505.19164v2

Authors: Ashirbad Mishra, Jinyu Zhao, Soumik Dey, Hansi Wu, Binbin Li, Kamesh Madduri

In the domain of sponsored search advertising, the focus of Keyphrase recommendation has largely been on exact match types, which pose issues such as high management expenses, limited targeting scope, and evolving search query patterns. Alternatives like Broad match types can alleviate certain drawbacks of exact matches but present challenges like poor targeting accuracy and minimal supervisory signals owing to limited advertiser usage. This research defines the criteria for an ideal broad match, emphasizing on both efficiency and effectiveness, ensuring that a significant portion of matched queries are relevant. We propose BroadGen, an innovative framework that recommends efficient and effective broad match keyphrases by utilizing historical search query data. Additionally, we demonstrate that BroadGen, through token correspondence modeling, maintains better query stability over time. BroadGen’s capabilities allow it to serve daily, millions of sellers at eBay with over 2.3 billion items.

在受赞助的搜索广告领域,Keyphone建议的重点主要放在精确匹配类型上,这提出了高管理费用、有限目标选择范围和不断变化的搜索查询模式等问题。像Bload匹配类型这样的替代办法可以减轻某些准确匹配的缺点,但由于广告用户使用有限而带来的目标选择准确性差和监管信号少等挑战。这项研究界定了理想广泛匹配的标准,同时强调效率和有效性,确保大量匹配的查询具有相关性。我们提议BroadGen,这是一个创新框架,通过利用历史搜索查询数据,建议高效率和有成效的广泛匹配关键词。此外,我们证明BroadGen通过象征性通信模型,在一段时间内保持了更好的查询稳定性。BloadGen的能力允许它每天为超过23亿项的eBay的数百万卖主提供服务。

Article 261

Title@2025-05-29 (4): CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

Title: CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

CASS: Nvidia zu AMD Transpilation mit Daten, Modellen und Benchmark

CASS: Nvidia 到AMD 传输数据、模型和基准 2505.16968v3

Authors: Ahmed Heakl, Sarim Hashmi, Gustavo Bertolo Stahl, Seung Hun Eddie Han, Salman Khan, Abdulrahman Mahmoud

We introduce CASS, the first large-scale dataset and model suite for cross-architecture GPU code transpilation, targeting both source-level (CUDA <–> HIP) and assembly-level (Nvidia SASS <–> AMD RDNA3) translation. The dataset comprises 70k verified code pairs across host and device, addressing a critical gap in low-level GPU code portability. Leveraging this resource, we train the CASS family of domain-specific language models, achieving 95% source translation accuracy and 37.5% assembly translation accuracy, substantially outperforming commercial baselines such as GPT-4o, Claude, and Hipify. Our generated code matches native performance in over 85% of test cases, preserving runtime and memory behavior. To support rigorous evaluation, we introduce CASS-Bench, a curated benchmark spanning 16 GPU domains with ground-truth execution. All data, models, and evaluation tools are released as open source to foster progress in GPU compiler tooling, binary compatibility, and LLM-guided hardware translation.

我们引入了CASS, 这是首个用于跨建筑化 GPU 代码转换的大型数据集和模型套件, 针对源级( CUDA < - > HIP) 和组装级( Nvidia SASSS < - > AMD RDNA3) 翻译。该数据集由70k经核实的对数组成, 跨越主机和装置, 解决低级别 GPU 代码可移植性的重大差距。利用此资源, 我们培训 CASS 群域域语言模型, 实现95% 源翻译准确性和37.5% 组装翻译准确性, 大大超过 GPT-4o、 Claude 和 Hipifify等商业基线。我们生成的代码匹配了85%以上测试案例的本地性能, 保存运行时间和记忆行为。为了支持严格的评估, 我们引入了 CASS- Bench, 一个覆盖16 GPU 域域的曲线基准, 并带有地标执行。所有的数据、模型和评价工具都作为公开来源发布, 以促进 GPUPU 工具的编译、和 LLM 制硬件翻译的进展。

Article 262

Title@2025-05-29 (4): To Judge or not to Judge: Using LLM Judgements for Advertiser Keyphrase Relevance at eBay

Title: To Judge or not to Judge: Using LLM Judgements for Advertiser Keyphrase Relevance at eBay

Zu richten oder nicht zu richten: LLM-Richtungen für Werbetreibende Keyphrase Relevanz bei eBay verwenden

法官或非法官:在eBay使用LLM判决来作广告 2505.04209v2

Authors: Soumik Dey, Hansi Wu, Binbin Li

E-commerce sellers are recommended keyphrases based on their inventory on which they advertise to increase buyer engagement (clicks/sales). The relevance of advertiser keyphrases plays an important role in preventing the inundation of search systems with numerous irrelevant items that compete for attention in auctions, in addition to maintaining a healthy seller perception. In this work, we describe the shortcomings of training Advertiser keyphrase relevance filter models on click/sales/search relevance signals and the importance of aligning with human judgment, as sellers have the power to adopt or reject said keyphrase recommendations. In this study, we frame Advertiser keyphrase relevance as a complex interaction between 3 dynamical systems – seller judgment, which influences seller adoption of our product, Advertising, which provides the keyphrases to bid on, and Search, who holds the auctions for the same keyphrases. This study discusses the practicalities of using human judgment via a case study at eBay Advertising and demonstrate that using LLM-as-a-judge en-masse as a scalable proxy for seller judgment to train our relevance models achieves a better harmony across the three systems – provided that they are bound by a meticulous evaluation framework grounded in business metrics.

在这项工作中,我们描述了培训广告商关键词相关性过滤模型在点击/销售/搜索相关性信号方面的缺点,以及与人类判断保持一致的重要性,因为卖方有权采纳或拒绝上述关键词句建议。在本研究中,我们将广告关键词句的关联性作为三个动态系统 – – 卖方判决 – – 之间的复杂互动关系来设置。卖方判决影响卖方采用我们的产品 “ 广告 “ ,该判决提供了出价的关键词,而搜索公司则为同一关键词句进行拍卖。本研究讨论了通过eBay广告的案例研究使用人类判断的实用性,并表明使用LM-as-a-judge en-massassess作为卖方判断的可升级代言人,以培训我们的关联性模型,从而在三个系统实现更好的和谐 – – 前提是它们以严格的评价框架为基础。

Article 263

Title@2025-05-29 (4): Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking

Title: Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking

Dekom-Renorm-Merge: Modellzusammenführung auf dem richtigen Raum verbessert Multitasking

Decom-Renorm-Meorge:正确空间的模型合并改进多重任务 2505.23117v1

Authors: Yuatyong Chaichana, Thanapat Trachu, Peerat Limkonchotiwat, Konpat Preechakul, Tirasan Khandhawit, Ekapol Chuangsuwanich

In the era of large-scale training, model merging has evolved into a tool for creating multitasking models efficiently. It enables the knowledge of models to be fused, without the need for heavy computation as required in traditional multitask learning. Existing merging methods often assume that entries at identical positions in weight matrices serve the same function, enabling straightforward entry-wise comparison and merging. However, this assumption overlooks the complexity of finetuned neural networks, where neurons may develop distinct feature compositions, making direct entry-wise merging problematic. We present Decom-Renorm-Merge (DRM), a simple yet effective approach that leverages Singular Value Decomposition to decompose and coordinate weight matrices into an aligned joint space, where entry-wise merging becomes possible. We showcase the effectiveness of DRM across various settings ranging from smaller encoder-based such as ViT and DeBERTa, encoder-decoder-based such as T5, and larger decoder-based such as Llama3.1-8B. Our experimental results show that DRM outperforms several state-of-the-art merging techniques across full finetuning and low-rank adaptation settings. Moreover, our analysis reveals renormalization as the crucial component for creating a robust and even joint space for merging, significantly contributing to the method’s performance.

在大规模培训的时代,模型合并已经演变成一个高效创建多任务模型的工具,使模型知识得以融合,而无需像传统多任务学习所要求的那样进行大量计算。现有的合并方法往往假设重力矩阵中相同位置的条目功能相同,能够直接进行入门比较和合并。然而,这一假设忽略了微调神经网络的复杂性,其中神经元可能形成独特的特征构成,使直接进入的合并成为问题。我们提出了脱子-再调节-Merge(DRM),这是一种简单而有效的方法,利用Singulal值分解法将重量矩阵转换和协调成一个统一的联合空间,从而有可能进行入轨合并。我们展示了DRM在各种环境中的效能,这些环境包括基于小编码器的ViT和DeBERTA,基于T5的编码的编码-decoder-decoder网络,以及Llama3.1-8B等较大的分解器。我们的实验结果表明,DRM优于若干个状态的合并技术,在全面调整和低级的空间调整后,将展示了我们的关键的调整和低级的调整方法。

Article 264

Title@2025-05-29 (4): Learning to Reason from Feedback at Test-Time

Title: Learning to Reason from Feedback at Test-Time

Von Feedback bei Test-Time zur Vernunft lernen

从测试时的反馈中学习到理由 2502.15771v2

Authors: Yanyang Li, Michael Lyu, Liwei Wang

Solving complex tasks in a single attempt is challenging for large language models (LLMs). Iterative interaction with the environment and feedback is often required to achieve success, making effective feedback utilization a critical topic. Existing approaches either struggle with length generalization or rely on naive retries without leveraging prior information. In this paper, we introduce FTTT, a novel paradigm that formulates feedback utilization as an optimization problem at test time. Additionally, we propose a learnable test-time optimizer, OpTune, to effectively exploit feedback. Experiments on two LLMs across four reasoning datasets demonstrate that FTTT and OpTune achieve superior scalability and performance.

在一次尝试中解决复杂任务对大型语言模型(LLMs)来说具有挑战性,要取得成功,往往需要与环境的迭代互动和反馈,使有效的反馈利用成为一个关键议题。现有办法要么与时间的概括斗争,要么依靠天真重整而不利用先前的信息。在本文中,我们引入了FTTT,这是一个创新的范例,将反馈利用作为测试时的一个优化问题。此外,我们提议了一个可学习的测试-时间优化器OpTune,以有效利用反馈。在四个推理数据集中对两个LMs的实验表明,FTTT和OpTune实现了更高的可扩展性和性。

Article 265

Title@2025-05-29 (4): CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Forecasting with Exogenous Variables

Title: CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Forecasting with Exogenous Variables

CrossLinear: Plug-and-Play-Cross-Korrelation für Zeitreihenvorhersage mit exogenen Variablen einbetten

Crossliear: 用外源变量预测时间序列的插件和插件交叉校正嵌入 2505.23116v1

Authors: Pengfei Zhou, Yunlong Liu, Junli Liang, Qi Song, Xiangyang Li

Time series forecasting with exogenous variables is a critical emerging paradigm that presents unique challenges in modeling dependencies between variables. Traditional models often struggle to differentiate between endogenous and exogenous variables, leading to inefficiencies and overfitting. In this paper, we introduce CrossLinear, a novel Linear-based forecasting model that addresses these challenges by incorporating a plug-and-play cross-correlation embedding module. This lightweight module captures the dependencies between variables with minimal computational cost and seamlessly integrates into existing neural networks. Specifically, it captures time-invariant and direct variable dependencies while disregarding time-varying or indirect dependencies, thereby mitigating the risk of overfitting in dependency modeling and contributing to consistent performance improvements. Furthermore, CrossLinear employs patch-wise processing and a global linear head to effectively capture both short-term and long-term temporal dependencies, further improving its forecasting precision. Extensive experiments on 12 real-world datasets demonstrate that CrossLinear achieves superior performance in both short-term and long-term forecasting tasks. The ablation study underscores the effectiveness of the cross-correlation embedding module. Additionally, the generalizability of this module makes it a valuable plug-in for various forecasting tasks across different domains. Codes are available at https://github.com/mumiao2000/CrossLinear.

使用外源变量进行时间序列预测是一个新出现的重要范例,它给变量之间的依赖性建模带来了独特的挑战。传统模型往往难以区分内生变量和外生变量,导致效率低下和过度适应。在本文中,我们引入了CrossLineear,这是一个全新的线性预测模型,这是一个基于线性预测的新颖模型,它通过纳入插插插和边交叉关系嵌入模块来应对这些挑战。这一轻量级模块捕捉了计算成本最低的变量之间的依赖性,并且无缝地融入了现有的神经网络。具体地说,它捕捉了时间差异性和直接差异性依赖性,同时忽略了时间差异性或间接依赖性,从而减少了过度依赖性建模的风险,并有助于不断改进绩效。此外,Crosleinear采用对齐的处理和全球线性头,以有效捕捉短期和长期的跨时间依赖性,进一步提高其预测精确性。在12个真实世界数据集上的广泛实验表明,CrossLinaltare在短期和长期预测任务中都取得了较高的业绩。Crelationalimations asiming the greabilityal-labilizational

Article 266

Title@2025-05-29 (4): Instance-dependent Convergence Theory for Diffusion Models

Title: Instance-dependent Convergence Theory for Diffusion Models

Instanz-abhängige Konvergenztheorie für Diffusionsmodelle

扩散模型集成模型理论 2410.13738v2

Authors: Yuchen Jiao, Gen Li

Score-based diffusion models have demonstrated outstanding empirical performance in machine learning and artificial intelligence, particularly in generating high-quality new samples from complex probability distributions. Improving the theoretical understanding of diffusion models, with a particular focus on the convergence analysis, has attracted significant attention. In this work, we develop a convergence rate that is adaptive to the smoothness of different target distributions, referred to as instance-dependent bound. Specifically, we establish an iteration complexity of $\min{d,d^{2/3}L^{1/3},d^{1/3}L}\varepsilon^{-2/3}$ (up to logarithmic factors), where $d$ denotes the data dimension, and $\varepsilon$ quantifies the output accuracy in terms of total variation (TV) distance. In addition, $L$ represents a relaxed Lipschitz constant, which, in the case of Gaussian mixture models, scales only logarithmically with the number of components, the dimension and iteration number, demonstrating broad applicability.

基于分数的传播模型在机器学习和人工智能方面,特别是在从复杂概率分布中产生高质量的新样本方面,表现出杰出的经验性表现。改进对扩散模型的理论理解,特别侧重于趋同分析,已经引起极大关注。在这项工作中,我们发展了适应不同目标分布的顺利性的统一率,称之为依赖实例的束缚。具体地说,我们确立了美元(mind,d2/3}L1/3},d1/3}Lvarepsilon2/3}$(最高为对数系数)的迭代复杂性,其中美元表示数据维度,而$\varepsilon 美元则按总变异(TV)距离计算产出精度。此外,美元代表一个松动的Lipschitz常数。在高斯混合模型中,只有对数尺度与组件数量、尺寸和相异数的比值,显示出广泛适用性。

Article 267

Title@2025-05-29 (4): FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article

Title: FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article

FutureGen: LLM-RAG Ansatz zur Generierung der zukünftigen Arbeit des wissenschaftlichen Artikels

FutureGen:LLM-RAG 产生科学条款未来工作的方法 2503.16561v2

Authors: Ibrahim Al Azher, Miftahul Jannat Mokarrama, Zhishuai Guo, Sagnik Ray Choudhury, Hamed Alhoori

The future work section of a scientific article outlines potential research directions by identifying gaps and limitations of a current study. This section serves as a valuable resource for early-career researchers seeking unexplored areas and experienced researchers looking for new projects or collaborations. In this study, we generate future work suggestions from key sections of a scientific article alongside related papers and analyze how the trends have evolved. We experimented with various Large Language Models (LLMs) and integrated Retrieval-Augmented Generation (RAG) to enhance the generation process. We incorporate a LLM feedback mechanism to improve the quality of the generated content and propose an LLM-as-a-judge approach for evaluation. Our results demonstrated that the RAG-based approach with LLM feedback outperforms other methods evaluated through qualitative and quantitative metrics. Moreover, we conduct a human evaluation to assess the LLM as an extractor and judge. The code and dataset for this project are here, code: HuggingFace

科学文章的未来工作章节通过查明当前研究的差距和局限性,概述了潜在的研究方向,概述了未来研究方向。本节是早期职业研究人员寻找未探索领域和有经验的研究人员寻找新项目或协作的宝贵资源。在本研究报告中,我们从科学文章的关键部分提出未来工作建议,并结合相关论文分析趋势如何演变。我们试验了各种大语言模型和综合检索-启动一代(RAG),以加强生成过程。我们采用了LLLM反馈机制,以提高生成内容的质量,并提出LLM-as-a-judge-评价方法。我们的成果表明,以LLM反馈为基础的RAG方法超越了通过定性和定量指标评估的其他方法。此外,我们进行了人类评估,以评估LLM作为提取器和评判器。这个项目的代码和数据集在这里,代码是:HuggingFace:HuggingFace。

Article 268

Title@2025-05-29 (4): Neural Interpretable PDEs: Harmonizing Fourier Insights with Attention for Scalable and Interpretable Physics Discovery

Title: Neural Interpretable PDEs: Harmonizing Fourier Insights with Attention for Scalable and Interpretable Physics Discovery

Neural Interpretable PDEs: Harmonisierung Fourier Insights mit Aufmerksamkeit für skalierbare und Interpretierbare Physik Discovery

神经可解释的PDEs:协调Fourier Insights,注意可缩放和可解释的物理发现 2505.23106v1

Authors: Ning Liu, Yue Yu

Attention mechanisms have emerged as transformative tools in core AI domains such as natural language processing and computer vision. Yet, their largely untapped potential for modeling intricate physical systems presents a compelling frontier. Learning such systems often entails discovering operators that map between functional spaces using limited instances of function pairs – a task commonly framed as a severely ill-posed inverse PDE problem. In this work, we introduce Neural Interpretable PDEs (NIPS), a novel neural operator architecture that builds upon and enhances Nonlocal Attention Operators (NAO) in both predictive accuracy and computational efficiency. NIPS employs a linear attention mechanism to enable scalable learning and integrates a learnable kernel network that acts as a channel-independent convolution in Fourier space. As a consequence, NIPS eliminates the need to explicitly compute and store large pairwise interactions, effectively amortizing the cost of handling spatial interactions into the Fourier transform. Empirical evaluations demonstrate that NIPS consistently surpasses NAO and other baselines across diverse benchmarks, heralding a substantial leap in scalable, interpretable, and efficient physics learning. Our code and data accompanying this paper are available at https://github.com/fishmoon1234/Nonlocal-Attention-Operator.

在诸如自然语言处理和计算机愿景等核心AI领域,关注机制已成为变革性工具,但在自然语言处理和计算机愿景等核心AI领域,它们基本上尚未开发的建立复杂物理系统模型的潜力是一个令人瞩目的前沿。学习这类系统往往需要发现操作者,在功能空间之间绘制使用有限功能对等实例的分布图 – – 这项任务通常被描绘成一个严重错误的反PDE问题。在这项工作中,我们引入了神经可解释的PDE(NIPS)(NIPS),这是一个新的神经操作者结构,在预测准确性和计算效率两方面都建立在并加强了非本地关注操作员(NAO)和其他基准上。NIPS使用了线性关注机制,以便能够进行可缩放的学习并整合一个可学习的内核网络,在Fourier空间中作为一条视通道独立的共动。因此,NIPS消除了明确配置和储存大量双向互动的必要性,有效地将处理空间互动的成本分摊到Fourier的转变中。实证性评估表明,NIPS始终超过NAO(NAO)和其他基准,在可扩展性、可解释和高效物理学学习方面出现大幅度的飞跃式飞跃式飞跃。我们的代码和数据在http上。

Article 269

Title@2025-05-29 (4): LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical Fabrics

Title: LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical Fabrics

LUMION: Schnelle Fehlerwiederherstellung für ML-Jobs mit programmierbaren optischen Stoffen

LUMION: 使用可编程光学制造器快速回收 ML 工作 2505.23105v1

Authors: Abhishek Vijaya Kumar, Eric Ding, Arjun Devraj, Darius Bunandar, Rachee Singh

When accelerators fail in modern ML datacenters, operators migrate the affected ML training or inference jobs to entirely new racks. This approach, while preserving network performance, is highly inefficient, requiring datacenters to reserve full racks of idle accelerators for fault tolerance. In this paper, we address this resource inefficiency by introducing LUMION, a novel reconfigurable optical fabric for connecting accelerators within a datacenter rack. Instead of migrating entire ML jobs, LUMION dynamically integrates spare accelerators into ongoing workloads as failures occur, thereby maintaining consistent performance without costly migrations. We show the benefits of LUMION by building an end-to-end hardware prototype. Our experiments fine-tune Llama 3.2 and show that LUMION swaps a failed GPU with a healthy one and restarts the ML job within ~ 1 second of the failure. LUMION achieves higher inter-GPU bandwidth compared to traditional electrical racks after replacing failed accelerators with spare ones, leading to nearly 2X improvement in fine-tuning throughput.

当加速器在现代 ML 数据中心失败时, 操作员会将受影响的 ML 培训或推断工作迁移到全新的工作架上。这种方法在保持网络性能的同时, 效率极低, 要求数据中心保留闲置加速器的完整括号, 以防故障。在本文中, 我们通过引入LUMION来解决资源效率低下问题。 LUMION是一种新型的可重新配置的光纤结构, 用于将加速器连接到一个数据中心架内。 LUMION 不仅没有将整个 ML 工作迁移, 反而将备用加速器动态地整合到持续的工作量中, 从而保持连续的性能, 而不花费昂贵的迁移。我们通过建立一个端到端的硬件原型来显示 LUMION 的好处。我们的实验微调 Llama 3. 2 并显示 LUMION 将一个失败的 GPU 转换为健康的 GPU, 并在故障的1 秒内重新启动 MLL 工作。 LUMION 在用备用加速器替换失败的失败的加速器后, 实现微调的近2X 改进后, 。

Article 270

Title@2025-05-29 (4): Approximate Thompson Sampling for Learning Linear Quadratic Regulators with $O(\sqrt{T})$ Regret

Title: Approximate Thompson Sampling for Learning Linear Quadratic Regulators with $O(\sqrt{T})$ Regret

Ungefähre Thompson-Probenahme für das Lernen linearer quadratischer Regulatoren mit $O(\sqrt{T})$ Bedauern

Thompson 学习线性赤道调节器的近似 Thompson 抽样以 $(\ sqrt{T}) regret $(\ sqrt{T}) 为学习线性赤道调节器 2405.19380v2

Authors: Yeoneung Kim, Gihun Kim, Jiwhan Park, Insoon Yang

We propose a novel Thompson sampling algorithm that learns linear quadratic regulators (LQR) with a Bayesian regret bound of $O(\sqrt{T})$. Our method leverages Langevin dynamics with a carefully designed preconditioner and incorporates a simple excitation mechanism. We show that the excitation signal drives the minimum eigenvalue of the preconditioner to grow over time, thereby accelerating the approximate posterior sampling process. Furthermore, we establish nontrivial concentration properties of the approximate posteriors generated by our algorithm. These properties enable us to bound the moments of the system state and attain an $O(\sqrt{T})$ regret bound without relying on the restrictive assumptions that are often used in the literature.

我们提出一种新的汤普森采样算法,学习线性二次调控器(LQR) , 学习贝叶西亚人的遗憾为$O(\\ sqrt{T}) 。我们的方法利用精心设计的前提条件来利用兰杰文的动态, 并包含一个简单的引言机制。我们显示, 刺激信号驱动了先决条件的最低值随时间增长, 从而加快了近似后方采样过程。此外, 我们建立了由我们算法产生的近似后方的非三角集中特性。这些特性使我们能够约束系统状态的瞬间, 并在不依赖文献中经常使用的限制性假设的情况下, 获得$O(\ sqrt{T} ) 的遗憾约束。

Article 271

Title@2025-05-29 (4): Weight Spectra Induced Efficient Model Adaptation

Title: Weight Spectra Induced Efficient Model Adaptation

Gewicht Spectra Induzierte effiziente Modellanpassung

引导有效模型适应 2505.23099v1

Authors: Chongjie Si, Xuankun Yang, Muqing Liu, Yadao Wang, Xiaokang Yang, Wenbo Su, Bo Zheng, Wei Shen

Large-scale foundation models have demonstrated remarkable versatility across a wide range of downstream tasks. However, fully fine-tuning these models incurs prohibitive computational costs, motivating the development of Parameter-Efficient Fine-Tuning (PEFT) methods such as LoRA, which introduces low-rank updates to pre-trained weights. Despite their empirical success, the underlying mechanisms by which PEFT modifies model parameters remain underexplored. In this work, we present a systematic investigation into the structural changes of weight matrices during fully fine-tuning. Through singular value decomposition (SVD), we reveal that fine-tuning predominantly amplifies the top singular values while leaving the remainder largely intact, suggesting that task-specific knowledge is injected into a low-dimensional subspace. Furthermore, we find that the dominant singular vectors are reoriented in task-specific directions, whereas the non-dominant subspace remains stable. Building on these insights, we propose a novel method that leverages learnable rescaling of top singular directions, enabling precise modulation of the most influential components without disrupting the global structure. Our approach achieves consistent improvements over strong baselines across multiple tasks, highlighting the efficacy of structurally informed fine-tuning.

大型基础模型在一系列广泛的下游任务中表现出了显著的多功能性。然而,全面微调这些模型带来了令人望而却步的计算成本,鼓励开发Parater-Efficent Fine-Turning(PEFT)方法,如LORA(PERA)方法,该方法采用低级别更新预培训重量。尽管取得了经验上的成功,但PEFT修改模型参数所依据的基本机制仍然未得到充分探讨。在这项工作中,我们提出了对全面微调过程中重量矩阵结构变化的系统调查。我们通过单值分解(SVD),发现微调主要扩大了顶级单值,而其余部分基本保持不变,这表明任务特定知识被注入到一个低维的子空间。此外,我们发现占主导地位的单向矢量矢量在特定任务方向上重新定位,而非占支配地位的子空间则保持稳定。我们提出了一种新的方法,利用最高单级方向的可调整,使最有影响力的部件得以精确调整,同时又不打乱全球结构结构结构结构。我们的方法是在多个任务上实现一致的改进。

Article 272

Title@2025-05-29 (4): Learning to Search for Vehicle Routing with Multiple Time Windows

Title: Learning to Search for Vehicle Routing with Multiple Time Windows

Lernen, nach Fahrzeug Routing mit mehreren Zeitfenstern zu suchen

学习搜索多时间窗口运行的车辆 2505.23098v1

Authors: Kuan Xu, Zhiguang Cao, Chenlong Zheng, Linong Liu

In this study, we propose a reinforcement learning-based adaptive variable neighborhood search (RL-AVNS) method designed for effectively solving the Vehicle Routing Problem with Multiple Time Windows (VRPMTW). Unlike traditional adaptive approaches that rely solely on historical operator performance, our method integrates a reinforcement learning framework to dynamically select neighborhood operators based on real-time solution states and learned experience. We introduce a fitness metric that quantifies customers’ temporal flexibility to improve the shaking phase, and employ a transformer-based neural policy network to intelligently guide operator selection during the local search. Extensive computational experiments are conducted on realistic scenarios derived from the replenishment of unmanned vending machines, characterized by multiple clustered replenishment windows. Results demonstrate that RL-AVNS significantly outperforms traditional variable neighborhood search (VNS), adaptive VNS (AVNS), and state-of-the-art learning-based heuristics, achieving substantial improvements in solution quality and computational efficiency across various instance scales and time window complexities. Particularly notable is the algorithm’s capability to generalize effectively to problem instances not encountered during training, underscoring its practical utility for complex logistics scenarios.

在这项研究中,我们提出了一种基于学习的强化适应性可变邻里搜索(RL-AVNS)方法,该方法旨在有效地解决多时视窗车辆流动问题。与仅依赖历史运营者性能的传统适应性方法不同,我们的方法将强化学习框架整合到动态地选择基于实时解决方案状态和所学经验的邻里运营者中。我们引入了一种健康度量标准,对客户的时间灵活性进行量化,以改进摇动阶段,并使用基于变压器的神经政策网络,以明智地指导当地搜索过程中的操作者选择。进行了广泛的计算实验,其依据是补充无人驾驶自动售货机(以多个集群补充窗口为特征)所产生的现实情景。结果显示,RL-AVNS大大超越了传统的可变邻里搜索(VNS)、适应性VNS(ANS)和以学习为主的状态的超常态功能,在解决方案质量和计算效率方面在各个实例和时间窗口复杂性方面取得重大改进。特别值得注意的是,算法能力将培训中未遇到的问题有效地归纳到培训中,强调其对复杂物流假设的实用用途。

Article 273

Title@2025-05-29 (4): Stochastic Diffusion: A Diffusion Based Model for Stochastic Time Series Forecasting

Title: Stochastic Diffusion: A Diffusion Based Model for Stochastic Time Series Forecasting

Stochastische Diffusion: Ein diffusionsbasiertes Modell für stochastische Zeitreihen

斯托卡扩散:以传播为基础的斯托卡时间序列预测模型 2406.02827v2

Authors: Yuansan Liu, Sudanthi Wijewickrema, Dongting Hu, Christofer Bester, Stephen O’Leary, James Bailey

Recent innovations in diffusion probabilistic models have paved the way for significant progress in image, text and audio generation, leading to their applications in generative time series forecasting. However, leveraging such abilities to model highly stochastic time series data remains a challenge. In this paper, we propose a novel Stochastic Diffusion (StochDiff) model which learns data-driven prior knowledge at each time step by utilizing the representational power of the stochastic latent spaces to model the variability of the multivariate time series data. The learnt prior knowledge helps the model to capture complex temporal dynamics and the inherent uncertainty of the data. This improves its ability to model highly stochastic time series data. Through extensive experiments on real-world datasets, we demonstrate the effectiveness of our proposed model on stochastic time series forecasting. Additionally, we showcase an application of our model for real-world surgical guidance, highlighting its potential to benefit the medical community.

最近在传播概率模型方面的创新为图像、文本和音频生成方面的重大进步铺平了道路,从而导致其在基因时间序列预测中的应用。然而,利用这种能力模拟高度随机时间序列数据仍然是一个挑战。在本文中,我们提出一个新型的Stochacistic扩散(StochDiff)模型,通过利用随机潜伏空间的代表性能力,在每一个阶段学习数据驱动的先前知识,以模拟多变时间序列数据的变异性。所学的先前知识有助于模型捕捉复杂的时间动态和数据固有的不确定性。这提高了模型模拟高度随机时间序列数据的能力。通过在现实世界数据集上的广泛实验,我们展示了我们提议的模型在随机时间序列预测方面的有效性。此外,我们展示了我们模型在现实世界外科指导方面的应用,突出了它有利于医学界的潜力。

Article 274

Title@2025-05-29 (4): Constraints and Variables Reduction for Optimal Power Flow Using Hierarchical Graph Neural Networks with Virtual Node-Splitting

Title: Constraints and Variables Reduction for Optimal Power Flow Using Hierarchical Graph Neural Networks with Virtual Node-Splitting

Einschränkungen und Variablen-Reduktion für optimalen Stromfluss mittels Hierarchischer Graphen-Neural-Netzwerke mit virtuellem Knoten-Splitting

利用具有虚拟节点切除功能的等级形图形神经网络减少最佳电力流动的制约因素和变数 2411.06268v2

Authors: Thuan Pham, Xingpeng Li

Power system networks are often modeled as homogeneous graphs, which limits the ability of graph neural network (GNN) to capture individual generator features at the same nodes. By introducing the proposed virtual node-splitting strategy, generator-level attributes like costs, limits, and ramp rates can be fully captured by GNN models, improving GNN’s learning capacity and prediction accuracy. Optimal power flow (OPF) problem is used for real-time grid operations. Limited timeframe motivates studies to create size-reduced OPF (ROPF) models to relieve the computational complexity. In this paper, with virtual node-splitting, a novel two-stage adaptive hierarchical GNN is developed to (i) predict critical lines that would be congested, and then (ii) predict base generators that would operate at the maximum capacity. This will substantially reduce the constraints and variables needed for OPF, creating the proposed ROPFLG model with reduced monitor lines and reduced generator-specific variables and constraints. Two ROPF models, ROPFL and ROPFG, with just reduced lines or generators respectively, are also implemented as additional benchmark models. Case studies show that the proposed ROPFLG consistently outperforms the benchmark full OPF (FOPF) and the other two ROPF methods, achieving significant computational time savings while reliably finding optimal solutions.

电源系统网络往往以同质图制成,限制了图形神经网络(GNN)在同一节点上捕捉单个发电机功能的能力。通过采用拟议的虚拟节点分割战略,GNN模型可以充分捕捉到发电机一级的特性,如成本、限值和坡度等,提高GNN的学习能力和预测准确性。最佳电流(OPF)问题用于实时电网操作。有限的时间框架促使研究创建缩小电流(ROPF)模型以减轻计算复杂性。在本文件中,通过虚拟节点分割,开发了一个新型的两阶段适应性级GNNNN,以(一) 预测将凝固的关键线,然后(二) 预测最大容量运行的基发电机。这将大大减少对OPF的制约和变数,创建监测线减少和发电机特定变数和制约因素的拟议ROPFLG模型。两种模型,即ROPFL和ROPFG,分别缩小线或发电机的两种模型,也作为补充基准模型。案例研究表明,将持续地实现ROPFFS的其它重要计算方法。

Article 275

Title@2025-05-29 (4): MAP: Revisiting Weight Decomposition for Low-Rank Adaptation

Title: MAP: Revisiting Weight Decomposition for Low-Rank Adaptation

KARTE: Wiederbesuchen der Gewichtsverringerung für Low-Rank-Anpassung

MAP: 重新审视低浓度适应的重量分解 2505.23094v1

Authors: Chongjie Si, Zhiyi Shi, Yadao Wang, Xiaokang Yang, Susanto Rahardja, Wei Shen

The rapid development of large language models has revolutionized natural language processing, but their fine-tuning remains computationally expensive, hindering broad deployment. Parameter-efficient fine-tuning (PEFT) methods, such as LoRA, have emerged as solutions. Recent work like DoRA attempts to further decompose weight adaptation into direction and magnitude components. However, existing formulations often define direction heuristically at the column level, lacking a principled geometric foundation. In this paper, we propose MAP, a novel framework that reformulates weight matrices as high-dimensional vectors and decouples their adaptation into direction and magnitude in a rigorous manner. MAP normalizes the pre-trained weights, learns a directional update, and introduces two scalar coefficients to independently scale the magnitude of the base and update vectors. This design enables more interpretable and flexible adaptation, and can be seamlessly integrated into existing PEFT methods. Extensive experiments show that MAP significantly improves performance when coupling with existing methods, offering a simple yet powerful enhancement to existing PEFT methods. Given the universality and simplicity of MAP, we hope it can serve as a default setting for designing future PEFT methods.

大型语言模型的迅速发展使自然语言处理发生了革命性的变化,但是它们的微调仍然在计算上昂贵,阻碍了广泛的部署。参数效率微调方法,如LORA,已经成为一种解决办法。最近的工作,例如DoRA试图将重量调整进一步分解成方向和量级组成部分。然而,现有的配方往往在柱级上以超自然方式界定方向,缺乏一个原则的几何基础。在本文件中,我们建议MAP这个新框架将重量矩阵重新作为高维矢量矢量,并严格地将其调整为方向和规模。MAP使预先训练的重量正常化,学习方向性更新,并引入两个标量系数,独立地测量基的大小,并更新矢量。这种设计可以使更可解释和灵活地适应,并且可以顺利地融入现有的PEFT方法。广泛的实验表明,MAP在与现有方法结合时大大改进了业绩,为现有的PEFT方法提供了简单而有力的改进。鉴于MAP的普及性和简洁性,我们希望它能够作为未来的默认设置PEFT方法。

Article 276

Title@2025-05-29 (4): Equivariant Spherical Transformer for Efficient Molecular Modeling

Title: Equivariant Spherical Transformer for Efficient Molecular Modeling

Equivarianter Spherical Transformer für effiziente molekulare Modellierung

高效分子建模的等同球质变变变器 2505.23086v1

Authors: Junyi An, Xinyu Lu, Chao Qu, Yunfei Shi, Peijia Lin, Qianwei Tang, Licheng Xu, Fenglei Cao, Yuan Qi

SE(3)-equivariant Graph Neural Networks (GNNs) have significantly advanced molecular system modeling by employing group representations. However, their message passing processes, which rely on tensor product-based convolutions, are limited by insufficient non-linearity and incomplete group representations, thereby restricting expressiveness. To overcome these limitations, we introduce the Equivariant Spherical Transformer (EST), a novel framework that leverages a Transformer structure within the spatial domain of group representations after Fourier transform. We theoretically and empirically demonstrate that EST can encompass the function space of tensor products while achieving superior expressiveness. Furthermore, EST’s equivariant inductive bias is guaranteed through a uniform sampling strategy for the Fourier transform. Our experiments demonstrate state-of-the-art performance by EST on various molecular benchmarks, including OC20 and QM9.

SE(3)-QQevariant 图形神经网络(SE(3)-QQQNNs)通过使用集团代表制来显著先进的分子系统建模,然而,他们的信息传递过程依赖以高压产品为基础的变异,受到非线性不足和不完全的集团代表制的限制,从而限制了表达性。为了克服这些局限性,我们引入了EQevariant Spheal Transferal Informationeration(EST),这是一个在Fourier变异后在群体代表制空间范围内利用变异器结构的新框架。我们从理论上和从经验上证明,EST可以涵盖色素产品的功能空间,同时实现高清晰度的表达性。此外,EST的等同感的感性倾向偏向是通过四级变异的统一的抽样战略得到保证的。我们的实验展示了EST在包括OC20和QM9在内的各种分子基准方面的最新表现。

Article 277

Title@2025-05-29 (4): Gradient Boosting Decision Tree with LSTM for Investment Prediction

Title: Gradient Boosting Decision Tree with LSTM for Investment Prediction

Gradienten Auftrieb Entscheidungsbaum mit LSTM für Investitionsvorhersage

与 LSTM 一起逐步促进投资预测决策树 2505.23084v1

Authors: Chang Yu, Fang Liu, Jie Zhu, Shaobo Guo, Yifan Gao, Zhongheng Yang, Meiwei Liu, Qianwen Xing

This paper proposes a hybrid framework combining LSTM (Long Short-Term Memory) networks with LightGBM and CatBoost for stock price prediction. The framework processes time-series financial data and evaluates performance using seven models: Artificial Neural Networks (ANNs), Convolutional Neural Networks (CNNs), Bidirectional LSTM (BiLSTM), vanilla LSTM, XGBoost, LightGBM, and standard Neural Networks (NNs). Key metrics, including MAE, R-squared, MSE, and RMSE, are used to establish benchmarks across different time scales. Building on these benchmarks, we develop an ensemble model that combines the strengths of sequential and tree-based approaches. Experimental results show that the proposed framework improves accuracy by 10 to 15 percent compared to individual models and reduces error during market changes. This study highlights the potential of ensemble methods for financial forecasting and provides a flexible design for integrating new machine learning techniques.

本文件提出一个混合框架,将LSTM(长期短期内存)网络与LightGBM和CatBoost(用于股票价格预测)结合起来,该框架处理时间序列财务数据,并使用七个模型评估业绩:人工神经网络、进化神经网络、双向LSTM(BILSTM)、Vanilla LSTM、XGBost、LightGBM和标准神经网络。主要指标,包括MAE、R-qured、MSE和RMSE,用于制定不同时间尺度的基准。我们以这些基准为基础,开发了将连续和基于树木的方法的优势结合起来的全套模型。实验结果表明,拟议的框架比单个模型的准确性提高了10%至15%,并减少了市场变化期间的错误。本研究报告强调了组合方法在财务预测方面的潜力,为整合新的机器学习技术提供了灵活的设计。

Article 278

Title@2025-05-29 (4): Gradient Methods with Online Scaling Part I. Theoretical Foundations

Title: Gradient Methods with Online Scaling Part I. Theoretical Foundations

Gradient Methoden mit Online-Skalierung Teil I. Theoretische Grundlagen

在线扩展第一部分的渐进方法理论基础 2505.23081v1

Authors: Wenzhi Gao, Ya-Chi Chu, Yinyu Ye, Madeleine Udell

This paper establishes the theoretical foundations of the online scaled gradient methods (OSGM), a framework that utilizes online learning to adapt stepsizes and provably accelerate first-order methods. OSGM quantifies the effectiveness of a stepsize by a feedback function motivated from a convergence measure and uses the feedback to adjust the stepsize through an online learning algorithm. Consequently, instantiations of OSGM achieve convergence rates that are asymptotically no worse than the optimal stepsize. OSGM yields desirable convergence guarantees on smooth convex problems, including 1) trajectory-dependent global convergence on smooth convex objectives; 2) an improved complexity result on smooth strongly convex problems, and 3) local superlinear convergence. Notably, OSGM constitutes a new family of first-order methods with non-asymptotic superlinear convergence, joining the celebrated quasi-Newton methods. Finally, OSGM explains the empirical success of the popular hypergradient-descent heuristic in optimization for machine learning.

本文确立了在线缩放梯度方法(OSGM)的理论基础,这一框架利用在线学习来调整阶梯化和可以想象地加速一级方法。OSGM量化了由趋同措施驱动的反馈功能所推动的阶梯化步骤的有效性,并使用反馈来通过在线学习算法调整阶梯化步骤。因此,OSGM的即时趋同率并不比最佳步骤更差。OSGM在顺流的锥形问题上提供了理想的趋同保证,包括:(1) 顺流的锥形目标方面取决于轨迹的全球趋同;(2) 顺流的强烈螺旋问题提高了复杂性,(3) 地方超线性趋同。值得注意的是,OSGM构成一种由非自动超线性趋同法组成的新一流方法组合,加入了庆祝的准纽顿方法。最后,OSGM解释了在优化机器学习方面流行的超高级日光速超光谱法的成功经验。

Article 279

Title@2025-05-29 (4): Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble

Title: Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble

Zweite Meinungsfrage: Auf dem Weg zu adaptiver klinischer KI über den Konsens des Expert Model Ensembles

第二意见事项:通过专家示范组共识实现适应性临床AI 2505.23075v1

Authors: Amit Kumthekar, Zion Tilley, Henry Duong, Bhargav Patel, Michael Magnoli, Ahmed Omar, Ahmed Nasser, Chaitanya Gharpure, Yevgen Reztzov

Despite the growing clinical adoption of large language models (LLMs), current approaches heavily rely on single model architectures. To overcome risks of obsolescence and rigid dependence on single model systems, we present a novel framework, termed the Consensus Mechanism. Mimicking clinical triage and multidisciplinary clinical decision-making, the Consensus Mechanism implements an ensemble of specialized medical expert agents enabling improved clinical decision making while maintaining robust adaptability. This architecture enables the Consensus Mechanism to be optimized for cost, latency, or performance, purely based on its interior model configuration. To rigorously evaluate the Consensus Mechanism, we employed three medical evaluation benchmarks: MedMCQA, MedQA, and MedXpertQA Text, and the differential diagnosis dataset, DDX+. On MedXpertQA, the Consensus Mechanism achieved an accuracy of 61.0% compared to 53.5% and 45.9% for OpenAI’s O3 and Google’s Gemini 2.5 Pro. Improvement was consistent across benchmarks with an increase in accuracy on MedQA ($\Delta\mathrm{Accuracy}{\mathrm{consensus\text{-}O3}} = 3.4\%$) and MedMCQA ($\Delta\mathrm{Accuracy}{\mathrm{consensus\text{-}O3}} = 9.1\%$). These accuracy gains extended to differential diagnosis generation, where our system demonstrated improved recall and precision (F1$\mathrm{consensus}$ = 0.326 vs. F1${\mathrm{O3\text{-}high}}$ = 0.2886) and a higher top-1 accuracy for DDX (Top1$\mathrm{consensus}$ = 52.0% vs. Top1${\mathrm{O3\text{-}high}}$ = 45.2%).

尽管临床采用了大型语言模型(LLMS),但目前的做法在很大程度上依赖单一模式结构。为了克服过时和严格依赖单一模式系统的风险,我们提出了一个新框架,称为共识机制。在进行临床分流和多学科临床决策时,共识机制实施了一系列专业医疗专家代理机构,以便在保持稳健的适应性的同时改进临床决策。这一架构使共识机制能够完全根据其内部模型配置优化成本、延缓度或性能。为了严格评估共识机制,我们采用了三个医疗评估基准:MDMCQA、MDQA和MedXpertQA文本,以及差分诊断数据集,DDXA。在MedXperQA方面,共识机制实现了61%的准确性,而O3和Google的Gami 2.5 Proper。改进与MedQA(DQQQQQQQQ)的精确度提高值(=QQ_Q_Q_Q_BAR_BAR_BAR_Q_Q_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_B_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_B_B_B_B_B_B_B_B_B_B_B_B_B_BAR_BAR_BAR_BAR_B_B_BAR_BAR_BAR_BAR_B_B_B_BAR_BAR_B_B_B_B_B_B_B_B_B_B_B_B_B_

Article 280

Title@2025-05-29 (4): Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

Title: Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

Shortcut-verbundene Experten-Parallelität für die Beschleunigung von Mixture-of-Experts

加速混合专家专家专家平行专家 2404.05019v3

Authors: Weilin Cai, Juyong Jiang, Le Qin, Junwei Cui, Sunghun Kim, Jiayi Huang

Expert parallelism has emerged as a key strategy for distributing the computational workload of sparsely-gated mixture-of-experts (MoE) models across multiple devices, enabling the processing of increasingly large-scale models. However, the All-to-All communication inherent to expert parallelism poses a significant bottleneck, limiting the efficiency of MoE models. Although existing optimization methods partially mitigate this issue, they remain constrained by the sequential dependency between communication and computation operations. To address this challenge, we propose ScMoE, a novel shortcut-connected MoE architecture integrated with an overlapping parallelization strategy. ScMoE decouples communication from its conventional sequential ordering, enabling up to 100% overlap with computation. Compared to the prevalent top-2 MoE baseline, ScMoE achieves speedups of 1.49 times in training and 1.82 times in inference. Moreover, our experiments and analyses indicate that ScMoE not only achieves comparable but in some instances surpasses the model quality of existing approaches.

专家的平行性已成为一种关键战略,用于通过多种装置分配分散的分散专家混合模型的计算工作量,从而能够处理越来越大规模的模型。然而,专家平行性所固有的 “ 人人交流 “ 构成了一个很大的瓶颈,限制了教育部模式的效率。虽然现有的优化方法在一定程度上缓解了这一问题,但它们仍然受到通信和计算操作之间依次依赖的制约。为了应对这一挑战,我们提议ScMoE,这是一个与重叠的平行战略相结合的新颖的、与捷径相连的教育部结构。ScMoE从常规顺序排序中解析通信,使计算重叠率达到100%。与普遍的上层-2教育部基线相比,ScMoE在培训中实现了1.49倍的加速率,在推断中实现了1.82倍的加速率。此外,我们的实验和分析表明,ScMoE不仅取得了可比较的结果,而且在某些情况下超过了现有方法的模型质量。

Article 281

Title: Multi-Modal Learning with Bayesian-Oriented Gradient Calibration

Multi-Modal-Lernen mit Bayesian-Oriented Gradient Calibration

多模式学习,以巴耶斯为主的梯度校准 2505.23071v1

Authors: Peizheng Guo, Jingyao Wang, Huijie Guo, Jiangmeng Li, Chuxiong Sun, Changwen Zheng, Wenwen Qiang

Multi-Modal Learning (MML) integrates information from diverse modalities to improve predictive accuracy. However, existing methods mainly aggregate gradients with fixed weights and treat all dimensions equally, overlooking the intrinsic gradient uncertainty of each modality. This may lead to (i) excessive updates in sensitive dimensions, degrading performance, and (ii) insufficient updates in less sensitive dimensions, hindering learning. To address this issue, we propose BOGC-MML, a Bayesian-Oriented Gradient Calibration method for MML to explicitly model the gradient uncertainty and guide the model optimization towards the optimal direction. Specifically, we first model each modality’s gradient as a random variable and derive its probability distribution, capturing the full uncertainty in the gradient space. Then, we propose an effective method that converts the precision (inverse variance) of each gradient distribution into a scalar evidence. This evidence quantifies the confidence of each modality in every gradient dimension. Using these evidences, we explicitly quantify per-dimension uncertainties and fuse them via a reduced Dempster-Shafer rule. The resulting uncertainty-weighted aggregation produces a calibrated update direction that balances sensitivity and conservatism across dimensions. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness and advantages of the proposed method.

多模式学习(MML)整合了不同模式的信息,以提高预测准确性;然而,现有方法主要是将具有固定重量的梯度汇总起来,对所有层面一视同仁,忽略每个模式固有的梯度不确定性;这可能导致(一) 敏感层面的过度更新,降低性能,以及(二) 低敏感层面的更新不足,阻碍学习;为解决这一问题,我们提议BOGC-MML,一种巴伊萨-摩尔为对象的梯度渐进校准方法,用于MML明确模拟梯度不确定性,引导模型优化走向最佳方向;具体地说,我们首先将每种模式的梯度作为随机变量,得出其概率分布,捕捉梯度空间的全部不确定性;然后,我们提出一种有效的方法,将每种梯度分布的精度(反差)转换成一个缩放证据;为了解决这一问题,我们提议BOGC-ML,即巴伊斯-东梯度梯度梯度梯度梯度梯度梯度梯度梯度梯度的梯度梯度梯度梯度校准方法,我们明确量化每梯度不确定性,并通过降低调规则将其融合为最佳方向。因此,由不确定性加权汇总得出一个校准的校准新方向,从而产生一个校正校正校准的校准更新方向,以显示宽差差差度的基点点点点度,显示宽度的精度的精度的精度的精度和测度,显示宽度,展示度,展示度的精确度的精度的精度的精度的精度,显示宽度,显示宽度,显示宽度和度的宽度的宽度,显示宽度,显示宽度,显示宽度,显示宽度的深度度和度的宽度的宽度的宽度,显示跨度和节度。度。

Article 282

Title@2025-05-29 (4): Sparse Linear Bandits with Blocking Constraints

Title: Sparse Linear Bandits with Blocking Constraints

Sparse Linear Bandits mit Blockierung Einschränkungen

带有阻塞限制的粗细线条强力 2410.20041v2

Authors: Adit Jain, Soumyabrata Pal, Sunav Choudhary, Ramasuri Narayanam, Harshita Chopra, Vikram Krishnamurthy

We investigate the high-dimensional sparse linear bandits problem in a data-poor regime where the time horizon is much smaller than the ambient dimension and number of arms. We study the setting under the additional blocking constraint where each unique arm can be pulled only once. The blocking constraint is motivated by practical applications in personalized content recommendation and identification of data points to improve annotation efficiency for complex learning tasks. With mild assumptions on the arms, our proposed online algorithm (BSLB) achieves a regret guarantee of $\widetilde{\mathsf{O}}((1+\beta_k)^2k^{\frac{2}{3}} \mathsf{T}^{\frac{2}{3}})$ where the parameter vector has an (unknown) relative tail $\beta_k$ – the ratio of $\ell_1$ norm of the top-$k$ and remaining entries of the parameter vector. To this end, we show novel offline statistical guarantees of the lasso estimator for the linear model that is robust to the sparsity modeling assumption. Finally, we propose a meta-algorithm (C-BSLB) based on corralling that does not need knowledge of optimal sparsity parameter $k$ at minimal cost to regret. Our experiments on multiple real-world datasets demonstrate the validity of our algorithms and theoretical framework.

我们在一个数据贫乏的系统中调查高维分散的线性匪徒问题,因为时间范围远小于环境维度和武器数量。我们研究额外屏障限制下的设置,每个独特的手臂只能拉动一次。屏障限制的动机是个人化内容建议和确定数据点的实际应用,以提高复杂学习任务的批注效率。在对手臂的轻度假设下,我们提议的在线算法(BSLBB)实现了对$\全方位推移=mathfsf{O((1beta_k)%2kfrac{23\mathsf{Tfrac{Tfrac{233)$的额外屏障限制。最后,在参数矢量矢量矢量矢量具有(未知)相对尾量的尾量$\beta_k$ – – 最高值标准为$\ell_1美元,以及参数矢量的剩余条目。到此,我们展示了对线性模型的测算模型的新离线性统计保证。最后,我们提议一个基于模型的顶值的理论-C正数级的理论实验,我们并不需要我们最起码的理论级的模型。

Article 283

Title@2025-05-29 (4): GrokFormer: Graph Fourier Kolmogorov-Arnold Transformers

Title: GrokFormer: Graph Fourier Kolmogorov-Arnold Transformers

GrokFormer: Graph Fourier Kolmogorov-Arnold Transformer

GrokFormer:图示 Fourier Kolmogorov-Arnold变形器 2411.17296v3

Authors: Guoguo Ai, Guansong Pang, Hezhe Qiao, Yuan Gao, Hui Yan

Graph Transformers (GTs) have demonstrated remarkable performance in graph representation learning over popular graph neural networks (GNNs). However, self–attention, the core module of GTs, preserves only low-frequency signals in graph features, leading to ineffectiveness in capturing other important signals like high-frequency ones. Some recent GT models help alleviate this issue, but their flexibility and expressiveness are still limited since the filters they learn are fixed on predefined graph spectrum or spectral order. To tackle this challenge, we propose a Graph Fourier Kolmogorov-Arnold Transformer (GrokFormer), a novel GT model that learns highly expressive spectral filters with adaptive graph spectrum and spectral order through a Fourier series modeling over learnable activation functions. We demonstrate theoretically and empirically that the proposed GrokFormer filter offers better expressiveness than other spectral methods. Comprehensive experiments on 10 real-world node classification datasets across various domains, scales, and graph properties, as well as 5 graph classification datasets, show that GrokFormer outperforms state-of-the-art GTs and GNNs. Our code is available at https://github.com/GGA23/GrokFormer

图形变形器(GTs)在广受欢迎的图形神经网络(GNNS)的图形显示学习中表现出了惊人的成绩。然而,GT的核心模块“自我注意”在图形特征中只保留低频信号,只保留低频信号,导致无法有效捕捉其他重要信号,如高频信号等。最近的一些GT模型帮助缓解了这一问题,但由于他们所学的过滤器固定在预定义的图形频谱或光谱顺序上,其灵活性和表达性仍然有限。为了应对这一挑战,我们提议了一个“Flyier Kolmogorov-Arnold变形器”(GrokFormer)这一新型GT模型,通过四重系列模型对适应性图形频谱和光谱顺序进行学习,从而导致无法有效捕捉到其他重要信号。我们从理论上和从经验上证明,拟议的GrokForformer过滤器比其他光谱方法更清晰。关于10个真实世界的无界分类数据集的全面实验,涵盖不同领域、尺度和图形属性,以及5个图表分类数据集,显示Grokformer overs State-Ost-GM23/GMAR_GMS/GNS可使用的代码和GMS。

Article 284

Title@2025-05-29 (4): Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling

Title: Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling

Skalierung von Flüssig-Resistenz-Netzwerken für eine effiziente Sequenzmodellierung

增强增强流动性恢复力的流动性能力网络,以建立高效序列建模 2505.21717v2

Authors: Mónika Farsang, Ramin Hasani, Radu Grosu

We present LrcSSM, a $\textit{nonlinear}$ recurrent model that processes long sequences as fast as today’s linear state-space layers. By forcing the state-transition matrix to be diagonal and learned at every step, the full sequence can be solved in parallel with a single prefix-scan, giving $\mathcal{O}(TD)$ time and memory and only $\mathcal{O}(\log T)$ sequential depth, for input-sequence length $T$ and a state dimension $D$. Moreover, LrcSSM offers a formal gradient-stability guarantee that other input-varying systems such as Liquid-S4 and Mamba do not provide. Lastly, for network depth $L$, as the forward and backward passes cost $\Theta(T\,D\,L)$ FLOPs, with its low sequential depth and parameter count $\Theta(D\,L)$, the model follows the compute-optimal scaling law regime ($\beta \approx 0.42$) recently observed for Mamba, outperforming quadratic-attention Transformers at equal compute while avoiding the memory overhead of FFT-based long convolutions. We show that on a series of long-range forecasting tasks, LrcSSM outperforms LRU, S5 and Mamba.

我们展示了LrcSSSM, 一种与今天的线性状态- 空间层一样快速处理长序列的 $ textit{ nonlinear} 的经常模式。此外, LcSSSSSSM 提供了正式的梯度可变性保证, 保证其他输入流系统, 如 livers- S4 和 Mamba 等, 每一步都无法提供对等和学习。最后, 对于网络深度来说, 完全序列可以与单一的前缀扫描平行解决, 给 $\ mathcal{O} (TD) 时间和记忆, 并且只有 $\ mathcal{O} (log T) 的顺序深度和参数, 用于输入序列的长度 $Theta(D\, L) 美元, 该模型遵循了可配置和最佳的测量法制度 $\ betaapprox 0. 42 。最近观测到的Mamba、 RVal- Revorveal Strial 等的 IMA- Reval- Reval- Reval Stal- IMVal- sal- sileval- sal- silvial laviewal- silval- silval- silval silval laveal laveal labal labs labs) 。

Article 285

Title@2025-05-29 (4): SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

Title: SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

SORSA: Singuläre Werte und Orthonormale Regularisierte Singuläre Vektoren Anpassung großer Sprachmodelle

SORSA: 单项价值和正正正的正规化的单项矢量,以适应大语言模式 2409.00055v6

Authors: Yang Cao, Zhao Song

In this paper, we propose Singular Values and Orthonormal Regularized Singular Vectors Adaptation, or SORSA, a novel parameter efficient fine-tuning (PEFT) method. Each SORSA adapter consists of two main parts: trainable principal singular weights $W_p = U_p \text{diag}(S_p) V^\top_p$, and frozen residual weights $W_r = U_r \text{diag}(S_r) V^\top_r$. These parts are initialized by performing singular value decomposition (SVD) on pre-trained weights. Moreover, we implement and analyze an orthonormal regularizer, which we prove could decrease the condition number of $W_p$ and make the optimization more efficient. SORSA adapters could be merged during inference, thus eliminating any inference latency. We also introduce a method to analyze the variation of the parameters by performing SVD and discuss and analyze SORSA’s superiority in minimizing the alteration in the SVD aspect. After all, SORSA shows a faster convergence than LoRA and PiSSA in our experiments. On the GSM-8K benchmark, Llama 2 7B adapted using SORSA achieved 56.03\% accuracy, surpassing LoRA (42.30\%) and Full FT (49.05\%). We conclude that SORSA offers a new perspective on parameter-efficient fine-tuning, demonstrating remarkable performance.

在本文中, 我们提议 Singulal 值和 Orthod Reclarizizal Singers Aditors, 或 SORSA, 一种新型参数高效微调( PEFT) 方法。每个 SORSA 调整器由两个主要部分组成: 可训练的主要单重量 $W_ p = U_ p = U_ p = U_ p text{diag} (S_ r) Vtop_ r$。这些部分是通过在预训练重量上执行单值分解( SVD ) 的初始化。此外, 我们实施和分析一个正正正正正正正正正正正的调整器, 我们能降低 $_ p = U_ p = U_ text{diag} (S_ p) (S_ r) Vtr = U_ text{dia} (S_ diag} (S_r (S_r) (S_ text) (VD) Vtop__r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_r_s_s_s_s_s_s_s_s_s_s_s_s_s_sr_s_s_sr_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_sreford_smmmation_ss_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_s_ss_s_s_s_s

Article 286

Title@2025-05-29 (4): M3Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes

Title: M3Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes

M3Bench: Benchmarking Ganzkörper-Bewegungs-Generation für mobile Manipulation in 3D-Szenen

M3Bench:3D场景移动操纵基准全体运动生成 2410.06678v3

Authors: Zeyu Zhang, Sixu Yan, Muzhi Han, Zaijin Wang, Xinggang Wang, Song-Chun Zhu, Hangxin Liu

We propose M3Bench, a new benchmark for whole-body motion generation in mobile manipulation tasks. Given a 3D scene context, M3Bench requires an embodied agent to reason about its configuration, environmental constraints, and task objectives to generate coordinated whole-body motion trajectories for object rearrangement. M3Bench features 30,000 object rearrangement tasks across 119 diverse scenes, providing expert demonstrations generated by our newly developed M3BenchMaker, an automatic data generation tool that produces whole-body motion trajectories from high-level task instructions using only basic scene and robot information. Our benchmark includes various task splits to evaluate generalization across different dimensions and leverages realistic physics simulation for trajectory assessment. Extensive evaluation analysis reveals that state-of-the-art models struggle with coordinating base-arm motion while adhering to environmental and task-specific constraints, underscoring the need for new models to bridge this gap. By releasing M3Bench and M3BenchMaker we aim to advance robotics research toward more adaptive and capable mobile manipulation in diverse, real-world environments.

我们提出M3Bench,这是移动操纵任务中全体运动生成的新基准。在3D场景背景下,M3Bench要求一个内含的代理体对其配置、环境限制和任务目标进行解释,以产生协调的全体运动轨迹,用于物体重新排列。 M3Bench具有119个不同场景的30,000个物体重新排列任务的特点,提供我们新开发的M3Bench-Maker产生的专家演示,这是一个自动数据生成工具,仅使用基本场景和机器人信息,从高级任务指令中产生全体运动轨迹。我们的基准包括各种任务分割,以评价不同层面的通用,并利用现实物理模拟进行轨迹评估。广泛的评估分析表明,最先进的模型在坚持环境和特定任务的限制的同时,在协调基本运动方面挣扎,强调需要新的模型来弥合这一差距。我们通过释放M3Bench和M3Bench-M3Benker,目的是推动机器人研究,在多样化的现实世界环境中进行更适应和更有能力的移动操纵。

Article 287

Title@2025-05-29 (4): Topological Structure Learning Should Be A Research Priority for LLM-Based Multi-Agent Systems

Title: Topological Structure Learning Should Be A Research Priority for LLM-Based Multi-Agent Systems

Topologisches Strukturlernen sollte eine Forschungspriorität für LLM-basierte Multi-Agent-Systeme sein

地形结构学习应成为以LLM为基础的多种机构系统的研究重点 2505.22467v2

Authors: Jiaxi Yang, Mengqi Zhang, Yiqiao Jin, Hao Chen, Qingsong Wen, Lu Lin, Yi He, Weijie Xu, James Evans, Jindong Wang

Large Language Model-based Multi-Agent Systems (MASs) have emerged as a powerful paradigm for tackling complex tasks through collaborative intelligence. Nevertheless, the question of how agents should be structurally organized for optimal cooperation remains largely unexplored. In this position paper, we aim to gently redirect the focus of the MAS research community toward this critical dimension: develop topology-aware MASs for specific tasks. Specifically, the system consists of three core components - agents, communication links, and communication patterns - that collectively shape its coordination performance and efficiency. To this end, we introduce a systematic, three-stage framework: agent selection, structure profiling, and topology synthesis. Each stage would trigger new research opportunities in areas such as language models, reinforcement learning, graph learning, and generative modeling; together, they could unleash the full potential of MASs in complicated real-world applications. Then, we discuss the potential challenges and opportunities in the evaluation of multiple systems. We hope our perspective and framework can offer critical new insights in the era of agentic AI.

大型语言模型多行为者系统(MAS)已成为通过协作情报处理复杂任务的有力范例,然而,关于应如何从结构上组织代理人以实现最佳合作的问题基本上尚未探讨。在本立场文件中,我们的目标是将MAS研究界的重点轻轻地转向这一关键方面:为具体任务开发具有地貌意识的MAS。具体地说,该系统由三个核心组成部分组成:代理、通信联系和通信模式,共同决定其协调性能和效率。为此,我们引入了一个系统化的、三阶段的框架:代理选择、结构特征分析和地形综合。每个阶段都将在语言模型、强化学习、图表学习和基因模型等领域触发新的研究机会;它们一起可以充分发挥MAS在复杂的现实世界应用中的潜力。然后,我们讨论在评估多种系统方面的潜在挑战和机遇。我们希望我们的观点和框架能够在代理性AI时代提供重要的新见解。

Article 288

Title@2025-05-29 (4): Efficient Quantum Approximate $k$NN Algorithm via Granular-Ball Computing

Title: Efficient Quantum Approximate $k$NN Algorithm via Granular-Ball Computing

Effiziente Quanten Ungefähre $k$NN-Algorithmus über Granular-Ball Computing

通过颗粒球式计算机计算, 近于 $k$NN 的高效量量量 2505.23066v1

Authors: Shuyin Xia, Xiaojiang Tian, Suzhen Yuan, Jeremiah D. Deng

High time complexity is one of the biggest challenges faced by $k$-Nearest Neighbors ($k$NN). Although current classical and quantum $k$NN algorithms have made some improvements, they still have a speed bottleneck when facing large amounts of data. To address this issue, we propose an innovative algorithm called Granular-Ball based Quantum $k$NN(GB-Q$k$NN). This approach achieves higher efficiency by first employing granular-balls, which reduces the data size needed to processed. The search process is then accelerated by adopting a Hierarchical Navigable Small World (HNSW) method. Moreover, we optimize the time-consuming steps, such as distance calculation, of the HNSW via quantization, further reducing the time complexity of the construct and search process. By combining the use of granular-balls and quantization of the HNSW method, our approach manages to take advantage of these treatments and significantly reduces the time complexity of the $k$NN-like algorithms, as revealed by a comprehensive complexity analysis.

高时复杂度是近距离邻里面临的最大挑战之一。尽管目前的古典和量子小世界运算法已经取得了一些改进,但当面临大量数据时,它们仍然有一个速度瓶颈。为了解决这个问题,我们建议采用一种创新算法,称为以Granulal-Ball为基础的Qaunum $k$NN(GB-Q$k$NNN) 。这种方法首先使用颗粒球,从而降低处理所需的数据大小,从而提高效率。然后,通过采用高层次可导航小世界(HNSW)方法加快搜索过程。此外,我们优化了HNSW的花费时间步骤,例如通过四分化计算距离,进一步降低构建和搜索过程的时间复杂性。通过使用颗粒球和HNSW方法的四分化,我们的方法得以利用这些处理方法,大大降低GKNN值类似算法的时间复杂性,全面的复杂性分析揭示了这一点。

Article 289

Title@2025-05-29 (4): Machine Learning Framework for Characterizing Processing-Structure Relationship in Block Copolymer Thin Films

Title: Machine Learning Framework for Characterizing Processing-Structure Relationship in Block Copolymer Thin Films

Machine Learning Framework zur Charakterisierung von Verarbeitungs-Struktur-Beziehungen in Block Copolymer Thin Films

确定胶合聚合薄薄膜加工-结构关系特征的机械学习框架 2505.23064v1

Authors: Bradley Lamb, Saroj Upreti, Yunfei Wang, Daniel Struble, Chenhui Zhu, Guillaume Freychet, Xiaodan Gu, Boran Ma

The morphology of block copolymers (BCPs) critically influences material properties and applications. This work introduces a machine learning (ML)-enabled, high-throughput framework for analyzing grazing incidence small-angle X-ray scattering (GISAXS) data and atomic force microscopy (AFM) images to characterize BCP thin film morphology. A convolutional neural network was trained to classify AFM images by morphology type, achieving 97% testing accuracy. Classified images were then analyzed to extract 2D grain size measurements from the samples in a high-throughput manner. ML models were developed to predict morphological features based on processing parameters such as solvent ratio, additive type, and additive ratio. GISAXS-based properties were predicted with strong performances ($R^2$ > 0.75), while AFM-based property predictions were less accurate ($R^2$ < 0.60), likely due to the localized nature of AFM measurements compared to the bulk information captured by GISAXS. Beyond model performance, interpretability was addressed using Shapley Additive exPlanations (SHAP). SHAP analysis revealed that the additive ratio had the largest impact on morphological predictions, where additive provides the BCP chains with increased volume to rearrange into thermodynamically favorable morphologies. This interpretability helps validate model predictions and offers insight into parameter importance. Altogether, the presented framework combining high-throughput characterization and interpretable ML offers an approach to exploring and optimizing BCP thin film morphology across a broad processing landscape.

这项工作引入了一个机器学习(ML)驱动的高通量框架,用于分析小角X射线散射(GISAXS)的放牧事件、小角X射线分散(GISAXS)的数据和原子力显微镜(AFM)图像,以描述BCP薄薄膜形态学。一个革命神经网络接受了培训,按形态类型对AFM图像进行分类,达到97%的测试精确度。然后对分类图像进行了分析,以高通量方式从样本中提取2D粒度测量数据。根据加工参数,如溶剂比率、添加剂类型和添加剂比率,开发了ML模型,以预测形态特征。基于GISAXS的特性预测有很强的性能(R%2美元 > 0.75),而基于FMM财产的预测则不那么准确(R2美元 < 0.60),可能是由于与GISAXS的定量评估方法相比,亚调度模型的局部性度测量,除模型性能外,还利用可理解性透性剖析性剖面图(SHAPAAP),将深度分析结果显示Brasslievildrolational Styalview的深度分析提供量。

Article 290

Title: Loss-Guided Model Sharing and Local Learning Correction in Decentralized Federated Learning for Crop Disease Classification

Loss-Guided Model Sharing und lokale Lernkorrektur bei dezentralisiertem Föderated Learning für die Klassifizierung von Crop Diseases

关于作物疾病分类的分散化联邦学习中损失指导模式共享和地方学习校正 2505.23063v1

Authors: Denis Mamba Kabala, Adel Hafiane, Laurent Bobelin, Raphael Canals

Crop disease detection and classification is a critical challenge in agriculture, with major implications for productivity, food security, and environmental sustainability. While deep learning models such as CNN and ViT have shown excellent performance in classifying plant diseases from images, their large-scale deployment is often limited by data privacy concerns. Federated Learning (FL) addresses this issue, but centralized FL remains vulnerable to single-point failures and scalability limits. In this paper, we introduce a novel Decentralized Federated Learning (DFL) framework that uses validation loss (Loss_val) both to guide model sharing between peers and to correct local training via an adaptive loss function controlled by weighting parameter. We conduct extensive experiments using PlantVillage datasets with three deep learning architectures (ResNet50, VGG16, and ViT_B16), analyzing the impact of weighting parameter, the number of shared models, the number of clients, and the use of Loss_val versus Loss_train of other clients. Results demonstrate that our DFL approach not only improves accuracy and convergence speed, but also ensures better generalization and robustness across heterogeneous data environments making it particularly well-suited for privacy-preserving agricultural applications.

作物疾病检测和分类是农业面临的一项重大挑战,对生产力、粮食安全和环境可持续性具有重大影响。CNN和VIT等深层次学习模式在将植物疾病从图像中分类方面表现良好,但其大规模部署往往受到数据隐私问题的限制。联邦学习(FL)处理这一问题,但中央化FL仍然易受单点失灵和可缩放限制的影响。在本文中,我们引入了一个新型的分散化联邦学习(DFL)框架,该框架使用验证损失(Loss_val)来指导同龄人之间分享模型,并通过由加权参数控制的适应性损失功能来纠正当地培训。我们用三种深层学习结构(ResNet50、VGG16和VIT_B16)进行广泛的实验,分析加权参数的影响、共享模型的数量、客户数量、以及使用Lost_val相对于其他客户的损失/损失。结果显示,我们的DFL方法不仅提高精确度和趋近速度,而且还确保更加普及和稳健地贯穿不同数据环境,使其特别适合于保护隐私的应用。

Article 291

Title@2025-05-29 (4): Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data

Title: Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data

Composite Flow passend zum Verstärkungslernen mit Shifted-Dynamics-Daten

与上下动动量数据匹配的强化学习综合流程 2505.23062v1

Authors: Lingkai Kong, Haichuan Wang, Tonghan Wang, Guojun Xiong, Milind Tambe

Incorporating pre-collected offline data from a source environment can significantly improve the sample efficiency of reinforcement learning (RL), but this benefit is often challenged by discrepancies between the transition dynamics of the source and target environments. Existing methods typically address this issue by penalizing or filtering out source transitions in high dynamics-gap regions. However, their estimation of the dynamics gap often relies on KL divergence or mutual information, which can be ill-defined when the source and target dynamics have disjoint support. To overcome these limitations, we propose CompFlow, a method grounded in the theoretical connection between flow matching and optimal transport. Specifically, we model the target dynamics as a conditional flow built upon the output distribution of the source-domain flow, rather than learning it directly from a Gaussian prior. This composite structure offers two key advantages: (1) improved generalization for learning target dynamics, and (2) a principled estimation of the dynamics gap via the Wasserstein distance between source and target transitions. Leveraging our principled estimation of the dynamics gap, we further introduce an optimistic active data collection strategy that prioritizes exploration in regions of high dynamics gap, and theoretically prove that it reduces the performance disparity with the optimal policy. Empirically, CompFlow outperforms strong baselines across several RL benchmarks with shifted dynamics.

从源环境预先收集的离线数据可以大大提高强化学习(RL)的抽样效率,但这一效益往往受到源与目标环境过渡动态之间的差异的挑战。现有方法通常通过惩罚或过滤高动态差距区域源的过渡来解决这一问题。但是,它们对于动态差距的估计往往依赖KL差异或相互信息,而当源和目标动态得到不连贯的支持时,这种差异或相互信息可能定义不当。为了克服这些限制,我们提议CompFlow,这是基于流动匹配与最佳运输之间理论联系的一种方法。具体地说,我们将目标动态作为基于源-地流动产出分布的有条件流动模型,而不是直接从Gausian之前的区域学习。这一综合结构提供了两个主要优势:(1) 改进学习目标动态的通用化,(2) 通过源与目标转型之间的瓦瑟斯坦距离对动态差距进行有原则性的估计。我们利用我们对动态差距的原则性估计,我们进一步引入了一种乐观的积极数据收集战略,优先在高动态差距区域进行勘探,从理论上证明它能够减少最佳政策基线之间的强性差距。

Article 292

Title@2025-05-29 (4): Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design

Title: Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design

Spekulative Dekodierung trifft auf Quantisierung: Kompatibilitätsbewertung und Hierarchisches Framework Design

投机性下限符合量化:兼容性评价和等级框架设计 2505.22179v2

Authors: Yudi Zhang, Weilin Zhao, Xu Han, Tiejun Zhao, Wang Xu, Hailong Cao, Conghui Zhu

Speculative decoding and quantization effectively accelerate memory-bound inference of large language models. Speculative decoding mitigates the memory bandwidth bottleneck by verifying multiple tokens within a single forward pass, which increases computational effort. Quantization achieves this optimization by compressing weights and activations into lower bit-widths and also reduces computations via low-bit matrix multiplications. To further leverage their strengths, we investigate the integration of these two techniques. Surprisingly, experiments applying the advanced speculative decoding method EAGLE-2 to various quantized models reveal that the memory benefits from 4-bit weight quantization are diminished by the computational load from speculative decoding. Specifically, verifying a tree-style draft incurs significantly more time overhead than a single-token forward pass on 4-bit weight quantized models. This finding led to our new speculative decoding design: a hierarchical framework that employs a small model as an intermediate stage to turn tree-style drafts into sequence drafts, leveraging the memory access benefits of the target quantized model. Experimental results show that our hierarchical approach achieves a 2.78$\times$ speedup across various tasks for the 4-bit weight Llama-3-70B model on an A100 GPU, outperforming EAGLE-2 by 1.31$\times$. Code available at https://github.com/AI9Stars/SpecMQuant.

可能解码和量化能够有效加速大型语言模型的内存解析。猜测解码可以减少记忆带宽瓶颈, 具体来说, 量化可以将重量压缩和激活到小位宽度, 并通过低位基数乘法减少计算。为了进一步发挥这两个技术的优势, 我们调查了这两种技术的整合情况。令人惊讶的是, 将先进的投机解码方法 EAGLE-2 应用到各种量化模型的实验表明, 4比位权重量化的记忆因投机解码的计算负荷而减少。具体地说, 验证树型草案比四位基位基数的单端前端分流要多得多。发现导致我们新的投机解码设计: 一个等级框架, 使用一个小型的模型将树型草案转换成序列草稿, 利用目标值值值值值EAGLE9- 3的重量量化, 具体地说, 树型草案将A78- ALS 的内存取收益收益, 一个等级方法在1个基数级平比值模型上, 4级分析结果显示, 级方法在1个基比重模型上, 4级方法达到E78_B级。级方法, 。 4级。实验结果结果, 我们的等级方法在1级方法在1个基级分级分级分级法方法, 在1比级法方法上, 在1比级法方法上, 在1比。

Article 293

Title@2025-05-29 (4): DINGO: Constrained Inference for Diffusion LLMs

Title: DINGO: Constrained Inference for Diffusion LLMs

DINGO: Beschränkte Schlussfolgerung für Diffusion LLMs

DINGO: 扩散长效LMM的连续推论 2505.23061v1

Authors: Tarun Suresh, Debangshu Banerjee, Shubham Ugare, Sasa Misailovic, Gagandeep Singh

Diffusion LLMs have emerged as a promising alternative to conventional autoregressive LLMs, offering significant potential for improved runtime efficiency. However, existing diffusion models lack the ability to provably enforce user-specified formal constraints, such as regular expressions, which makes them unreliable for tasks that require structured outputs, such as fixed-schema JSON generation. Unlike autoregressive models that generate tokens sequentially, diffusion LLMs predict a block of tokens in parallel. This parallelism makes traditional constrained decoding algorithms, which are designed for sequential token prediction, ineffective at preserving the true output distribution. To address this limitation, we propose DINGO, a dynamic programming-based constrained decoding strategy that is both efficient and provably distribution-preserving. DINGO enables sampling of output strings with the highest probability under the model’s predicted distribution, while strictly satisfying any user-specified regular expression. On standard symbolic math and JSON generation benchmarks, DINGO achieves up to a 68 percentage point improvement over unconstrained inference

与传统自动递减的LMS相比,LMS已成为一种大有希望的替代传统自动递减的LMS,它为提高运行时间效率提供了巨大的潜力;然而,现有的推广模式缺乏能力,无法对用户指定的正式限制,例如常规表达方式,使其不适于执行需要结构化产出的任务,例如固定的JSON 生成。与自动递减模式不同,扩散LMS同时预测一系列象征性。这种平行使得传统的受限制解码算法(这些算法是为按顺序进行象征性预测而设计的,在保存真正的产出分布方面无效)。为了应对这一限制,我们建议DINGO,这是一个动态的、基于程序化的受限解码战略,既高效又可可移动的分布保存。DINGO能够根据模型预测的分布,在严格满足用户指定的任何常规表达方式时,以最高概率取样产出字符。关于标准的象征性数学和JSONS生成基准,DINGO在未受限制的推算之外,实现了68个百分点的改进。

Article 294

Title@2025-05-29 (4): Improved Last-Iterate Convergence of Shuffling Gradient Methods for Nonsmooth Convex Optimization

Title: Improved Last-Iterate Convergence of Shuffling Gradient Methods for Nonsmooth Convex Optimization

Verbesserte letzte Konvergenz der schrumpfenden Gradienten-Methoden für rauchfreie Convex-Optimierung

优化非移动convex最佳化的渐进式打碎方法的改进后最后 2505.23056v1

Authors: Zijian Liu, Zhengyuan Zhou

We study the convergence of the shuffling gradient method, a popular algorithm employed to minimize the finite-sum function with regularization, in which functions are passed to apply (Proximal) Gradient Descent (GD) one by one whose order is determined by a permutation on the indices of functions. In contrast to its easy implementation and effective performance in practice, the theoretical understanding remains limited. A recent advance by (Liu & Zhou, 2024b) establishes the first last-iterate convergence results under various settings, especially proving the optimal rates for smooth (strongly) convex optimization. However, their bounds for nonsmooth (strongly) convex functions are only as fast as Proximal GD. In this work, we provide the first improved last-iterate analysis for the nonsmooth case demonstrating that the widely used Random Reshuffle ($\textsf{RR}$) and Single Shuffle ($\textsf{SS}$) strategies are both provably faster than Proximal GD, reflecting the benefit of randomness. As an important implication, we give the first (nearly) optimal convergence result for the suffix average under the $\textsf{RR}$ sampling scheme in the general convex case, matching the lower bound shown by (Koren et al., 2022).

我们研究的是折叠梯度方法的趋同性,这是一种常用的算法,目的是尽量减少与正规化的有限和总和功能,在这种算法中,各种功能被传递到应用(最接近的)渐变源(GD),其顺序由功能指数的变异决定。与其易于执行和实践中的有效运作相比,理论上的理解仍然有限。最近由(Liu & Zhou, 2024b) 和Sone Shuffle (textfsffsf{SS}$) 提出的最接近性结果在各种环境下首次确定,特别是证明最优的速率可以顺利(强的)调和同质优化。然而,这些功能对非mooth(强的)共融(GD) 功能的界限仅与最接近性GD(GD) 一样快。在这项工作中,我们第一次改进了对非曲线案例的最后一次计算率分析,表明广泛使用的随机再组合(textffffffs} 20美元) 和Shuffleshleshle shal 最接近结果,在平均的20x 和最接近结果之下。

Article 295

Title@2025-05-29 (4): CDR-Agent: Intelligent Selection and Execution of Clinical Decision Rules Using Large Language Model Agents

Title: CDR-Agent: Intelligent Selection and Execution of Clinical Decision Rules Using Large Language Model Agents

CDR-Agent: Intelligente Auswahl und Durchführung klinischer Entscheidungsregeln unter Verwendung von Large Language Model Agents

CDR-代理:明智选择和执行使用大语言示范物剂的临床决定规则 2505.23055v1

Authors: Zhen Xiang, Aliyah R. Hsu, Austin V. Zane, Aaron E. Kornblith, Margaret J. Lin-Martore, Jasmanpreet C. Kaur, Vasuda M. Dokiparthi, Bo Li, Bin Yu

Clinical decision-making is inherently complex and fast-paced, particularly in emergency departments (EDs) where critical, rapid and high-stakes decisions are made. Clinical Decision Rules (CDRs) are standardized evidence-based tools that combine signs, symptoms, and clinical variables into decision trees to make consistent and accurate diagnoses. CDR usage is often hindered by the clinician’s cognitive load, limiting their ability to quickly recall and apply the appropriate rules. We introduce CDR-Agent, a novel LLM-based system designed to enhance ED decision-making by autonomously identifying and applying the most appropriate CDRs based on unstructured clinical notes. To validate CDR-Agent, we curated two novel ED datasets: synthetic and CDR-Bench, although CDR-Agent is applicable to non ED clinics. CDR-Agent achieves a 56.3\% (synthetic) and 8.7\% (CDR-Bench) accuracy gain relative to the standalone LLM baseline in CDR selection. Moreover, CDR-Agent significantly reduces computational overhead. Using these datasets, we demonstrated that CDR-Agent not only selects relevant CDRs efficiently, but makes cautious yet effective imaging decisions by minimizing unnecessary interventions while successfully identifying most positively diagnosed cases, outperforming traditional LLM prompting approaches. Code for our work can be found at: https://github.com/zhenxianglance/medagent-cdr-agent

临床决策具有内在的复杂性,而且速度很快,特别是在紧急部门(急诊部门)尤其如此。临床决策规则(CDR)是标准化的循证工具,将迹象、症状和临床变量结合到决策树中,以便作出一致和准确的诊断。临床决策的使用往往受到临床医生认知负荷的阻碍,限制了他们迅速回忆和适用适当规则的能力。我们引入了CDR-Agency,这是一个以LLM为基础的创新系统,目的是通过自主地识别和应用基于非结构化临床说明的最适当的CDR(CDR-Agency)来加强ED决策。为了验证CDR-Agency,我们调整了两个新型的ED数据集:合成和CDR-Bench,尽管CDR-Agency适用于非ED诊所。CDR-Agentient在CDR的选择中取得了56.3(合成)和8.7(CDR-Ben-Bench)的准确度,而与CDRM的不透明性LM基准相对。此外,CDR-Agency 明显地降低了计算间接费用。我们成功地选择了CRDRDRDR的准确性案例。我们成功地选择了C-C-C-C-C-DRDRDRDRDRDRDRDR(成功),我们发现C-C-C-C-C-de)。

Article 296

Title@2025-05-29 (4): Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network

Title: Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network

Lernen von suboptimalen Daten in der kontinuierlichen Kontrolle über Auto-Regressive Soft Q-Network

通过自动递减软软QNetwork, 从连续控制中的亚最佳数据中学习 2502.00288v2

Authors: Jijia Liu, Feng Gao, Qingmin Liao, Chao Yu, Yu Wang

Reinforcement learning (RL) for continuous control often requires large amounts of online interaction data. Value-based RL methods can mitigate this burden by offering relatively high sample efficiency. Some studies further enhance sample efficiency by incorporating offline demonstration data to “kick-start” training, achieving promising results in continuous control. However, they typically compute the Q-function independently for each action dimension, neglecting interdependencies and making it harder to identify optimal actions when learning from suboptimal data, such as non-expert demonstration and online-collected data during the training process. To address these issues, we propose Auto-Regressive Soft Q-learning (ARSQ), a value-based RL algorithm that models Q-values in a coarse-to-fine, auto-regressive manner. First, ARSQ decomposes the continuous action space into discrete spaces in a coarse-to-fine hierarchy, enhancing sample efficiency for fine-grained continuous control tasks. Next, it auto-regressively predicts dimensional action advantages within each decision step, enabling more effective decision-making in continuous control tasks. We evaluate ARSQ on two continuous control benchmarks, RLBench and D4RL, integrating demonstration data into online training. On D4RL, which includes non-expert demonstrations, ARSQ achieves an average $1.62\times$ performance improvement over SOTA value-based baseline. On RLBench, which incorporates expert demonstrations, ARSQ surpasses various baselines, demonstrating its effectiveness in learning from suboptimal online-collected data. Project page is at https://sites.google.com/view/ar-soft-q

用于连续控制的强化学习(RL)往往需要大量的在线互动数据。基于价值的RL方法可以通过提供相对较高的样本效率来减轻这一负担。有些研究将离线演示数据纳入“启动”培训,从而进一步提高样本效率,从而在连续控制方面实现有希望的成果。然而,它们通常对每个行动层面独立计算Q功能,忽视相互依存关系,在学习非优化数据时更难确定最佳行动,例如培训过程中的非专家演示和在线收集的数据。为了解决这些问题,我们建议采用自动回归软软Q学习(ARSQ),一种基于价值的RL算法,将“离线”到“启动”培训,在连续控制方面,ARSQQQ 将基于价值的模型模型,将连续的操作空间放在一个离散的空间中,提高精细计量的连续控制任务的样本效率。此外,它自动递增地预测了每个决策步骤的维度行动优势,使得以粗略的R-Sft QQL 能够更有效地进行模拟决策,在连续控制中将SO-D-L任务纳入持续管理。

Article 297

Title@2025-05-29 (4): DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration

Title: DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration

DenoiseRotator: Verbesserung der Beschneidungsfestigkeit für LLMs durch Bedeutungskonzentration

DenoisRotator:通过重视浓度提高LLMs的稳健力 2505.23049v1

Authors: Tianteng Gu, Bei Liu, Bo Xiao, Ke Zeng, Jiacheng Liu, Yanmin Qian

Pruning is a widely used technique to compress large language models (LLMs) by removing unimportant weights, but it often suffers from significant performance degradation - especially under semi-structured sparsity constraints. Existing pruning methods primarily focus on estimating the importance of individual weights, which limits their ability to preserve critical capabilities of the model. In this work, we propose a new perspective: rather than merely selecting which weights to prune, we first redistribute parameter importance to make the model inherently more amenable to pruning. By minimizing the information entropy of normalized importance scores, our approach concentrates importance onto a smaller subset of weights, thereby enhancing pruning robustness. We instantiate this idea through DenoiseRotator, which applies learnable orthogonal transformations to the model’s weight matrices. Our method is model-agnostic and can be seamlessly integrated with existing pruning techniques such as Magnitude, SparseGPT, and Wanda. Evaluated on LLaMA3, Qwen2.5, and Mistral models under 50% unstructured and 2:4 semi-structured sparsity, DenoiseRotator consistently improves perplexity and zero-shot accuracy. For instance, on LLaMA3-70B pruned with SparseGPT at 2:4 semi-structured sparsity, DenoiseRotator reduces the perplexity gap to the dense model by 58%, narrowing the degradation from 8.1 to 3.4 points. Codes are available at https://github.com/Axel-gu/DenoiseRotator.

粗略是一种通过去除不重要的重量压缩大语言模型( LLMs) 的技术, 它被广泛使用, 以压缩大语言模型( LLMs) 。但是, 我们的方法往往受到显著的性能退化的影响, 特别是在半结构化的宽度限制下。现有的裁剪方法主要侧重于估算个体重量的重要性, 这限制了它们保存模型关键能力的能力。在这项工作中, 我们提出了一个新视角: 我们不仅选择对纯度的权重, 我们首先重新分配参数的重要性, 以使模型本身更容易被剪裁。通过将正常重要性分分数的信息最小化, 我们的方法将重要性集中在一个较小的重量组上, 特别是半结构化的缩略度强度。我们通过DenoiseRototiator将这一想法快速化, 将可学习的或高度的变异度转换应用到模型的重量矩阵矩阵。我们的方法是模范- 、 SpressGPT和Wanda 等现有调技术可以顺利地结合。由LLA3 、 Quencommissionality 和 Mis- mission dealtialalalticalalality 在50 和 2: Drassalticalticalticality上持续地改进了50- deal- dealtial- deal- dealalality 。

Article 298

Title@2025-05-29 (4): ProDiff: Prototype-Guided Diffusion for Minimal Information Trajectory Imputation

Title: ProDiff: Prototype-Guided Diffusion for Minimal Information Trajectory Imputation

ProDiff: Prototypen-geführte Diffusion für minimale Information Trajektorie Imputation

ProDiff: 用于最小信息轨迹截肢的原型类型辅助扩散 2505.23048v1

Authors: Tianci Bu, Le Zhou, Wenchuan Yang, Jianhong Mou, Kang Yang, Suoyi Tan, Feng Yao, Jingyuan Wang, Xin Lu

Trajectory data is crucial for various applications but often suffers from incompleteness due to device limitations and diverse collection scenarios. Existing imputation methods rely on sparse trajectory or travel information, such as velocity, to infer missing points. However, these approaches assume that sparse trajectories retain essential behavioral patterns, which place significant demands on data acquisition and overlook the potential of large-scale human trajectory embeddings. To address this, we propose ProDiff, a trajectory imputation framework that uses only two endpoints as minimal information. It integrates prototype learning to embed human movement patterns and a denoising diffusion probabilistic model for robust spatiotemporal reconstruction. Joint training with a tailored loss function ensures effective imputation. ProDiff outperforms state-of-the-art methods, improving accuracy by 6.28\% on FourSquare and 2.52\% on WuXi. Further analysis shows a 0.927 correlation between generated and real trajectories, demonstrating the effectiveness of our approach.

对各种应用来说,轨迹数据至关重要,但由于装置限制和收集情况多种多样,数据往往不完全。现有的估算方法依靠稀少的轨迹或旅行信息,例如速度,来推断缺失点。然而,这些方法假定,稀疏的轨迹保留了基本的行为模式,对数据获取提出了重大要求,忽视了大规模人类轨迹嵌入的潜力。为解决这一问题,我们提议ProDiff,这是一个轨迹估算框架,仅使用两个端点作为最低限度的信息。它将原型学习纳入人类运动模式,并采用一个去消化的传播概率模型,以进行强大的波段重建。与特定损失函数的联合培训可以确保有效的估算。ProDiff超越了最新方法,通过6.28(FourSquare)和2.52(WuXi)提高了准确度。进一步分析显示,生成的轨迹与实际轨迹之间有0.927的关联,显示了我们的方法的有效性。

Article 299

Title@2025-05-29 (4): Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping

Title: Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping

Nicht konvexe stochastische Optimierung unter schwerfälligen Geräuschen: Optimale Konvergenz ohne gradientes Clipping

在重困噪音下非convex 斯托卡优化: 没有梯度缩放的最佳趋同 2412.19529v4

Authors: Zijian Liu, Zhengyuan Zhou

Recently, the study of heavy-tailed noises in first-order nonconvex stochastic optimization has gotten a lot of attention since it was recognized as a more realistic condition as suggested by many empirical observations. Specifically, the stochastic noise (the difference between the stochastic and true gradient) is considered to have only a finite $\mathfrak{p}$-th moment where $\mathfrak{p}\in\left(1,2\right]$ instead of assuming it always satisfies the classical finite variance assumption. To deal with this more challenging setting, people have proposed different algorithms and proved them to converge at an optimal $\mathcal{O}(T^{\frac{1-\mathfrak{p}}{3\mathfrak{p}-2}})$ rate for smooth objectives after $T$ iterations. Notably, all these new-designed algorithms are based on the same technique - gradient clipping. Naturally, one may want to know whether the clipping method is a necessary ingredient and the only way to guarantee convergence under heavy-tailed noises. In this work, by revisiting the existing Batched Normalized Stochastic Gradient Descent with Momentum (Batched NSGDM) algorithm, we provide the first convergence result under heavy-tailed noises but without gradient clipping. Concretely, we prove that Batched NSGDM can achieve the optimal $\mathcal{O}(T^{\frac{1-\mathfrak{p}}{3\mathfrak{p}-2}})$ rate even under the relaxed smooth condition. More interestingly, we also establish the first $\mathcal{O}(T^{\frac{1-\mathfrak{p}}{2\mathfrak{p}}})$ convergence rate in the case where the tail index $\mathfrak{p}$ is unknown in advance, which is arguably the common scenario in practice.

最近,对一级(mathfrak{p}p}in\left(1,2\right) $的重尾噪声的研究得到了很多关注,因为许多实证观察都认为这是一个更现实的条件。具体地说,在美元外转后,对重尾噪声(Stochac{matrak}p}p}$的差别)仅认为只有一定的$(mathfrak{p}in\left(1,2\right) 美元,而没有假设它总是符合传统的有限差异假设。要处理这个更具挑战性的设置,人们已经提出了不同的算法,并证明它们会以最佳的 $\ mathalcal{O} (T\\\ mathratch\\\\ markr\\\ translation) 的方式趋同, 平整的算法(现在的变压式曲子) 也可以先用平价缩的变压的变压法建立。

Article 300

Title@2025-05-29 (4): From Theory to Application: Fine-Tuning Large EEG Model with Real-World Stress Data

Title: From Theory to Application: Fine-Tuning Large EEG Model with Real-World Stress Data

Von der Theorie zur Anwendung: Feintuning-Großes EEG-Modell mit realen Stressdaten

从理论到应用:使用现实世界应激数据精美应用大型电子EEG模型 2505.23042v1

Authors: Siwen Wang, Shitou Zhang, Wan-Lin Chen, Dung Truong, Tzyy-Ping Jung

Recent advancements in Large Language Models have inspired the development of foundation models across various domains. In this study, we evaluate the efficacy of Large EEG Models (LEMs) by fine-tuning LaBraM, a state-of-the-art foundation EEG model, on a real-world stress classification dataset collected in a graduate classroom. Unlike previous studies that primarily evaluate LEMs using data from controlled clinical settings, our work assesses their applicability to real-world environments. We train a binary classifier that distinguishes between normal and elevated stress states using resting-state EEG data recorded from 18 graduate students during a class session. The best-performing fine-tuned model achieves a balanced accuracy of 90.47% with a 5-second window, significantly outperforming traditional stress classifiers in both accuracy and inference efficiency. We further evaluate the robustness of the fine-tuned LEM under random data shuffling and reduced channel counts. These results demonstrate the capability of LEMs to effectively process real-world EEG data and highlight their potential to revolutionize brain-computer interface applications by shifting the focus from model-centric to data-centric design.

nan

Article 301

Title@2025-05-29 (4): TINED: GNNs-to-MLPs by Teacher Injection and Dirichlet Energy Distillation

Title: TINED: GNNs-to-MLPs by Teacher Injection and Dirichlet Energy Distillation

TINED: GNNs-to-MLPs von Lehrerinjektion und Dirichlet Energy Destillation

TINED:通过教师注射和稀释能源蒸馏,将GNNs改为MLP 2412.11180v3

Authors: Ziang Zhou, Zhihao Ding, Jieming Shi, Qing Li, Shiqi Shen

Graph Neural Networks (GNNs) are pivotal in graph-based learning, particularly excelling in node classification. However, their scalability is hindered by the need for multi-hop data during inference, limiting their application in latency-sensitive scenarios. Recent efforts to distill GNNs into multi-layer perceptrons (MLPs) for faster inference often underutilize the layer-level insights of GNNs. In this paper, we present TINED, a novel approach that distills GNNs to MLPs on a layer-by-layer basis using Teacher Injection and Dirichlet Energy Distillation techniques. We focus on two key operations in GNN layers: feature transformation (FT) and graph propagation (GP). We recognize that FT is computationally equivalent to a fully-connected (FC) layer in MLPs. Thus, we propose directly transferring teacher parameters from an FT in a GNN to an FC layer in the student MLP, enhanced by fine-tuning. In TINED, the FC layers in an MLP replicate the sequence of FTs and GPs in the GNN. We also establish a theoretical bound for GP approximation. Furthermore, we note that FT and GP operations in GNN layers often exhibit opposing smoothing effects: GP is aggressive, while FT is conservative. Using Dirichlet energy, we develop a DE ratio to measure these effects and propose Dirichlet Energy Distillation to convey these characteristics from GNN layers to MLP layers. Extensive experiments show that TINED outperforms GNNs and leading distillation methods across various settings and seven datasets. Source code are available at https://github.com/scottjiao/TINED_ICML25/.

nan

Article 302

Title@2025-05-29 (4): One Model for One Graph: A New Perspective for Pretraining with Cross-domain Graphs

Title: One Model for One Graph: A New Perspective for Pretraining with Cross-domain Graphs

Ein Modell für einen Graphen: Eine neue Perspektive für das Pretraining mit domänenübergreifenden Graphen

一图一模型:带有跨领域图的训练前新视角 2412.00315v2

Authors: Jingzhe Liu, Haitao Mao, Zhikai Chen, Bingheng Li, Wenqi Fan, Mingxuan Ju, Tong Zhao, Neil Shah, Jiliang Tang

Graph Neural Networks (GNNs) have emerged as a powerful tool to capture intricate network patterns, achieving success across different domains. However, existing GNNs require careful domain-specific architecture designs and training from scratch on each dataset, leading to an expertise-intensive process with difficulty in generalizing across graphs from different domains. Therefore, it can be hard for practitioners to infer which GNN model can generalize well to graphs from their domains. To address this challenge, we propose a novel cross-domain pretraining framework, “one model for one graph,” which overcomes the limitations of previous approaches that failed to use a single GNN to capture diverse graph patterns across domains with significant gaps. Specifically, we pretrain a bank of expert models, with each one corresponding to a specific dataset. When inferring to a new graph, gating functions choose a subset of experts to effectively integrate prior model knowledge while avoiding negative transfer. Extensive experiments consistently demonstrate the superiority of our proposed method on both link prediction and node classification tasks.

nan

Article 303

Title: Cross-modal RAG: Sub-dimensional Retrieval-Augmented Text-to-Image Generation

Cross-modal RAG: Sub-dimensionale Retrieval-Augmented Text-to-Image Generation

跨模式RAG:次二维检索增强的文本到图像生成 2505.21956v2

Authors: Mengdan Zhu, Senhao Cheng, Guangji Bai, Yifei Zhang, Liang Zhao

Text-to-image generation increasingly demands access to domain-specific, fine-grained, and rapidly evolving knowledge that pretrained models cannot fully capture. Existing Retrieval-Augmented Generation (RAG) methods attempt to address this by retrieving globally relevant images, but they fail when no single image contains all desired elements from a complex user query. We propose Cross-modal RAG, a novel framework that decomposes both queries and images into sub-dimensional components, enabling subquery-aware retrieval and generation. Our method introduces a hybrid retrieval strategy - combining a sub-dimensional sparse retriever with a dense retriever - to identify a Pareto-optimal set of images, each contributing complementary aspects of the query. During generation, a multimodal large language model is guided to selectively condition on relevant visual features aligned to specific subqueries, ensuring subquery-aware image synthesis. Extensive experiments on MS-COCO, Flickr30K, WikiArt, CUB, and ImageNet-LT demonstrate that Cross-modal RAG significantly outperforms existing baselines in both retrieval and generation quality, while maintaining high efficiency.

nan

Article 304

Title@2025-05-29 (4): Case-Based Reasoning Enhances the Predictive Power of LLMs in Drug-Drug Interaction

Title: Case-Based Reasoning Enhances the Predictive Power of LLMs in Drug-Drug Interaction

Case-Based Reasoning verbessert die vorausschauende Kraft von LLMs in der Arzneimittel-Drogen-Interaktion

以个案为依据的理由加强药物-药物相互作用LLMs的预测能力 2505.23034v1

Authors: Guangyi Liu, Yongqi Zhang, Xunyuan Liu, Quanming Yao

Drug-drug interaction (DDI) prediction is critical for treatment safety. While large language models (LLMs) show promise in pharmaceutical tasks, their effectiveness in DDI prediction remains challenging. Inspired by the well-established clinical practice where physicians routinely reference similar historical cases to guide their decisions through case-based reasoning (CBR), we propose CBR-DDI, a novel framework that distills pharmacological principles from historical cases to improve LLM reasoning for DDI tasks. CBR-DDI constructs a knowledge repository by leveraging LLMs to extract pharmacological insights and graph neural networks (GNNs) to model drug associations. A hybrid retrieval mechanism and dual-layer knowledge-enhanced prompting allow LLMs to effectively retrieve and reuse relevant cases. We further introduce a representative sampling strategy for dynamic case refinement. Extensive experiments demonstrate that CBR-DDI achieves state-of-the-art performance, with a significant 28.7% accuracy improvement over both popular LLMs and CBR baseline, while maintaining high interpretability and flexibility.

nan

Article 305

Title@2025-05-29 (4): Exploring the Limitations of Mamba in COPY and CoT Reasoning

Title: Exploring the Limitations of Mamba in COPY and CoT Reasoning

Erforschung der Grenzen von Mamba in COPY und CoT Reasoning

探索COPY和COT理由解释中Mamba的局限性 2410.03810v3

Authors: Ruifeng Ren, Zhicong Li, Yong Liu

Transformers have become the backbone of modern Large Language Models (LLMs); however, their inference overhead grows linearly with the sequence length, posing challenges for modeling long sequences. In light of this, Mamba has attracted attention for maintaining a constant inference size, with empirical evidence demonstrating that it can match Transformer performance in sequence modeling while significantly reducing computational costs. However, an open question remains: can Mamba always bring savings while achieving performance comparable to Transformers? In this paper, we focus on analyzing the expressive ability of Mamba to perform our defined COPY operation and Chain of Thought (CoT) reasoning. First, inspired by the connection between Mamba and linear attention, we show that constant-sized Mamba may struggle to perform COPY operations while Transformers can handle them more easily. However, when the size of Mamba grows linearly with the input sequence length, it can accurately perform COPY, but in this case, Mamba no longer provides overhead savings. Based on this observation, we further analyze Mamba’s ability to tackle CoT tasks, which can be described by the Dynamic Programming (DP) problems. Our findings suggest that to solve arbitrary DP problems, the total cost of Mamba is still comparable to standard Transformers. However, similar to efficient Transformers, when facing DP problems with favorable properties such as locality, Mamba can provide savings in overhead. Our experiments on the copy and CoT tasks further demonstrate Mamba’s limitations compared to Transformers in learning these tasks.

nan

Article 306

Title@2025-05-29 (4): AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge

Title: AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge

AntiLeakBench: Datenkontamination durch automatisches Konstruieren von Benchmarks mit aktualisiertem Real-World-Wissen verhindern

防止泄漏:利用最新现实世界知识自动建立基准,防止数据污染 2412.13670v2

Authors: Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, William Yang Wang

Data contamination hinders fair LLM evaluation by introducing test data into newer models’ training sets. Existing studies solve this challenge by updating benchmarks with newly collected data. However, they fail to guarantee contamination-free evaluation as the newly collected data may contain pre-existing knowledge, and their benchmark updates rely on intensive human labor. To address these issues, we in this paper propose AntiLeak-Bench, an automated anti-leakage benchmarking framework. Instead of simply using newly collected data, we construct samples with explicitly new knowledge absent from LLMs’ training sets, which thus ensures strictly contamination-free evaluation. We further design a fully automated workflow to build and update our benchmark without human labor. This significantly reduces the cost of benchmark maintenance to accommodate emerging LLMs. Through extensive experiments, we highlight that data contamination likely exists before LLMs’ cutoff time and demonstrate AntiLeak-Bench effectively overcomes this challenge.

nan

Article 307

Title@2025-05-29 (4): Bayesian Neural Scaling Laws Extrapolation with Prior-Fitted Networks

Title: Bayesian Neural Scaling Laws Extrapolation with Prior-Fitted Networks

Bayesische Neural Scaling-Gesetze Extrapolation mit vormontierten Netzwerken

Bayesian神经扩增法与事先确定网络的外推法 2505.23032v1

Authors: Dongwoo Lee, Dong Bok Lee, Steven Adriaensen, Juho Lee, Sung Ju Hwang, Frank Hutter, Seon Joo Kim, Hae Beom Lee

Scaling has been a major driver of recent advancements in deep learning. Numerous empirical studies have found that scaling laws often follow the power-law and proposed several variants of power-law functions to predict the scaling behavior at larger scales. However, existing methods mostly rely on point estimation and do not quantify uncertainty, which is crucial for real-world applications involving decision-making problems such as determining the expected performance improvements achievable by investing additional computational resources. In this work, we explore a Bayesian framework based on Prior-data Fitted Networks (PFNs) for neural scaling law extrapolation. Specifically, we design a prior distribution that enables the sampling of infinitely many synthetic functions resembling real-world neural scaling laws, allowing our PFN to meta-learn the extrapolation. We validate the effectiveness of our approach on real-world neural scaling laws, comparing it against both the existing point estimation methods and Bayesian approaches. Our method demonstrates superior performance, particularly in data-limited scenarios such as Bayesian active learning, underscoring its potential for reliable, uncertainty-aware extrapolation in practical applications.

nan

Article 308

Title@2025-05-29 (4): Diverse Prototypical Ensembles Improve Robustness to Subpopulation Shift

Title: Diverse Prototypical Ensembles Improve Robustness to Subpopulation Shift

Unterschiedliche prototypische Ensembles verbessern die Robustheit der Subpopulationsverschiebung

提高亚人口变换能力 2505.23027v1

Authors: Minh Nguyen Nhat To, Paul F RWilson, Viet Nguyen, Mohamed Harmanani, Michael Cooper, Fahimeh Fooladgar, Purang Abolmaesumi, Parvin Mousavi, Rahul G. Krishnan

The subpopulationtion shift, characterized by a disparity in subpopulation distributibetween theween the training and target datasets, can significantly degrade the performance of machine learning models. Current solutions to subpopulation shift involve modifying empirical risk minimization with re-weighting strategies to improve generalization. This strategy relies on assumptions about the number and nature of subpopulations and annotations on group membership, which are unavailable for many real-world datasets. Instead, we propose using an ensemble of diverse classifiers to adaptively capture risk associated with subpopulations. Given a feature extractor network, we replace its standard linear classification layer with a mixture of prototypical classifiers, where each member is trained to classify the data while focusing on different features and samples from other members. In empirical evaluation on nine real-world datasets, covering diverse domains and kinds of subpopulation shift, our method of Diverse Prototypical Ensembles (DPEs) often outperforms the prior state-of-the-art in worst-group accuracy. The code is available at https://github.com/minhto2802/dpe4subpop

nan

Article 309

Title@2025-05-29 (4): Graph Wave Networks

Title: Graph Wave Networks

Graphische Wellennetze

图图波网络 2505.20034v2

Authors: Juwei Yue, Haikuo Li, Jiawei Sheng, Yihan Guo, Xinghua Zhang, Chuan Zhou, Tingwen Liu, Li Guo

Dynamics modeling has been introduced as a novel paradigm in message passing (MP) of graph neural networks (GNNs). Existing methods consider MP between nodes as a heat diffusion process, and leverage heat equation to model the temporal evolution of nodes in the embedding space. However, heat equation can hardly depict the wave nature of graph signals in graph signal processing. Besides, heat equation is essentially a partial differential equation (PDE) involving a first partial derivative of time, whose numerical solution usually has low stability, and leads to inefficient model training. In this paper, we would like to depict more wave details in MP, since graph signals are essentially wave signals that can be seen as a superposition of a series of waves in the form of eigenvector. This motivates us to consider MP as a wave propagation process to capture the temporal evolution of wave signals in the space. Based on wave equation in physics, we innovatively develop a graph wave equation to leverage the wave propagation on graphs. In details, we demonstrate that the graph wave equation can be connected to traditional spectral GNNs, facilitating the design of graph wave networks based on various Laplacians and enhancing the performance of the spectral GNNs. Besides, the graph wave equation is particularly a PDE involving a second partial derivative of time, which has stronger stability on graphs than the heat equation that involves a first partial derivative of time. Additionally, we theoretically prove that the numerical solution derived from the graph wave equation are constantly stable, enabling to significantly enhance model efficiency while ensuring its performance. Extensive experiments show that GWNs achieve SOTA and efficient performance on benchmark datasets, and exhibit outstanding performance in addressing challenging graph problems, such as over-smoothing and heterophily.

nan

Article 310

Title@2025-05-29 (4): Offline Learning for Combinatorial Multi-armed Bandits

Title: Offline Learning for Combinatorial Multi-armed Bandits

Offline-Lernen für kombinatorische Multi-Armed Bandits

多武装混合强盗离线学习 2501.19300v2

Authors: Xutong Liu, Xiangxiang Dai, Jinhang Zuo, Siwei Wang, Carlee Joe-Wong, John C. S. Lui, Wei Chen

The combinatorial multi-armed bandit (CMAB) is a fundamental sequential decision-making framework, extensively studied over the past decade. However, existing work primarily focuses on the online setting, overlooking the substantial costs of online interactions and the readily available offline datasets. To overcome these limitations, we introduce Off-CMAB, the first offline learning framework for CMAB. Central to our framework is the combinatorial lower confidence bound (CLCB) algorithm, which combines pessimistic reward estimations with combinatorial solvers. To characterize the quality of offline datasets, we propose two novel data coverage conditions and prove that, under these conditions, CLCB achieves a near-optimal suboptimality gap, matching the theoretical lower bound up to a logarithmic factor. We validate Off-CMAB through practical applications, including learning to rank, large language model (LLM) caching, and social influence maximization, showing its ability to handle nonlinear reward functions, general feedback models, and out-of-distribution action samples that excludes optimal or even feasible actions. Extensive experiments on synthetic and real-world datasets further highlight the superior performance of CLCB.

nan

Article 311

Title@2025-05-29 (4): An Empirical Study of Federated Prompt Learning for Vision Language Model

Title: An Empirical Study of Federated Prompt Learning for Vision Language Model

Eine empirische Studie über Federated Prompt Learning for Vision Language Model

联邦快速学习促进愿景语言模式经验研究 2505.23024v1

Authors: Zhihao Wang, Wenke Huang, Tian Chen, Zekun Shi, Guancheng Wan, Yu Qiao, Bin Yang, Jian Wang, Bing Li, Mang Ye

The Vision Language Model (VLM) excels in aligning vision and language representations, and prompt learning has emerged as a key technique for adapting such models to downstream tasks. However, the application of prompt learning with VLM in federated learning (\fl{}) scenarios remains underexplored. This paper systematically investigates the behavioral differences between language prompt learning (LPT) and vision prompt learning (VPT) under data heterogeneity challenges, including label skew and domain shift. We conduct extensive experiments to evaluate the impact of various \fl{} and prompt configurations, such as client scale, aggregation strategies, and prompt length, to assess the robustness of Federated Prompt Learning (FPL). Furthermore, we explore strategies for enhancing prompt learning in complex scenarios where label skew and domain shift coexist, including leveraging both prompt types when computational resources allow. Our findings offer practical insights into optimizing prompt learning in federated settings, contributing to the broader deployment of VLMs in privacy-preserving environments.

nan

Article 312

Title@2025-05-29 (4): GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

Title: GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

GuardAgent: LLM-Agenten durch einen Guard Agent durch wissensgestützte Vernunft schützen

警卫人员:由警卫人员通过 “ 知识化理由 “ 保护有限责任公司代理 2406.09187v3

Authors: Zhen Xiang, Linzhi Zheng, Yanjie Li, Junyuan Hong, Qinbin Li, Han Xie, Jiawei Zhang, Zidi Xiong, Chulin Xie, Carl Yang, Dawn Song, Bo Li

The rapid advancement of large language model (LLM) agents has raised new concerns regarding their safety and security. In this paper, we propose GuardAgent, the first guardrail agent to protect target agents by dynamically checking whether their actions satisfy given safety guard requests. Specifically, GuardAgent first analyzes the safety guard requests to generate a task plan, and then maps this plan into guardrail code for execution. By performing the code execution, GuardAgent can deterministically follow the safety guard request and safeguard target agents. In both steps, an LLM is utilized as the reasoning component, supplemented by in-context demonstrations retrieved from a memory module storing experiences from previous tasks. In addition, we propose two novel benchmarks: EICU-AC benchmark to assess the access control for healthcare agents and Mind2Web-SC benchmark to evaluate the safety policies for web agents. We show that GuardAgent effectively moderates the violation actions for different types of agents on these two benchmarks with over 98% and 83% guardrail accuracies, respectively. Project page: https://guardagent.github.io/

nan

Article 313

Title@2025-05-29 (4): SCORPIO: Serving the Right Requests at the Right Time for Heterogeneous SLOs in LLM Inference

Title: SCORPIO: Serving the Right Requests at the Right Time for Heterogeneous SLOs in LLM Inference

SCORPIO: Den richtigen Anfragen zur richtigen Zeit für heterogene SLOs in LLM-Schlussfolgerung dienen

在LLM推理中异基因性溶液的适当时间满足正确的要求 2505.23022v1

Authors: Yinghao Tang, Tingfeng Lan, Xiuqi Huang, Hui Lu, Wei Chen

Existing Large Language Model (LLM) serving systems prioritize maximum throughput. They often neglect Service Level Objectives (SLOs) such as Time to First Token (TTFT) and Time Per Output Token (TPOT), which leads to suboptimal SLO attainment. This paper introduces SCORPIO, an SLO-oriented LLM serving system designed to maximize system goodput and SLO attainment for workloads with heterogeneous SLOs. Our core insight is to exploit SLO heterogeneity for adaptive scheduling across admission control, queue management, and batch selection. SCORPIO features a TTFT Guard, which employs least-deadline-first reordering and rejects unattainable requests, and a TPOT Guard, which utilizes a VBS-based admission control and a novel credit-based batching mechanism. Both guards are supported by a predictive module. Evaluations demonstrate that SCORPIO improves system goodput by up to 14.4X and SLO adherence by up to 46.5% compared to state-of-the-art baselines.

nan

Article 314

Title@2025-05-29 (4): SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models

Title: SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models

SciHorizon: Benchmarking von KI-für-Science Readiness von wissenschaftlichen Daten zu großen Sprachmodellen

SciHorizon:将AI-SciHorizon科学准备程度从科学数据基准确定为大语言模式 2503.13503v3

Authors: Chuan Qin, Xin Chen, Chengrui Wang, Pengmin Wu, Xi Chen, Yihang Cheng, Jingyi Zhao, Meng Xiao, Xiangchao Dong, Qingqing Long, Boya Pan, Han Wu, Chengzan Li, Yuanchun Zhou, Hui Xiong, Hengshu Zhu

In recent years, the rapid advancement of Artificial Intelligence (AI) technologies, particularly Large Language Models (LLMs), has revolutionized the paradigm of scientific discovery, establishing AI-for-Science (AI4Science) as a dynamic and evolving field. However, there is still a lack of an effective framework for the overall assessment of AI4Science, particularly from a holistic perspective on data quality and model capability. Therefore, in this study, we propose SciHorizon, a comprehensive assessment framework designed to benchmark the readiness of AI4Science from both scientific data and LLM perspectives. First, we introduce a generalizable framework for assessing AI-ready scientific data, encompassing four key dimensions: Quality, FAIRness, Explainability, and Compliance-which are subdivided into 15 sub-dimensions. Drawing on data resource papers published between 2018 and 2023 in peer-reviewed journals, we present recommendation lists of AI-ready datasets for Earth, Life, and Materials Sciences, making a novel and original contribution to the field. Concurrently, to assess the capabilities of LLMs across multiple scientific disciplines, we establish 16 assessment dimensions based on five core indicators Knowledge, Understanding, Reasoning, Multimodality, and Values spanning Mathematics, Physics, Chemistry, Life Sciences, and Earth and Space Sciences. Using the developed benchmark datasets, we have conducted a comprehensive evaluation of over 50 representative open-source and closed source LLMs. All the results are publicly available and can be accessed online at www.scihorizon.cn/en.

nan

Article 315

Title@2025-05-29 (4): BECAME: BayEsian Continual Learning with Adaptive Model MErging

Title: BECAME: BayEsian Continual Learning with Adaptive Model MErging

BECAME: BayEsian Continual Learning mit adaptivem Modell-Merging

BECAME: 采用适应性示范招生模型的巴伊连续学习 2504.02666v2

Authors: Mei Li, Yuxiang Lu, Qinyan Dai, Suizhi Huang, Yue Ding, Hongtao Lu

Continual Learning (CL) strives to learn incrementally across tasks while mitigating catastrophic forgetting. A key challenge in CL is balancing stability (retaining prior knowledge) and plasticity (learning new tasks). While representative gradient projection methods ensure stability, they often limit plasticity. Model merging techniques offer promising solutions, but prior methods typically rely on empirical assumptions and carefully selected hyperparameters. In this paper, we explore the potential of model merging to enhance the stability-plasticity trade-off, providing theoretical insights that underscore its benefits. Specifically, we reformulate the merging mechanism using Bayesian continual learning principles and derive a closed-form solution for the optimal merging coefficient that adapts to the diverse characteristics of tasks. To validate our approach, we introduce a two-stage framework named BECAME, which synergizes the expertise of gradient projection and adaptive merging. Extensive experiments show that our approach outperforms state-of-the-art CL methods and existing merging strategies.

nan

Article 316

Title@2025-05-29 (4): $K^2$VAE: A Koopman-Kalman Enhanced Variational AutoEncoder for Probabilistic Time Series Forecasting

Title: $K^2$VAE: A Koopman-Kalman Enhanced Variational AutoEncoder for Probabilistic Time Series Forecasting

$K^2$VAE: Ein Koopman-Kalman-Verbesserter Variations-AutoEncoder für probabilistische Zeitreihenprognosen

2美元VAE: 概率时间序列预测的Koopman-Kalman增强变异自动编码器 2505.23017v1

Authors: Xingjian Wu, Xiangfei Qiu, Hongfan Gao, Jilin Hu, Bin Yang, Chenjuan Guo

Probabilistic Time Series Forecasting (PTSF) plays a crucial role in decision-making across various fields, including economics, energy, and transportation. Most existing methods excell at short-term forecasting, while overlooking the hurdles of Long-term Probabilistic Time Series Forecasting (LPTSF). As the forecast horizon extends, the inherent nonlinear dynamics have a significant adverse effect on prediction accuracy, and make generative models inefficient by increasing the cost of each iteration. To overcome these limitations, we introduce $K^2$VAE, an efficient VAE-based generative model that leverages a KoopmanNet to transform nonlinear time series into a linear dynamical system, and devises a KalmanNet to refine predictions and model uncertainty in such linear system, which reduces error accumulation in long-term forecasting. Extensive experiments demonstrate that $K^2$VAE outperforms state-of-the-art methods in both short- and long-term PTSF, providing a more efficient and accurate solution.

nan

Article 317

Title@2025-05-29 (4): Hyperbolic-PDE GNN: Spectral Graph Neural Networks in the Perspective of A System of Hyperbolic Partial Differential Equations

Title: Hyperbolic-PDE GNN: Spectral Graph Neural Networks in the Perspective of A System of Hyperbolic Partial Differential Equations

Hyperbolic-PDE GNN: Spektral Graph Neural Networks in the Perspective of A System of Hyperbolic Partial Differential Equations

GNN: 从超曲偏偏部分异差系统的角度看待光谱图形神经网络 2505.23014v1

Authors: Juwei Yue, Haikuo Li, Jiawei Sheng, Xiaodong Li, Taoyu Su, Tingwen Liu, Li Guo

Graph neural networks (GNNs) leverage message passing mechanisms to learn the topological features of graph data. Traditional GNNs learns node features in a spatial domain unrelated to the topology, which can hardly ensure topological features. In this paper, we formulates message passing as a system of hyperbolic partial differential equations (hyperbolic PDEs), constituting a dynamical system that explicitly maps node representations into a particular solution space. This solution space is spanned by a set of eigenvectors describing the topological structure of graphs. Within this system, for any moment in time, a node features can be decomposed into a superposition of the basis of eigenvectors. This not only enhances the interpretability of message passing but also enables the explicit extraction of fundamental characteristics about the topological structure. Furthermore, by solving this system of hyperbolic partial differential equations, we establish a connection with spectral graph neural networks (spectral GNNs), serving as a message passing enhancement paradigm for spectral GNNs.We further introduce polynomials to approximate arbitrary filter functions. Extensive experiments demonstrate that the paradigm of hyperbolic PDEs not only exhibits strong flexibility but also significantly enhances the performance of various spectral GNNs across diverse graph tasks.

nan

Article 318

Title@2025-05-29 (4): SplitLoRA: Balancing Stability and Plasticity in Continual Learning Through Gradient Space Splitting

Title: SplitLoRA: Balancing Stability and Plasticity in Continual Learning Through Gradient Space Splitting

SplitLoRA: Balance Stabilität und Plastizität im kontinuierlichen Lernen durch gradienten Raum Splitting

Split LoRA:通过逐步空间分割在持续学习中平衡稳定和可塑性 2505.22370v2

Authors: Haomiao Qiu, Miao Zhang, Ziyue Qiao, Weili Guan, Min Zhang, Liqiang Nie

Continual Learning requires a model to learn multiple tasks in sequence while maintaining both stability:preserving knowledge from previously learned tasks, and plasticity:effectively learning new tasks. Gradient projection has emerged as an effective and popular paradigm in CL, where it partitions the gradient space of previously learned tasks into two orthogonal subspaces: a primary subspace and a minor subspace. New tasks are learned effectively within the minor subspace, thereby reducing interference with previously acquired knowledge. However, existing Gradient Projection methods struggle to achieve an optimal balance between plasticity and stability, as it is hard to appropriately partition the gradient space. In this work, we consider a continual learning paradigm based on Low-Rank Adaptation, which has gained considerable attention due to its efficiency and wide applicability, and propose a novel approach for continual learning, called SplitLoRA. We first provide a theoretical analysis of how subspace partitioning affects model stability and plasticity. Informed by this analysis, we then introduce an effective method that derives the optimal partition of the gradient space for previously learned tasks. This approach effectively balances stability and plasticity in continual learning. Experimental results on multiple datasets demonstrate that the proposed method achieves state-of-the-art performance.

nan

Article 319

Title@2025-05-29 (4): Scalable Complexity Control Facilitates Reasoning Ability of LLMs

Title: Scalable Complexity Control Facilitates Reasoning Ability of LLMs

Skalierbare Komplexitätskontrolle erleichtert die Fähigkeit von LLMs, sich zu verankern

Authors: Liangkai Hang, Junjie Yao, Zhiwei Bai, Tianyi Chen, Yang Chen, Rongjie Diao, Hezhou Li, Pengxiao Lin, Zhiwei Wang, Cheng Xu, Zhongwang Zhang, Zhangchen Zhou, Zhiyu Li, Zehao Lin, Kai Chen, Feiyu Xiong, Yaoyu Zhang, Weinan E, Hongkang Yang, Zhi-Qin John Xu

The reasoning ability of large language models (LLMs) has been rapidly advancing in recent years, attracting interest in more fundamental approaches that can reliably enhance their generalizability. This work demonstrates that model complexity control, conveniently implementable by adjusting the initialization rate and weight decay coefficient, improves the scaling law of LLMs consistently over varying model sizes and data sizes. This gain is further illustrated by comparing the benchmark performance of 2.4B models pretrained on 1T tokens with different complexity hyperparameters. Instead of fixing the initialization std, we found that a constant initialization rate (the exponent of std) enables the scaling law to descend faster in both model and data sizes. These results indicate that complexity control is a promising direction for the continual advancement of LLMs.

nan

Article 320

Title@2025-05-29 (4): BA-LoRA: Bias-Alleviating Low-Rank Adaptation to Mitigate Catastrophic Inheritance in Large Language Models

Title: BA-LoRA: Bias-Alleviating Low-Rank Adaptation to Mitigate Catastrophic Inheritance in Large Language Models

BA-LoRA: Bias-Alleviating Low-Rank Anpassung an Mitigate Katastrophische Vererbung in großen Sprachmodellen

BA-LORA:在大语言模型中,对减轻灾害传承的低率适应 2408.04556v5

Authors: Yupeng Chang, Yi Chang, Yuan Wu

Large language models (LLMs) have demonstrated remarkable proficiency across various natural language processing (NLP) tasks. However, adapting LLMs to downstream applications requires computationally intensive and memory-demanding fine-tuning procedures. To alleviate these burdens, parameter-efficient fine-tuning (PEFT) techniques have emerged as a promising approach to tailor LLMs with minimal computational overhead. While PEFT methods offer substantial advantages, they do not fully address the pervasive issue of bias propagation from pre-training data. This work introduces Bias-Alleviating Low-Rank Adaptation (BA-LoRA), a novel PEFT method designed to counteract bias inheritance. BA-LoRA incorporates three distinct regularization terms: (1) a consistency regularizer, (2) a diversity regularizer, and (3) a singular value decomposition regularizer. These regularizers aim to enhance the models’ consistency, diversity, and generalization capabilities during fine-tuning. We conduct extensive experiments on natural language understanding (NLU) and natural language generation (NLG) tasks using prominent LLMs such as LLaMA, Mistral, and Gemma. The results demonstrate that BA-LoRA outperforms LoRA and its state-of-the-art variants. Moreover, the extended experiments demonstrate that our method effectively mitigates the adverse effects of pre-training bias, leading to more reliable and robust model outputs. The code is available at https://github.com/cyp-jlu-ai/BA-LoRA.

nan

Article 321

Title@2025-05-29 (4): EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

Title: EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

EmergentTTS-Eval: Bewertung von TTS-Modellen auf komplexe Prosodic, Expressivität und sprachliche Herausforderungen mit Model-as-a-Judge

新兴TTS-Eval:利用 “ 模拟即审法官 “ 评估关于复杂立案、表达性和语言挑战的TTS模型 2505.23009v1

Authors: Ruskin Raj Manku, Yuzhi Tang, Xingjian Shi, Mu Li, Alex Smola

Text-to-Speech (TTS) benchmarks often fail to capture how well models handle nuanced and semantically complex text. Building on $\textit{EmergentTTS}$, we introduce $\textit{EmergentTTS-Eval}$, a comprehensive benchmark covering six challenging TTS scenarios: emotions, paralinguistics, foreign words, syntactic complexity, complex pronunciation (e.g. URLs, formulas), and questions. Crucially, our framework automates both test-case generation and evaluation, making the benchmark easily extensible. Starting from a small set of human-written seed prompts, we iteratively extend them using LLMs to target specific structural, phonetic and prosodic challenges, resulting in 1,645 diverse test cases. Moreover, we employ a model-as-a-judge approach, using a Large Audio Language Model (LALM) to assess the speech across multiple dimensions such as expressed emotion, prosodic, intonational, and pronunciation accuracy. We evaluate state-of-the-art open-source and proprietary TTS systems, such as 11Labs, Deepgram, and OpenAI’s 4o-mini-TTS, on EmergentTTS-Eval, demonstrating its ability to reveal fine-grained performance differences. Results show that the model-as-a-judge approach offers robust TTS assessment and a high correlation with human preferences. We open source the evaluation $\href{https://github.com/boson-ai/EmergentTTS-Eval-public}{code}$ and the $\href{https://huggingface.co/datasets/bosonai/EmergentTTS-Eval}{dataset}$.

nan

Article 322

Title@2025-05-29 (4): QLIP: A Dynamic Quadtree Vision Prior Enhances MLLM Performance Without Retraining

Title: QLIP: A Dynamic Quadtree Vision Prior Enhances MLLM Performance Without Retraining

QLIP: Eine dynamische Quadtree Vision verbessert die MLLM-Performance ohne Umschulung

QLIP: 动态的四方愿景,事先提高MLLM业绩,不再培训 2505.23004v1

Authors: Kyle R. Chickering, Bangzheng Li, Muhao Chen

Multimodal Large Language Models (MLLMs) encode images into visual tokens, aligning visual and textual signals within a shared latent space to facilitate crossmodal representation learning. The CLIP model is a widely adopted foundational vision language model whose vision encoder has played a critical role in the development of MLLMs such as LLaVA. However, the CLIP vision encoder suffers from notable limitations including being constrained to only handling fixed input resolutions and a failure to produce separated embeddings for dissimilar images. Replacing the vision encoder of an existing model typically incurs substantial computational costs because such a change often necessitates retraining the entire model pipeline. In this work, we identify two factors which underlie the limitations of the CLIP vision encoder: mesoscopic bias and interpolation bias. To address these issues, we propose QLIP, a drop-in replacement for CLIP that can be seamlessly integrated with existing MLLMs with only a few lines of code and can enhance both coarse-grained and fine-grained visual understanding, without re-training. QLIP is designed around an image quadtree which replaces the standard uniform grid patches with a novel content aware patchification. Our experimental results demonstrate that QLIP improves the general visual question answering accuracy of the LLaVA v1.5 model series across various model sizes–without requiring retraining or fine-tuning of the full MLLM. Notably, QLIP boosts detailed understanding performance on the challenging $V^{\ast}$ benchmark by up to 13.6 percent.

nan

Article 323

Title@2025-05-29 (4): Universal Sequence Preconditioning

Title: Universal Sequence Preconditioning

Universelle Sequenz Vorkonditionierung

通用序列序序预设 2502.06545v2

Authors: Annie Marsden, Elad Hazan

We study the problem of preconditioning in the setting of sequential prediction. From the theoretical lens of linear dynamical systems, we show that applying a convolution to the input sequence translates to applying a polynomial to the unknown transition matrix in the hidden space. With this insight, we develop a novel preconditioning method that convolves the input sequence with the coefficients of the Chebyshev or Legendre polynomials. We formally prove that this improves the regret of two distinct prediction methods. Moreover, using this preconditioning technique on either method gives the first sublinear regret bounds that are also hidden dimension independent (up to logarithmic factors) even when the hidden transition matrix is asymmetric. From rigorous experiments on synthetic data we show that our simple preconditioning method generalizes to both 1) settings where the data is not from a linear dynamical system and 2) a broad range of learning algorithms, including recurrent neural networks.

nan

Article 324

Title@2025-05-29 (4): Hybrid Cross-domain Robust Reinforcement Learning

Title: Hybrid Cross-domain Robust Reinforcement Learning

Hybrides Cross-Domain Robustes Verstärkungslernen

跨部门加强强化学习 2505.23003v1

Authors: Linh Le Pham Van, Minh Hoang Nguyen, Hung Le, Hung The Tran, Sunil Gupta

Robust reinforcement learning (RL) aims to learn policies that remain effective despite uncertainties in its environment, which frequently arise in real-world applications due to variations in environment dynamics. The robust RL methods learn a robust policy by maximizing value under the worst-case models within a predefined uncertainty set. Offline robust RL algorithms are particularly promising in scenarios where only a fixed dataset is available and new data cannot be collected. However, these approaches often require extensive offline data, and gathering such datasets for specific tasks in specific environments can be both costly and time-consuming. Using an imperfect simulator offers a faster, cheaper, and safer way to collect data for training, but it can suffer from dynamics mismatch. In this paper, we introduce HYDRO, the first Hybrid Cross-Domain Robust RL framework designed to address these challenges. HYDRO utilizes an online simulator to complement the limited amount of offline datasets in the non-trivial context of robust RL. By measuring and minimizing performance gaps between the simulator and the worst-case models in the uncertainty set, HYDRO employs novel uncertainty filtering and prioritized sampling to select the most relevant and reliable simulator samples. Our extensive experiments demonstrate HYDRO’s superior performance over existing methods across various tasks, underscoring its potential to improve sample efficiency in offline robust RL.

nan

Article 325

Title@2025-05-29 (4): Improved and Oracle-Efficient Online $\ell_1$-Multicalibration

Title: Improved and Oracle-Efficient Online $\ell_1$-Multicalibration

Verbesserte und Oracle-Effizient Online $\ell_1$-Multikalibrierung

改进和 Oracle-Effacient 在线 $\ell_1美元-多边校准 2505.17365v2

Authors: Rohan Ghuge, Vidya Muthukumar, Sahil Singla

We study \emph{online multicalibration}, a framework for ensuring calibrated predictions across multiple groups in adversarial settings, across $T$ rounds. Although online calibration is typically studied in the $\ell_1$ norm, prior approaches to online multicalibration have taken the indirect approach of obtaining rates in other norms (such as $\ell_2$ and $\ell_{\infty}$) and then transferred these guarantees to $\ell_1$ at additional loss. In contrast, we propose a direct method that achieves improved and oracle-efficient rates of $\widetilde{\mathcal{O}}(T^{-1/3})$ and $\widetilde{\mathcal{O}}(T^{-1/4})$ respectively, for online $\ell_1$-multicalibration. Our key insight is a novel reduction of online (\ell_1)-multicalibration to an online learning problem with product-based rewards, which we refer to as \emph{online linear-product optimization} ($\mathtt{OLPO}$). To obtain the improved rate of $\widetilde{\mathcal{O}}(T^{-1/3})$, we introduce a linearization of $\mathtt{OLPO}$ and design a no-regret algorithm for this linearized problem. Although this method guarantees the desired sublinear rate (nearly matching the best rate for online calibration), it is computationally expensive when the group family (\mathcal{H}) is large or infinite, since it enumerates all possible groups. To address scalability, we propose a second approach to $\mathtt{OLPO}$ that makes only a polynomial number of calls to an offline optimization (\emph{multicalibration evaluation}) oracle, resulting in \emph{oracle-efficient} online (\ell_1)-multicalibration with a rate of $\widetilde{\mathcal{O}}(T^{-1/4})$. Our framework also extends to certain infinite families of groups (e.g., all linear functions on the context space) by exploiting a $1$-Lipschitz property of the (\ell_1)-multicalibration error with respect to (\mathcal{H}).

nan

Article 326

Title@2025-05-29 (4): Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning

Title: Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning

Dolphin: Ein programmierbares Framework für skalierbares neurosymbolisches Lernen

Dolphin: 可缩放的神经元学习程序框架 2410.03348v4

Authors: Aaditya Naik, Jason Liu, Claire Wang, Amish Sethi, Saikat Dutta, Mayur Naik, Eric Wong

Neurosymbolic learning enables the integration of symbolic reasoning with deep learning but faces significant challenges in scaling to complex symbolic programs, large datasets, or both. We introduce DOLPHIN, a framework that tackles these challenges by supporting neurosymbolic programs in Python, executing complex symbolic reasoning on the CPU while vectorizing probabilistic computations and gradient propagation on the GPU. Across 13 benchmarks spanning tasks over text, image, and video data, with symbolic reasoning features like recursion and black-box functions, DOLPHIN converges to state-of-the-art accuracies on the more complex benchmarks while existing frameworks such as Scallop, ISED, and IndeCateR+ fail to converge within the time limit. On simpler benchmarks, DOLPHIN matches their performance, while achieving these results 1.71x to 62x faster than the baselines. Overall, DOLPHIN advances the scalability of neurosymbolic frameworks, achieving state-of-the-art efficiency and convergence on difficult benchmarks where existing frameworks struggle. The code is published at https://github.com/Dolphin-NeSy/Dolphin.

nan

Article 327

Title@2025-05-29 (4): A Bayesian Model Selection Criterion for Selecting Pretraining Checkpoints

Title: A Bayesian Model Selection Criterion for Selecting Pretraining Checkpoints

Ein Bayesian Modellauswahl-Kriterium für die Auswahl von Vortrainings-Checkpoints

选择培训前检查站的巴伊西亚示范甄选标准标准 2410.05612v2

Authors: Michael Munn, Susan Wei

Recent advances in artificial intelligence have been fueled by the development of foundation models such as BERT, GPT, T5, and Vision Transformers. These models are first pretrained on vast and diverse datasets and then adapted to specific downstream tasks, often with significantly less data. However, the mechanisms behind the success of this ubiquitous pretrain-then-adapt paradigm remain underexplored, particularly the characteristics of pretraining checkpoints that enhance downstream adaptation. We introduce a Bayesian model selection criterion, called the downstream free energy, which quantifies a checkpoint’s adaptability by measuring the concentration of nearby favorable parameters for the downstream task. We demonstrate that this Bayesian model selection criterion can be effectively implemented without access to the downstream data or prior knowledge of the downstream task. Furthermore, we provide empirical evidence that the criterion reliably correlates with improved finetuning performance, offering a principled approach to predicting model adaptability.

nan

Article 328

Title@2025-05-29 (4): HydraNet: Momentum-Driven State Space Duality for Multi-Granularity Tennis Tournaments Analysis

Title: HydraNet: Momentum-Driven State Space Duality for Multi-Granularity Tennis Tournaments Analysis

HydraNet: Momentum-getriebene State Space-Dualität für Multi-Granularity-Tennisturniere Analyse

Authors: Ruijie Li, Xiang Zhao, Qiao Ning, Shikai Guo

In tennis tournaments, momentum, a critical yet elusive phenomenon, reflects the dynamic shifts in performance of athletes that can decisively influence match outcomes. Despite its significance, momentum in terms of effective modeling and multi-granularity analysis across points, games, sets, and matches in tennis tournaments remains underexplored. In this study, we define a novel Momentum Score (MS) metric to quantify a player’s momentum level in multi-granularity tennis tournaments, and design HydraNet, a momentum-driven state-space duality-based framework, to model MS by integrating thirty-two heterogeneous dimensions of athletes performance in serve, return, psychology and fatigue. HydraNet integrates a Hydra module, which builds upon a state-space duality (SSD) framework, capturing explicit momentum with a sliding-window mechanism and implicit momentum through cross-game state propagation. It also introduces a novel Versus Learning method to better enhance the adversarial nature of momentum between the two athletes at a macro level, along with a Collaborative-Adversarial Attention Mechanism (CAAM) for capturing and integrating intra-player and inter-player dynamic momentum at a micro level. Additionally, we construct a million-level tennis cross-tournament dataset spanning from 2012-2023 Wimbledon and 2013-2023 US Open, and validate the multi-granularity modeling capability of HydraNet for the MS metric on this dataset. Extensive experimental evaluations demonstrate that the MS metric constructed by the HydraNet framework provides actionable insights into how momentum impacts outcomes at different granularities, establishing a new foundation for momentum modeling and sports analysis. To the best of our knowledge, this is the first work to explore and effectively model momentum across multiple granularities in professional tennis tournaments.

nan

Article 329

Title@2025-05-29 (4): Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment

Title: Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment

Jenseits der Belohnung Hacking: Kausale Belohnungen für großsprachige Modellausrichtung

优胜后加分:大语言模型对齐的因果奖励 2501.09620v2

Authors: Chaoqi Wang, Zhuokai Zhao, Yibo Jiang, Zhaorun Chen, Chen Zhu, Yuxin Chen, Jiayi Liu, Lizhu Zhang, Xiangjun Fan, Hao Ma, Sinong Wang

Recent advances in large language models (LLMs) have demonstrated significant progress in performing complex tasks. While Reinforcement Learning from Human Feedback (RLHF) has been effective in aligning LLMs with human preferences, it is susceptible to spurious correlations in reward modeling. Consequently, it often introduces biases-such as length bias, sycophancy, conceptual bias, and discrimination-that hinder the model’s ability to capture true causal relationships. To address this, we propose a novel causal reward modeling approach that integrates causality to mitigate these spurious correlations. Our method enforces counterfactual invariance, ensuring reward predictions remain consistent when irrelevant variables are altered. Through experiments on both synthetic and real-world datasets, we show that our approach mitigates various types of spurious correlations effectively, resulting in more reliable and fair alignment of LLMs with human preferences. As a drop-in enhancement to the existing RLHF workflow, our causal reward modeling provides a practical way to improve the trustworthiness and fairness of LLM finetuning.

nan

Article 330

Title@2025-05-29 (4): ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Title: ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

ReinFlow: Feinsteuerungs-Flow Matching-Politik mit Online-Verstärkungs-Lernen

ReinFlow: 与在线强化学习匹配流动政策的微调 2505.22094v2

Authors: Tonghe Zhang, Chao Yu, Sichang Su, Yu Wang

We propose ReinFlow, a simple yet effective online reinforcement learning (RL) framework that fine-tunes a family of flow matching policies for continuous robotic control. Derived from rigorous RL theory, ReinFlow injects learnable noise into a flow policy’s deterministic path, converting the flow into a discrete-time Markov Process for exact and straightforward likelihood computation. This conversion facilitates exploration and ensures training stability, enabling ReinFlow to fine-tune diverse flow model variants, including Rectified Flow [35] and Shortcut Models [19], particularly at very few or even one denoising step. We benchmark ReinFlow in representative locomotion and manipulation tasks, including long-horizon planning with visual input and sparse reward. The episode reward of Rectified Flow policies obtained an average net growth of 135.36% after fine-tuning in challenging legged locomotion tasks while saving denoising steps and 82.63% of wall time compared to state-of-the-art diffusion RL fine-tuning method DPPO [43]. The success rate of the Shortcut Model policies in state and visual manipulation tasks achieved an average net increase of 40.34% after fine-tuning with ReinFlow at four or even one denoising step, whose performance is comparable to fine-tuned DDIM policies while saving computation time for an average of 23.20%. Project webpage: https://reinflow.github.io/

nan

Article 331

Title@2025-05-29 (4): Is Attention Required for Transformer Inference? Explore Function-preserving Attention Replacement

Title: Is Attention Required for Transformer Inference? Explore Function-preserving Attention Replacement

Ist Achtung für Transformer-Inferenz erforderlich? Erkunden Sie Funktionserhaltende Aufmerksamkeitsersatz

需要注意吗? 探索功能保持注意替换 2505.21535v2

Authors: Yuxin Ren, Maxwell D Collins, Miao Hu, Huanrui Yang

While transformers excel across vision and language pretraining tasks, their reliance on attention mechanisms poses challenges for inference efficiency, especially on edge and embedded accelerators with limited parallelism and memory bandwidth. Hinted by the observed redundancy of attention at inference time, we hypothesize that though the model learns complicated token dependency through pretraining, the inference-time sequence-to-sequence mapping in each attention layer is actually ‘‘simple’’ enough to be represented with a much cheaper function. In this work, we explore FAR, a Function-preserving Attention Replacement framework that replaces all attention blocks in pretrained transformers with learnable sequence-to-sequence modules, exemplified by an LSTM. FAR optimize a multi-head LSTM architecture with a block-wise distillation objective and a global structural pruning framework to achieve a family of efficient LSTM-based models from pretrained transformers. We validate FAR on the DeiT vision transformer family and demonstrate that it matches the accuracy of the original models on ImageNet and multiple downstream tasks with reduced parameters and latency. Further analysis shows that FAR preserves the semantic token relationships and the token-to-token correlation learned in the transformer’s attention module.

nan

Article 332

Title@2025-05-29 (4): LLM Agents for Bargaining with Utility-based Feedback

Title: LLM Agents for Bargaining with Utility-based Feedback

LLM-Agenten für Schnäppchen mit Utility-basiertem Feedback

LLM 与基于利用的反馈进行交涉的代理代理 2505.22998v1

Authors: Jihwan Oh, Murad Aghazada, Se-Young Yun, Taehyeon Kim

Bargaining, a critical aspect of real-world interactions, presents challenges for large language models (LLMs) due to limitations in strategic depth and adaptation to complex human factors. Existing benchmarks often fail to capture this real-world complexity. To address this and enhance LLM capabilities in realistic bargaining, we introduce a comprehensive framework centered on utility-based feedback. Our contributions are threefold: (1) BargainArena, a novel benchmark dataset with six intricate scenarios (e.g., deceptive practices, monopolies) to facilitate diverse strategy modeling; (2) human-aligned, economically-grounded evaluation metrics inspired by utility theory, incorporating agent utility and negotiation power, which implicitly reflect and promote opponent-aware reasoning (OAR); and (3) a structured feedback mechanism enabling LLMs to iteratively refine their bargaining strategies. This mechanism can positively collaborate with in-context learning (ICL) prompts, including those explicitly designed to foster OAR. Experimental results show that LLMs often exhibit negotiation strategies misaligned with human preferences, and that our structured feedback mechanism significantly improves their performance, yielding deeper strategic and opponent-aware reasoning.

nan

Article 333

Title@2025-05-29 (4): Theoretical Foundations of the Deep Copula Classifier: A Generative Approach to Modeling Dependent Features

Title: Theoretical Foundations of the Deep Copula Classifier: A Generative Approach to Modeling Dependent Features

Theoretische Grundlagen des Deep Copula Klassifikators: Ein generativer Ansatz zur Modellierung abhängiger Merkmale

深 Cocula 分类法理论基础:建模附属地貌的开创性方法 2505.22997v1

Authors: Agnideep Aich, Ashit Baran Aich, Bruce Wade

Traditional classifiers often assume feature independence or rely on overly simplistic relationships, leading to poor performance in settings where real-world dependencies matter. We introduce the Deep Copula Classifier (DCC), a generative model that separates the learning of each feature’s marginal distribution from the modeling of their joint dependence structure via neural network-parameterized copulas. For each class, lightweight neural networks are used to flexibly and adaptively capture feature interactions, making DCC particularly effective when classification is driven by complex dependencies. We establish that DCC converges to the Bayes-optimal classifier under standard conditions and provide explicit convergence rates of O(n^{-r/(2r + d)}) for r-smooth copula densities. Beyond theoretical guarantees, we outline several practical extensions, including high-dimensional scalability through vine and factor copula architectures, semi-supervised learning via entropy regularization, and online adaptation using streaming gradient methods. By unifying statistical rigor with the representational power of neural networks, DCC offers a mathematically grounded and interpretable framework for dependency-aware classification.

nan

Article 334

Title@2025-05-29 (4): Walking the Weight Manifold: a Topological Approach to Conditioning Inspired by Neuromodulation

Title: Walking the Weight Manifold: a Topological Approach to Conditioning Inspired by Neuromodulation

Wiege manifold gehen: ein topologischer Ansatz zur Konditionierung Inspiriert durch Neuromodulation

身穿轻重背重力:在神经调节的启发下,从地形学角度处理条件问题 2505.22994v1

Authors: Ari S. Benjamin, Kyle Daruwalla, Christian Pehle, Anthony M. Zador

One frequently wishes to learn a range of similar tasks as efficiently as possible, re-using knowledge across tasks. In artificial neural networks, this is typically accomplished by conditioning a network upon task context by injecting context as input. Brains have a different strategy: the parameters themselves are modulated as a function of various neuromodulators such as serotonin. Here, we take inspiration from neuromodulation and propose to learn weights which are smoothly parameterized functions of task context variables. Rather than optimize a weight vector, i.e. a single point in weight space, we optimize a smooth manifold in weight space with a predefined topology. To accomplish this, we derive a formal treatment of optimization of manifolds as the minimization of a loss functional subject to a constraint on volumetric movement, analogous to gradient descent. During inference, conditioning selects a single point on this manifold which serves as the effective weight matrix for a particular sub-task. This strategy for conditioning has two main advantages. First, the topology of the manifold (whether a line, circle, or torus) is a convenient lever for inductive biases about the relationship between tasks. Second, learning in one state smoothly affects the entire manifold, encouraging generalization across states. To verify this, we train manifolds with several topologies, including straight lines in weight space (for conditioning on e.g. noise level in input data) and ellipses (for rotated images). Despite their simplicity, these parameterizations outperform conditioning identical networks by input concatenation and better generalize to out-of-distribution samples. These results suggest that modulating weights over low-dimensional manifolds offers a principled and effective alternative to traditional conditioning.

nan

Article 335

Title@2025-05-29 (4): Number of Clusters in a Dataset: A Regularized K-means Approach

Title: Number of Clusters in a Dataset: A Regularized K-means Approach

Anzahl der Cluster in einem Datensatz: Ein regularisierter K-Mittelansatz

数据集中的组群数量:正规化的K手段方法 2505.22991v1

Authors: Behzad Kamgar-Parsi, Behrooz Kamgar-Parsi

Finding the number of meaningful clusters in an unlabeled dataset is important in many applications. Regularized k-means algorithm is a possible approach frequently used to find the correct number of distinct clusters in datasets. The most common formulation of the regularization function is the additive linear term $\lambda k$, where $k$ is the number of clusters and $\lambda$ a positive coefficient. Currently, there are no principled guidelines for setting a value for the critical hyperparameter $\lambda$. In this paper, we derive rigorous bounds for $\lambda$ assuming clusters are {\em ideal}. Ideal clusters (defined as $d$-dimensional spheres with identical radii) are close proxies for k-means clusters ($d$-dimensional spherically symmetric distributions with identical standard deviations). Experiments show that the k-means algorithm with additive regularizer often yields multiple solutions. Thus, we also analyze k-means algorithm with multiplicative regularizer. The consensus among k-means solutions with additive and multiplicative regularizations reduces the ambiguity of multiple solutions in certain cases. We also present selected experiments that demonstrate performance of the regularized k-means algorithms as clusters deviate from the ideal assumption.

nan

Article 336

Title@2025-05-29 (4): MenTeR: A fully-automated Multi-agenT workflow for end-to-end RF/Analog Circuits Netlist Design

Title: MenTeR: A fully-automated Multi-agenT workflow for end-to-end RF/Analog Circuits Netlist Design

MenTeR: Ein vollautomatisierter Multi-AgenT-Workflow für End-to-End-RF/Analog-Schaltungen Netlist Design

MenTeR: 终端至终端RF/Analog 电路网络列表设计全自动多元T工作流程 2505.22990v1

Authors: Pin-Han Chen, Yu-Sheng Lin, Wei-Cheng Lee, Tin-Yu Leu, Po-Hsiang Hsu, Anjana Dissanayake, Sungjin Oh, Chinq-Shiun Chiu

RF/Analog design is essential for bridging digital technologies with real-world signals, ensuring the functionality and reliability of a wide range of electronic systems. However, analog design procedures are often intricate, time-consuming and reliant on expert intuition, and hinder the time and cost efficiency of circuit development. To overcome the limitations of the manual circuit design, we introduce MenTeR - a multiagent workflow integrated into an end-to-end analog design framework. By employing multiple specialized AI agents that collaboratively address different aspects of the design process, such as specification understanding, circuit optimization, and test bench validation, MenTeR reduces the dependency on frequent trial-and-error-style intervention. MenTeR not only accelerates the design cycle time but also facilitates a broader exploration of the design space, demonstrating robust capabilities in handling real-world analog systems. We believe that MenTeR lays the groundwork for future “RF/Analog Copilots” that can collaborate seamlessly with human designers.

nan

Article 337

Title@2025-05-29 (4): Effects of Dropout on Performance in Long-range Graph Learning Tasks

Title: Effects of Dropout on Performance in Long-range Graph Learning Tasks

Auswirkungen des Dropouts auf die Leistungsfähigkeit in großflächigen Graphen-Lernaufgaben

辍学对远程图表学习任务绩效的影响 2502.07364v2

Authors: Jasraj Singh, Keyue Jiang, Brooks Paige, Laura Toni

Message Passing Neural Networks (MPNNs) are a class of Graph Neural Networks (GNNs) that propagate information across the graph via local neighborhoods. The scheme gives rise to two key challenges: over-smoothing and over-squashing. While several Dropout-style algorithms, such as DropEdge and DropMessage, have successfully addressed over-smoothing, their impact on over-squashing remains largely unexplored. This represents a critical gap in the literature, as failure to mitigate over-squashing would make these methods unsuitable for long-range tasks – the intended use case of deep MPNNs. In this work, we study the aforementioned algorithms, and closely related edge-dropping algorithms – DropNode, DropAgg and DropGNN – in the context of over-squashing. We present theoretical results showing that DropEdge-variants reduce sensitivity between distant nodes, limiting their suitability for long-range tasks. To address this, we introduce DropSens, a sensitivity-aware variant of DropEdge that explicitly controls the proportion of information lost due to edge-dropping, thereby increasing sensitivity to distant nodes despite dropping the same number of edges. Our experiments on long-range synthetic and real-world datasets confirm the predicted limitations of existing edge-dropping and feature-dropping methods. Moreover, DropSens consistently outperforms graph rewiring techniques designed to mitigate over-squashing, suggesting that simple, targeted modifications can substantially improve a model’s ability to capture long-range interactions. Our conclusions highlight the need to re-evaluate and re-design existing methods for training deep GNNs, with a renewed focus on modelling long-range interactions.

nan

Article 338

Title@2025-05-29 (4): Model-Preserving Adaptive Rounding

Title: Model-Preserving Adaptive Rounding

Modellschonende adaptive Rundung

模型保护适应性四舍五入 2505.22988v1

Authors: Albert Tseng, Zhaofeng Sun, Christopher De Sa

The main goal of post-training quantization (PTQ) is to produced a compressed model whose output distribution is as close to the original model’s as possible. To do this tractably, almost all LLM PTQ algorithms quantize linear layers by independently minimizing the immediate activation error. However, this localized objective ignores the effect of subsequent layers, so reducing it does not necessarily give a closer model. In this work, we introduce Yet Another Quantization Algorithm (YAQA), an adaptive rounding algorithm that uses Kronecker-factored approximations of each linear layer’s Hessian with respect to the \textit{full model} KL divergence. YAQA consists of two components: Kronecker-factored sketches of the full layerwise Hessian that can be tractably computed for hundred-billion parameter LLMs, and a quantizer-independent rounding algorithm that uses these sketches and comes with theoretical guarantees. Across a wide range of models and quantizers, YAQA empirically reduces the KL divergence to the original model by $\approx 30\%$ while achieving state of the art performance on downstream tasks.

nan

Article 339

Title@2025-05-29 (4): Knowledge Distillation for Reservoir-based Classifier: Human Activity Recognition

Title: Knowledge Distillation for Reservoir-based Classifier: Human Activity Recognition

Wissensdestillation für Reservoir-basierte Klassifikator: Menschliche Aktivitätserkennung

以储量为基础的分类法知识蒸馏:人类活动认识 2505.22985v1

Authors: Masaharu Kagiyama, Tsuyoshi Okita

This paper aims to develop an energy-efficient classifier for time-series data by introducing PatchEchoClassifier, a novel model that leverages a reservoir-based mechanism known as the Echo State Network (ESN). The model is designed for human activity recognition (HAR) using one-dimensional sensor signals and incorporates a tokenizer to extract patch-level representations. To train the model efficiently, we propose a knowledge distillation framework that transfers knowledge from a high-capacity MLP-Mixer teacher to the lightweight reservoir-based student model. Experimental evaluations on multiple HAR datasets demonstrate that our model achieves over 80 percent accuracy while significantly reducing computational cost. Notably, PatchEchoClassifier requires only about one-sixth of the floating point operations (FLOPS) compared to DeepConvLSTM, a widely used convolutional baseline. These results suggest that PatchEchoClassifier is a promising solution for real-time and energy-efficient human activity recognition in edge computing environments.

nan

Article 340

Title@2025-05-29 (4): A Computational Approach to Improving Fairness in K-means Clustering

Title: A Computational Approach to Improving Fairness in K-means Clustering

Ein Computational Approach zur Verbesserung der Fairness im K-Mittel-Clustering

改进K类手段分类组合的公平性计算方法 2505.22984v1

Authors: Guancheng Zhou, Haiping Xu, Hongkang Xu, Chenyu Li, Donghui Yan

The popular K-means clustering algorithm potentially suffers from a major weakness for further analysis or interpretation. Some cluster may have disproportionately more (or fewer) points from one of the subpopulations in terms of some sensitive variable, e.g., gender or race. Such a fairness issue may cause bias and unexpected social consequences. This work attempts to improve the fairness of K-means clustering with a two-stage optimization formulation–clustering first and then adjust cluster membership of a small subset of selected data points. Two computationally efficient algorithms are proposed in identifying those data points that are expensive for fairness, with one focusing on nearest data points outside of a cluster and the other on highly ‘mixed’ data points. Experiments on benchmark datasets show substantial improvement on fairness with a minimal impact to clustering quality. The proposed algorithms can be easily extended to a broad class of clustering algorithms or fairness metrics.

nan

Article 341

Title@2025-05-29 (4): MedRAX: Medical Reasoning Agent for Chest X-ray

Title: MedRAX: Medical Reasoning Agent for Chest X-ray

MedRAX: Medizinischer Reasoning Agent für Bruströntgen

MedraX: 胸前X光医疗理疗代理 2502.02673v2

Authors: Adibvafa Fallahpour, Jun Ma, Alif Munim, Hongwei Lyu, Bo Wang

Chest X-rays (CXRs) play an integral role in driving critical decisions in disease management and patient care. While recent innovations have led to specialized models for various CXR interpretation tasks, these solutions often operate in isolation, limiting their practical utility in clinical practice. We present MedRAX, the first versatile AI agent that seamlessly integrates state-of-the-art CXR analysis tools and multimodal large language models into a unified framework. MedRAX dynamically leverages these models to address complex medical queries without requiring additional training. To rigorously evaluate its capabilities, we introduce ChestAgentBench, a comprehensive benchmark containing 2,500 complex medical queries across 7 diverse categories. Our experiments demonstrate that MedRAX achieves state-of-the-art performance compared to both open-source and proprietary models, representing a significant step toward the practical deployment of automated CXR interpretation systems. Data and code have been publicly available at https://github.com/bowang-lab/MedRAX

nan

Article 342

Title@2025-05-29 (4): Theoretical guarantees on the best-of-n alignment policy

Title: Theoretical guarantees on the best-of-n alignment policy

Theoretische Garantien für die optimale Ausrichtungspolitik

关于最佳协调政策理论保障 2401.01879v3

Authors: Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D’Amour, Jacob Eisenstein, Chirag Nagpal, Ananda Theertha Suresh

A simple and effective method for the inference-time alignment and scaling test-time compute of generative models is best-of-$n$ sampling, where $n$ samples are drawn from a reference policy, ranked based on a reward function, and the highest ranking one is selected. A commonly used analytical expression in the literature claims that the KL divergence between the best-of-$n$ policy and the reference policy is equal to $\log (n) - (n-1)/n.$ We disprove the validity of this claim, and show that it is an upper bound on the actual KL divergence. We also explore the tightness of this upper bound in different regimes, and propose a new estimator for the KL divergence and empirically show that it provides a tight approximation. We also show that the win rate of the best-of-$n$ policy against the reference policy is upper bounded by $n/(n+1)$ and derive bounds on the tightness of this characterization. We conclude with analyzing the tradeoffs between win rate and KL divergence of the best-of-$n$ alignment policy, which demonstrate that very good tradeoffs are achievable with $n < 1000$.

nan

Article 343

Title@2025-05-29 (4): Learning coordinated badminton skills for legged manipulators

Title: Learning coordinated badminton skills for legged manipulators

Koordinierte Badminton-Fähigkeiten für Legged Manipulatoren lernen

为腿脚操纵者学习协调的羽毛球技能 2505.22974v1

Authors: Yuntao Ma, Andrei Cramariuc, Farbod Farshidian, Marco Hutter

Coordinating the motion between lower and upper limbs and aligning limb control with perception are substantial challenges in robotics, particularly in dynamic environments. To this end, we introduce an approach for enabling legged mobile manipulators to play badminton, a task that requires precise coordination of perception, locomotion, and arm swinging. We propose a unified reinforcement learning-based control policy for whole-body visuomotor skills involving all degrees of freedom to achieve effective shuttlecock tracking and striking. This policy is informed by a perception noise model that utilizes real-world camera data, allowing for consistent perception error levels between simulation and deployment and encouraging learned active perception behaviors. Our method includes a shuttlecock prediction model, constrained reinforcement learning for robust motion control, and integrated system identification techniques to enhance deployment readiness. Extensive experimental results in a variety of environments validate the robot’s capability to predict shuttlecock trajectories, navigate the service area effectively, and execute precise strikes against human players, demonstrating the feasibility of using legged mobile manipulators in complex and dynamic sports scenarios.

nan

Article 344

Title@2025-05-29 (4): EquiReg: Equivariance Regularized Diffusion for Inverse Problems

Title: EquiReg: Equivariance Regularized Diffusion for Inverse Problems

EquiReg: Äquivarianz Regularisierte Diffusion für Inverse Probleme

equireg: 用于反向问题的公平、正规化传播 2505.22973v1

Authors: Bahareh Tolooshams, Aditi Chandrashekar, Rayhan Zirvi, Abbas Mammadov, Jiachen Yao, Chuwei Wang, Anima Anandkumar

Diffusion models represent the state-of-the-art for solving inverse problems such as image restoration tasks. In the Bayesian framework, diffusion-based inverse solvers incorporate a likelihood term to guide the prior sampling process, generating data consistent with the posterior distribution. However, due to the intractability of the likelihood term, many current methods rely on isotropic Gaussian approximations, which lead to deviations from the data manifold and result in inconsistent, unstable reconstructions. We propose Equivariance Regularized (EquiReg) diffusion, a general framework for regularizing posterior sampling in diffusion-based inverse problem solvers. EquiReg enhances reconstructions by reweighting diffusion trajectories and penalizing those that deviate from the data manifold. We define a new distribution-dependent equivariance error, empirically identify functions that exhibit low error for on-manifold samples and higher error for off-manifold samples, and leverage these functions to regularize the diffusion sampling process. When applied to a variety of solvers, EquiReg outperforms state-of-the-art diffusion models in both linear and nonlinear image restoration tasks, as well as in reconstructing partial differential equations.

nan

Article 345

Title@2025-05-29 (4): Minimal Sufficient Views: A DNN model making predictions with more evidence has higher accuracy

Title: Minimal Sufficient Views: A DNN model making predictions with more evidence has higher accuracy

Minimal Ausreichende Ansichten: Ein DNN-Modell, das Vorhersagen mit mehr Beweisen macht, hat höhere Genauigkeit

最低限度的充分意见:一个DNN模型,用更多证据作出预测,其准确性更高 2402.01095v2

Authors: Keisuke Kawano, Takuro Kutsuna, Keisuke Sano

Deep neural networks (DNNs) exhibit high performance in image recognition; however, the reasons for their strong generalization abilities remain unclear. A plausible hypothesis is that DNNs achieve robust and accurate predictions by identifying multiple pieces of evidence from images. Thus, to test this hypothesis, this study proposed minimal sufficient views (MSVs). MSVs is defined as a set of minimal regions within an input image that are sufficient to preserve the prediction of DNNs, thus representing the evidence discovered by the DNN. We empirically demonstrated a strong correlation between the number of MSVs (i.e., the number of pieces of evidence) and the generalization performance of the DNN models. Remarkably, this correlation was found to hold within a single DNN as well as between different DNNs, including convolutional and transformer models. This suggested that a DNN model that makes its prediction based on more evidence has a higher generalization performance. We proposed a metric based on MSVs for DNN model selection that did not require label information. Consequently, we empirically showed that the proposed metric was less dependent on the degree of overfitting, rendering it a more reliable indicator of model performance than existing metrics, such as average confidence.

nan

Article 346

Title@2025-05-29 (4): MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary Programming

Title: MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary Programming

MermaidFlow: Neudefinition der agentischen Workflow-Generierung durch sicherheitsbeschränkte evolutionäre Programmierung

美人鱼:通过受安全限制的进化方案拟订,重新确定干燥性工作流的产生 2505.22967v1

Authors: Chengqi Zheng, Jianda Chen, Yueming Lyu, Wen Zheng Terence Ng, Haopeng Zhang, Yew-Soon Ong, Ivor Tsang, Haiyan Yin

Despite the promise of autonomous agentic reasoning, existing workflow generation methods frequently produce fragile, unexecutable plans due to unconstrained LLM-driven construction. We introduce MermaidFlow, a framework that redefines the agentic search space through safety-constrained graph evolution. At its core, MermaidFlow represent workflows as a verifiable intermediate representation using Mermaid, a structured and human-interpretable graph language. We formulate domain-aware evolutionary operators, i.e., crossover, mutation, insertion, and deletion, to preserve semantic correctness while promoting structural diversity, enabling efficient exploration of a high-quality, statically verifiable workflow space. Without modifying task settings or evaluation protocols, MermaidFlow achieves consistent improvements in success rates and faster convergence to executable plans on the agent reasoning benchmark. The experimental results demonstrate that safety-constrained graph evolution offers a scalable, modular foundation for robust and interpretable agentic reasoning systems.

nan

Article 347

Title@2025-05-29 (4): Exploring Scaling Laws for EHR Foundation Models

Title: Exploring Scaling Laws for EHR Foundation Models

Erforschung von Skalierungsgesetzen für EHR-Stiftungsmodelle

探索EHR基金会模式的扩展法律 2505.22964v1

Authors: Sheng Zhang, Qin Liu, Naoto Usuyama, Cliff Wong, Tristan Naumann, Hoifung Poon

The emergence of scaling laws has profoundly shaped the development of large language models (LLMs), enabling predictable performance gains through systematic increases in model size, dataset volume, and compute. Yet, these principles remain largely unexplored in the context of electronic health records (EHRs) – a rich, sequential, and globally abundant data source that differs structurally from natural language. In this work, we present the first empirical investigation of scaling laws for EHR foundation models. By training transformer architectures on patient timeline data from the MIMIC-IV database across varying model sizes and compute budgets, we identify consistent scaling patterns, including parabolic IsoFLOPs curves and power-law relationships between compute, model parameters, data size, and clinical utility. These findings demonstrate that EHR models exhibit scaling behavior analogous to LLMs, offering predictive insights into resource-efficient training strategies. Our results lay the groundwork for developing powerful EHR foundation models capable of transforming clinical prediction tasks and advancing personalized healthcare.

nan

Article 348

Title@2025-05-29 (4): INRFlow: Flow Matching for INRs in Ambient Space

Title: INRFlow: Flow Matching for INRs in Ambient Space

INRFlow: Flow Passend für INRs im Umgebungsraum

INFRFlow: 环境空间IRR的流量匹配 2412.03791v2

Authors: Yuyang Wang, Anurag Ranjan, Josh Susskind, Miguel Angel Bautista

Flow matching models have emerged as a powerful method for generative modeling on domains like images or videos, and even on irregular or unstructured data like 3D point clouds or even protein structures. These models are commonly trained in two stages: first, a data compressor is trained, and in a subsequent training stage a flow matching generative model is trained in the latent space of the data compressor. This two-stage paradigm sets obstacles for unifying models across data domains, as hand-crafted compressors architectures are used for different data modalities. To this end, we introduce INRFlow, a domain-agnostic approach to learn flow matching transformers directly in ambient space. Drawing inspiration from INRs, we introduce a conditionally independent point-wise training objective that enables INRFlow to make predictions continuously in coordinate space. Our empirical results demonstrate that INRFlow effectively handles different data modalities such as images, 3D point clouds and protein structure data, achieving strong performance in different domains and outperforming comparable approaches. INRFlow is a promising step towards domain-agnostic flow matching generative models that can be trivially adopted in different data domains.

nan

Article 349

Title@2025-05-29 (4): ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

Title: ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

ToMAP: Training Gegner-Bewusst LLM überzeugt mit Theorie des Geistes

ToMAP:培训有思想理论的对抗者软件软件LLM 2505.22961v1

Authors: Peixuan Han, Zijia Liu, Jiaxuan You

Large language models (LLMs) have shown promising potential in persuasion, but existing works on training LLM persuaders are still preliminary. Notably, while humans are skilled in modeling their opponent’s thoughts and opinions proactively and dynamically, current LLMs struggle with such Theory of Mind (ToM) reasoning, resulting in limited diversity and opponent awareness. To address this limitation, we introduce Theory of Mind Augmented Persuader (ToMAP), a novel approach for building more flexible persuader agents by incorporating two theory of mind modules that enhance the persuader’s awareness and analysis of the opponent’s mental state. Specifically, we begin by prompting the persuader to consider possible objections to the target central claim, and then use a text encoder paired with a trained MLP classifier to predict the opponent’s current stance on these counterclaims. Our carefully designed reinforcement learning schema enables the persuader learns how to analyze opponent-related information and utilize it to generate more effective arguments. Experiments show that the ToMAP persuader, while containing only 3B parameters, outperforms much larger baselines, like GPT-4o, with a relative gain of 39.4% across multiple persuadee models and diverse corpora. Notably, ToMAP exhibits complex reasoning chains and reduced repetition during training, which leads to more diverse and effective arguments. The opponent-aware feature of ToMAP also makes it suitable for long conversations and enables it to employ more logical and opponent-aware strategies. These results underscore our method’s effectiveness and highlight its potential for developing more persuasive language agents. Code is available at: https://github.com/ulab-uiuc/ToMAP.

nan

Article 350

Title@2025-05-29 (4): Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

Title: Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

Multi-Agenten-Debatte als Test-Time Scaling: Eine systematische Studie der bedingten Wirksamkeit

重新审议作为试验时间尺度的多机构辩论:对有条件有效性的系统研究 2505.22960v1

Authors: Yongjin Yang, Euiin Yi, Jongwoo Ko, Kimin Lee, Zhijing Jin, Se-Young Yun

The remarkable growth in large language model (LLM) capabilities has spurred exploration into multi-agent systems, with debate frameworks emerging as a promising avenue for enhanced problem-solving. These multi-agent debate (MAD) approaches, where agents collaboratively present, critique, and refine arguments, potentially offer improved reasoning, robustness, and diverse perspectives over monolithic models. Despite prior studies leveraging MAD, a systematic understanding of its effectiveness compared to self-agent methods, particularly under varying conditions, remains elusive. This paper seeks to fill this gap by conceptualizing MAD as a test-time computational scaling technique, distinguished by collaborative refinement and diverse exploration capabilities. We conduct a comprehensive empirical investigation comparing MAD with strong self-agent test-time scaling baselines on mathematical reasoning and safety-related tasks. Our study systematically examines the influence of task difficulty, model scale, and agent diversity on MAD’s performance. Key findings reveal that, for mathematical reasoning, MAD offers limited advantages over self-agent scaling but becomes more effective with increased problem difficulty and decreased model capability, while agent diversity shows little benefit. Conversely, for safety tasks, MAD’s collaborative refinement can increase vulnerability, but incorporating diverse agent configurations facilitates a gradual reduction in attack success through the collaborative refinement process. We believe our findings provide critical guidance for the future development of more effective and strategically deployed MAD systems.

nan

Article 351

Title@2025-05-29 (4): Unveiling Environmental Impacts of Large Language Model Serving: A Functional Unit View

Title: Unveiling Environmental Impacts of Large Language Model Serving: A Functional Unit View

Enthüllen von Umweltauswirkungen von großsprachigen Modellen: Eine funktionale Einheitsansicht

大型语文服务模式的不懈环境影响:职能单位观点 2502.11256v2

Authors: Yanran Wu, Inez Hua, Yi Ding

Large language models (LLMs) offer powerful capabilities but come with significant environmental impact, particularly in carbon emissions. Existing studies benchmark carbon emissions but lack a standardized basis for comparison across different model configurations. To address this, we introduce the concept of functional unit (FU) as a standardized basis and develop FUEL, the first FU-based framework for evaluating LLM serving’s environmental impact. Through three case studies, we uncover key insights and trade-offs in reducing carbon emissions by optimizing model size, quantization strategy, and hardware choice, paving the way for more sustainable LLM serving. The code is available at https://github.com/jojacola/FUEL.

nan

Article 352

Title@2025-05-29 (4): CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance

Title: CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance

CodeSteer: Symbolisch-Augmentierte Sprachmodelle über Code/Text Anleitung

代码器:通过编码/文本指导的代码/文本指导的代码器:代号辅助语言模式 2502.04350v2

Authors: Yongchao Chen, Yilun Hao, Yueying Liu, Yang Zhang, Chuchu Fan

Existing methods fail to effectively steer Large Language Models (LLMs) between textual reasoning and code generation, leaving symbolic computing capabilities underutilized. We introduce CodeSteer, an effective method for guiding LLM code/text generation. We construct a comprehensive benchmark SymBench comprising 37 symbolic tasks with adjustable complexity and also synthesize datasets of 12k multi-turn guidance/generation trajectories and 5.5k guidance comparison pairs. We fine-tune the Llama-3-8B model with a newly designed multi-turn supervised fine-tuning (SFT) and direct preference optimization (DPO). The resulting model, CodeSteerLLM, augmented with the proposed symbolic and self-answer checkers, effectively guides the code/text generation of larger models. Augmenting GPT-4o with CodeSteer raises its average performance score from 53.3 to 86.4, even outperforming the existing best LLM OpenAI o1 (82.7), o1-preview (74.8), and DeepSeek R1 (76.8) across all 37 tasks (28 seen, 9 unseen). Trained for GPT-4o, CodeSteer demonstrates superior generalizability, providing an average 41.8 performance boost on Claude, Mistral, and GPT-3.5. CodeSteer-guided LLMs fully harness symbolic computing to maintain strong performance on highly complex tasks. Models, Datasets, and Codes are available at https://github.com/yongchao98/CodeSteer-v1.0 and https://huggingface.co/yongchao98.

nan

Article 353

Title@2025-05-29 (4): Understanding Bias Reinforcement in LLM Agents Debate

Title: Understanding Bias Reinforcement in LLM Agents Debate

Verständnis der Bias-Verstärkung in LLM-Agenten-Debatte

了解LLLM代理商的强化申请 2503.16814v2

Authors: Jihwan Oh, Minchan Jeong, Jongwoo Ko, Se-Young Yun

Large Language Models $($LLMs$)$ solve complex problems using training-free methods like prompt engineering and in-context learning, yet ensuring reasoning correctness remains challenging. While self-correction methods such as self-consistency and self-refinement aim to improve reliability, they often reinforce biases due to the lack of effective feedback mechanisms. Multi-Agent Debate $($MAD$)$ has emerged as an alternative, but we identify two key limitations: bias reinforcement, where debate amplifies model biases instead of correcting them, and lack of perspective diversity, as all agents share the same model and reasoning patterns, limiting true debate effectiveness. To systematically evaluate these issues, we introduce $\textit{MetaNIM Arena}$, a benchmark designed to assess LLMs in adversarial strategic decision-making, where dynamic interactions influence optimal decisions. To overcome MAD’s limitations, we propose $\textbf{DReaMAD}$ $($$\textbf{D}$iverse $\textbf{Rea}$soning via $\textbf{M}$ulti-$\textbf{A}$gent $\textbf{D}$ebate with Refined Prompt$)$, a novel framework that $(1)$ refines LLM’s strategic prior knowledge to improve reasoning quality and $(2)$ promotes diverse viewpoints within a single model by systematically modifying prompts, reducing bias. Empirical results show that $\textbf{DReaMAD}$ significantly improves decision accuracy, reasoning diversity, and bias mitigation across multiple strategic tasks, establishing it as a more effective approach for LLM-based decision-making.

nan

Article 354

Title@2025-05-29 (4): Performance Guaranteed Poisoning Attacks in Federated Learning: A Sliding Mode Approach

Title: Performance Guaranteed Poisoning Attacks in Federated Learning: A Sliding Mode Approach

Leistungsgarantie Vergiftung Angriffe im Föderierten Lernen: Ein Schiebemodus Ansatz

联邦学习中保证中毒袭击的绩效:一种脱落模式方法 2505.16403v2

Authors: Huazi Pan, Yanjun Zhang, Leo Yu Zhang, Scott Adams, Abbas Kouzani, Suiyang Khoo

Manipulation of local training data and local updates, i.e., the poisoning attack, is the main threat arising from the collaborative nature of the federated learning (FL) paradigm. Most existing poisoning attacks aim to manipulate local data/models in a way that causes denial-of-service (DoS) issues. In this paper, we introduce a novel attack method, named Federated Learning Sliding Attack (FedSA) scheme, aiming at precisely introducing the extent of poisoning in a subtle controlled manner. It operates with a predefined objective, such as reducing global model’s prediction accuracy by 10%. FedSA integrates robust nonlinear control-Sliding Mode Control (SMC) theory with model poisoning attacks. It can manipulate the updates from malicious clients to drive the global model towards a compromised state, achieving this at a controlled and inconspicuous rate. Additionally, leveraging the robust control properties of FedSA allows precise control over the convergence bounds, enabling the attacker to set the global accuracy of the poisoned model to any desired level. Experimental results demonstrate that FedSA can accurately achieve a predefined global accuracy with fewer malicious clients while maintaining a high level of stealth and adjustable learning rates.

nan

Article 355

Title@2025-05-29 (4): CellFlux: Simulating Cellular Morphology Changes via Flow Matching

Title: CellFlux: Simulating Cellular Morphology Changes via Flow Matching

CellFlux: simulierende zelluläre Morphologie-Änderungen durch Flow Matching

细胞通量:通过流动匹配模拟细胞生理变化 2502.09775v2

Authors: Yuhui Zhang, Yuchang Su, Chenyu Wang, Tianhong Li, Zoe Wefers, Jeffrey Nirschl, James Burgess, Daisy Ding, Alejandro Lozano, Emma Lundberg, Serena Yeung-Levy

Building a virtual cell capable of accurately simulating cellular behaviors in silico has long been a dream in computational biology. We introduce CellFlux, an image-generative model that simulates cellular morphology changes induced by chemical and genetic perturbations using flow matching. Unlike prior methods, CellFlux models distribution-wise transformations from unperturbed to perturbed cell states, effectively distinguishing actual perturbation effects from experimental artifacts such as batch effects – a major challenge in biological data. Evaluated on chemical (BBBC021), genetic (RxRx1), and combined perturbation (JUMP) datasets, CellFlux generates biologically meaningful cell images that faithfully capture perturbation-specific morphological changes, achieving a 35% improvement in FID scores and a 12% increase in mode-of-action prediction accuracy over existing methods. Additionally, CellFlux enables continuous interpolation between cellular states, providing a potential tool for studying perturbation dynamics. These capabilities mark a significant step toward realizing virtual cell modeling for biomedical research. Project page: https://yuhui-zh15.github.io/CellFlux/.

nan

Article 356

Title@2025-05-29 (4): Directed Graph Grammars for Sequence-based Learning

Title: Directed Graph Grammars for Sequence-based Learning

Gezielte Graphen-Grammatik für sequenzbasiertes Lernen

以序列为基础的学习方向图表语法 2505.22949v1

Authors: Michael Sun, Orion Foo, Gang Liu, Wojciech Matusik, Jie Chen

Directed acyclic graphs (DAGs) are a class of graphs commonly used in practice, with examples that include electronic circuits, Bayesian networks, and neural architectures. While many effective encoders exist for DAGs, it remains challenging to decode them in a principled manner, because the nodes of a DAG can have many different topological orders. In this work, we propose a grammar-based approach to constructing a principled, compact and equivalent sequential representation of a DAG. Specifically, we view a graph as derivations over an unambiguous grammar, where the DAG corresponds to a unique sequence of production rules. Equivalently, the procedure to construct such a description can be viewed as a lossless compression of the data. Such a representation has many uses, including building a generative model for graph generation, learning a latent space for property prediction, and leveraging the sequence representational continuity for Bayesian Optimization over structured data. Code is available at https://github.com/shiningsunnyday/induction.

nan

Article 357

Title@2025-05-28 (3): NegVQA: Can Vision Language Models Understand Negation?

Title: NegVQA: Can Vision Language Models Understand Negation?

NegVQA: Können Visions-Sprachmodelle Negation verstehen?

NegVQA:视觉语言模式能理解差吗? 2505.22946v1

Authors: Yuhui Zhang, Yuchang Su, Yiming Liu, Serena Yeung-Levy

Negation is a fundamental linguistic phenomenon that can entirely reverse the meaning of a sentence. As vision language models (VLMs) continue to advance and are deployed in high-stakes applications, assessing their ability to comprehend negation becomes essential. To address this, we introduce NegVQA, a visual question answering (VQA) benchmark consisting of 7,379 two-choice questions covering diverse negation scenarios and image-question distributions. We construct NegVQA by leveraging large language models to generate negated versions of questions from existing VQA datasets. Evaluating 20 state-of-the-art VLMs across seven model families, we find that these models struggle significantly with negation, exhibiting a substantial performance drop compared to their responses to the original questions. Furthermore, we uncover a U-shaped scaling trend, where increasing model size initially degrades performance on NegVQA before leading to improvements. Our benchmark reveals critical gaps in VLMs’ negation understanding and offers insights into future VLM development. Project page available at https://yuhui-zh15.github.io/NegVQA/.

nan

Article 358

Title@2025-05-28 (3): Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates

Title: Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates

Kann LLMs CLIP deciive? Benchmarking Adversarial Compositionalität der vortrainierten multimodalen Darstellung über Textaktualisierungen

LLMs CLIP能否通过文本更新确定培训前多模式代表的反向构成基准? 2505.22943v1

Authors: Jaewoo Ahn, Heeseung Yun, Dayoon Ko, Gunhee Kim

While pre-trained multimodal representations (e.g., CLIP) have shown impressive capabilities, they exhibit significant compositional vulnerabilities leading to counterintuitive judgments. We introduce Multimodal Adversarial Compositionality (MAC), a benchmark that leverages large language models (LLMs) to generate deceptive text samples to exploit these vulnerabilities across different modalities and evaluates them through both sample-wise attack success rate and group-wise entropy-based diversity. To improve zero-shot methods, we propose a self-training approach that leverages rejection-sampling fine-tuning with diversity-promoting filtering, which enhances both attack success rate and sample diversity. Using smaller language models like Llama-3.1-8B, our approach demonstrates superior performance in revealing compositional vulnerabilities across various multimodal representations, including images, videos, and audios.

nan

Article 359

Title@2025-05-28 (3): Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified?

Title: Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified?

Sind Domain Generalization Benchmarks mit Genauigkeit auf der Zeile falsch angegeben?

域通用基准与误标线的准确性是否一致? 2504.00186v2

Authors: Olawale Salaudeen, Nicole Chiou, Shiny Weng, Sanmi Koyejo

Spurious correlations are unstable statistical associations that hinder robust decision-making. Conventional wisdom suggests that models relying on such correlations will fail to generalize out-of-distribution (OOD), especially under strong distribution shifts. However, empirical evidence challenges this view as naive in-distribution empirical risk minimizers often achieve the best OOD accuracy across popular OOD generalization benchmarks. In light of these results, we propose a different perspective: many widely used benchmarks for evaluating robustness to spurious correlations are misspecified. Specifically, they fail to include shifts in spurious correlations that meaningfully impact OOD generalization, making them unsuitable for evaluating the benefit of removing such correlations. We establish conditions under which a distribution shift can reliably assess a model’s reliance on spurious correlations. Crucially, under these conditions, we should not observe a strong positive correlation between in-distribution and OOD accuracy, often called “accuracy on the line.” Yet, most state-of-the-art benchmarks exhibit this pattern, suggesting they do not effectively assess robustness. Our findings expose a key limitation in current benchmarks used to evaluate domain generalization algorithms, that is, models designed to avoid spurious correlations. We highlight the need to rethink how robustness to spurious correlations is assessed, identify well-specified benchmarks the field should prioritize, and enumerate strategies for designing future benchmarks that meaningfully reflect robustness under distribution shift.

nan

Article 360

Title: Generative Social Choice: The Next Generation

Generative soziale Wahl: Die nächste Generation

产生社会选择:下一代 2505.22939v1

Authors: Niclas Boehmer, Sara Fish, Ariel D. Procaccia

A key task in certain democratic processes is to produce a concise slate of statements that proportionally represents the full spectrum of user opinions. This task is similar to committee elections, but unlike traditional settings, the candidate set comprises all possible statements of varying lengths, and so it can only be accessed through specific queries. Combining social choice and large language models, prior work has approached this challenge through a framework of generative social choice. We extend the framework in two fundamental ways, providing theoretical guarantees even in the face of approximately optimal queries and a budget limit on the overall length of the slate. Using GPT-4o to implement queries, we showcase our approach on datasets related to city improvement measures and drug reviews, demonstrating its effectiveness in generating representative slates from unstructured user opinions.

nan

Article 361

Title@2025-05-28 (3): Is Noise Conditioning Necessary? A Unified Theory of Unconditional Graph Diffusion Models

Title: Is Noise Conditioning Necessary? A Unified Theory of Unconditional Graph Diffusion Models

Ist die Lärmkonditionierung notwendig? Eine einheitliche Theorie der Bedingungslosen Graphen-Diffusionsmodelle

是否有必要设定噪音条件? 无条件图形扩散模型的统一理论 2505.22935v1

Authors: Jipeng Li, Yanning Shen

Explicit noise-level conditioning is widely regarded as essential for the effective operation of Graph Diffusion Models (GDMs). In this work, we challenge this assumption by investigating whether denoisers can implicitly infer noise levels directly from corrupted graph structures, potentially eliminating the need for explicit noise conditioning. To this end, we develop a theoretical framework centered on Bernoulli edge-flip corruptions and extend it to encompass more complex scenarios involving coupled structure-attribute noise. Extensive empirical evaluations on both synthetic and real-world graph datasets, using models such as GDSS and DiGress, provide strong support for our theoretical findings. Notably, unconditional GDMs achieve performance comparable or superior to their conditioned counterparts, while also offering reductions in parameters (4-6%) and computation time (8-10%). Our results suggest that the high-dimensional nature of graph data itself often encodes sufficient information for the denoising process, opening avenues for simpler, more efficient GDM architectures.

nan

Article 362

Title@2025-05-28 (3): Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging

Title: Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging

Unraveling LoRA Interferenz: Orthogonale Subräume für robuste Modellzusammenführung

开放 LoRA 干涉度: 用于强力模型合并的正弦形子空间 2505.22934v1

Authors: Haobo Zhang, Jiayu Zhou

Fine-tuning large language models (LMs) for individual tasks yields strong performance but is expensive for deployment and storage. Recent works explore model merging to combine multiple task-specific models into a single multi-task model without additional training. However, existing merging methods often fail for models fine-tuned with low-rank adaptation (LoRA), due to significant performance degradation. In this paper, we show that this issue arises from a previously overlooked interplay between model parameters and data distributions. We propose Orthogonal Subspaces for Robust model Merging (OSRM) to constrain the LoRA subspace prior to fine-tuning, ensuring that updates relevant to one task do not adversely shift outputs for others. Our approach can seamlessly integrate with most existing merging algorithms, reducing the unintended interference among tasks. Extensive experiments on eight datasets, tested with three widely used LMs and two large LMs, demonstrate that our method not only boosts merging performance but also preserves single-task accuracy. Furthermore, our approach exhibits greater robustness to the hyperparameters of merging. These results highlight the importance of data-parameter interaction in model merging and offer a plug-and-play solution for merging LoRA models.

nan

Article 363

Title@2025-05-28 (3): K-Paths: Reasoning over Graph Paths for Drug Repurposing and Drug Interaction Prediction

Title: K-Paths: Reasoning over Graph Paths for Drug Repurposing and Drug Interaction Prediction

K-Paths: Begründung über Graphenpfade für Drogenrepurposing und Drogeninteraktionsvorhersage

K-Paths: 以图解路径为依据进行药物再定位和药物相互作用预测 2502.13344v3

Authors: Tassallah Abdullahi, Ioanna Gemou, Nihal V. Nayak, Ghulam Murtaza, Stephen H. Bach, Carsten Eickhoff, Ritambhara Singh

Biomedical knowledge graphs (KGs) encode rich, structured information critical for drug discovery tasks, but extracting meaningful insights from large-scale KGs remains challenging due to their complex structure. Existing biomedical subgraph retrieval methods are tailored for graph neural networks (GNNs), limiting compatibility with other paradigms, including large language models (LLMs). We introduce K-Paths, a model-agnostic retrieval framework that extracts structured, diverse, and biologically meaningful multi-hop paths from dense biomedical KGs. These paths enable the prediction of unobserved drug-drug and drug-disease interactions, including those involving entities not seen during training, thus supporting inductive reasoning. K-Paths is training-free and employs a diversity-aware adaptation of Yen’s algorithm to extract the K shortest loopless paths between entities in a query, prioritizing biologically relevant and relationally diverse connections. These paths serve as concise, interpretable reasoning chains that can be directly integrated with LLMs or GNNs to improve generalization, accuracy, and enable explainable inference. Experiments on benchmark datasets show that K-Paths improves zero-shot reasoning across state-of-the-art LLMs. For instance, Tx-Gemma 27B improves by 19.8 and 4.0 F1 points on interaction severity prediction and drug repurposing tasks, respectively. Llama 70B achieves gains of 8.5 and 6.2 points on the same tasks. K-Paths also boosts the training efficiency of EmerGNN, a state-of-the-art GNN, by reducing the KG size by 90% while maintaining predictive performance. Beyond efficiency, K-Paths bridges the gap between KGs and LLMs, enabling scalable and explainable LLM-augmented scientific discovery. We release our code and the retrieved paths as a benchmark for inductive reasoning.

nan

Article 364

Title@2025-05-28 (3): How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias

Title: How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias

Wie Transformer lernen Regelmäßige Spracherkennung: Eine theoretische Studie über Trainingsdynamik und Implizite Bias

变换人如何学习常规语言识别:关于培训动态和隐含偏见的理论研究 2505.00926v3

Authors: Ruiquan Huang, Yingbin Liang, Jing Yang

Language recognition tasks are fundamental in natural language processing (NLP) and have been widely used to benchmark the performance of large language models (LLMs). These tasks also play a crucial role in explaining the working mechanisms of transformers. In this work, we focus on two representative tasks in the category of regular language recognition, known as even pairs' and parity check’, the aim of which is to determine whether the occurrences of certain subsequences in a given sequence are even. Our goal is to explore how a one-layer transformer, consisting of an attention layer followed by a linear layer, learns to solve these tasks by theoretically analyzing its training dynamics under gradient descent. While even pairs can be solved directly by a one-layer transformer, parity check need to be solved by integrating Chain-of-Thought (CoT), either into the inference stage of a transformer well-trained for the even pairs task, or into the training of a one-layer transformer. For both problems, our analysis shows that the joint training of attention and linear layers exhibits two distinct phases. In the first phase, the attention layer grows rapidly, mapping data sequences into separable vectors. In the second phase, the attention layer becomes stable, while the linear layer grows logarithmically and approaches in direction to a max-margin hyperplane that correctly separates the attention layer outputs into positive and negative samples, and the loss decreases at a rate of $O(1/t)$. Our experiments validate those theoretical results.

nan

Article 365

Title@2025-05-28 (3): Scalable Parameter and Memory Efficient Pretraining for LLM: Recent Algorithmic Advances and Benchmarking

Title: Scalable Parameter and Memory Efficient Pretraining for LLM: Recent Algorithmic Advances and Benchmarking

Skalierbare Parameter und Speicher Effizientes Vortraining für LLM: Algorithmische Fortschritte und Benchmarking

LLM的可缩放参数和记忆高效预修培训:最近的演算进展和基准 2505.22922v1

Authors: Athanasios Glentis, Jiaxiang Li, Qiulin Shang, Andi Han, Ioannis Tsaknakis, Quan Wei, Mingyi Hong

Fueled by their remarkable ability to tackle diverse tasks across multiple domains, large language models (LLMs) have grown at an unprecedented rate, with some recent models containing trillions of parameters. This growth is accompanied by substantial computational challenges, particularly regarding the memory and compute resources required for training and fine-tuning. Numerous approaches have been explored to address these issues, such as LoRA. While these methods are effective for fine-tuning, their application to pre-training is significantly more challenging due to the need to learn vast datasets. Motivated by this issue, we aim to address the following questions: Can parameter- or memory-efficient methods enhance pre-training efficiency while achieving performance comparable to full-model training? How can the performance gap be narrowed? To this end, the contributions of this work are the following. (1) We begin by conducting a comprehensive survey that summarizes state-of-the-art methods for efficient pre-training. (2) We perform a benchmark evaluation of several representative memory efficient pre-training approaches to comprehensively evaluate their performance across model sizes. We observe that with a proper choice of optimizer and hyperparameters, full-rank training delivers the best performance, as expected. We also notice that incorporating high-rank updates in low-rank approaches is the key to improving their performance. (3) Finally, we propose two practical techniques, namely weight refactorization and momentum reset, to enhance the performance of efficient pre-training methods. We observe that applying these techniques to the low-rank method (on a 1B model) can achieve a lower perplexity than popular memory efficient algorithms such as GaLore and Fira, while simultaneously using about 25% less memory.

nan

Article 366

Title@2025-05-28 (3): Unlocking Mental Health: Exploring College Students’ Well-being through Smartphone Behaviors

Title: Unlocking Mental Health: Exploring College Students’ Well-being through Smartphone Behaviors

Entsperren der psychischen Gesundheit: Erforschen des Wohlbefindens der Studenten durch Smartphone-Verhalten

解锁心理健康:通过智能手机行为探索大学生福祉 2502.08766v2

Authors: Wei Xuan, Meghna Roy Chowdhury, Yi Ding, Yixue Zhao

The global mental health crisis is a pressing concern, with college students particularly vulnerable to rising mental health disorders. The widespread use of smartphones among young adults, while offering numerous benefits, has also been linked to negative outcomes such as addiction and regret, significantly impacting well-being. Leveraging the longest longitudinal dataset collected over four college years through passive mobile sensing, this study is the first to examine the relationship between students’ smartphone unlocking behaviors and their mental health at scale in real-world settings. We provide the first evidence demonstrating the predictability of phone unlocking behaviors for mental health outcomes based on a large dataset, highlighting the potential of these novel features for future predictive models. Our findings reveal important variations in smartphone usage across genders and locations, offering a deeper understanding of the interplay between digital behaviors and mental health. We highlight future research directions aimed at mitigating adverse effects and promoting digital well-being in this population.

nan

Article 367

Title@2025-05-28 (3): Enhancing Semi-supervised Learning with Zero-shot Pseudolabels

Title: Enhancing Semi-supervised Learning with Zero-shot Pseudolabels

Halbbeaufsichtigtes Lernen mit Null-Shot-Pseudo-Labels verbessern

用零弹Pseudo标签加强半监督的学习 2502.12584v2

Authors: Jichan Chung, Irene Y. Chen

The high cost of data labeling presents a major barrier to deploying machine learning systems at scale. Semi-supervised learning (SSL) mitigates this challenge by utilizing unlabeled data alongside limited labeled examples, while the emergence of foundation models (FMs) offers powerful zero-shot capabilities that can further reduce labeling cost. However, directly fine-tuning large FMs is often impractical in resource-constrained settings, and na"ively using their pseudo-labels for unlabeled data can degrade performance due to its unreliablity or domain mismatch with target task. In this work, we introduce ZeroMatch, a novel SSL framework that integrates knowledge distillation with consistency-based learning to jointly leverage labeled data, unlabeled data, and pseudo-labels from FMs. ZeroMatch enables training compact student models using only FM inference, making it suitable for low-resource environments such as personal devices with limited compute. Experiments on six vision and language classification benchmarks show that ZeroMatch consistently outperforms standard SSL and zero-shot augmented methods, demonstrating its effectiveness and robustness across a range of foundation model qualities.

nan

Article 368

Title: cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning

cadrille: Multimodale CAD-Rekonstruktion mit Online-Verstärkung

与在线强化学习相结合的多模式 CAD重建 2505.22914v1

Authors: Maksim Kolodiazhnyi, Denis Tarasov, Dmitrii Zhemchuzhnikov, Alexander Nikulin, Ilya Zisman, Anna Vorontsova, Anton Konushin, Vladislav Kurenkov, Danila Rukhovich

Computer-Aided Design (CAD) plays a central role in engineering and manufacturing, making it possible to create precise and editable 3D models. Using a variety of sensor or user-provided data as inputs for CAD reconstruction can democratize access to design applications. However, existing methods typically focus on a single input modality, such as point clouds, images, or text, which limits their generalizability and robustness. Leveraging recent advances in vision-language models (VLM), we propose a multi-modal CAD reconstruction model that simultaneously processes all three input modalities. Inspired by large language model (LLM) training paradigms, we adopt a two-stage pipeline: supervised fine-tuning (SFT) on large-scale procedurally generated data, followed by reinforcement learning (RL) fine-tuning using online feedback, obtained programatically. Furthermore, we are the first to explore RL fine-tuning of LLMs for CAD tasks demonstrating that online RL algorithms such as Group Relative Preference Optimization (GRPO) outperform offline alternatives. In the DeepCAD benchmark, our SFT model outperforms existing single-modal approaches in all three input modalities simultaneously. More importantly, after RL fine-tuning, cadrille sets new state-of-the-art on three challenging datasets, including a real-world one.

nan

Article 369

Title@2025-05-28 (3): Mustafar: Promoting Unstructured Sparsity for KV Cache Pruning in LLM Inference

Title: Mustafar: Promoting Unstructured Sparsity for KV Cache Pruning in LLM Inference

Mustafar: Förderung unstrukturierter Sparsamkeit für KV Cache Pruning in LLM Inferenz

Mustafar:在LLM推理中促进KV Cache Pruning的无结构平衡 2505.22913v1

Authors: Donghyeon Joo, Helya Hosseini, Ramyad Hadidi, Bahar Asgari

We demonstrate that unstructured sparsity significantly improves KV cache compression for LLMs, enabling sparsity levels up to 70% without compromising accuracy or requiring fine-tuning. We conduct a systematic exploration of pruning strategies and find per-token magnitude-based pruning as highly effective for both Key and Value caches under unstructured sparsity, surpassing prior structured pruning schemes. The Key cache benefits from prominent outlier elements, while the Value cache surprisingly benefits from a simple magnitude-based pruning despite its uniform distribution. KV cache size is the major bottleneck in decode performance due to high memory overhead for large context lengths. To address this, we use a bitmap-based sparse format and a custom attention kernel capable of compressing and directly computing over compressed caches pruned to arbitrary sparsity patterns, significantly accelerating memory-bound operations in decode computations and thereby compensating for the overhead of runtime pruning and compression. Our custom attention kernel coupled with the bitmap-based format delivers substantial compression of KV cache upto 45% of dense inference and thereby enables longer context length and increased tokens/sec throughput of upto 2.23x compared to dense inference. Our pruning mechanism and sparse attention kernel is available at https://github.com/dhjoo98/mustafar.

nan

Article 370

Title@2025-05-28 (3): GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation

Title: GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation

GraphEval: Ein leichter Graph-basierter LLM-Rahmen für die Idee-Evaluierung

图图Eval:基于轻量图图的理论评估LLM框架 2503.12600v2

Authors: Tao Feng, Yihang Sun, Jiaxuan You

The powerful capabilities of Large Language Models (LLMs) have led to their growing use in evaluating human-generated content, particularly in evaluating research ideas within academic settings. Existing solutions primarily rely on prompt-based LLM methods or fine-tuned lightweight language models for idea evaluation. However, these methods are often unstable and struggle to comprehend the complex semantic information embedded in the ideas, impeding their ability to perform high-quality evaluations. To address the above challenges, we propose GraphEval, a lightweight graph-based LLM framework for idea evaluation. Our insight is that a complex idea can be broken down into comprehensible viewpoint nodes using prompts from small LLMs. These viewpoint nodes can then be linked together through edges created from LLM-based relation extraction and/or BERT similarity scores. The created viewpoint-graph can be used to conveniently propagate scores across view-nodes to improve the robustness of the idea evaluations. In particular, we propose two lightweight graph-based methods for idea evaluation: (1) GraphEval-LP: a training-free label propagation algorithm that propagates evaluation scores from known view-nodes to unknown nodes; (2) GraphEval-GNN: a Graph Neural Networks (GNN) that is trained to predict the evaluation scores given the observed graph with minimal computation resources. Moreover, to overcome LLM’s limitation in objectively assessing the novelty of ideas, we further propose a novelty detection model to GraphEval-GNN to enhance its capability in judging idea novelty. Experiments on two datasets show GraphEval improves F1 scores by at least 14% with low computation and API costs. Additionally, GraphEval can effectively detect plagiarized ideas.

nan

Article 371

Title@2025-05-28 (3): Ensuring User-side Fairness in Dynamic Recommender Systems

Title: Ensuring User-side Fairness in Dynamic Recommender Systems

Gewährleistung der benutzerseitigen Fairness in dynamischen Recommender-Systemen

确保动态建议系统在用户方面的公平公正 2308.15651v3

Authors: Hyunsik Yoo, Zhichen Zeng, Jian Kang, Ruizhong Qiu, David Zhou, Zhining Liu, Fei Wang, Charlie Xu, Eunice Chan, Hanghang Tong

User-side group fairness is crucial for modern recommender systems, aiming to alleviate performance disparities among user groups defined by sensitive attributes like gender, race, or age. In the ever-evolving landscape of user-item interactions, continual adaptation to newly collected data is crucial for recommender systems to stay aligned with the latest user preferences. However, we observe that such continual adaptation often exacerbates performance disparities. This necessitates a thorough investigation into user-side fairness in dynamic recommender systems, an area that has been unexplored in the literature. This problem is challenging due to distribution shifts, frequent model updates, and non-differentiability of ranking metrics. To our knowledge, this paper presents the first principled study on ensuring user-side fairness in dynamic recommender systems. We start with theoretical analyses on fine-tuning v.s. retraining, showing that the best practice is incremental fine-tuning with restart. Guided by our theoretical analyses, we propose FAir Dynamic rEcommender (FADE), an end-to-end fine-tuning framework to dynamically ensure user-side fairness over time. To overcome the non-differentiability of recommendation metrics in the fairness loss, we further introduce Differentiable Hit (DH) as an improvement over the recent NeuralNDCG method, not only alleviating its gradient vanishing issue but also achieving higher efficiency. Besides that, we also address the instability issue of the fairness loss by leveraging the competing nature between the recommendation loss and the fairness loss. Through extensive experiments on real-world datasets, we demonstrate that FADE effectively and efficiently reduces performance disparities with little sacrifice in the overall recommendation performance.

nan

Article 372

Title@2025-05-28 (3): SP2RINT: Spatially-Decoupled Physics-Inspired Progressive Inverse Optimization for Scalable, PDE-Constrained Meta-Optical Neural Network Training

Title: SP2RINT: Spatially-Decoupled Physics-Inspired Progressive Inverse Optimization for Scalable, PDE-Constrained Meta-Optical Neural Network Training

SP2RINT: Spatially-Decoupled Physics-Inspired Progressive Inverse Optimization für skalierbare, PDE-Constrained Meta-Optical Neural Network Training

SP2RINT: 空间-减速物理激励-渐进式反向优化,用于可缩放、PDE-受培训的元神经网络培训 2505.18377v2

Authors: Pingchuan Ma, Ziang Yin, Qi Jing, Zhengqi Gao, Nicholas Gangi, Boyang Zhang, Tsung-Wei Huang, Zhaoran Huang, Duane S. Boning, Yu Yao, Jiaqi Gu

DONNs leverage light propagation for efficient analog AI and signal processing. Advances in nanophotonic fabrication and metasurface-based wavefront engineering have opened new pathways to realize high-capacity DONNs across various spectral regimes. Training such DONN systems to determine the metasurface structures remains challenging. Heuristic methods are fast but oversimplify metasurfaces modulation, often resulting in physically unrealizable designs and significant performance degradation. Simulation-in-the-loop optimizes implementable metasurfaces via adjoint methods, but is computationally prohibitive and unscalable. To address these limitations, we propose SP2RINT, a spatially decoupled, progressive training framework that formulates DONN training as a PDE-constrained learning problem. Metasurface responses are first relaxed into freely trainable transfer matrices with a banded structure. We then progressively enforce physical constraints by alternating between transfer matrix training and adjoint-based inverse design, avoiding per-iteration PDE solves while ensuring final physical realizability. To further reduce runtime, we introduce a physics-inspired, spatially decoupled inverse design strategy based on the natural locality of field interactions. This approach partitions the metasurface into independently solvable patches, enabling scalable and parallel inverse design with system-level calibration. Evaluated across diverse DONN training tasks, SP2RINT achieves digital-comparable accuracy while being 1825 times faster than simulation-in-the-loop approaches. By bridging the gap between abstract DONN models and implementable photonic hardware, SP2RINT enables scalable, high-performance training of physically realizable meta-optical neural systems. Our code is available at https://github.com/ScopeX-ASU/SP2RINT

nan

Article 373

Title@2025-05-28 (3): Defining Foundation Models for Computational Science: A Call for Clarity and Rigor

Title: Defining Foundation Models for Computational Science: A Call for Clarity and Rigor

Fundamentalmodelle für die Computerwissenschaft definieren: Ein Ruf nach Klarheit und Starrheit

界定计算科学基础模型:要求明确和严格 2505.22904v1

Authors: Youngsoo Choi, Siu Wun Cheung, Youngkyu Kim, Ping-Hsuan Tsai, Alejandro N. Diaz, Ivan Zanardi, Seung Whan Chung, Dylan Matthew Copeland, Coleman Kendrick, William Anderson, Traian Iliescu, Matthias Heinkenschloss

The widespread success of foundation models in natural language processing and computer vision has inspired researchers to extend the concept to scientific machine learning and computational science. However, this position paper argues that as the term “foundation model” is an evolving concept, its application in computational science is increasingly used without a universally accepted definition, potentially creating confusion and diluting its precise scientific meaning. In this paper, we address this gap by proposing a formal definition of foundation models in computational science, grounded in the core values of generality, reusability, and scalability. We articulate a set of essential and desirable characteristics that such models must exhibit, drawing parallels with traditional foundational methods, like the finite element and finite volume methods. Furthermore, we introduce the Data-Driven Finite Element Method (DD-FEM), a framework that fuses the modular structure of classical FEM with the representational power of data-driven learning. We demonstrate how DD-FEM addresses many of the key challenges in realizing foundation models for computational science, including scalability, adaptability, and physics consistency. By bridging traditional numerical methods with modern AI paradigms, this work provides a rigorous foundation for evaluating and developing novel approaches toward future foundation models in computational science.

nan

Article 374

Title@2025-05-28 (3): Norm-Bounded Low-Rank Adaptation

Title: Norm-Bounded Low-Rank Adaptation

Normgebundene Low-Rank-Anpassung

适应性 2501.19050v3

Authors: Ruigang Wang, Krishnamurthy Dvijotham, Ian R. Manchester

In this work, we propose norm-bounded low-rank adaptation (NB-LoRA) for parameter-efficient fine tuning. NB-LoRA is a novel parameterization of low-rank weight adaptations that admits explicit bounds on each singular value of the adaptation matrix, which can thereby satisfy any prescribed unitarily invariant norm bound, including the Schatten norms (e.g., nuclear, Frobenius, spectral norm). The proposed parameterization is unconstrained, smooth, and complete, i.e. it covers all matrices satisfying the prescribed rank and singular-value bounds. Comparative experiments on large language models show that NB-LoRA achieves superior adaptation performance and faster training over a range of models, tasks and ranks. Vision fine-tuning experiments show that NB-LoRA can achieve strong adaptation performance while avoiding model catastrophic forgetting, and compared to existing approaches it is substantially more robust to a hyper-parameters such as including adaptation rank, learning rate and number of training epochs.

nan

Article 375

Title@2025-05-28 (3): On the Dynamic Regret of Following the Regularized Leader: Optimism with History Pruning

Title: On the Dynamic Regret of Following the Regularized Leader: Optimism with History Pruning

Zum dynamischen Bedauern, dem regularisierten Führer zu folgen: Optimismus mit Geschichtsveredelung

在追赶正规领导人之后的强烈遗憾:对历史的乐观态度 2505.22899v1

Authors: Naram Mhaisen, George Iosifidis

We revisit the Follow the Regularized Leader (FTRL) framework for Online Convex Optimization (OCO) over compact sets, focusing on achieving dynamic regret guarantees. Prior work has highlighted the framework’s limitations in dynamic environments due to its tendency to produce “lazy” iterates. However, building on insights showing FTRL’s ability to produce “agile” iterates, we show that it can indeed recover known dynamic regret bounds through optimistic composition of future costs and careful linearization of past costs, which can lead to pruning some of them. This new analysis of FTRL against dynamic comparators yields a principled way to interpolate between greedy and agile updates and offers several benefits, including refined control over regret terms, optimism without cyclic dependence, and the application of minimal recursive regularization akin to AdaFTRL. More broadly, we show that it is not the lazy projection style of FTRL that hinders (optimistic) dynamic regret, but the decoupling of the algorithm’s state (linearized history) from its iterates, allowing the state to grow arbitrarily. Instead, pruning synchronizes these two when necessary.

nan

Article 376

Title@2025-05-28 (3): The Geometry of ReLU Networks through the ReLU Transition Graph

Title: The Geometry of ReLU Networks through the ReLU Transition Graph

Die Geometrie von ReLU-Netzwerken durch den ReLU-Übergangsgraphen

通过 ReLU 过渡图绘制 ReLU 网络的几何图 2505.11692v2

Authors: Sahil Rajesh Dhayalkar

We develop a novel theoretical framework for analyzing ReLU neural networks through the lens of a combinatorial object we term the ReLU Transition Graph (RTG). In this graph, each node corresponds to a linear region induced by the network’s activation patterns, and edges connect regions that differ by a single neuron flip. Building on this structure, we derive a suite of new theoretical results connecting RTG geometry to expressivity, generalization, and robustness. Our contributions include tight combinatorial bounds on RTG size and diameter, a proof of RTG connectivity, and graph-theoretic interpretations of VC-dimension. We also relate entropy and average degree of the RTG to generalization error. Each theoretical result is rigorously validated via carefully controlled experiments across varied network depths, widths, and data regimes. This work provides the first unified treatment of ReLU network structure via graph theory and opens new avenues for compression, regularization, and complexity control rooted in RTG analysis.

nan

Article 377

Title@2025-05-28 (3): Neural Networks as Universal Finite-State Machines: A Constructive Deterministic Finite Automaton Theory

Title: Neural Networks as Universal Finite-State Machines: A Constructive Deterministic Finite Automaton Theory

Neurale Netzwerke als universelle Finite-State-Maschinen: Eine konstruktive Deterministische Finite-Automaten-Theorie

神经网络作为普遍有限国家机器:具有建设性决定作用的有限自定义理论 2505.11694v2

Authors: Sahil Rajesh Dhayalkar

We present a complete theoretical and empirical framework establishing feedforward neural networks as universal finite-state machines (N-FSMs). Our results prove that finite-depth ReLU and threshold networks can exactly simulate deterministic finite automata (DFAs) by unrolling state transitions into depth-wise neural layers, with formal characterizations of required depth, width, and state compression. We demonstrate that DFA transitions are linearly separable, binary threshold activations allow exponential compression, and Myhill-Nerode equivalence classes can be embedded into continuous latent spaces while preserving separability. We also formalize the expressivity boundary: fixed-depth feedforward networks cannot recognize non-regular languages requiring unbounded memory. Unlike prior heuristic or probing-based studies, we provide constructive proofs and design explicit DFA-unrolled neural architectures that empirically validate every claim. Our results bridge deep learning, automata theory, and neural-symbolic computation, offering a rigorous blueprint for how discrete symbolic processes can be realized in continuous neural systems.

nan

Article 378

Title@2025-05-28 (3): A Combinatorial Theory of Dropout: Subnetworks, Graph Geometry, and Generalization

Title: A Combinatorial Theory of Dropout: Subnetworks, Graph Geometry, and Generalization

A Combinatorial Theory of Dropout: Subnetzwerke, Graphische Geometrie und Generalisierung

辍学综合理论:子网络、图形几何和一般化 2504.14762v2

Authors: Sahil Rajesh Dhayalkar

We propose a combinatorial and graph-theoretic theory of dropout by modeling training as a random walk over a high-dimensional graph of binary subnetworks. Each node represents a masked version of the network, and dropout induces stochastic traversal across this space. We define a subnetwork contribution score that quantifies generalization and show that it varies smoothly over the graph. Using tools from spectral graph theory, PAC-Bayes analysis, and combinatorics, we prove that generalizing subnetworks form large, connected, low-resistance clusters, and that their number grows exponentially with network width. This reveals dropout as a mechanism for sampling from a robust, structured ensemble of well-generalizing subnetworks with built-in redundancy. Extensive experiments validate every theoretical claim across diverse architectures. Together, our results offer a unified foundation for understanding dropout and suggest new directions for mask-guided regularization and subnetwork optimization.

nan

Article 379

Title@2025-05-28 (3): Smart Surrogate Losses for Contextual Stochastic Linear Optimization with Robust Constraints

Title: Smart Surrogate Losses for Contextual Stochastic Linear Optimization with Robust Constraints

Intelligente Surrogatverluste für kontextuelle stochastische Linearoptimierung mit robusten Einschränkungen

具有强力限制的内幕斯托卡式线性优化的智能代谢损失 2505.22881v1

Authors: Hyungki Im, Wyame Benslimane, Paul Grigas

We study an extension of contextual stochastic linear optimization (CSLO) that, in contrast to most of the existing literature, involves inequality constraints that depend on uncertain parameters predicted by a machine learning model. To handle the constraint uncertainty, we use contextual uncertainty sets constructed via methods like conformal prediction. Given a contextual uncertainty set method, we introduce the “Smart Predict-then-Optimize with Robust Constraints” (SPO-RC) loss, a feasibility-sensitive adaptation of the SPO loss that measures decision error of predicted objective parameters. We also introduce a convex surrogate, SPO-RC+, and prove Fisher consistency with SPO-RC. To enhance performance, we train on truncated datasets where true constraint parameters lie within the uncertainty sets, and we correct the induced sample selection bias using importance reweighting techniques. Through experiments on fractional knapsack and alloy production problem instances, we demonstrate that SPO-RC+ effectively handles uncertainty in constraints and that combining truncation with importance reweighting can further improve performance.

nan

Article 380

Title@2025-05-28 (3): Signal attenuation enables scalable decentralized multi-agent reinforcement learning over networks

Title: Signal attenuation enables scalable decentralized multi-agent reinforcement learning over networks

Signaldämpfung ermöglicht skalierbares dezentrales Multi-Agenten-Verstärkungslernen über Netzwerke

信号减速使可伸缩的分散式多试剂强化学习超越网络 2505.11461v2

Authors: Wesley A Suttle, Vipul K Sharma, Brian M Sadler

Multi-agent reinforcement learning (MARL) methods typically require that agents enjoy global state observability, preventing development of decentralized algorithms and limiting scalability. Recent work has shown that, under assumptions on decaying inter-agent influence, global observability can be replaced by local neighborhood observability at each agent, enabling decentralization and scalability. Real-world applications enjoying such decay properties remain underexplored, however, despite the fact that signal power decay, or signal attenuation, due to path loss is an intrinsic feature of many problems in wireless communications and radar networks. In this paper, we show that signal attenuation enables decentralization in MARL by considering the illustrative special case of performing power allocation for target detection in a radar network. To achieve this, we propose two new constrained multi-agent Markov decision process formulations of this power allocation problem, derive local neighborhood approximations for global value function and policy gradient estimates and establish corresponding error bounds, and develop decentralized saddle point policy gradient algorithms for solving the proposed problems. Our approach, though oriented towards the specific radar network problem we consider, provides a useful model for extensions to additional problems in wireless communications and radar networks.

nan

Article 381

Title@2025-05-28 (3): CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models

Title: CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models

CFP-Gen: Kombinatorische funktionelle Proteinerzeugung über Diffusions-Sprachenmodelle

CFP-Gen:通过传播语言模式生成混合功能性蛋白质 2505.22869v1

Authors: Junbo Yin, Chao Zha, Wenjia He, Chencheng Xu, Xin Gao

Existing PLMs generate protein sequences based on a single-condition constraint from a specific modality, struggling to simultaneously satisfy multiple constraints across different modalities. In this work, we introduce CFP-Gen, a novel diffusion language model for Combinatorial Functional Protein GENeration. CFP-Gen facilitates the de novo protein design by integrating multimodal conditions with functional, sequence, and structural constraints. Specifically, an Annotation-Guided Feature Modulation (AGFM) module is introduced to dynamically adjust the protein feature distribution based on composable functional annotations, e.g., GO terms, IPR domains and EC numbers. Meanwhile, the Residue-Controlled Functional Encoding (RCFE) module captures residue-wise interaction to ensure more precise control. Additionally, off-the-shelf 3D structure encoders can be seamlessly integrated to impose geometric constraints. We demonstrate that CFP-Gen enables high-throughput generation of novel proteins with functionality comparable to natural proteins, while achieving a high success rate in designing multifunctional proteins. Code and data available at https://github.com/yinjunbo/cfpgen.

nan

Article 382

Title@2025-05-28 (3): Multimodal Survival Modeling in the Age of Foundation Models

Title: Multimodal Survival Modeling in the Age of Foundation Models

Multimodale Überlebensmodellierung im Zeitalter der Gründungsmodelle

基金会时代多模式生存模型 2505.07683v2

Authors: Steven Song, Morgan Borjigin-Wang, Irene Madejski, Robert L. Grossman

The Cancer Genome Atlas (TCGA) has enabled novel discoveries and served as a large-scale reference through its harmonized genomics, clinical, and image data. Prior studies have trained bespoke cancer survival prediction models from unimodal or multimodal TCGA data. A modern paradigm in biomedical deep learning is the development of foundation models (FMs) to derive meaningful feature embeddings, agnostic to a specific modeling task. Biomedical text especially has seen growing development of FMs. While TCGA contains free-text data as pathology reports, these have been historically underutilized. Here, we investigate the feasibility of training classical, multimodal survival models over zero-shot embeddings extracted by FMs. We show the ease and additive effect of multimodal fusion, outperforming unimodal models. We demonstrate the benefit of including pathology report text and rigorously evaluate the effect of model-based text summarization and hallucination. Overall, we modernize survival modeling by leveraging FMs and information extraction from pathology reports.

nan

Article 383

Title@2025-05-28 (3): CrossNAS: A Cross-Layer Neural Architecture Search Framework for PIM Systems

Title: CrossNAS: A Cross-Layer Neural Architecture Search Framework for PIM Systems

CrossNAS: Ein Cross-Layer Neural Architecture Search Framework für PIM-Systeme

CrossNAS:PIM系统跨行业神经结构搜索框架 2505.22868v1

Authors: Md Hasibul Amin, Mohammadreza Mohammadi, Jason D. Bakos, Ramtin Zand

In this paper, we propose the CrossNAS framework, an automated approach for exploring a vast, multidimensional search space that spans various design abstraction layers-circuits, architecture, and systems-to optimize the deployment of machine learning workloads on analog processing-in-memory (PIM) systems. CrossNAS leverages the single-path one-shot weight-sharing strategy combined with the evolutionary search for the first time in the context of PIM system mapping and optimization. CrossNAS sets a new benchmark for PIM neural architecture search (NAS), outperforming previous methods in both accuracy and energy efficiency while maintaining comparable or shorter search times.

nan

Article 384

Title@2025-05-28 (3): Scaling Offline RL via Efficient and Expressive Shortcut Models

Title: Scaling Offline RL via Efficient and Expressive Shortcut Models

Skalierung von Offline-RL über effiziente und Expressive Shortcut-Modelle

通过高效和直表达快捷键模式缩放离线 RL 2505.22866v1

Authors: Nicolas Espinosa-Dice, Yiyi Zhang, Yiding Chen, Bradley Guo, Owen Oertell, Gokul Swamy, Kiante Brantley, Wen Sun

Diffusion and flow models have emerged as powerful generative approaches capable of modeling diverse and multimodal behavior. However, applying these models to offline reinforcement learning (RL) remains challenging due to the iterative nature of their noise sampling processes, making policy optimization difficult. In this paper, we introduce Scalable Offline Reinforcement Learning (SORL), a new offline RL algorithm that leverages shortcut models - a novel class of generative models - to scale both training and inference. SORL’s policy can capture complex data distributions and can be trained simply and efficiently in a one-stage training procedure. At test time, SORL introduces both sequential and parallel inference scaling by using the learned Q-function as a verifier. We demonstrate that SORL achieves strong performance across a range of offline RL tasks and exhibits positive scaling behavior with increased test-time compute. We release the code at nico-espinosadice.github.io/projects/sorl.

nan

Article 385

Title@2025-05-28 (3): Your Data, My Model: Learning Who Really Helps in Federated Learning

Title: Your Data, My Model: Learning Who Really Helps in Federated Learning

Ihre Daten, mein Modell: Lernen, die wirklich hilft beim Federated Learning

您的数据, 我的模型: 学习谁真正帮助联邦学习 2409.02064v3

Authors: Shamsiiat Abdurakhmanova, Amirhossein Mohammadi, Yasmin SarcheshmehPour, Alexander Jung

Many important machine learning applications involve networks of devices-such as wearables or smartphones-that generate local data and train personalized models. A key challenge is determining which peers are most beneficial for collaboration. We propose a simple and privacy-preserving method to select relevant collaborators by evaluating how much a model improves after a single gradient step using another devices data-without sharing raw data. This method naturally extends to non-parametric models by replacing the gradient step with a non-parametric generalization. Our approach enables model-agnostic, data-driven peer selection for personalized federated learning (PersFL).

nan

Article 386

Title@2025-05-28 (3): Causal-PIK: Causality-based Physical Reasoning with a Physics-Informed Kernel

Title: Causal-PIK: Causality-based Physical Reasoning with a Physics-Informed Kernel

Causal-PIK: Kausalitätsbasierte Physical Reasoning mit einem physikinformierten Kernel

Authors: Carlota Parés-Morlans, Michelle Yi, Claire Chen, Sarah A. Wu, Rika Antonova, Tobias Gerstenberg, Jeannette Bohg

Tasks that involve complex interactions between objects with unknown dynamics make planning before execution difficult. These tasks require agents to iteratively improve their actions after actively exploring causes and effects in the environment. For these type of tasks, we propose Causal-PIK, a method that leverages Bayesian optimization to reason about causal interactions via a Physics-Informed Kernel to help guide efficient search for the best next action. Experimental results on Virtual Tools and PHYRE physical reasoning benchmarks show that Causal-PIK outperforms state-of-the-art results, requiring fewer actions to reach the goal. We also compare Causal-PIK to human studies, including results from a new user study we conducted on the PHYRE benchmark. We find that Causal-PIK remains competitive on tasks that are very challenging, even for human problem-solvers.

nan

Article 387

Title@2025-05-28 (3): Permissioned LLMs: Enforcing Access Control in Large Language Models

Title: Permissioned LLMs: Enforcing Access Control in Large Language Models

Zugelassene LLMs: Erzwingen der Zugriffskontrolle in großen Sprachmodellen

获得许可的LLMM:在大语言模型中实施访问控制 2505.22860v1

Authors: Bargav Jayaraman, Virendra J. Marathe, Hamid Mozaffari, William F. Shen, Krishnaram Kenthapadi

In enterprise settings, organizational data is segregated, siloed and carefully protected by elaborate access control frameworks. These access control structures can completely break down if an LLM fine-tuned on the siloed data serves requests, for downstream tasks, from individuals with disparate access privileges. We propose Permissioned LLMs (PermLLM), a new class of LLMs that superimpose the organizational data access control structures on query responses they generate. We formalize abstractions underpinning the means to determine whether access control enforcement happens correctly over LLM query responses. Our formalism introduces the notion of a relevant response that can be used to prove whether a PermLLM mechanism has been implemented correctly. We also introduce a novel metric, called access advantage, to empirically evaluate the efficacy of a PermLLM mechanism. We introduce three novel PermLLM mechanisms that build on Parameter Efficient Fine-Tuning to achieve the desired access control. We furthermore present two instantiations of access advantage–(i) Domain Distinguishability Index (DDI) based on Membership Inference Attacks, and (ii) Utility Gap Index (UGI) based on LLM utility evaluation. We demonstrate the efficacy of our PermLLM mechanisms through extensive experiments on four public datasets (GPQA, RCV1, SimpleQA, and WMDP), in addition to evaluating the validity of DDI and UGI metrics themselves for quantifying access control in LLMs.

nan

Article 388

Title@2025-05-28 (3): NGPU-LM: GPU-Accelerated N-Gram Language Model for Context-Biasing in Greedy ASR Decoding

Title: NGPU-LM: GPU-Accelerated N-Gram Language Model for Context-Biasing in Greedy ASR Decoding

NGPU-LM: GPU-beschleunigtes N-Gram-Sprachenmodell für Kontext-Biasing in Greedy ASR-Dekodierung

NGPU-LM: 加速GPU-加速型N-Gram语语模式,用于在贪婪ASR标记中进行背景切换 2505.22857v1

Authors: Vladimir Bataev, Andrei Andrusenko, Lilit Grigoryan, Aleksandr Laptev, Vitaly Lavrukhin, Boris Ginsburg

Statistical n-gram language models are widely used for context-biasing tasks in Automatic Speech Recognition (ASR). However, existing implementations lack computational efficiency due to poor parallelization, making context-biasing less appealing for industrial use. This work rethinks data structures for statistical n-gram language models to enable fast and parallel operations for GPU-optimized inference. Our approach, named NGPU-LM, introduces customizable greedy decoding for all major ASR model types - including transducers, attention encoder-decoder models, and CTC - with less than 7% computational overhead. The proposed approach can eliminate more than 50% of the accuracy gap between greedy and beam search for out-of-domain scenarios while avoiding significant slowdown caused by beam search. The implementation of the proposed NGPU-LM is open-sourced.

nan

Article 389

Title: Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning

Nutzung von nicht gekennzeichneten Daten durch Kernel-Funktion Annäherung im Offline-Verstärkungs-Lernen

在离线强化学习中,通过 Kernel 函数相近接近的内核功能利用未贴标签的数据分享来利用无标签数据分享 2408.12307v3

Authors: Yen-Ru Lai, Fu-Chieh Chang, Pei-Yuan Wu

Offline reinforcement learning (RL) learns policies from a fixed dataset, but often requires large amounts of data. The challenge arises when labeled datasets are expensive, especially when rewards have to be provided by human labelers for large datasets. In contrast, unlabelled data tends to be less expensive. This situation highlights the importance of finding effective ways to use unlabelled data in offline RL, especially when labelled data is limited or expensive to obtain. In this paper, we present the algorithm to utilize the unlabeled data in the offline RL method with kernel function approximation and give the theoretical guarantee. We present various eigenvalue decay conditions of $\mathcal{H}_k$ which determine the complexity of the algorithm. In summary, our work provides a promising approach for exploiting the advantages offered by unlabeled data in offline RL, whilst maintaining theoretical assurances.

nan

Article 390

Title@2025-05-28 (3): Point Cloud Synthesis Using Inner Product Transforms

Title: Point Cloud Synthesis Using Inner Product Transforms

Punkt-Cloud-Synthese mit inneren Produkt-Transformationen

使用内产产品变换的点云合成 2410.18987v3

Authors: Ernst Röell, Bastian Rieck

Point-cloud synthesis, i.e. the generation of novel point clouds from an input distribution, remains a challenging task, for which numerous complex machine-learning models have been devised. We develop a novel method that encodes geometrical-topological characteristics of point clouds using inner products, leading to a highly-efficient point cloud representation with provable expressivity properties. Integrated into deep learning models, our encoding exhibits high quality in typical tasks like reconstruction, generation, and interpolation, with inference times orders of magnitude faster than existing methods.

nan

Article 391

Title@2025-05-28 (3): RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation

Title: RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation

RocqStar: Leveraging-ähnliche Retrieval- und Agentiksysteme für die Rocq-Generation

RocqStar:利用利用相似度驱动回收系统和干系统来生成Rocq 2505.22846v1

Authors: Nikita Khramov, Andrei Kozyrev, Gleb Solovev, Anton Podkopaev

Interactive Theorem Proving was repeatedly shown to be fruitful combined with Generative Artificial Intelligence. This paper assesses multiple approaches to Rocq generation and illuminates potential avenues for improvement. We highlight the importance of thorough premise selection for generating Rocq proofs and propose a novel approach, leveraging retrieval via a self-attentive embedder model. The evaluation of the designed approach shows up to 28% relative increase of the generator’s performance. We tackle the problem of writing Rocq proofs using a multi-stage agentic system, tailored for formal verification, and demonstrate its high effectiveness. We conduct an ablation study and show the use of multi-agent debate on the planning stage of proof synthesis.

nan

Article 392

Title@2025-05-28 (3): Entropy-regularized Gradient Estimators for Approximate Bayesian Inference

Title: Entropy-regularized Gradient Estimators for Approximate Bayesian Inference

Entropie-regularisierte Gradienten-Estimatoren für ungefähre Bayesische Schlussfolgerung

用于近近贝耶斯推断的全天正规化梯度测算器 2503.11964v3

Authors: Jasmeet Kaur

Effective uncertainty quantification is important for training modern predictive models with limited data, enhancing both accuracy and robustness. While Bayesian methods are effective for this purpose, they can be challenging to scale. When employing approximate Bayesian inference, ensuring the quality of samples from the posterior distribution in a computationally efficient manner is essential. This paper addresses the estimation of the Bayesian posterior to generate diverse samples by approximating the gradient flow of the Kullback-Leibler (KL) divergence and the cross entropy of the target approximation under the metric induced by the Stein Operator. It presents empirical evaluations on classification tasks to assess the method’s performance and discuss its effectiveness for Model-Based Reinforcement Learning that uses uncertainty-aware network dynamics models.

nan

Article 393

Title@2025-05-28 (3): Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion

Title: Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion

Jenseits der Permutationssymmetrie der Transformer: Die Rolle der Rotation für die Modellfusion

变异器超越变异对称:变动对模型融合的作用 2502.00264v2

Authors: Binchi Zhang, Zaiyi Zheng, Zhengzhang Chen, Jundong Li

Symmetry in the parameter space of deep neural networks (DNNs) has proven beneficial for various deep learning applications. A well-known example is the permutation symmetry in Multi-Layer Perceptrons (MLPs), where permuting the rows of weight matrices in one layer and applying the inverse permutation to adjacent layers yields a functionally equivalent model. While permutation symmetry fully characterizes the equivalence set for MLPs, its discrete nature limits its utility for transformers. In this paper, we introduce rotation symmetry, a novel form of parameter space symmetry for transformers that generalizes permutation symmetry by rotating parameter matrices in self-attention layers. Unlike permutation symmetry, rotation symmetry operates in a continuous domain, thereby significantly expanding the equivalence set for transformers. Based on this property, we propose a theoretically optimal parameter matching algorithm as a plug-and-play module to enhance model fusion. We evaluate our approach using pre-trained transformers across diverse natural language and vision tasks. Experimental results demonstrate that our rotation symmetry-based matching algorithm substantially improves model fusion, highlighting the potential of parameter space symmetry to facilitate model fusion. Our code is available on https://github.com/zhengzaiyi/RotationSymmetry.

nan

Article 394

Title@2025-05-28 (3): Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation

Title: Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation

Bayesian Attention Mechanism: Ein probabilistisches Framework für die Positionskodierung und Kontextlängen-Extrapolation

Bayesian注意机制:定位编码和背景长度外推概率框架 2505.22842v1

Authors: Arthur S. Bianchessi, Rodrigo C. Barros, Lucas S. Kupssinskü

Transformer-based language models rely on positional encoding (PE) to handle token order and support context length extrapolation. However, existing PE methods lack theoretical clarity and rely on limited evaluation metrics to substantiate their extrapolation claims. We propose the Bayesian Attention Mechanism (BAM), a theoretical framework that formulates positional encoding as a prior within a probabilistic model. BAM unifies existing methods (e.g., NoPE and ALiBi) and motivates a new Generalized Gaussian positional prior that substantially improves long-context generalization. Empirically, BAM enables accurate information retrieval at $500\times$ the training context length, outperforming previous state-of-the-art context length generalization in long context retrieval accuracy while maintaining comparable perplexity and introducing minimal additional parameters.

nan

Article 395

Title@2025-05-28 (3): Kernel-Smoothed Scores for Denoising Diffusion: A Bias-Variance Study

Title: Kernel-Smoothed Scores for Denoising Diffusion: A Bias-Variance Study

Kernelgeglättete Punktzahlen für die Denoisierung der Diffusion: Eine Bias-Varianz-Studie

Disoising 扩散的内核悬浮分数:生物量变化研究 2505.22841v1

Authors: Franck Gabriel, François Ged, Maria Han Veiga, Emmanuel Schertzer

Diffusion models now set the benchmark in high-fidelity generative sampling, yet they can, in principle, be prone to memorization. In this case, their learned score overfits the finite dataset so that the reverse-time SDE samples are mostly training points. In this paper, we interpret the empirical score as a noisy version of the true score and show that its covariance matrix is asymptotically a re-weighted data PCA. In large dimension, the small time limit makes the noise variance blow up while simultaneously reducing spatial correlation. To reduce this variance, we introduce a kernel-smoothed empirical score and analyze its bias-variance trade-off. We derive asymptotic bounds on the Kullback-Leibler divergence between the true distribution and the one generated by the modified reverse SDE. Regularization on the score has the same effect as increasing the size of the training dataset, and thus helps prevent memorization. A spectral decomposition of the forward diffusion suggests better variance control under some regularity conditions of the true data distribution. Reverse diffusion with kernel-smoothed empirical score can be reformulated as a gradient descent drifted toward a Log-Exponential Double-Kernel Density Estimator (LED-KDE). This perspective highlights two regularization mechanisms taking place in denoising diffusions: an initial Gaussian kernel first diffuses mass isotropically in the ambient space, while a second kernel applied in score space concentrates and spreads that mass along the data manifold. Hence, even a straightforward regularization-without any learning-already mitigates memorization and enhances generalization. Numerically, we illustrate our results with several experiments on synthetic and MNIST datasets.

nan

Article 396

Title@2025-05-28 (3): Development and Validation of SXI++ LNM Algorithm for Sepsis Prediction

Title: Development and Validation of SXI++ LNM Algorithm for Sepsis Prediction

Entwicklung und Validierung von SXI++ LNM-Algorithmus für Sepsis-Vorhersage

SXI+++ LNM 测距算法的制定和校验 2505.22840v1

Authors: Dharambir Mahto, Prashant Yadav, Mahesh Banavar, Jim Keany, Alan T Joseph, Srinivas Kilambi

Sepsis is a life-threatening condition affecting over 48.9 million people globally and causing 11 million deaths annually. Despite medical advancements, predicting sepsis remains a challenge due to non-specific symptoms and complex pathophysiology. The SXI++ LNM is a machine learning scoring system that refines sepsis prediction by leveraging multiple algorithms and deep neural networks. This study aims to improve robustness in clinical applications and evaluates the predictive performance of the SXI++ LNM for sepsis prediction. The model, utilizing a deep neural network, was trained and tested using multiple scenarios with different dataset distributions. The model’s performance was assessed against unseen test data, and accuracy, precision, and area under the curve (AUC) were calculated. THE SXI++ LNM outperformed the state of the art in three use cases, achieving an AUC of 0.99 (95% CI: 0.98-1.00). The model demonstrated a precision of 99.9% (95% CI: 99.8-100.0) and an accuracy of 99.99% (95% CI: 99.98-100.0), maintaining high reliability.

nan

Article 397

Title@2025-05-28 (3): How Do Diffusion Models Improve Adversarial Robustness?

Title: How Do Diffusion Models Improve Adversarial Robustness?

Wie verbessern Diffusionsmodelle die widrige Robustheit?

传播模型如何改善反逆能力? 2505.22839v1

Authors: Liu Yuezhang, Xue-Xin Wei

Recent findings suggest that diffusion models significantly enhance empirical adversarial robustness. While some intuitive explanations have been proposed, the precise mechanisms underlying these improvements remain unclear. In this work, we systematically investigate how and how well diffusion models improve adversarial robustness. First, we observe that diffusion models intriguingly increase, rather than decrease, the $\ell_p$ distance to clean samples–challenging the intuition that purification denoises inputs closer to the original data. Second, we find that the purified images are heavily influenced by the internal randomness of diffusion models, where a compression effect arises within each randomness configuration. Motivated by this observation, we evaluate robustness under fixed randomness and find that the improvement drops to approximately 24% on CIFAR-10–substantially lower than prior reports approaching 70%. Importantly, we show that this remaining robustness gain strongly correlates with the model’s ability to compress the input space, revealing the compression rate as a reliable robustness indicator without requiring gradient-based analysis. Our findings provide novel insights into the mechanisms underlying diffusion-based purification, and offer guidance for developing more effective and principled adversarial purification systems.

nan

Article 398

Title@2025-05-28 (3): Bridging Distribution Shift and AI Safety: Conceptual and Methodological Synergies

Title: Bridging Distribution Shift and AI Safety: Conceptual and Methodological Synergies

Bridging Distribution Shift und KI-Sicherheit: Konzeptionelle und methodische Synergien

搭桥分配转变与AI安全:概念与方法的协同作用 2505.22829v1

Authors: Chenruo Liu, Kenan Tang, Yao Qin, Qi Lei

This paper bridges distribution shift and AI safety through a comprehensive analysis of their conceptual and methodological synergies. While prior discussions often focus on narrow cases or informal analogies, we establish two types connections between specific causes of distribution shift and fine-grained AI safety issues: (1) methods addressing a specific shift type can help achieve corresponding safety goals, or (2) certain shifts and safety issues can be formally reduced to each other, enabling mutual adaptation of their methods. Our findings provide a unified perspective that encourages fundamental integration between distribution shift and AI safety research.

nan

Article 399

Title@2025-05-28 (3): PGLearn – An Open-Source Learning Toolkit for Optimal Power Flow

Title: PGLearn – An Open-Source Learning Toolkit for Optimal Power Flow

PGLearn – Ein Open-Source-Learning-Toolkit für optimalen Stromfluss

PGLearn – – 最佳电力流动开放源学习工具包 2505.22825v1

Authors: Michael Klamkin, Mathieu Tanneau, Pascal Van Hentenryck

Machine Learning (ML) techniques for Optimal Power Flow (OPF) problems have recently garnered significant attention, reflecting a broader trend of leveraging ML to approximate and/or accelerate the resolution of complex optimization problems. These developments are necessitated by the increased volatility and scale in energy production for modern and future grids. However, progress in ML for OPF is hindered by the lack of standardized datasets and evaluation metrics, from generating and solving OPF instances, to training and benchmarking machine learning models. To address this challenge, this paper introduces PGLearn, a comprehensive suite of standardized datasets and evaluation tools for ML and OPF. PGLearn provides datasets that are representative of real-life operating conditions, by explicitly capturing both global and local variability in the data generation, and by, for the first time, including time series data for several large-scale systems. In addition, it supports multiple OPF formulations, including AC, DC, and second-order cone formulations. Standardized datasets are made publicly available to democratize access to this field, reduce the burden of data generation, and enable the fair comparison of various methodologies. PGLearn also includes a robust toolkit for training, evaluating, and benchmarking machine learning models for OPF, with the goal of standardizing performance evaluation across the field. By promoting open, standardized datasets and evaluation metrics, PGLearn aims at democratizing and accelerating research and innovation in machine learning applications for optimal power flow problems. Datasets are available for download at https://www.huggingface.co/PGLearn.

nan

Article 400

Title: Comparing Human and AI Rater Effects Using the Many-Facet Rasch Model

Vergleich menschlicher und KI-Rater-Effekte mit dem Multi-Facet-Rasch-Modell

使用多面 Rasch 模型比较人类和AI Rater效应 2505.18486v2

Authors: Hong Jiao, Dan Song, Won-Chan Lee

Large language models (LLMs) have been widely explored for automated scoring in low-stakes assessment to facilitate learning and instruction. Empirical evidence related to which LLM produces the most reliable scores and induces least rater effects needs to be collected before the use of LLMs for automated scoring in practice. This study compared ten LLMs (ChatGPT 3.5, ChatGPT 4, ChatGPT 4o, OpenAI o1, Claude 3.5 Sonnet, Gemini 1.5, Gemini 1.5 Pro, Gemini 2.0, as well as DeepSeek V3, and DeepSeek R1) with human expert raters in scoring two types of writing tasks. The accuracy of the holistic and analytic scores from LLMs compared with human raters was evaluated in terms of Quadratic Weighted Kappa. Intra-rater consistency across prompts was compared in terms of Cronbach Alpha. Rater effects of LLMs were evaluated and compared with human raters using the Many-Facet Rasch model. The results in general supported the use of ChatGPT 4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet with high scoring accuracy, better rater reliability, and less rater effects.

nan

Article 401

Title@2025-05-28 (3): Hybrid Disagreement-Diversity Active Learning for Bioacoustic Sound Event Detection

Title: Hybrid Disagreement-Diversity Active Learning for Bioacoustic Sound Event Detection

Hybride Disagreement-Diversity Aktives Lernen für die bioakustische Sound-Erkennung

生物声波声音事件探测发现活动积极学习 2505.20956v2

Authors: Shiqi Zhang, Tuomas Virtanen

Bioacoustic sound event detection (BioSED) is crucial for biodiversity conservation but faces practical challenges during model development and training: limited amounts of annotated data, sparse events, species diversity, and class imbalance. To address these challenges efficiently with a limited labeling budget, we apply the mismatch-first farthest-traversal (MFFT), an active learning method integrating committee voting disagreement and diversity analysis. We also refine an existing BioSED dataset specifically for evaluating active learning algorithms. Experimental results demonstrate that MFFT achieves a mAP of 68% when cold-starting and 71% when warm-starting (which is close to the fully-supervised mAP of 75%) while using only 2.3% of the annotations. Notably, MFFT excels in cold-start scenarios and with rare species, which are critical for monitoring endangered species, demonstrating its practical value.

nan

Article 402

Title@2025-05-28 (3): Scalable Differentially Private Bayesian Optimization

Title: Scalable Differentially Private Bayesian Optimization

Skalierbare differenzierte private Bayesian-Optimierung

Bayesian优化化 2502.06044v2

Authors: Getoar Sopa, Juraj Marusic, Marco Avella-Medina, John P. Cunningham

In recent years, there has been much work on scaling Bayesian Optimization to high-dimensional problems, for example hyperparameter tuning in large machine learning models. These scalable methods have been successful, finding high objective values much more quickly than traditional global Bayesian Optimization or random search-based methods. At the same time, these large models often use sensitive data, but preservation of Differential Privacy has not scaled alongside these modern Bayesian Optimization procedures. Here we develop a method to privately optimize potentially high-dimensional parameter spaces using privatized Gradient Informative Bayesian Optimization. Our theoretical results show that under suitable conditions, our method converges exponentially fast to a locally optimal parameter configuration, up to a natural privacy error. Moreover, regardless of whether the assumptions are satisfied, we prove that our algorithm maintains privacy and empirically display superior performance to existing methods in the high-dimensional hyperparameter setting.

nan

Article 403

Title@2025-05-28 (3): When Collaborative Filtering is not Collaborative: Unfairness of PCA for Recommendations

Title: When Collaborative Filtering is not Collaborative: Unfairness of PCA for Recommendations

Wenn Kollaborative Filterung nicht kollaborativ ist: Unfairness von PCA für Empfehlungen

当协作过滤不是协作过滤时:常设仲裁院不公平以征求建议 2310.09687v2

Authors: David Liu, Jackie Baek, Tina Eliassi-Rad

We study the fairness of dimensionality reduction methods for recommendations. We focus on the fundamental method of principal component analysis (PCA), which identifies latent components and produces a low-rank approximation via the leading components while discarding the trailing components. Prior works have defined notions of “fair PCA”; however, these definitions do not answer the following question: why is PCA unfair? We identify two underlying popularity mechanisms that induce item unfairness in PCA. The first negatively impacts less popular items because less popular items rely on trailing latent components to recover their values. The second negatively impacts highly popular items, since the leading PCA components specialize in individual popular items instead of capturing similarities between items. To address these issues, we develop a polynomial-time algorithm, Item-Weighted PCA, that flexibly up-weights less popular items when optimizing for leading principal components. We theoretically show that PCA, in all cases, and Normalized PCA, in cases of block-diagonal matrices, are instances of Item-Weighted PCA. We empirically show that there exist datasets for which Item-Weighted PCA yields the optimal solution while the baselines do not. In contrast to past dimensionality reduction re-weighting techniques, Item-Weighted PCA solves a convex optimization problem and enforces a hard rank constraint. Our evaluations on real-world datasets show that Item-Weighted PCA not only mitigates both unfairness mechanisms, but also produces recommendations that outperform those of PCA baselines.

nan

Article 404

Title@2025-05-28 (3): Preference Learning with Response Time

Title: Preference Learning with Response Time

Präferenz-Lernen mit Reaktionszeit

具有响应时间的优先学习 2505.22820v1

Authors: Ayush Sawarni, Sahasrajit Sarmasarkar, Vasilis Syrgkanis

This paper investigates the integration of response time data into human preference learning frameworks for more effective reward model elicitation. While binary preference data has become fundamental in fine-tuning foundation models, generative AI systems, and other large-scale models, the valuable temporal information inherent in user decision-making remains largely unexploited. We propose novel methodologies to incorporate response time information alongside binary choice data, leveraging the Evidence Accumulation Drift Diffusion (EZ) model, under which response time is informative of the preference strength. We develop Neyman-orthogonal loss functions that achieve oracle convergence rates for reward model learning, matching the theoretical optimal rates that would be attained if the expected response times for each query were known a priori. Our theoretical analysis demonstrates that for linear reward functions, conventional preference learning suffers from error rates that scale exponentially with reward magnitude. In contrast, our response time-augmented approach reduces this to polynomial scaling, representing a significant improvement in sample efficiency. We extend these guarantees to non-parametric reward function spaces, establishing convergence properties for more complex, realistic reward models. Our extensive experiments validate our theoretical findings in the context of preference learning over images.

nan

Article 405

Title@2025-05-28 (3): IMTS is Worth Time $\times$ Channel Patches: Visual Masked Autoencoders for Irregular Multivariate Time Series Prediction

Title: IMTS is Worth Time $\times$ Channel Patches: Visual Masked Autoencoders for Irregular Multivariate Time Series Prediction

IMTS ist Zeit wert $\times$ Channel Patches: Visual Masked Autoencoder für irreguläre Multivariate Time Series Prediction

IMTS 是有价值的时间 $\ times$$ 频道补丁: 用于非常规多变时间序列预测的视觉蒙面自动编码器 2505.22815v1

Authors: Zhangyi Hu, Jiemin Wu, Hua Xu, Mingqian Liao, Ninghui Feng, Bo Gao, Songning Lai, Yutao Yue

Irregular Multivariate Time Series (IMTS) forecasting is challenging due to the unaligned nature of multi-channel signals and the prevalence of extensive missing data. Existing methods struggle to capture reliable temporal patterns from such data due to significant missing values. While pre-trained foundation models show potential for addressing these challenges, they are typically designed for Regularly Sampled Time Series (RTS). Motivated by the visual Mask AutoEncoder’s (MAE) powerful capability for modeling sparse multi-channel information and its success in RTS forecasting, we propose VIMTS, a framework adapting Visual MAE for IMTS forecasting. To mitigate the effect of missing values, VIMTS first processes IMTS along the timeline into feature patches at equal intervals. These patches are then complemented using learned cross-channel dependencies. Then it leverages visual MAE’s capability in handling sparse multichannel data for patch reconstruction, followed by a coarse-to-fine technique to generate precise predictions from focused contexts. In addition, we integrate self-supervised learning for improved IMTS modeling by adapting the visual MAE to IMTS data. Extensive experiments demonstrate VIMTS’s superior performance and few-shot capability, advancing the application of visual foundation models in more general time series tasks. Our code is available at https://github.com/WHU-HZY/VIMTS.

nan

Article 406

Title@2025-05-28 (3): Regression and Forecasting of U.S. Stock Returns Based on LSTM

Title: Regression and Forecasting of U.S. Stock Returns Based on LSTM

Regression und Prognose von US-Aktienrenditen basierend auf LSTM

根据LSTM对美国库存收益的回归和预测 2502.05210v3

Authors: Shicheng Zhou, Zizhou Zhang, Rong Zhang, Yuchen Yin, Chia Hong Chang, Qinyan Shen

This paper analyses the investment returns of three stock sectors, Manuf, Hitec, and Other, in the U.S. stock market, based on the Fama-French three-factor model, the Carhart four-factor model, and the Fama-French five-factor model, in order to test the validity of the Fama-French three-factor model, the Carhart four-factor model, and the Fama-French five-factor model for the three sectors of the market. French five-factor model for the three sectors of the market. Also, the LSTM model is used to explore the additional factors affecting stock returns. The empirical results show that the Fama-French five-factor model has better validity for the three segments of the market under study, and the LSTM model has the ability to capture the factors affecting the returns of certain industries, and can better regress and predict the stock returns of the relevant industries. Keywords- Fama-French model; Carhart model; Factor model; LSTM model.

nan

Article 407

Title@2025-05-28 (3): X-Factor: Quality Is a Dataset-Intrinsic Property

Title: X-Factor: Quality Is a Dataset-Intrinsic Property

X-Factor: Qualität ist eine datensatzintrinsische Eigenschaft

X 要素: 质量是一个数据集 - Intrins 属性 2505.22813v1

Authors: Josiah Couch, Miao Li, Rima Arnaout, Ramy Arnaout

In the universal quest to optimize machine-learning classifiers, three factors – model architecture, dataset size, and class balance – have been shown to influence test-time performance but do not fully account for it. Previously, evidence was presented for an additional factor that can be referred to as dataset quality, but it was unclear whether this was actually a joint property of the dataset and the model architecture, or an intrinsic property of the dataset itself. If quality is truly dataset-intrinsic and independent of model architecture, dataset size, and class balance, then the same datasets should perform better (or worse) regardless of these other factors. To test this hypothesis, here we create thousands of datasets, each controlled for size and class balance, and use them to train classifiers with a wide range of architectures, from random forests and support-vector machines to deep networks. We find that classifier performance correlates strongly by subset across architectures ($R^2=0.79$), supporting quality as an intrinsic property of datasets independent of dataset size and class balance and of model architecture. Digging deeper, we find that dataset quality appears to be an emergent property of something more fundamental: the quality of datasets’ constituent classes. Thus, quality joins size, class balance, and model architecture as an independent correlate of performance and a separate target for optimizing machine-learning-based classification.

nan

Article 408

Title@2025-05-28 (3): Credit Risk Identification in Supply Chains Using Generative Adversarial Networks

Title: Credit Risk Identification in Supply Chains Using Generative Adversarial Networks

Kreditrisikoidentifizierung in Lieferketten mit generativen Adversarial-Netzwerken

利用产生反逆网络的供应链中的信用风险识别 2501.10348v4

Authors: Zizhou Zhang, Xinshi Li, Yu Cheng, Zhenrui Chen, Qianying Liu

Credit risk management within supply chains has emerged as a critical research area due to its significant implications for operational stability and financial sustainability. The intricate interdependencies among supply chain participants mean that credit risks can propagate across networks, with impacts varying by industry. This study explores the application of Generative Adversarial Networks (GANs) to enhance credit risk identification in supply chains. GANs enable the generation of synthetic credit risk scenarios, addressing challenges related to data scarcity and imbalanced datasets. By leveraging GAN-generated data, the model improves predictive accuracy while effectively capturing dynamic and temporal dependencies in supply chain data. The research focuses on three representative industries-manufacturing (steel), distribution (pharmaceuticals), and services (e-commerce) to assess industry-specific credit risk contagion. Experimental results demonstrate that the GAN-based model outperforms traditional methods, including logistic regression, decision trees, and neural networks, achieving superior accuracy, recall, and F1 scores. The findings underscore the potential of GANs in proactive risk management, offering robust tools for mitigating financial disruptions in supply chains. Future research could expand the model by incorporating external market factors and supplier relationships to further enhance predictive capabilities. Keywords- Generative Adversarial Networks (GANs); Supply Chain Risk; Credit Risk Identification; Machine Learning; Data Augmentation

nan

Article 409

Title@2025-05-28 (3): Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Title: Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Hocheffiziente und effektive LLMs mit Multi-Boolean-Architekturen

多Boolean建筑群高效益、高效益、高效益、高效益、高效益、高效益的LLMs 2505.22811v1

Authors: Ba-Hien Tran, Van Minh Nguyen

Weight binarization has emerged as a promising strategy to drastically reduce the complexity of large language models (LLMs). It is mainly classified into two approaches: post-training binarization and finetuning with training-aware binarization methods. The first approach, while having low complexity, leads to significant loss of information from the original LLMs, resulting in poor performance. The second approach, on the other hand, relies heavily on full-precision latent weights for gradient approximation of binary weights, which not only remains suboptimal but also introduces substantial complexity. In this paper, we introduce a novel framework that effectively transforms LLMs into multi-kernel Boolean parameters, for the first time, finetunes them directly in the Boolean domain, eliminating the need for expensive latent weights. This significantly reduces complexity during both finetuning and inference. Through extensive and insightful experiments across a wide range of LLMs, we demonstrate that our method outperforms recent ultra low-bit quantization and binarization methods.

nan

Article 410

Title@2025-05-28 (3): Distribution free M-estimation

Title: Distribution free M-estimation

Verteilungsfreie M-Schätzung

免费分发 M - 估计 2505.22807v1

Authors: John C. Duchi

The basic question of delineating those statistical problems that are solvable without making any assumptions on the underlying data distribution has long animated statistics and learning theory. This paper characterizes when a (univariate) convex M-estimation or stochastic optimization problem is solvable in such an assumption-free setting, providing a precise dividing line between solvable and unsolvable problems. The conditions we identify show, perhaps surprisingly, that Lipschitz continuity of the loss being minimized is not necessary for distribution free minimization, and they are also distinct from classical characterizations of learnability in machine learning.

nan

Article 411

Title: Anomalies by Synthesis: Anomaly Detection using Generative Diffusion Models for Off-Road Navigation

Anomalien durch Synthese: Anomalieerkennung mit generativen Diffusionsmodellen für Off-Road-Navigation

合成反常现象:使用非轨道导航生成扩散模型进行异常检测 2505.22805v1

Authors: Siddharth Ancha, Sunshine Jiang, Travis Manderson, Laura Brandt, Yilun Du, Philip R. Osteen, Nicholas Roy

In order to navigate safely and reliably in off-road and unstructured environments, robots must detect anomalies that are out-of-distribution (OOD) with respect to the training data. We present an analysis-by-synthesis approach for pixel-wise anomaly detection without making any assumptions about the nature of OOD data. Given an input image, we use a generative diffusion model to synthesize an edited image that removes anomalies while keeping the remaining image unchanged. Then, we formulate anomaly detection as analyzing which image segments were modified by the diffusion model. We propose a novel inference approach for guided diffusion by analyzing the ideal guidance gradient and deriving a principled approximation that bootstraps the diffusion model to predict guidance gradients. Our editing technique is purely test-time that can be integrated into existing workflows without the need for retraining or fine-tuning. Finally, we use a combination of vision-language foundation models to compare pixels in a learned feature space and detect semantically meaningful edits, enabling accurate anomaly detection for off-road navigation. Project website: https://siddancha.github.io/anomalies-by-diffusion-synthesis/

nan

Article 412

Title@2025-05-28 (3): CLUE: Neural Networks Calibration via Learning Uncertainty-Error alignment

Title: CLUE: Neural Networks Calibration via Learning Uncertainty-Error alignment

CLUE: Neurale Netzwerke Kalibrierung über Learning Uncertainty-Error Alignment

CLUE:通过学习不确定性-差错对齐校准神经网络 2505.22803v1

Authors: Pedro Mendes, Paolo Romano, David Garlan

Reliable uncertainty estimation is critical for deploying neural networks (NNs) in real-world applications. While existing calibration techniques often rely on post-hoc adjustments or coarse-grained binning methods, they remain limited in scalability, differentiability, and generalization across domains. In this work, we introduce CLUE (Calibration via Learning Uncertainty-Error Alignment), a novel approach that explicitly aligns predicted uncertainty with observed error during training, grounded in the principle that well-calibrated models should produce uncertainty estimates that match their empirical loss. CLUE adopts a novel loss function that jointly optimizes predictive performance and calibration, using summary statistics of uncertainty and loss as proxies. The proposed method is fully differentiable, domain-agnostic, and compatible with standard training pipelines. Through extensive experiments on vision, regression, and language modeling tasks, including out-of-distribution and domain-shift scenarios, we demonstrate that CLUE achieves superior calibration quality and competitive predictive performance with respect to state-of-the-art approaches without imposing significant computational overhead.

nan

Article 413

Title@2025-05-28 (3): Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning

Title: Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning

Instruct-SkillMix: Eine leistungsstarke Pipeline für LLM Instruction Tuning

指令- SkillMix: 用于LLM 指令导导图的强大管道 2408.14774v4

Authors: Simran Kaur, Simon Park, Anirudh Goyal, Sanjeev Arora

We introduce Instruct-SkillMix, an automated approach for creating diverse, high quality SFT data for instruction-following. The pipeline involves two stages, each leveraging an existing powerful LLM: (1) Skill extraction: uses the LLM to extract core “skills” for instruction-following by directly prompting the model. This is inspired by LLM metacognition'' of Didolkar et al. (2024); (2) Data generation: uses the powerful LLM to generate (instruction, response) data that exhibit a randomly chosen pair of these skills. Here, the use of random skill combinations promotes diversity and difficulty. The estimated cost of creating the dataset is under $600. Vanilla SFT (i.e., no PPO, DPO, or RL methods) on data generated from Instruct-SkillMix leads to strong gains on instruction following benchmarks such as AlpacaEval 2.0, MT-Bench, and WildBench. With just 4K examples, LLaMA-3-8B-Base achieves 42.76% length-controlled win rate on AlpacaEval 2.0, a level similar to frontier models like Claude 3 Opus and LLaMA-3.1-405B-Instruct. Ablation studies also suggest plausible reasons for why creating open instruction-tuning datasets via naive crowd-sourcing has proved difficult. In our dataset, adding 20% low quality answers (shirkers’’) causes a noticeable degradation in performance. The Instruct-SkillMix pipeline seems flexible and adaptable to other settings.

nan

Article 414

Title@2025-05-28 (3): SequentialBreak: Large Language Models Can be Fooled by Embedding Jailbreak Prompts into Sequential Prompt Chains

Title: SequentialBreak: Large Language Models Can be Fooled by Embedding Jailbreak Prompts into Sequential Prompt Chains

SequentialBreak: Große Sprachmodelle können durch Einbetten von Jailbreak Prompts in Sequential Prompt Chains ausgeblendet werden

顺序式布雷克:大语言模型可以通过将破狱线索嵌入顺序式提示链来蒙骗大语言模型 2411.06426v3

Authors: Bijoy Ahmed Saiem, MD Sadik Hossain Shanto, Rakib Ahsan, Md Rafi ur Rashid

As the integration of the Large Language Models (LLMs) into various applications increases, so does their susceptibility to misuse, raising significant security concerns. Numerous jailbreak attacks have been proposed to assess the security defense of LLMs. Current jailbreak attacks mainly rely on scenario camouflage, prompt obfuscation, prompt optimization, and prompt iterative optimization to conceal malicious prompts. In particular, sequential prompt chains in a single query can lead LLMs to focus on certain prompts while ignoring others, facilitating context manipulation. This paper introduces SequentialBreak, a novel jailbreak attack that exploits this vulnerability. We discuss several scenarios, not limited to examples like Question Bank, Dialog Completion, and Game Environment, where the harmful prompt is embedded within benign ones that can fool LLMs into generating harmful responses. The distinct narrative structures of these scenarios show that SequentialBreak is flexible enough to adapt to various prompt formats beyond those discussed. Extensive experiments demonstrate that SequentialBreak uses only a single query to achieve a substantial gain of attack success rate over existing baselines against both open-source and closed-source models. Through our research, we highlight the urgent need for more robust and resilient safeguards to enhance LLM security and prevent potential misuse. All the result files and website associated with this research are available in this GitHub repository: https://anonymous.4open.science/r/JailBreakAttack-4F3B/.

nan

Article 415

Title@2025-05-28 (3): Efficient Preimage Approximation for Neural Network Certification

Title: Efficient Preimage Approximation for Neural Network Certification

Effiziente Preimage-Annäherung für die Neural Network Zertifizierung

神经网络认证的高效预感近似率 2505.22798v1

Authors: Anton Björklund, Mykola Zaitsev, Marta Kwiatkowska

The growing reliance on artificial intelligence in safety- and security-critical applications demands effective neural network certification. A challenging real-world use case is certification against ``patch attacks’’, where adversarial patches or lighting conditions obscure parts of images, for example traffic signs. One approach to certification, which also gives quantitative coverage estimates, utilizes preimages of neural networks, i.e., the set of inputs that lead to a specified output. However, these preimage approximation methods, including the state-of-the-art PREMAP algorithm, struggle with scalability. This paper presents novel algorithmic improvements to PREMAP involving tighter bounds, adaptive Monte Carlo sampling, and improved branching heuristics. We demonstrate efficiency improvements of at least an order of magnitude on reinforcement learning control benchmarks, and show that our method scales to convolutional neural networks that were previously infeasible. Our results demonstrate the potential of preimage approximation methodology for reliability and robustness certification.

nan

Article 416

Title: DeSocial: Blockchain-based Decentralized Social Networks

DeSocial: Dezentrale soziale Netzwerke auf Blockchain-Basis

社会:基于供应链的权力下放社会网络 2505.21388v2

Authors: Jingyuan Huang, Xi Zhu, Minghao Guo, Yongfeng Zhang

Web 2.0 social platforms are inherently centralized, with user data and algorithmic decisions controlled by the platform. However, users can only passively receive social predictions without being able to choose the underlying algorithm, which limits personalization. Fortunately, with the emergence of blockchain, users are allowed to choose algorithms that are tailored to their local situation, improving prediction results in a personalized way. In a blockchain environment, each user possesses its own model to perform the social prediction, capturing different perspectives on social interactions. In our work, we propose DeSocial, a decentralized social network learning framework deployed on an Ethereum (ETH) local development chain that integrates distributed data storage, node-level consensus, and user-driven model selection through Ganache. In the first stage, each user leverages DeSocial to evaluate multiple backbone models on their local subgraph. DeSocial coordinates the execution and returns model-wise prediction results, enabling the user to select the most suitable backbone for personalized social prediction. Then, DeSocial uniformly selects several validation nodes that possess the algorithm specified by each user, and aggregates the prediction results by majority voting, to prevent errors caused by any single model’s misjudgment. Extensive experiments show that DeSocial has an evident improvement compared to the five classical centralized social network learning models, promoting user empowerment in blockchain-based decentralized social networks, showing the importance of multi-node validation and personalized algorithm selection based on blockchain. Our implementation is available at: https://github.com/agiresearch/DeSocial.

nan

Article 417

Title@2025-05-28 (3): The Empirical Mean is Minimax Optimal for Local Glivenko-Cantelli

Title: The Empirical Mean is Minimax Optimal for Local Glivenko-Cantelli

Das Empirische Mittel ist Minimax Optimal für lokale Glivenko-Cantelli

当地格利文科-坎泰利的经验中值为 Minimax 最佳当地格利文科-坎泰利 2410.02835v2

Authors: Doron Cohen, Aryeh Kontorovich, Roi Weiss

We revisit the recently introduced Local Glivenko-Cantelli setting, which studies distribution-dependent uniform convergence rates of the Empirical Mean Estimator (EME). In this work, we investigate generalizations of this setting where arbitrary estimators are allowed rather than just the EME. Can a strictly larger class of measures be learned? Can better risk decay rates be obtained? We provide exhaustive answers to these questions, which are both negative, provided the learner is barred from exploiting some infinite-dimensional pathologies. On the other hand, allowing such exploits does lead to a strictly larger class of learnable measures.

nan

Article 418

Title@2025-05-28 (3): KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Title: KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

KVQuant: In Richtung 10 Millionen Kontextlänge LLM-Inferenz mit KV Cache-Quantisierung

KVQuant: 努力达到1000万个内长长LLM 与 KV 缓存量推论 2401.18079v6

Authors: Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami

LLMs are seeing growing use for applications which require large context windows, and with these large context windows KV cache activations surface as the dominant contributor to memory consumption during inference. Quantization is a promising approach for compressing KV cache activations; however, existing solutions fail to represent activations accurately in sub-4-bit precision. Our work, KVQuant, facilitates low precision KV cache quantization by incorporating several novel methods: (i) Per-Channel Key Quantization, where we adjust the dimension along which we quantize the Key activations to better match the distribution; (ii) Pre-RoPE Key Quantization, where we quantize Key activations before the rotary positional embedding to mitigate its impact on quantization; (iii) Non-Uniform KV Cache Quantization, where we derive per-layer sensitivity-weighted non-uniform datatypes that better represent the distributions; and (iv) Per-Vector Dense-and-Sparse Quantization, where we isolate outliers separately for each vector to minimize skews in quantization ranges. By applying our method to the LLaMA, Llama-2, Llama-3, and Mistral models, we achieve < 0.1 perplexity degradation with 3-bit quantization on both Wikitext-2 and C4, outperforming existing approaches. Our method enables serving LLaMA-7B with a context length of up to 1 million on a single A100-80GB GPU and up to 10 million on an 8-GPU system. We develop custom CUDA kernels for KVQuant, showing that we can achieve up to ~1.7x speedups, compared to baseline fp16 matrix-vector multiplications, for the LLaMA-7B model.

nan

Article 419

Title@2025-05-28 (3): Navigating the Latent Space Dynamics of Neural Models

Title: Navigating the Latent Space Dynamics of Neural Models

Navigation der latenten Raumdynamik von Neuralmodellen

导航内壳模型的冷层空间动态 2505.22785v1

Authors: Marco Fumero, Luca Moschella, Emanuele Rodolà, Francesco Locatello

Neural networks transform high-dimensional data into compact, structured representations, often modeled as elements of a lower dimensional latent space. In this paper, we present an alternative interpretation of neural models as dynamical systems acting on the latent manifold. Specifically, we show that autoencoder models implicitly define a latent vector field on the manifold, derived by iteratively applying the encoding-decoding map, without any additional training. We observe that standard training procedures introduce inductive biases that lead to the emergence of attractor points within this vector field. Drawing on this insight, we propose to leverage the vector field as a representation for the network, providing a novel tool to analyze the properties of the model and the data. This representation enables to: (i) analyze the generalization and memorization regimes of neural models, even throughout training; (ii) extract prior knowledge encoded in the network’s parameters from the attractors, without requiring any input data; (iii) identify out-of-distribution samples from their trajectories in the vector field. We further validate our approach on vision foundation models, showcasing the applicability and effectiveness of our method in real-world scenarios.

nan

Article 420

Title@2025-05-28 (3): On the definition and importance of interpretability in scientific machine learning

Title: On the definition and importance of interpretability in scientific machine learning

Zur Definition und Bedeutung der Deutbarkeit im wissenschaftlichen maschinellen Lernen

关于科学机器学习中可解释性的定义和重要性 2505.13510v2

Authors: Conor Rowan, Alireza Doostan

Though neural networks trained on large datasets have been successfully used to describe and predict many physical phenomena, there is a sense among scientists that, unlike traditional scientific models comprising simple mathematical expressions, their findings cannot be integrated into the body of scientific knowledge. Critics of machine learning’s inability to produce human-understandable relationships have converged on the concept of “interpretability” as its point of departure from more traditional forms of science. As the growing interest in interpretability has shown, researchers in the physical sciences seek not just predictive models, but also to uncover the fundamental principles that govern a system of interest. However, clarity around a definition of interpretability and the precise role that it plays in science is lacking in the literature. In this work, we argue that researchers in equation discovery and symbolic regression tend to conflate the concept of sparsity with interpretability. We review key papers on interpretable machine learning from outside the scientific community and argue that, though the definitions and methods they propose can inform questions of interpretability for scientific machine learning (SciML), they are inadequate for this new purpose. Noting these deficiencies, we propose an operational definition of interpretability for the physical sciences. Our notion of interpretability emphasizes understanding of the mechanism over mathematical sparsity. Innocuous though it may seem, this emphasis on mechanism shows that sparsity is often unnecessary. It also questions the possibility of interpretable scientific discovery when prior knowledge is lacking. We believe a precise and philosophically informed definition of interpretability in SciML will help focus research efforts toward the most significant obstacles to realizing a data-driven scientific future.

nan

Article 421

Title@2025-05-28 (3): Adaptive Exploration for Multi-Reward Multi-Policy Evaluation

Title: Adaptive Exploration for Multi-Reward Multi-Policy Evaluation

Adaptive Exploration für Multi-Reward Multi-Policy-Bewertung

多方奖励多政策评价的适应性探索 2502.02516v2

Authors: Alessio Russo, Aldo Pacchiano

We study the policy evaluation problem in an online multi-reward multi-policy discounted setting, where multiple reward functions must be evaluated simultaneously for different policies. We adopt an $(\epsilon,\delta)$-PAC perspective to achieve $\epsilon$-accurate estimates with high confidence across finite or convex sets of rewards, a setting that has not been investigated in the literature. Building on prior work on Multi-Reward Best Policy Identification, we adapt the MR-NaS exploration scheme to jointly minimize sample complexity for evaluating different policies across different reward sets. Our approach leverages an instance-specific lower bound revealing how the sample complexity scales with a measure of value deviation, guiding the design of an efficient exploration policy. Although computing this bound entails a hard non-convex optimization, we propose an efficient convex approximation that holds for both finite and convex reward sets. Experiments in tabular domains demonstrate the effectiveness of this adaptive exploration scheme.

nan

Article 422

Title@2025-05-28 (3): Temporal Convolutional Autoencoder for Interference Mitigation in FMCW Radar Altimeters

Title: Temporal Convolutional Autoencoder for Interference Mitigation in FMCW Radar Altimeters

Temporal Convolutional Autoencoder für Interferenzmilderung in FMCW Radar Höhenmessern

FMCC 雷达测高仪中用于减少干扰干扰的时时变自动算器 2505.22783v1

Authors: Charles E. Thornton, Jamie Sloop, Samuel Brown, Aaron Orndorff, William C. Headley, Stephen Young

We investigate the end-to-end altitude estimation performance of a convolutional autoencoder-based interference mitigation approach for frequency-modulated continuous-wave (FMCW) radar altimeters. Specifically, we show that a Temporal Convolutional Network (TCN) autoencoder effectively exploits temporal correlations in the received signal, providing superior interference suppression compared to a Least Mean Squares (LMS) adaptive filter. Unlike existing approaches, the present method operates directly on the received FMCW signal. Additionally, we identify key challenges in applying deep learning to wideband FMCW interference mitigation and outline directions for future research to enhance real-time feasibility and generalization to arbitrary interference conditions.

nan

Article 423

Title@2025-05-28 (3): Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean-Field Games

Title: Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean-Field Games

Finite-Sample-Konvergenzgrenzen für die Optimierung der Treuhandregion-Politik in Mittelfeld-Spielen

平地运动会中信任区政策优化 2505.22781v1

Authors: Antonio Ocello, Daniil Tiapkin, Lorenzo Mancini, Mathieu Laurière, Eric Moulines

We introduce Mean-Field Trust Region Policy Optimization (MF-TRPO), a novel algorithm designed to compute approximate Nash equilibria for ergodic Mean-Field Games (MFG) in finite state-action spaces. Building on the well-established performance of TRPO in the reinforcement learning (RL) setting, we extend its methodology to the MFG framework, leveraging its stability and robustness in policy optimization. Under standard assumptions in the MFG literature, we provide a rigorous analysis of MF-TRPO, establishing theoretical guarantees on its convergence. Our results cover both the exact formulation of the algorithm and its sample-based counterpart, where we derive high-probability guarantees and finite sample complexity. This work advances MFG optimization by bridging RL techniques with mean-field decision-making, offering a theoretically grounded approach to solving complex multi-agent problems.

nan

Article 424

Title@2025-05-28 (3): Machine Learning Models Have a Supply Chain Problem

Title: Machine Learning Models Have a Supply Chain Problem

Modelle des maschinellen Lernens haben ein Problem mit der Lieferkette

机器学习模式有供应链问题 2505.22778v1

Authors: Sarah Meiklejohn, Hayden Blauzvern, Mihai Maruseac, Spencer Schrock, Laurent Simon, Ilia Shumailov

Powerful machine learning (ML) models are now readily available online, which creates exciting possibilities for users who lack the deep technical expertise or substantial computing resources needed to develop them. On the other hand, this type of open ecosystem comes with many risks. In this paper, we argue that the current ecosystem for open ML models contains significant supply-chain risks, some of which have been exploited already in real attacks. These include an attacker replacing a model with something malicious (e.g., malware), or a model being trained using a vulnerable version of a framework or on restricted or poisoned data. We then explore how Sigstore, a solution designed to bring transparency to open-source software supply chains, can be used to bring transparency to open ML models, in terms of enabling model publishers to sign their models and prove properties about the datasets they use.

nan

Article 425

Title@2025-05-28 (3): GraphNarrator: Generating Textual Explanations for Graph Neural Networks

Title: GraphNarrator: Generating Textual Explanations for Graph Neural Networks

GraphNarrator: Erzeugen von Texterklärungen für Graph Neuronale Netzwerke

图示记录器:生成图形神经网络的文字解释 2410.15268v2

Authors: Bo Pan, Zhen Xiong, Guanchen Wu, Zheng Zhang, Yifei Zhang, Liang Zhao

Graph representation learning has garnered significant attention due to its broad applications in various domains, such as recommendation systems and social network analysis. Despite advancements in graph learning methods, challenges still remain in explainability when graphs are associated with semantic features. In this paper, we present GraphNarrator, the first method designed to generate natural language explanations for Graph Neural Networks. GraphNarrator employs a generative language model that maps input-output pairs to explanations reflecting the model’s decision-making process. To address the lack of ground truth explanations to train the model, we propose first generating pseudo-labels that capture the model’s decisions from saliency-based explanations, then using Expert Iteration to iteratively train the pseudo-label generator based on training objectives on explanation quality. The high-quality pseudo-labels are finally utilized to train an end-to-end explanation generator model. Extensive experiments are conducted to demonstrate the effectiveness of GraphNarrator in producing faithful, concise, and human-preferred natural language explanations.

nan

Article 426

Title@2025-05-28 (3): The Value of Information in Human-AI Decision-making

Title: The Value of Information in Human-AI Decision-making

Der Wert von Informationen in der Mensch-AI-Entscheidungsfindung

信息在人类-大赦国际决策中的价值 2502.06152v4

Authors: Ziyang Guo, Yifan Wu, Jason Hartline, Jessica Hullman

Multiple agents – including humans and AI models – are increasingly combined to make decisions with the expectation of achieving complementary performance, where the decisions they make together outperform those made individually. However, knowing how to improve the performance of collaborating agents is often difficult without knowing more about what particular information and strategies each agent employs. With a focus on human-AI pairings, we contribute a decision-theoretic framework for characterizing the value of information – and consequently, opportunities for agents to better exploit available information – in AI-assisted decision workflows. We present a novel explanation technique (ILIV-SHAP) that adapts SHAP explanations to highlight human-complementing information. We validate the effectiveness of the framework and ILIV-SHAP through a study of human-AI decision-making. We show that our measure of complementary information can be used to identify which AI model will best complement human decisions. We also find that presenting ILIV-SHAP with AI predictions leads to reliably greater reductions in error over non-AI assisted decisions more than vanilla SHAP.

nan

Article 427

Title@2025-05-28 (3): Calibrated Value-Aware Model Learning with Stochastic Environment Models

Title: Calibrated Value-Aware Model Learning with Stochastic Environment Models

Kalibriertes wertbewusstes Modelllernen mit stochastischen Umweltmodellen

使用存储环境模型校准价值软件模型学习 2505.22772v1

Authors: Claas Voelcker, Anastasiia Pedan, Arash Ahmadian, Romina Abachi, Igor Gilitschenski, Amir-massoud Farahmand

The idea of value-aware model learning, that models should produce accurate value estimates, has gained prominence in model-based reinforcement learning. The MuZero loss, which penalizes a model’s value function prediction compared to the ground-truth value function, has been utilized in several prominent empirical works in the literature. However, theoretical investigation into its strengths and weaknesses is limited. In this paper, we analyze the family of value-aware model learning losses, which includes the popular MuZero loss. We show that these losses, as normally used, are uncalibrated surrogate losses, which means that they do not always recover the correct model and value function. Building on this insight, we propose corrections to solve this issue. Furthermore, we investigate the interplay between the loss calibration, latent model architectures, and auxiliary losses that are commonly employed when training MuZero-style agents. We show that while deterministic models can be sufficient to predict accurate values, learning calibrated stochastic models is still advantageous.

nan

Article 428

Title@2025-05-28 (3): Multivariate de Bruijn Graphs: A Symbolic Graph Framework for Time Series Forecasting

Title: Multivariate de Bruijn Graphs: A Symbolic Graph Framework for Time Series Forecasting

Multivariate de Bruijn Graphen: Ein symbolisches Graphen-Framework für die Vorhersage von Zeitreihen

布鲁伊图多变量图:时间序列预测符号图框架 2505.22768v1

Authors: Mert Onur Cakiroglu, Idil Bilge Altun, Hasan Kurban, Elham Buxton, Mehmet Dalkilic

Time series forecasting remains a challenging task for foundation models due to temporal heterogeneity, high dimensionality, and the lack of inherent symbolic structure. In this work, we propose DRAGON (Discrete Representation and Augmented Graph encoding Over deBruijN Graphs), a novel encoder that introduces Multivariate de Bruijn Graphs (MdBGs) to bridge the gap between symbolic representations and neural modeling. DRAGON discretizes continuous input sequences and maps them onto a fixed graph structure, enabling dynamic context recovery via graph-based attention. Integrated as an auxiliary module within a dual-branch architecture, DRAGON augments conventional CNN-based encoders with symbolic, structure-aware representations. All code developed for this study is available at: https://github.com/KurbanIntelligenceLab/MultdBG-Time-Series-Library

nan

Article 429

Title@2025-05-28 (3): Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks

Title: Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks

Degenerierung von Mess- und Regellösungen über aufgabenorientierte recurrente Neuralnetzwerke hinweg

跨任务技术经常性神经网络的退化 2410.03972v2

Authors: Ann Huang, Satpreet H. Singh, Flavio Martinelli, Kanaka Rajan

Task-trained recurrent neural networks (RNNs) are widely used in neuroscience and machine learning to model dynamical computations. To gain mechanistic insight into how neural systems solve tasks, prior work often reverse-engineers individual trained networks. However, different RNNs trained on the same task and achieving similar performance can exhibit strikingly different internal solutions-a phenomenon known as solution degeneracy. Here, we develop a unified framework to systematically quantify and control solution degeneracy across three levels: behavior, neural dynamics, and weight space. We apply this framework to 3,400 RNNs trained on four neuroscience-relevant tasks-flip-flop memory, sine wave generation, delayed discrimination, and path integration-while systematically varying task complexity, learning regime, network size, and regularization. We find that higher task complexity and stronger feature learning reduce degeneracy in neural dynamics but increase it in weight space, with mixed effects on behavior. In contrast, larger networks and structural regularization reduce degeneracy at all three levels. These findings empirically validate the Contravariance Principle and provide practical guidance for researchers aiming to tailor RNN solutions-whether to uncover shared neural mechanisms or to model individual variability observed in biological systems. This work provides a principled framework for quantifying and controlling solution degeneracy in task-trained RNNs, offering new tools for building more interpretable and biologically grounded models of neural computation.

nan

Article 430

Title@2025-05-28 (3): Test-time augmentation improves efficiency in conformal prediction

Title: Test-time augmentation improves efficiency in conformal prediction

Testzeitvergrößerung verbessert die Effizienz in der konformen Vorhersage

提高试验时间的提高提高符合预测的效率 2505.22764v1

Authors: Divya Shanmugam, Helen Lu, Swami Sankaranarayanan, John Guttag

A conformal classifier produces a set of predicted classes and provides a probabilistic guarantee that the set includes the true class. Unfortunately, it is often the case that conformal classifiers produce uninformatively large sets. In this work, we show that test-time augmentation (TTA)–a technique that introduces inductive biases during inference–reduces the size of the sets produced by conformal classifiers. Our approach is flexible, computationally efficient, and effective. It can be combined with any conformal score, requires no model retraining, and reduces prediction set sizes by 10%-14% on average. We conduct an evaluation of the approach spanning three datasets, three models, two established conformal scoring methods, different guarantee strengths, and several distribution shifts to show when and why test-time augmentation is a useful addition to the conformal pipeline.

nan

Article 431

Title@2025-05-28 (3): Generalizable Representation Learning for fMRI-based Neurological Disorder Identification

Title: Generalizable Representation Learning for fMRI-based Neurological Disorder Identification

Generalisierbares Repräsentationslernen für die fMRI-basierte neurologische Störungserkennung

FMRI基于神经疾病识别的神经疾病学学习 2412.16197v2

Authors: Wenhui Cui, Haleh Akrami, Anand A. Joshi, Richard M. Leahy

Despite the impressive advances achieved using deep learning for functional brain activity analysis, the heterogeneity of functional patterns and the scarcity of imaging data still pose challenges in tasks such as identifying neurological disorders. For functional Magnetic Resonance Imaging (fMRI), while data may be abundantly available from healthy controls, clinical data is often scarce, especially for rare diseases, limiting the ability of models to identify clinically-relevant features. We overcome this limitation by introducing a novel representation learning strategy integrating meta-learning with self-supervised learning to improve the generalization from normal to clinical features. This approach enables generalization to challenging clinical tasks featuring scarce training data. We achieve this by leveraging self-supervised learning on the control dataset to focus on inherent features that are not limited to a particular supervised task and incorporating meta-learning to improve the generalization across domains. To explore the generalizability of the learned representations to unseen clinical applications, we apply the model to four distinct clinical datasets featuring scarce and heterogeneous data for neurological disorder classification. Results demonstrate the superiority of our representation learning strategy on diverse clinically-relevant tasks. Code is publicly available at https://github.com/wenhui0206/MeTSK/tree/main

nan

Article 432

Title@2025-05-28 (3): MIAS-SAM: Medical Image Anomaly Segmentation without thresholding

Title: MIAS-SAM: Medical Image Anomaly Segmentation without thresholding

MIAS-SAM: Medizinische Bildanomalie Segmentierung ohne Schwellenbildung

MIAS-SAM: 医学形象非典型分割,无阈值 2505.22762v1

Authors: Marco Colussi, Dragan Ahmetovic, Sergio Mascetti

This paper presents MIAS-SAM, a novel approach for the segmentation of anomalous regions in medical images. MIAS-SAM uses a patch-based memory bank to store relevant image features, which are extracted from normal data using the SAM encoder. At inference time, the embedding patches extracted from the SAM encoder are compared with those in the memory bank to obtain the anomaly map. Finally, MIAS-SAM computes the center of gravity of the anomaly map to prompt the SAM decoder, obtaining an accurate segmentation from the previously extracted features. Differently from prior works, MIAS-SAM does not require to define a threshold value to obtain the segmentation from the anomaly map. Experimental results conducted on three publicly available datasets, each with a different imaging modality (Brain MRI, Liver CT, and Retina OCT) show accurate anomaly segmentation capabilities measured using DICE score. The code is available at: https://github.com/warpcut/MIAS-SAM

nan

Article 433

Title@2025-05-28 (3): Non-convex entropic mean-field optimization via Best Response flow

Title: Non-convex entropic mean-field optimization via Best Response flow

Nicht konvexe entropische Mittelfeld-Optimierung über Best Response Flow

通过最佳反应流程优化非convex 电子中位平均场 2505.22760v1

Authors: Razvan-Andrei Lascu, Mateusz B. Majka

We study the problem of minimizing non-convex functionals on the space of probability measures, regularized by the relative entropy (KL divergence) with respect to a fixed reference measure, as well as the corresponding problem of solving entropy-regularized non-convex-non-concave min-max problems. We utilize the Best Response flow (also known in the literature as the fictitious play flow) and study how its convergence is influenced by the relation between the degree of non-convexity of the functional under consideration, the regularization parameter and the tail behaviour of the reference measure. In particular, we demonstrate how to choose the regularizer, given the non-convex functional, so that the Best Response operator becomes a contraction with respect to the $L^1$-Wasserstein distance, which then ensures the existence of its unique fixed point, which is then shown to be the unique global minimizer for our optimization problem. This extends recent results where the Best Response flow was applied to solve convex optimization problems regularized by the relative entropy with respect to arbitrary reference measures, and with arbitrary values of the regularization parameter. Our results explain precisely how the assumption of convexity can be relaxed, at the expense of making a specific choice of the regularizer. Additionally, we demonstrate how these results can be applied in reinforcement learning in the context of policy optimization for Markov Decision Processes and Markov games with softmax parametrized policies in the mean-field regime.

nan

Article 434

Title@2025-05-28 (3): FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference

Title: FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference

FlashFormer: Ganzmodell-Kernel für effiziente Low-Batch-Inferenz

FlashFormer: 用于高效低批量推断的全模块内核 2505.22758v1

Authors: Aniruddha Nrusimha, William Brandon, Mayank Mishra, Yikang Shen, Rameswar Panda, Jonathan Ragan-Kelley, Yoon Kim

The size and compute characteristics of modern large language models have led to an increased interest in developing specialized kernels tailored for training and inference. Existing kernels primarily optimize for compute utilization, targeting the large-batch training and inference settings. However, low-batch inference, where memory bandwidth and kernel launch overheads contribute are significant factors, remains important for many applications of interest such as in edge deployment and latency-sensitive applications. This paper describes FlashFormer, a proof-of-concept kernel for accelerating single-batch inference for transformer-based large language models. Across various model sizes and quantizations settings, we observe nontrivial speedups compared to existing state-of-the-art inference kernels.

nan

Article 435

Title@2025-05-28 (3): Decomposing Elements of Problem Solving: What “Math” Does RL Teach?

Title: Decomposing Elements of Problem Solving: What “Math” Does RL Teach?

Zersetzende Elemente der Problemlösung: Was “Math” lehrt RL?

问题解决的分解要素:RL教什么“马思”? 2505.22756v1

Authors: Tian Qin, Core Francisco Park, Mujin Kwun, Aaron Walsman, Eran Malach, Nikhil Anand, Hidenori Tanaka, David Alvarez-Melis

Mathematical reasoning tasks have become prominent benchmarks for assessing the reasoning capabilities of LLMs, especially with reinforcement learning (RL) methods such as GRPO showing significant performance gains. However, accuracy metrics alone do not support fine-grained assessment of capabilities and fail to reveal which problem-solving skills have been internalized. To better understand these capabilities, we propose to decompose problem solving into fundamental capabilities: Plan (mapping questions to sequences of steps), Execute (correctly performing solution steps), and Verify (identifying the correctness of a solution). Empirically, we find that GRPO mainly enhances the execution skill-improving execution robustness on problems the model already knows how to solve-a phenomenon we call temperature distillation. More importantly, we show that RL-trained models struggle with fundamentally new problems, hitting a ‘coverage wall’ due to insufficient planning skills. To explore RL’s impact more deeply, we construct a minimal, synthetic solution-tree navigation task as an analogy for mathematical problem-solving. This controlled setup replicates our empirical findings, confirming RL primarily boosts execution robustness. Importantly, in this setting, we identify conditions under which RL can potentially overcome the coverage wall through improved exploration and generalization to new solution paths. Our findings provide insights into the role of RL in enhancing LLM reasoning, expose key limitations, and suggest a path toward overcoming these barriers. Code is available at https://github.com/cfpark00/RL-Wall.

nan

Article 436

Title@2025-05-28 (3): Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

Title: Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

Darstellungsdynamiken von Diffusionsmodellen durch Low-Dimensional Modeling verstehen

通过低多样性建模理解通过低多样性建模传播模型的动态 2502.05743v2

Authors: Xiao Li, Zekai Zhang, Xiang Li, Siyi Chen, Zhihui Zhu, Peng Wang, Qing Qu

Diffusion models, though originally designed for generative tasks, have demonstrated impressive self-supervised representation learning capabilities. A particularly intriguing phenomenon in these models is the emergence of unimodal representation dynamics, where the quality of learned features peaks at an intermediate noise level. In this work, we conduct a comprehensive theoretical and empirical investigation of this phenomenon. Leveraging the inherent low-dimensionality structure of image data, we theoretically demonstrate that the unimodal dynamic emerges when the diffusion model successfully captures the underlying data distribution. The unimodality arises from an interplay between denoising strength and class confidence across noise scales. Empirically, we further show that, in classification tasks, the presence of unimodal dynamics reliably indicates generalization: it emerges when the model generalizes and gradually transitions to a monotonically decreasing curve as the model begins to memorize the training data.

nan

Article 437

Title@2025-05-28 (3): VideoRAG: Retrieval-Augmented Generation over Video Corpus

Title: VideoRAG: Retrieval-Augmented Generation over Video Corpus

VideoRAG: Retrieval-Augmented Generation über Video Corpus

VideoRAG: 利用视频公司回收的原始一代 2501.05874v3

Authors: Soyeong Jeong, Kangsan Kim, Jinheon Baek, Sung Ju Hwang

Retrieval-Augmented Generation (RAG) is a powerful strategy for improving the factual accuracy of models by retrieving external knowledge relevant to queries and incorporating it into the generation process. However, existing approaches primarily focus on text, with some recent advancements considering images, and they largely overlook videos, a rich source of multimodal knowledge capable of representing contextual details more effectively than any other modality. While very recent studies explore the use of videos in response generation, they either predefine query-associated videos without retrieval or convert videos into textual descriptions losing multimodal richness. To tackle these, we introduce VideoRAG, a framework that not only dynamically retrieves videos based on their relevance with queries but also utilizes both visual and textual information. The operation of VideoRAG is powered by recent Large Video Language Models (LVLMs), which enable the direct processing of video content to represent it for retrieval and the seamless integration of retrieved videos jointly with queries for response generation. Also, inspired by that the context size of LVLMs may not be sufficient to process all frames in extremely long videos and not all frames are equally important, we introduce a video frame selection mechanism to extract the most informative subset of frames, along with a strategy to extract textual information from videos (as it can aid the understanding of video content) when their subtitles are not available. We experimentally validate the effectiveness of VideoRAG, showcasing that it is superior to relevant baselines. Code is available at https://github.com/starsuzi/VideoRAG.

nan

Article 438

Title@2025-05-28 (3): Self-orthogonalizing attractor neural networks emerging from the free energy principle

Title: Self-orthogonalizing attractor neural networks emerging from the free energy principle

Selbst-orthogonalisierendes Attraktor-Neuralnetzwerk, das aus dem Prinzip der freien Energie entspringt

根据自由能源原则建立的自我调整的吸引人神经网络 2505.22749v1

Authors: Tamas Spisak, Karl Friston

Attractor dynamics are a hallmark of many complex systems, including the brain. Understanding how such self-organizing dynamics emerge from first principles is crucial for advancing our understanding of neuronal computations and the design of artificial intelligence systems. Here we formalize how attractor networks emerge from the free energy principle applied to a universal partitioning of random dynamical systems. Our approach obviates the need for explicitly imposed learning and inference rules and identifies emergent, but efficient and biologically plausible inference and learning dynamics for such self-organizing systems. These result in a collective, multi-level Bayesian active inference process. Attractors on the free energy landscape encode prior beliefs; inference integrates sensory data into posterior beliefs; and learning fine-tunes couplings to minimize long-term surprise. Analytically and via simulations, we establish that the proposed networks favor approximately orthogonalized attractor representations, a consequence of simultaneously optimizing predictive accuracy and model complexity. These attractors efficiently span the input subspace, enhancing generalization and the mutual information between hidden causes and observable effects. Furthermore, while random data presentation leads to symmetric and sparse couplings, sequential data fosters asymmetric couplings and non-equilibrium steady-state dynamics, offering a natural extension to conventional Boltzmann Machines. Our findings offer a unifying theory of self-organizing attractor networks, providing novel insights for AI and neuroscience.

nan

Article 439

Title@2025-05-28 (3): An unsupervised method for MRI recovery: Deep image prior with structured sparsity

Title: An unsupervised method for MRI recovery: Deep image prior with structured sparsity

Eine unüberwachte Methode für die MRT-Wiederherstellung: Tiefenbild vor mit strukturierter Sparsamkeit

MRI 恢复的一种不受监督的方法: 结构宽度之前的深图像 2501.01482v3

Authors: Muhammad Ahmad Sultan, Chong Chen, Yingmin Liu, Katarzyna Gil, Karolina Zareba, Rizwan Ahmad

Objective: To propose and validate an unsupervised MRI reconstruction method that does not require fully sampled k-space data. Materials and Methods: The proposed method, deep image prior with structured sparsity (DISCUS), extends the deep image prior (DIP) by introducing group sparsity to frame-specific code vectors, enabling the discovery of a low-dimensional manifold for capturing temporal variations. \discus was validated using four studies: (I) simulation of a dynamic Shepp-Logan phantom to demonstrate its manifold discovery capabilities, (II) comparison with compressed sensing and DIP-based methods using simulated single-shot late gadolinium enhancement (LGE) image series from six distinct digital cardiac phantoms in terms of normalized mean square error (NMSE) and structural similarity index measure (SSIM), (III) evaluation on retrospectively undersampled single-shot LGE data from eight patients, and (IV) evaluation on prospectively undersampled single-shot LGE data from eight patients, assessed via blind scoring from two expert readers. Results: DISCUS outperformed competing methods, demonstrating superior reconstruction quality in terms of NMSE and SSIM (Studies I–III) and expert reader scoring (Study IV). Discussion: An unsupervised image reconstruction method is presented and validated on simulated and measured data. These developments can benefit applications where acquiring fully sampled data is challenging.

nan

Article 440

Title@2025-05-28 (3): StarBASE-GP: Biologically-Guided Automated Machine Learning for Genotype-to-Phenotype Association Analysis

Title: StarBASE-GP: Biologically-Guided Automated Machine Learning for Genotype-to-Phenotype Association Analysis

StarBASE-GP: Biologisch geführtes automatisiertes maschinelles Lernen für die Analyse von Genotyp-zu-Phenotyp-Verbindungen

StarBASE-GP: 基因型至极型协会分析的生物辅助自动计算机学习 2505.22746v1

Authors: Jose Guadalupe Hernandez, Attri Ghosh, Philip J. Freda, Yufei Meng, Nicholas Matsumoto, Jason H. Moore

We present the Star-Based Automated Single-locus and Epistasis analysis tool - Genetic Programming (StarBASE-GP), an automated framework for discovering meaningful genetic variants associated with phenotypic variation in large-scale genomic datasets. StarBASE-GP uses a genetic programming-based multi-objective optimization strategy to evolve machine learning pipelines that simultaneously maximize explanatory power (r2) and minimize pipeline complexity. Biological domain knowledge is integrated at multiple stages, including the use of nine inheritance encoding strategies to model deviations from additivity, a custom linkage disequilibrium pruning node that minimizes redundancy among features, and a dynamic variant recommendation system that prioritizes informative candidates for pipeline inclusion. We evaluate StarBASE-GP on a cohort of Rattus norvegicus (brown rat) to identify variants associated with body mass index, benchmarking its performance against a random baseline and a biologically naive version of the tool. StarBASE-GP consistently evolves Pareto fronts with superior performance, yielding higher accuracy in identifying both ground truth and novel quantitative trait loci, highlighting relevant targets for future validation. By incorporating evolutionary search and relevant biological theory into a flexible automated machine learning framework, StarBASE-GP demonstrates robust potential for advancing variant discovery in complex traits.

nan

Article 441

Title@2025-05-28 (3): Information-Computation Gaps in Quantum Learning via Low-Degree Likelihood

Title: Information-Computation Gaps in Quantum Learning via Low-Degree Likelihood

Informations-Computation Lücken im Quanten-Lernen über Low-Degree Likelihood

通过低贫困风险学习的量子学习中的信息估计差距 2505.22743v1

Authors: Sitan Chen, Weiyuan Gong, Jonas Haferkamp, Yihui Quek

In a variety of physically relevant settings for learning from quantum data, designing protocols that can computationally efficiently extract information remains largely an art, and there are important cases where we believe this to be impossible, that is, where there is an information-computation gap. While there is a large array of tools in the classical literature for giving evidence for average-case hardness of statistical inference problems, the corresponding tools in the quantum literature are far more limited. One such framework in the classical literature, the low-degree method, makes predictions about hardness of inference problems based on the failure of estimators given by low-degree polynomials. In this work, we extend this framework to the quantum setting. We establish a general connection between state designs and low-degree hardness. We use this to obtain the first information-computation gaps for learning Gibbs states of random, sparse, non-local Hamiltonians. We also use it to prove hardness for learning random shallow quantum circuit states in a challenging model where states can be measured in adaptively chosen bases. To our knowledge, the ability to model adaptivity within the low-degree framework was open even in classical settings. In addition, we also obtain a low-degree hardness result for quantum error mitigation against strategies with single-qubit measurements. We define a new quantum generalization of the planted biclique problem and identify the threshold at which this problem becomes computationally hard for protocols that perform local measurements. Interestingly, the complexity landscape for this problem shifts when going from local measurements to more entangled single-copy measurements. We show average-case hardness for the “standard” variant of Learning Stabilizers with Noise and for agnostically learning product states.

nan

Article 442

Title@2025-05-28 (3): Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing

Title: Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing

Darstellung Shattering in Transformers: Synthetische Studie mit Wissensbearbeitung

在变形器中代表变形器:带有知识编辑的合成研究 2410.17194v4

Authors: Kento Nishi, Maya Okawa, Rahul Ramesh, Mikail Khona, Hidenori Tanaka, Ekdeep Singh Lubana

Knowledge Editing (KE) algorithms alter models’ weights to perform targeted updates to incorrect, outdated, or otherwise unwanted factual associations. However, recent work has shown that applying KE can adversely affect models’ broader factual recall accuracy and diminish their reasoning abilities. Although these studies give insights into the potential harms of KE algorithms, e.g., performance evaluations on benchmarks, little is understood about why such destructive failures occur. Motivated by this, we define a novel synthetic task in which a Transformer is trained from scratch to internalize a “structured” knowledge graph. The structure enforces relationships between entities of the graph, such that editing a factual association has “trickling effects” on other entities (e.g., altering X’s parent is Y to Z affects who X’s siblings’ parent is). Through evaluations of edited models on this task, we show that KE inadvertently affects representations of entities beyond the targeted one, distorting relevant structures that allow a model to infer unseen knowledge about an entity. We call this phenomenon representation shattering and demonstrate that it degrades models’ factual recall and reasoning performance. We further corroborate our findings in naturalistic settings with pre-trained Llama and Mamba models as well. Overall, our work yields a precise mechanistic hypothesis to explain why KE has adverse effects on model abilities.

nan

Article 443

Title@2025-05-28 (3): AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

Title: AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

AutoL2S: Auto-Lang-Short-Reasoning für effiziente große Sprachmodelle

自动L2S:高效大语言模式的自动长期短期理由 2505.22662v1

Authors: Feng Luo, Yu-Neng Chuang, Guanchu Wang, Hoang Anh Duy Le, Shaochen Zhong, Hongyi Liu, Jiayi Yuan, Yang Sui, Vladimir Braverman, Vipin Chaudhary, Xia Hu

The reasoning-capable large language models (LLMs) demonstrate strong performance on complex reasoning tasks but often suffer from overthinking, generating unnecessarily long chain-of-thought (CoT) reasoning paths for easy reasoning questions, thereby increasing inference cost and latency. Recent approaches attempt to address this challenge by manually deciding when to apply long or short reasoning. However, they lack the flexibility to adapt CoT length dynamically based on question complexity. In this paper, we propose Auto Long-Short Reasoning (AutoL2S), a dynamic and model-agnostic framework that enables LLMs to dynamically compress their generated reasoning path based on the complexity of the reasoning question. AutoL2S enables a learned paradigm, in which LLMs themselves can decide when longer reasoning is necessary and when shorter reasoning suffices, by training on data annotated with our proposed method, which includes both long and short CoT paths and a special token. We then use token to indicate when the model can skip generating lengthy CoT reasoning. This proposed annotation strategy can enhance the LLMs' ability to generate shorter CoT reasoning paths with improved quality after training. Extensive evaluation results show that AutoL2S reduces the length of reasoning generation by up to 57% without compromising performance, demonstrating the effectiveness of AutoL2S for scalable and efficient LLM reasoning.

nan

Article 444

Title@2025-05-28 (3): 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model

Title: 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model

3DLLM-Mem: Langzeit-Raum-Temporal-Speicher für körpereigenes 3D-Großsprachmodell

3DLLM-Mem:3D大语言模型内嵌成的3D大语言长期空间-时间记忆 2505.22657v1

Authors: Wenbo Hu, Yining Hong, Yanjun Wang, Leison Gao, Zibu Wei, Xingcheng Yao, Nanyun Peng, Yonatan Bitton, Idan Szpektor, Kai-Wei Chang

Humans excel at performing complex tasks by leveraging long-term memory across temporal and spatial experiences. In contrast, current Large Language Models (LLMs) struggle to effectively plan and act in dynamic, multi-room 3D environments. We posit that part of this limitation is due to the lack of proper 3D spatial-temporal memory modeling in LLMs. To address this, we first introduce 3DMem-Bench, a comprehensive benchmark comprising over 26,000 trajectories and 2,892 embodied tasks, question-answering and captioning, designed to evaluate an agent’s ability to reason over long-term memory in 3D environments. Second, we propose 3DLLM-Mem, a novel dynamic memory management and fusion model for embodied spatial-temporal reasoning and actions in LLMs. Our model uses working memory tokens, which represents current observations, as queries to selectively attend to and fuse the most useful spatial and temporal features from episodic memory, which stores past observations and interactions. Our approach allows the agent to focus on task-relevant information while maintaining memory efficiency in complex, long-horizon environments. Experimental results demonstrate that 3DLLM-Mem achieves state-of-the-art performance across various tasks, outperforming the strongest baselines by 16.5% in success rate on 3DMem-Bench’s most challenging in-the-wild embodied tasks.

nan

Article 445

Title@2025-05-28 (3): Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents

Title: Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents

Position: Ungewissheitsquantifizierung braucht eine Neubewertung für großsprachige Modellagenten

位置:大语言示范物剂的不确定性量化需求评估 2505.22655v1

Authors: Michael Kirchhof, Gjergji Kasneci, Enkelejda Kasneci

Large-language models (LLMs) and chatbot agents are known to provide wrong outputs at times, and it was recently found that this can never be fully prevented. Hence, uncertainty quantification plays a crucial role, aiming to quantify the level of ambiguity in either one overall number or two numbers for aleatoric and epistemic uncertainty. This position paper argues that this traditional dichotomy of uncertainties is too limited for the open and interactive setup that LLM agents operate in when communicating with a user, and that we need to research avenues that enrich uncertainties in this novel scenario. We review the literature and find that popular definitions of aleatoric and epistemic uncertainties directly contradict each other and lose their meaning in interactive LLM agent settings. Hence, we propose three novel research directions that focus on uncertainties in such human-computer interactions: Underspecification uncertainties, for when users do not provide all information or define the exact task at the first go, interactive learning, to ask follow-up questions and reduce the uncertainty about the current context, and output uncertainties, to utilize the rich language and speech space to express uncertainties as more than mere numbers. We expect that these new ways of dealing with and communicating uncertainties will lead to LLM agent interactions that are more transparent, trustworthy, and intuitive.

nan

Article 446

Title@2025-05-28 (3): Sherlock: Self-Correcting Reasoning in Vision-Language Models

Title: Sherlock: Self-Correcting Reasoning in Vision-Language Models

Sherlock: Selbstkorrekte Vernunft in Vision-Sprachen-Modellen

夏洛克:视觉语言模型中的自我校正理由 2505.22651v1

Authors: Yi Ding, Ruqi Zhang

Reasoning Vision-Language Models (VLMs) have shown promising performance on complex multimodal tasks. However, they still face significant challenges: they are highly sensitive to reasoning errors, require large volumes of annotated data or accurate verifiers, and struggle to generalize beyond specific domains. To address these limitations, we explore self-correction as a strategy to enhance reasoning VLMs. We first conduct an in-depth analysis of reasoning VLMs’ self-correction abilities and identify key gaps. Based on our findings, we introduce Sherlock, a self-correction and self-improvement training framework. Sherlock introduces a trajectory-level self-correction objective, a preference data construction method based on visual perturbation, and a dynamic $\beta$ for preference tuning. Once the model acquires self-correction capabilities using only 20k randomly sampled annotated data, it continues to self-improve without external supervision. Built on the Llama3.2-Vision-11B model, Sherlock achieves remarkable results across eight benchmarks, reaching an average accuracy of 64.1 with direct generation and 65.4 after self-correction. It outperforms LLaVA-CoT (63.2), Mulberry (63.9), and LlamaV-o1 (63.4) while using less than 20% of the annotated data.

nan

Article 447

Title@2025-05-28 (3): On Learning Verifiers for Chain-of-Thought Reasoning

Title: On Learning Verifiers for Chain-of-Thought Reasoning

Über das Lernen von Prüfern für die Ketten-of-Thought-Reasoning

关于研究链理由的学习验证符 2505.22650v1

Authors: Maria-Florina Balcan, Avrim Blum, Zhiyuan Li, Dravyansh Sharma

Chain-of-Thought reasoning has emerged as a powerful approach for solving complex mathematical and logical problems. However, it can often veer off track through incorrect or unsubstantiated inferences. Formal mathematical reasoning, which can be checked with a formal verifier, is one approach to addressing this issue. However, currently LLMs are simply not good enough to solve complex problems in a formal way, and even just formalizing an informal problem statement can be challenging. Motivated by this fact, in this work we consider the problem of learning reliable verifiers for natural language Chain-of-Thought reasoning. That is, given a problem statement and step-by-step solution in natural language, the aim of the verifier is to output [Yes] if the reasoning steps in the solution are all valid, and [No] otherwise. In this work we give a formal PAC-learning framework for studying this problem. We propose and analyze several natural verification goals, at different levels of strength, in this framework. We provide sample complexity upper-bounds for learning verifiers satisfying these goals, as well as lower-bound and impossibility results for learning other natural verification objectives without additional assumptions.

nan

Article 448

Title@2025-05-28 (3): Private Rate-Constrained Optimization with Applications to Fair Learning

Title: Private Rate-Constrained Optimization with Applications to Fair Learning

Private Rate-Constrained Optimization mit Anwendungen für faires Lernen

利用公平学习申请实现优化 2505.22703v1

Authors: Mohammad Yaghini, Tudor Cebere, Michael Menart, Aurélien Bellet, Nicolas Papernot

Many problems in trustworthy ML can be formulated as minimization of the model error under constraints on the prediction rates of the model for suitably-chosen marginals, including most group fairness constraints (demographic parity, equality of odds, etc.). In this work, we study such constrained minimization problems under differential privacy (DP). Standard DP optimization techniques like DP-SGD rely on the loss function’s decomposability into per-sample contributions. However, rate constraints introduce inter-sample dependencies, violating the decomposability requirement. To address this, we develop RaCO-DP, a DP variant of the Stochastic Gradient Descent-Ascent (SGDA) algorithm which solves the Lagrangian formulation of rate constraint problems. We demonstrate that the additional privacy cost of incorporating these constraints reduces to privately estimating a histogram over the mini-batch at each optimization step. We prove the convergence of our algorithm through a novel analysis of SGDA that leverages the linear structure of the dual parameter. Finally, empirical results on learning under group fairness constraints demonstrate that our method Pareto-dominates existing private learning approaches in fairness-utility trade-offs.

nan

Article 449

Title@2025-05-28 (3): Spectral Survival Analysis

Title: Spectral Survival Analysis

Spektrale Überlebensanalyse

光谱生存分析 2505.22641v1

Authors: Chengzhi Shi, Stratis Ioannidis

Survival analysis is widely deployed in a diverse set of fields, including healthcare, business, ecology, etc. The Cox Proportional Hazard (CoxPH) model is a semi-parametric model often encountered in the literature. Despite its popularity, wide deployment, and numerous variants, scaling CoxPH to large datasets and deep architectures poses a challenge, especially in the high-dimensional regime. We identify a fundamental connection between rank regression and the CoxPH model: this allows us to adapt and extend the so-called spectral method for rank regression to survival analysis. Our approach is versatile, naturally generalizing to several CoxPH variants, including deep models. We empirically verify our method’s scalability on multiple real-world high-dimensional datasets; our method outperforms legacy methods w.r.t. predictive performance and efficiency.

nan

Article 450

Title@2025-05-28 (3): SimProcess: High Fidelity Simulation of Noisy ICS Physical Processes

Title: SimProcess: High Fidelity Simulation of Noisy ICS Physical Processes

SimProcess: Hohe Fidelity-Simulation von lärmigen ICS-Physischen Prozessen

中间过程:高菲力模拟有噪音的ICS物理过程 2505.22638v1

Authors: Denis Donadel, Gabriele Crestanello, Giulio Morandini, Daniele Antonioli, Mauro Conti, Massimo Merro

Industrial Control Systems (ICS) manage critical infrastructures like power grids and water treatment plants. Cyberattacks on ICSs can disrupt operations, causing severe economic, environmental, and safety issues. For example, undetected pollution in a water plant can put the lives of thousands at stake. ICS researchers have increasingly turned to honeypots – decoy systems designed to attract attackers, study their behaviors, and eventually improve defensive mechanisms. However, existing ICS honeypots struggle to replicate the ICS physical process, making them susceptible to detection. Accurately simulating the noise in ICS physical processes is challenging because different factors produce it, including sensor imperfections and external interferences. In this paper, we propose SimProcess, a novel framework to rank the fidelity of ICS simulations by evaluating how closely they resemble real-world and noisy physical processes. It measures the simulation distance from a target system by estimating the noise distribution with machine learning models like Random Forest. Unlike existing solutions that require detailed mathematical models or are limited to simple systems, SimProcess operates with only a timeseries of measurements from the real system, making it applicable to a broader range of complex dynamic systems. We demonstrate the framework’s effectiveness through a case study using real-world power grid data from the EPIC testbed. We compare the performance of various simulation methods, including static and generative noise techniques. Our model correctly classifies real samples with a recall of up to 1.0. It also identifies Gaussian and Gaussian Mixture as the best distribution to simulate our power systems, together with a generative solution provided by an autoencoder, thereby helping developers to improve honeypot fidelity. Additionally, we make our code publicly available.

nan

Article 451

Title@2025-05-28 (3): Understanding (Un)Reliability of Steering Vectors in Language Models

Title: Understanding (Un)Reliability of Steering Vectors in Language Models

Verständnis (Un)Zuverlässigkeit von Steuerungsvektoren in Sprachmodellen

(un) 语言模式指导矢量的可靠性 2505.22637v1

Authors: Joschka Braun, Carsten Eickhoff, David Krueger, Seyed Ali Bahrainian, Dmitrii Krasheninnikov

Steering vectors are a lightweight method to control language model behavior by adding a learned bias to the activations at inference time. Although steering demonstrates promising performance, recent work shows that it can be unreliable or even counterproductive in some cases. This paper studies the influence of prompt types and the geometry of activation differences on steering reliability. First, we find that all seven prompt types used in our experiments produce a net positive steering effect, but exhibit high variance across samples, and often give an effect opposite of the desired one. No prompt type clearly outperforms the others, and yet the steering vectors resulting from the different prompt types often differ directionally (as measured by cosine similarity). Second, we show that higher cosine similarity between training set activation differences predicts more effective steering. Finally, we observe that datasets where positive and negative activations are better separated are more steerable. Our results suggest that vector steering is unreliable when the target behavior is not represented by a coherent direction.

nan

Article 452

Title@2025-05-28 (3): Spatial Knowledge Graph-Guided Multimodal Synthesis

Title: Spatial Knowledge Graph-Guided Multimodal Synthesis

Raumwissen Graph-geführte multimodale Synthese

空间知识图表辅助多模式合成 2505.22633v1

Authors: Yida Xue, Zhen Bi, Jinnan Yang, Jungang Lou, Huajun Chen, Ningyu Zhang

Recent advances in multimodal large language models (MLLMs) have significantly enhanced their capabilities; however, their spatial perception abilities remain a notable limitation. To address this challenge, multimodal data synthesis offers a promising solution. Yet, ensuring that synthesized data adhere to spatial common sense is a non-trivial task. In this work, we introduce SKG2Data, a novel multimodal synthesis approach guided by spatial knowledge graphs, grounded in the concept of knowledge-to-data generation. SKG2Data automatically constructs a Spatial Knowledge Graph (SKG) to emulate human-like perception of spatial directions and distances, which is subsequently utilized to guide multimodal data synthesis. Extensive experiments demonstrate that data synthesized from diverse types of spatial knowledge, including direction and distance, not only enhance the spatial perception and reasoning abilities of MLLMs but also exhibit strong generalization capabilities. We hope that the idea of knowledge-based data synthesis can advance the development of spatial intelligence.

nan

Article 453

Title@2025-05-28 (3): GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks

Title: GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks

GraphOmni: Ein umfassender und erweiterbarer Benchmark-Rahmen für große Sprachmodelle zu graphtheoretischen Aufgaben

图图Omni:图理学任务大语言模型综合和可扩展基准框架 2504.12764v3

Authors: Hao Xu, Xiangru Jian, Xinjian Zhao, Wei Pang, Chao Zhang, Suyuchen Wang, Qixin Zhang, Zhengyuan Dong, Joao Monteiro, Bang Liu, Qiuzhuang Sun, Tianshu Yu

This paper introduces GraphOmni, a comprehensive benchmark designed to evaluate the reasoning capabilities of LLMs on graph-theoretic tasks articulated in natural language. GraphOmni encompasses diverse graph types, serialization formats, and prompting schemes, significantly exceeding prior efforts in both scope and depth. Through extensive systematic evaluation, we identify critical interactions among these dimensions, demonstrating their substantial impact on model performance. Our experiments reveal that state-of-the-art models like Claude-3.5 and o4-mini consistently outperform other models, yet even these leading models exhibit substantial room for improvement. Performance variability is evident depending on the specific combinations of factors we considered, underscoring the necessity of comprehensive evaluations across these interconnected dimensions. Additionally, we observe distinct impacts of serialization and prompting strategies between open-source and closed-source models, encouraging the development of tailored approaches. Motivated by the findings, we also propose a reinforcement learning-inspired framework that adaptively selects the optimal factors influencing LLM reasoning capabilities. This flexible and extendable benchmark not only deepens our understanding of LLM performance on structured tasks but also provides a robust foundation for advancing research in LLM-based graph reasoning. The code and datasets are available at https://github.com/GAI-Community/GraphOmni.

nan

Article 454

Title@2025-05-28 (3): SCIZOR: A Self-Supervised Approach to Data Curation for Large-Scale Imitation Learning

Title: SCIZOR: A Self-Supervised Approach to Data Curation for Large-Scale Imitation Learning

SCIZOR: Ein selbstüberwachter Ansatz zur Datenkuration für großflächiges Imitationslernen

SCIZOR: 大规模模拟学习数据计算法的自我监督办法 2505.22626v1

Authors: Yu Zhang, Yuqi Xie, Huihan Liu, Rutav Shah, Michael Wan, Linxi Fan, Yuke Zhu

Imitation learning advances robot capabilities by enabling the acquisition of diverse behaviors from human demonstrations. However, large-scale datasets used for policy training often introduce substantial variability in quality, which can negatively impact performance. As a result, automatically curating datasets by filtering low-quality samples to improve quality becomes essential. Existing robotic curation approaches rely on costly manual annotations and perform curation at a coarse granularity, such as the dataset or trajectory level, failing to account for the quality of individual state-action pairs. To address this, we introduce SCIZOR, a self-supervised data curation framework that filters out low-quality state-action pairs to improve the performance of imitation learning policies. SCIZOR targets two complementary sources of low-quality data: suboptimal data, which hinders learning with undesirable actions, and redundant data, which dilutes training with repetitive patterns. SCIZOR leverages a self-supervised task progress predictor for suboptimal data to remove samples lacking task progression, and a deduplication module operating on joint state-action representation for samples with redundant patterns. Empirically, we show that SCIZOR enables imitation learning policies to achieve higher performance with less data, yielding an average improvement of 15.4% across multiple benchmarks. More information is available at: https://ut-austin-rpl.github.io/SCIZOR/

nan

Article 455

Title@2025-05-28 (3): Principled Out-of-Distribution Generalization via Simplicity

Title: Principled Out-of-Distribution Generalization via Simplicity

Prinzipielle Nicht-Verteilung Verallgemeinerung über Einfachheit

通过简单化普遍化 2505.22622v1

Authors: Jiawei Ge, Amanda Wang, Shange Tang, Chi Jin

Modern foundation models exhibit remarkable out-of-distribution (OOD) generalization, solving tasks far beyond the support of their training data. However, the theoretical principles underpinning this phenomenon remain elusive. This paper investigates this problem by examining the compositional generalization abilities of diffusion models in image generation. Our analysis reveals that while neural network architectures are expressive enough to represent a wide range of models – including many with undesirable behavior on OOD inputs – the true, generalizable model that aligns with human expectations typically corresponds to the simplest among those consistent with the training data. Motivated by this observation, we develop a theoretical framework for OOD generalization via simplicity, quantified using a predefined simplicity metric. We analyze two key regimes: (1) the constant-gap setting, where the true model is strictly simpler than all spurious alternatives by a fixed gap, and (2) the vanishing-gap setting, where the fixed gap is replaced by a smoothness condition ensuring that models close in simplicity to the true model yield similar predictions. For both regimes, we study the regularized maximum likelihood estimator and establish the first sharp sample complexity guarantees for learning the true, generalizable, simple model.

nan

Article 456

Title@2025-05-28 (3): The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Title: The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Der Entropie-Mechanismus des Verstärkten Lernens für sinnvolle Sprachmodelle

理由语言模式强化学习的全英机制 2505.22617v1

Authors: Ganqu Cui, Yuchen Zhang, Jiacheng Chen, Lifan Yuan, Zhi Wang, Yuxin Zuo, Haozhan Li, Yuchen Fan, Huayu Chen, Weize Chen, Zhiyuan Liu, Hao Peng, Lei Bai, Wanli Ouyang, Yu Cheng, Bowen Zhou, Ning Ding

This paper aims to overcome a major obstacle in scaling RL for reasoning with LLMs, namely the collapse of policy entropy. Such phenomenon is consistently observed across vast RL runs without entropy intervention, where the policy entropy dropped sharply at the early training stage, this diminished exploratory ability is always accompanied with the saturation of policy performance. In practice, we establish a transformation equation R=-a*e^H+b between entropy H and downstream performance R. This empirical law strongly indicates that, the policy performance is traded from policy entropy, thus bottlenecked by its exhaustion, and the ceiling is fully predictable H=0, R=-a+b. Our finding necessitates entropy management for continuous exploration toward scaling compute for RL. To this end, we investigate entropy dynamics both theoretically and empirically. Our derivation highlights that, the change in policy entropy is driven by the covariance between action probability and the change in logits, which is proportional to its advantage when using Policy Gradient-like algorithms. Empirical study shows that, the values of covariance term and entropy differences matched exactly, supporting the theoretical conclusion. Moreover, the covariance term stays mostly positive throughout training, further explaining why policy entropy would decrease monotonically. Through understanding the mechanism behind entropy dynamics, we motivate to control entropy by restricting the update of high-covariance tokens. Specifically, we propose two simple yet effective techniques, namely Clip-Cov and KL-Cov, which clip and apply KL penalty to tokens with high covariances respectively. Experiments show that these methods encourage exploration, thus helping policy escape entropy collapse and achieve better downstream performance.

nan

Article 457

Title@2025-05-28 (3): Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

Title: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

Bridging Supervised Learning und Verstärkung Lernen in Mathe-Reasoning

在数学原因方面的受监督学习和强化学习架桥 2505.18116v2

Authors: Huayu Chen, Kaiwen Zheng, Qinsheng Zhang, Ganqu Cui, Yin Cui, Haotian Ye, Tsung-Yi Lin, Ming-Yu Liu, Jun Zhu, Haoxiang Wang

Reinforcement Learning (RL) has played a central role in the recent surge of LLMs’ math abilities by enabling self-improvement through binary verifier signals. In contrast, Supervised Learning (SL) is rarely considered for such verification-driven training, largely due to its heavy reliance on reference answers and inability to reflect on mistakes. In this work, we challenge the prevailing notion that self-improvement is exclusive to RL and propose Negative-aware Fine-Tuning (NFT) – a supervised approach that enables LLMs to reflect on their failures and improve autonomously with no external teachers. In online training, instead of throwing away self-generated negative answers, NFT constructs an implicit negative policy to model them. This implicit policy is parameterized with the same positive LLM we target to optimize on positive data, enabling direct policy optimization on all LLMs’ generations. We conduct experiments on 7B and 32B models in math reasoning tasks. Results consistently show that through the additional leverage of negative feedback, NFT significantly improves over SL baselines like Rejection sampling Fine-Tuning, matching or even surpassing leading RL algorithms like GRPO and DAPO. Furthermore, we demonstrate that NFT and GRPO are actually equivalent in strict-on-policy training, even though they originate from entirely different theoretical foundations. Our experiments and theoretical findings bridge the gap between SL and RL methods in binary-feedback learning systems.

nan

Article 458

Title@2025-05-28 (3): Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Title: Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Voll heterogene Grafenregression mit tiefen Doppelpoisson-Netzwerken

带有深双 Poisson 网络的全导流计数回归 2406.09262v4

Authors: Spencer Young, Porter Jenkins, Longchao Da, Jeff Dotson, Hua Wei

Neural networks capable of accurate, input-conditional uncertainty representation are essential for real-world AI systems. Deep ensembles of Gaussian networks have proven highly effective for continuous regression due to their ability to flexibly represent aleatoric uncertainty via unrestricted heteroscedastic variance, which in turn enables accurate epistemic uncertainty estimation. However, no analogous approach exists for count regression, despite many important applications. To address this gap, we propose the Deep Double Poisson Network (DDPN), a novel neural discrete count regression model that outputs the parameters of the Double Poisson distribution, enabling arbitrarily high or low predictive aleatoric uncertainty for count data and improving epistemic uncertainty estimation when ensembled. We formalize and prove that DDPN exhibits robust regression properties similar to heteroscedastic Gaussian models via learnable loss attenuation, and introduce a simple loss modification to control this behavior. Experiments on diverse datasets demonstrate that DDPN outperforms current baselines in accuracy, calibration, and out-of-distribution detection, establishing a new state-of-the-art in deep count regression.

nan

Article 459

Title@2025-05-28 (3): Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency

Title: Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency

Abgeschirmte Diffusion: Erzeugen von neuen und vielfältigen Bildern mit Sparse Repellency

盾牌扩散:利用微缩生成新奇和多样化图像 2410.06025v3

Authors: Michael Kirchhof, James Thornton, Louis Béthune, Pierre Ablin, Eugene Ndiaye, Marco Cuturi

The adoption of text-to-image diffusion models raises concerns over reliability, drawing scrutiny under the lens of various metrics like calibration, fairness, or compute efficiency. We focus in this work on two issues that arise when deploying these models: a lack of diversity when prompting images, and a tendency to recreate images from the training set. To solve both problems, we propose a method that coaxes the sampled trajectories of pretrained diffusion models to land on images that fall outside of a reference set. We achieve this by adding repellency terms to the diffusion SDE throughout the generation trajectory, which are triggered whenever the path is expected to land too closely to an image in the shielded reference set. Our method is sparse in the sense that these repellency terms are zero and inactive most of the time, and even more so towards the end of the generation trajectory. Our method, named SPELL for sparse repellency, can be used either with a static reference set that contains protected images, or dynamically, by updating the set at each timestep with the expected images concurrently generated within a batch, and with the images of previously generated batches. We show that adding SPELL to popular diffusion models improves their diversity while impacting their FID only marginally, and performs comparatively better than other recent training-free diversity methods. We also demonstrate how SPELL can ensure a shielded generation away from a very large set of protected images by considering all 1.2M images from ImageNet as the protected set.

nan

Article 460

Title@2025-05-28 (3): Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight Decay

Title: Solving Inverse Problems with Deep Linear Neural Networks: Global Convergence Guarantees for Gradient Descent with Weight Decay

Inverse Probleme mit tiefen linearen neuralen Netzwerken lösen: Globale Konvergenzgarantien für gradienten Abstieg mit Gewichtsverfall

解决深线神经神经网络的反面问题:全球一致保障渐变后裔与体重衰减 2502.15522v2

Authors: Hannah Laus, Suzanna Parkinson, Vasileios Charisopoulos, Felix Krahmer, Rebecca Willett

Machine learning methods are commonly used to solve inverse problems, wherein an unknown signal must be estimated from few measurements generated via a known acquisition procedure. In particular, neural networks perform well empirically but have limited theoretical guarantees. In this work, we study an underdetermined linear inverse problem that admits several possible solution mappings. A standard remedy (e.g., in compressed sensing) establishing uniqueness of the solution mapping is to assume knowledge of latent low-dimensional structure in the source signal. We ask the following question: do deep neural networks adapt to this low-dimensional structure when trained by gradient descent with weight decay regularization? We prove that mildly overparameterized deep linear networks trained in this manner converge to an approximate solution that accurately solves the inverse problem while implicitly encoding latent subspace structure. To our knowledge, this is the first result to rigorously show that deep linear networks trained with weight decay automatically adapt to latent subspace structure in the data under practical stepsize and weight initialization schemes. Our work highlights that regularization and overparameterization improve generalization, while overparameterization also accelerates convergence during training.

nan

Article 461

Title@2025-05-28 (3): Chest Disease Detection In X-Ray Images Using Deep Learning Classification Method

Title: Chest Disease Detection In X-Ray Images Using Deep Learning Classification Method

Brusterkrankungen Detektion in Röntgenbildern mit Deep Learning-Klassifikationsmethode

利用深学习分类方法在X射线图像中检测胸前疾病 2505.22609v1

Authors: Alanna Hazlett, Naomi Ohashi, Timothy Rodriguez, Sodiq Adewole

In this work, we investigate the performance across multiple classification models to classify chest X-ray images into four categories of COVID-19, pneumonia, tuberculosis (TB), and normal cases. We leveraged transfer learning techniques with state-of-the-art pre-trained Convolutional Neural Networks (CNNs) models. We fine-tuned these pre-trained architectures on a labeled medical x-ray images. The initial results are promising with high accuracy and strong performance in key classification metrics such as precision, recall, and F1 score. We applied Gradient-weighted Class Activation Mapping (Grad-CAM) for model interpretability to provide visual explanations for classification decisions, improving trust and transparency in clinical applications.

nan

Article 462

Title@2025-05-28 (3): AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Title: AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

AutoElicit: Mit großen Sprachmodellen für vorausschauende Modellierung von Expertenvoraussagen

自动:在预测模拟中使用大语言模型,供专家使用 2411.17284v5

Authors: Alexander Capstick, Rahul G. Krishnan, Payam Barnaghi

Large language models (LLMs) acquire a breadth of information across various domains. However, their computational complexity, cost, and lack of transparency often hinder their direct application for predictive tasks where privacy and interpretability are paramount. In fields such as healthcare, biology, and finance, specialised and interpretable linear models still hold considerable value. In such domains, labelled data may be scarce or expensive to obtain. Well-specified prior distributions over model parameters can reduce the sample complexity of learning through Bayesian inference; however, eliciting expert priors can be time-consuming. We therefore introduce AutoElicit to extract knowledge from LLMs and construct priors for predictive models. We show these priors are informative and can be refined using natural language. We perform a careful study contrasting AutoElicit with in-context learning and demonstrate how to perform model selection between the two methods. We find that AutoElicit yields priors that can substantially reduce error over uninformative priors, using fewer labels, and consistently outperform in-context learning. We show that AutoElicit saves over 6 months of labelling effort when building a new predictive model for urinary tract infections from sensor recordings of people living with dementia.

nan

Article 463

Title@2025-05-28 (3): One Rank at a Time: Cascading Error Dynamics in Sequential Learning

Title: One Rank at a Time: Cascading Error Dynamics in Sequential Learning

Ein Rang zu einer Zeit: Cascading Error Dynamics in Sequential Learning

一次一排: 序列学习中连带错误动态 2505.22602v1

Authors: Mahtab Alizadeh Vandchali, Fangshuo, Liao, Anastasios Kyrillidis

Sequential learning – where complex tasks are broken down into simpler, hierarchical components – has emerged as a paradigm in AI. This paper views sequential learning through the lens of low-rank linear regression, focusing specifically on how errors propagate when learning rank-1 subspaces sequentially. We present an analysis framework that decomposes the learning process into a series of rank-1 estimation problems, where each subsequent estimation depends on the accuracy of previous steps. Our contribution is a characterization of the error propagation in this sequential process, establishing bounds on how errors – e.g., due to limited computational budgets and finite precision – affect the overall model accuracy. We prove that these errors compound in predictable ways, with implications for both algorithmic design and stability guarantees.

nan

Article 464

Title@2025-05-28 (3): Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

Title: Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

Adjoint Sampling: Hoch skalierbare Diffusions-Probenehmer über Adjoint Matching

联合采样:通过联合配配制的高可缩放扩散采样器 2504.11713v3

Authors: Aaron Havens, Benjamin Kurt Miller, Bing Yan, Carles Domingo-Enrich, Anuroop Sriram, Brandon Wood, Daniel Levine, Bin Hu, Brandon Amos, Brian Karrer, Xiang Fu, Guan-Horng Liu, Ricky T. Q. Chen

We introduce Adjoint Sampling, a highly scalable and efficient algorithm for learning diffusion processes that sample from unnormalized densities, or energy functions. It is the first on-policy approach that allows significantly more gradient updates than the number of energy evaluations and model samples, allowing us to scale to much larger problem settings than previously explored by similar methods. Our framework is theoretically grounded in stochastic optimal control and shares the same theoretical guarantees as Adjoint Matching, being able to train without the need for corrective measures that push samples towards the target distribution. We show how to incorporate key symmetries, as well as periodic boundary conditions, for modeling molecules in both cartesian and torsional coordinates. We demonstrate the effectiveness of our approach through extensive experiments on classical energy functions, and further scale up to neural network-based energy models where we perform amortized conformer generation across many molecular systems. To encourage further research in developing highly scalable sampling methods, we plan to open source these challenging benchmarks, where successful methods can directly impact progress in computational chemistry.

nan

Article 465

Title@2025-05-28 (3): Machine Unlearning under Overparameterization

Title: Machine Unlearning under Overparameterization

Maschine Unlearning unter Überparameterisierung

超参数化下脱学机 2505.22601v1

Authors: Jacob L. Block, Aryan Mokhtari, Sanjay Shakkottai

Machine unlearning algorithms aim to remove the influence of specific training samples, ideally recovering the model that would have resulted from training on the remaining data alone. We study unlearning in the overparameterized setting, where many models interpolate the data, and defining the unlearning solution as any loss minimizer over the retained set$\unicode{x2013}$as in prior work in the underparameterized setting$\unicode{x2013}$is inadequate, since the original model may already interpolate the retained data and satisfy this condition. In this regime, loss gradients vanish, rendering prior methods based on gradient perturbations ineffective, motivating both new unlearning definitions and algorithms. For this setting, we define the unlearning solution as the minimum-complexity interpolator over the retained data and propose a new algorithmic framework that only requires access to model gradients on the retained set at the original solution. We minimize a regularized objective over perturbations constrained to be orthogonal to these model gradients, a first-order relaxation of the interpolation condition. For different model classes, we provide exact and approximate unlearning guarantees, and we demonstrate that an implementation of our framework outperforms existing baselines across various unlearning experiments.

nan

Article 466

Title@2025-05-28 (3): HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym

Title: HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym

HDDLGym: Ein Tool zum Studieren multi-agenter Hierarchischer Probleme, definiert in HDDL mit OpenAI Gym

HDDLGym: 与 OpenAI Gym 一起研究在HDDL 中界定的多代理等级问题的工具 2505.22597v1

Authors: Ngoc La, Ruaridh Mon-Williams, Julie A. Shah

In recent years, reinforcement learning (RL) methods have been widely tested using tools like OpenAI Gym, though many tasks in these environments could also benefit from hierarchical planning. However, there is a lack of a tool that enables seamless integration of hierarchical planning with RL. Hierarchical Domain Definition Language (HDDL), used in classical planning, introduces a structured approach well-suited for model-based RL to address this gap. To bridge this integration, we introduce HDDLGym, a Python-based tool that automatically generates OpenAI Gym environments from HDDL domains and problems. HDDLGym serves as a link between RL and hierarchical planning, supporting multi-agent scenarios and enabling collaborative planning among agents. This paper provides an overview of HDDLGym’s design and implementation, highlighting the challenges and design choices involved in integrating HDDL with the Gym interface, and applying RL policies to support hierarchical planning. We also provide detailed instructions and demonstrations for using the HDDLGym framework, including how to work with existing HDDL domains and problems from International Planning Competitions, exemplified by the Transport domain. Additionally, we offer guidance on creating new HDDL domains for multi-agent scenarios and demonstrate the practical use of HDDLGym in the Overcooked domain. By leveraging the advantages of HDDL and Gym, HDDLGym aims to be a valuable tool for studying RL in hierarchical planning, particularly in multi-agent contexts.

nan

Article 467

Title: SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Synworld: 用于改进制剂行动知识的虚拟情景合成 2504.03561v2

Authors: Runnan Fang, Xiaobin Wang, Yuan Liang, Shuofei Qiao, Jialong Wu, Zekun Xi, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

In the interaction between agents and their environments, agents expand their capabilities by planning and executing actions. However, LLM-based agents face substantial challenges when deployed in novel environments or required to navigate unconventional action spaces. To empower agents to autonomously explore environments, optimize workflows, and enhance their understanding of actions, we propose SynWorld, a framework that allows agents to synthesize possible scenarios with multi-step action invocation within the action space and perform Monte Carlo Tree Search (MCTS) exploration to effectively refine their action knowledge in the current environment. Our experiments demonstrate that SynWorld is an effective and general approach to learning action knowledge in new environments. Code is available at https://github.com/zjunlp/SynWorld.

nan

Article 468

Title@2025-05-28 (3): Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

Title: Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

Self-Error-Instruct: Verallgemeinern von Fehlern für LLMs Mathematische Begründung

自错误教学法: 数学理由LLMs 的错误一般化 2505.22591v1

Authors: Erxin Yu, Jing Li, Ming Liao, Qi Zhu, Boyang Xue, Minghui Xu, Baojun Wang, Lanqing Hong, Fei Mi, Lifeng Shang

Although large language models demonstrate strong performance across various domains, they still struggle with numerous bad cases in mathematical reasoning. Previous approaches to learning from errors synthesize training data by solely extrapolating from isolated bad cases, thereby failing to generalize the extensive patterns inherent within these cases. This paper presents Self-Error-Instruct (SEI), a framework that addresses these model weaknesses and synthesizes more generalized targeted training data. Specifically, we explore a target model on two mathematical datasets, GSM8K and MATH, to pinpoint bad cases. Then, we generate error keyphrases for these cases based on the instructor model’s (GPT-4o) analysis and identify error types by clustering these keyphrases. Next, we sample a few bad cases during each generation for each identified error type and input them into the instructor model, which synthesizes additional training data using a self-instruct approach. This new data is refined through a one-shot learning process to ensure that only the most effective examples are kept. Finally, we use these curated data to fine-tune the target model, iteratively repeating the process to enhance performance. We apply our framework to various models and observe improvements in their reasoning abilities across both in-domain and out-of-domain mathematics datasets. These results demonstrate the effectiveness of self-error instruction in improving LLMs’ mathematical reasoning through error generalization.

nan

Article 469

Title@2025-05-28 (3): VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

Title: VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

VTool-R1: VLMs lernen mit Bildern zu denken, indem sie mehr über multimodale Werkzeugnutzung lernen

VTool-R1:VLMs通过多模式工具使用强化学习学习如何用图像思考 2505.19255v2

Authors: Mingyuan Wu, Jingcheng Yang, Jize Jiang, Meitang Li, Kaizhuo Yan, Hanchao Yu, Minjia Zhang, Chengxiang Zhai, Klara Nahrstedt

Reinforcement Learning Finetuning (RFT) has significantly advanced the reasoning capabilities of large language models (LLMs) by enabling long chains of thought, self-correction, and effective tool use. While recent works attempt to extend RFT to vision-language models (VLMs), these efforts largely produce text-only reasoning conditioned on static image inputs, falling short of true multimodal reasoning in the response. In contrast, test-time methods like Visual Sketchpad incorporate visual steps but lack training mechanisms. We introduce VTool-R1, the first framework that trains VLMs to generate multimodal chains of thought by interleaving text and intermediate visual reasoning steps. VTool-R1 integrates Python-based visual editing tools into the RFT process, enabling VLMs to learn when and how to generate visual reasoning steps that benefit final reasoning. Trained with outcome-based rewards tied to task accuracy, our approach elicits strategic visual tool use for reasoning without relying on process-based supervision. Experiments on structured visual question answering over charts and tables show that VTool-R1 enhances reasoning performance by teaching VLMs to “think with images” and generate multimodal chain of thoughts with tools.

nan

Article 470

Title@2025-05-28 (3): ReLearn: Unlearning via Learning for Large Language Models

Title: ReLearn: Unlearning via Learning for Large Language Models

ReLearn: Entlernen über Learning for Large Language Models

Reearn:通过学习大语言模式来重新学习 2502.11190v3

Authors: Haoming Xu, Ningyuan Zhao, Liming Yang, Sendong Zhao, Shumin Deng, Mengru Wang, Bryan Hooi, Nay Oo, Huajun Chen, Ningyu Zhang

Current unlearning methods for large language models usually rely on reverse optimization to reduce target token probabilities. However, this paradigm disrupts the subsequent tokens prediction, degrading model performance and linguistic coherence. Moreover, existing evaluation metrics overemphasize contextual forgetting while inadequately assessing response fluency and relevance. To address these challenges, we propose ReLearn, a data augmentation and fine-tuning pipeline for effective unlearning, along with a comprehensive evaluation framework. This framework introduces Knowledge Forgetting Rate (KFR) and Knowledge Retention Rate (KRR) to measure knowledge-level preservation, and Linguistic Score (LS) to evaluate generation quality. Our experiments show that ReLearn successfully achieves targeted forgetting while preserving high-quality output. Through mechanistic analysis, we further demonstrate how reverse optimization disrupts coherent text generation, while ReLearn preserves this essential capability. Code is available at https://github.com/zjunlp/unlearn.

nan

Article 471

Title@2025-05-28 (3): Benignity of loss landscape with weight decay requires both large overparametrization and initialization

Title: Benignity of loss landscape with weight decay requires both large overparametrization and initialization

Die Benignität der Verlustlandschaft mit dem Verfall des Gewichts erfordert sowohl große Überparametrierung als auch Initialisierung

损失景观与体重衰减的尊严要求大规模过度平衡和初始化 2505.22578v1

Authors: Etienne Boursier, Matthew Bowditch, Matthias Englert, Ranko Lazic

The optimization of neural networks under weight decay remains poorly understood from a theoretical standpoint. While weight decay is standard practice in modern training procedures, most theoretical analyses focus on unregularized settings. In this work, we investigate the loss landscape of the $\ell_2$-regularized training loss for two-layer ReLU networks. We show that the landscape becomes benign – i.e., free of spurious local minima – under large overparametrization, specifically when the network width $m$ satisfies $m \gtrsim \min(n^d, 2^n)$, where $n$ is the number of data points and $d$ the input dimension. More precisely in this regime, almost all constant activation regions contain a global minimum and no spurious local minima. We further show that this level of overparametrization is not only sufficient but also necessary via the example of orthogonal data. Finally, we demonstrate that such loss landscape results primarily hold relevance in the large initialization regime. In contrast, for small initializations – corresponding to the feature learning regime – optimization can still converge to spurious local minima, despite the global benignity of the landscape.

nan

Article 472

Title@2025-05-28 (3): FNOPE: Simulation-based inference on function spaces with Fourier Neural Operators

Title: FNOPE: Simulation-based inference on function spaces with Fourier Neural Operators

FNOPE: Simulationsbasierte Inferenz auf Funktionsräumen mit Fourier-Neural-Betreibern

FNOPE: Fourier神经操作员对功能空间的模拟推推 2505.22573v1

Authors: Guy Moss, Leah Sophie Muhle, Reinhard Drews, Jakob H. Macke, Cornelius Schröder

Simulation-based inference (SBI) is an established approach for performing Bayesian inference on scientific simulators. SBI so far works best on low-dimensional parametric models. However, it is difficult to infer function-valued parameters, which frequently occur in disciplines that model spatiotemporal processes such as the climate and earth sciences. Here, we introduce an approach for efficient posterior estimation, using a Fourier Neural Operator (FNO) architecture with a flow matching objective. We show that our approach, FNOPE, can perform inference of function-valued parameters at a fraction of the simulation budget of state of the art methods. In addition, FNOPE supports posterior evaluation at arbitrary discretizations of the domain, as well as simultaneous estimation of vector-valued parameters. We demonstrate the effectiveness of our approach on several benchmark tasks and a challenging spatial inference task from glaciology. FNOPE extends the applicability of SBI methods to new scientific domains by enabling the inference of function-valued parameters.

nan

Article 473

Title: PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion

PRISM: Videodatensatz-Kondensation mit progressiver Veredelung und Einfügung für Sparse Motion

PRISM: 视频数据集浓缩,并逐步精化和插入,用于微缩移动 2505.22564v1

Authors: Jaehyun Choi, Jiwan Hur, Gyojin Han, Jaemyung Yu, Junmo Kim

Video dataset condensation has emerged as a critical technique for addressing the computational challenges associated with large-scale video data processing in deep learning applications. While significant progress has been made in image dataset condensation, the video domain presents unique challenges due to the complex interplay between spatial content and temporal dynamics. This paper introduces PRISM, Progressive Refinement and Insertion for Sparse Motion, for video dataset condensation, a novel approach that fundamentally reconsiders how video data should be condensed. Unlike the previous method that separates static content from dynamic motion, our method preserves the essential interdependence between these elements. Our approach progressively refines and inserts frames to fully accommodate the motion in an action while achieving better performance but less storage, considering the relation of gradients for each frame. Extensive experiments across standard video action recognition benchmarks demonstrate that PRISM outperforms existing disentangled approaches while maintaining compact representations suitable for resource-constrained environments.

nan

Article 474

Title@2025-05-28 (3): Geometric Hyena Networks for Large-scale Equivariant Learning

Title: Geometric Hyena Networks for Large-scale Equivariant Learning

Geometrische Hyänennetze für großmaßstäbliches Äquivalent-Lernen

大规模平等学习的几何Hyena网络 2505.22560v1

Authors: Artem Moskalev, Mangal Prakash, Junjie Xu, Tianyu Cui, Rui Liao, Tommaso Mansi

Processing global geometric context while preserving equivariance is crucial when modeling biological, chemical, and physical systems. Yet, this is challenging due to the computational demands of equivariance and global context at scale. Standard methods such as equivariant self-attention suffer from quadratic complexity, while local methods such as distance-based message passing sacrifice global information. Inspired by the recent success of state-space and long-convolutional models, we introduce Geometric Hyena, the first equivariant long-convolutional model for geometric systems. Geometric Hyena captures global geometric context at sub-quadratic complexity while maintaining equivariance to rotations and translations. Evaluated on all-atom property prediction of large RNA molecules and full protein molecular dynamics, Geometric Hyena outperforms existing equivariant models while requiring significantly less memory and compute that equivariant self-attention. Notably, our model processes the geometric context of 30k tokens 20x faster than the equivariant transformer and allows 72x longer context within the same budget.

nan

Article 475

Title@2025-05-28 (3): Preference Adaptive and Sequential Text-to-Image Generation

Title: Preference Adaptive and Sequential Text-to-Image Generation

Präferenz Adaptive und sequentielle Text-zu-Bild-Generierung

适应性和顺序性文字到图像生成 2412.10419v2

Authors: Ofir Nabati, Guy Tennenholtz, ChihWei Hsu, Moonkyung Ryu, Deepak Ramachandran, Yinlam Chow, Xiang Li, Craig Boutilier

We address the problem of interactive text-to-image (T2I) generation, designing a reinforcement learning (RL) agent which iteratively improves a set of generated images for a user through a sequence of prompt expansions. Using human raters, we create a novel dataset of sequential preferences, which we leverage, together with large-scale open-source (non-sequential) datasets. We construct user-preference and user-choice models using an EM strategy and identify varying user preference types. We then leverage a large multimodal language model (LMM) and a value-based RL approach to suggest an adaptive and diverse slate of prompt expansions to the user. Our Preference Adaptive and Sequential Text-to-image Agent (PASTA) extends T2I models with adaptive multi-turn capabilities, fostering collaborative co-creation and addressing uncertainty or underspecification in a user’s intent. We evaluate PASTA using human raters, showing significant improvement compared to baseline methods. We also open-source our sequential rater dataset and simulated user-rater interactions to support future research in user-centric multi-turn T2I systems.

nan

Article 476

Title@2025-05-28 (3): Can Copulas Be Used for Feature Selection? A Machine Learning Study on Diabetes Risk Prediction

Title: Can Copulas Be Used for Feature Selection? A Machine Learning Study on Diabetes Risk Prediction

Kann Copulas für die Feature-Auswahl verwendet werden? Eine maschinelle Studie über Diabetes Risikovorhersage

Copulas 能够用来选择特质吗? 糖尿病风险预测的机器学习研究。 2505.22554v1

Authors: Agnideep Aich, Md Monzur Murshed, Amanda Mayeaux, Sameera Hewage

Accurate diabetes risk prediction relies on identifying key features from complex health datasets, but conventional methods like mutual information (MI) filters and genetic algorithms (GAs) often overlook extreme dependencies critical for high-risk subpopulations. In this study we introduce a feature-selection framework using the upper-tail dependence coefficient ({\lambda}U) of the novel A2 copula, which quantifies how often extreme higher values of a predictor co-occur with diabetes diagnoses (target variable). Applied to the CDC Diabetes Health Indicators dataset (n=253,680), our method prioritizes five predictors (self-reported general health, high blood pressure, body mass index, mobility limitations, and high cholesterol levels) based on upper tail dependencies. These features match or outperform MI and GA selected subsets across four classifiers (Random Forest, XGBoost, Logistic Regression, Gradient Boosting), achieving accuracy up to 86.5% (XGBoost) and AUC up to 0.806 (Gradient Boosting), rivaling the full 21-feature model. Permutation importance confirms clinical relevance, with BMI and general health driving accuracy. To our knowledge, this is the first work to apply a copula’s upper-tail dependence for supervised feature selection, bridging extreme-value theory and machine learning to deliver a practical toolkit for diabetes prevention.

nan

Article 477

Title@2025-05-28 (3): Data-Distill-Net: A Data Distillation Approach Tailored for Reply-based Continual Learning

Title: Data-Distill-Net: A Data Distillation Approach Tailored for Reply-based Continual Learning

Data-Distill-Net: Ein Datendestillationsansatz, der auf Reply-based Continual Learning zugeschnitten ist

Data-still-Net:为基于答复的不断学习量身定制的数据蒸馏方法 2505.20135v2

Authors: Wenyang Liao, Quanziang Wang, Yichen Wu, Renzhen Wang, Deyu Meng

Replay-based continual learning (CL) methods assume that models trained on a small subset can also effectively minimize the empirical risk of the complete dataset. These methods maintain a memory buffer that stores a sampled subset of data from previous tasks to consolidate past knowledge. However, this assumption is not guaranteed in practice due to the limited capacity of the memory buffer and the heuristic criteria used for buffer data selection. To address this issue, we propose a new dataset distillation framework tailored for CL, which maintains a learnable memory buffer to distill the global information from the current task data and accumulated knowledge preserved in the previous memory buffer. Moreover, to avoid the computational overhead and overfitting risks associated with parameterizing the entire buffer during distillation, we introduce a lightweight distillation module that can achieve global information distillation solely by generating learnable soft labels for the memory buffer data. Extensive experiments show that, our method can achieve competitive results and effectively mitigates forgetting across various datasets. The source code will be publicly available.

nan

Article 478

Title@2025-05-28 (3): DES-LOC: Desynced Low Communication Adaptive Optimizers for Training Foundation Models

Title: DES-LOC: Desynced Low Communication Adaptive Optimizers for Training Foundation Models

DES-LOC: Entsynced Low Communication Adaptive Optimizers for Training Foundation Models

DES-LOC:为培训基金会模型提供发光的低通信适应性适应性优化剂 2505.22549v1

Authors: Alex Iacob, Lorenzo Sani, Mher Safaryan, Paris Giampouras, Samuel Horváth, Andrej Jovanovic, Meghdad Kurmanji, Preslav Aleksandrov, William F. Shen, Xinchi Qiu, Nicholas D. Lane

Scaling foundation model training with Distributed Data Parallel (DDP) methods is bandwidth-limited. Existing infrequent communication methods like Local SGD were designed to synchronize only model parameters and cannot be trivially applied to adaptive optimizers due to additional optimizer states. Current approaches extending Local SGD either lack convergence guarantees or require synchronizing all optimizer states, tripling communication costs. We propose Desynced Low Communication Adaptive Optimizers (DES-LOC), a family of optimizers assigning independent synchronization periods to parameters and momenta, enabling lower communication costs while preserving convergence. Through extensive experiments on language models of up to 1.7B, we show that DES-LOC can communicate 170x less than DDP and 2x less than the previous state-of-the-art Local ADAM. Furthermore, unlike previous heuristic approaches, DES-LOC is suited for practical training scenarios prone to system failures. DES-LOC offers a scalable, bandwidth-efficient, and fault-tolerant solution for foundation model training.

nan

Article 479

Title@2025-05-28 (3): A Human-Centric Approach to Explainable AI for Personalized Education

Title: A Human-Centric Approach to Explainable AI for Personalized Education

Ein menschlich-zentraler Ansatz zur erklärbaren KI für die personalisierte Bildung

以人文文化方式解释个人个性化教育的可解释的AI 2505.22541v1

Authors: Vinitra Swamy

Deep neural networks form the backbone of artificial intelligence research, with potential to transform the human experience in areas ranging from autonomous driving to personal assistants, healthcare to education. However, their integration into the daily routines of real-world classrooms remains limited. It is not yet common for a teacher to assign students individualized homework targeting their specific weaknesses, provide students with instant feedback, or simulate student responses to a new exam question. While these models excel in predictive performance, this lack of adoption can be attributed to a significant weakness: the lack of explainability of model decisions, leading to a lack of trust from students, parents, and teachers. This thesis aims to bring human needs to the forefront of eXplainable AI (XAI) research, grounded in the concrete use case of personalized learning and teaching. We frame the contributions along two verticals: technical advances in XAI and their aligned human studies. We investigate explainability in AI for education, revealing systematic disagreements between post-hoc explainers and identifying a need for inherently interpretable model architectures. We propose four novel technical contributions in interpretability with a multimodal modular architecture (MultiModN), an interpretable mixture-of-experts model (InterpretCC), adversarial training for explainer stability, and a theory-driven LLM-XAI framework to present explanations to students (iLLuMinaTE), which we evaluate in diverse settings with professors, teachers, learning scientists, and university students. By combining empirical evaluations of existing explainers with novel architectural designs and human studies, our work lays a foundation for human-centric AI systems that balance state-of-the-art performance with built-in transparency and trust.

nan

Article 480

Title@2025-05-28 (3): Uncertainty Quantification with Proper Scoring Rules: Adjusting Measures to Prediction Tasks

Title: Uncertainty Quantification with Proper Scoring Rules: Adjusting Measures to Prediction Tasks

Ungewissheitsquantifizierung mit korrekten Bewertungsregeln: Anpassung von Maßnahmen an Vorhersageaufgaben

以适当排序规则对不确定性进行量化:预测任务调整措施 2505.22538v1

Authors: Paul Hofman, Yusuf Sale, Eyke Hüllermeier

We address the problem of uncertainty quantification and propose measures of total, aleatoric, and epistemic uncertainty based on a known decomposition of (strictly) proper scoring rules, a specific type of loss function, into a divergence and an entropy component. This leads to a flexible framework for uncertainty quantification that can be instantiated with different losses (scoring rules), which makes it possible to tailor uncertainty quantification to the use case at hand. We show that this flexibility is indeed advantageous. In particular, we analyze the task of selective prediction and show that the scoring rule should ideally match the task loss. In addition, we perform experiments on two other common tasks. For out-of-distribution detection, our results confirm that a widely used measure of epistemic uncertainty, mutual information, performs best. Moreover, in the setting of active learning, our measure of epistemic uncertainty based on the zero-one-loss consistently outperforms other uncertainty measures.

nan

Article 481

Title@2025-05-28 (3): TabularQGAN: A Quantum Generative Model for Tabular Data

Title: TabularQGAN: A Quantum Generative Model for Tabular Data

TabularQGAN: Ein Quantum Generatives Modell für Tabulardaten

表格QGAN:表格数据量子生成模型 2505.22533v1

Authors: Pallavi Bhardwaj, Caitlin Jones, Lasse Dierich, Aleksandar Vučković

In this paper, we introduce a novel quantum generative model for synthesizing tabular data. Synthetic data is valuable in scenarios where real-world data is scarce or private, it can be used to augment or replace existing datasets. Real-world enterprise data is predominantly tabular and heterogeneous, often comprising a mixture of categorical and numerical features, making it highly relevant across various industries such as healthcare, finance, and software. We propose a quantum generative adversarial network architecture with flexible data encoding and a novel quantum circuit ansatz to effectively model tabular data. The proposed approach is tested on the MIMIC III healthcare and Adult Census datasets, with extensive benchmarking against leading classical models, CTGAN, and CopulaGAN. Experimental results demonstrate that our quantum model outperforms classical models by an average of 8.5% with respect to an overall similarity score from SDMetrics, while using only 0.072% of the parameters of the classical models. Additionally, we evaluate the generalization capabilities of the models using two custom-designed metrics that demonstrate the ability of the proposed quantum model to generate useful and novel samples. To our knowledge, this is one of the first demonstrations of a successful quantum generative model for handling tabular data, indicating that this task could be well-suited to quantum computers.

nan

Article 482

Title@2025-05-28 (3): Prediction of the Most Fire-Sensitive Point in Building Structures with Differentiable Agents for Thermal Simulators

Title: Prediction of the Most Fire-Sensitive Point in Building Structures with Differentiable Agents for Thermal Simulators

Vorhersage des feuerempfindlichsten Punkts in Gebäudestrukturen mit differenzierbaren Agenten für thermische Simulatoren

预测热模拟器使用不同物剂建造结构时最能防火的火敏度点 2502.03424v4

Authors: Yuan Xinjie, Khalid M. Mosalam

Fire safety is crucial for ensuring the stability of building structures, yet evaluating whether a structure meets fire safety requirement is challenging. Fires can originate at any point within a structure, and simulating every potential fire scenario is both expensive and time-consuming. To address this challenge, we propose the concept of the Most Fire-Sensitive Point (MFSP) and an efficient machine learning framework for its identification. The MFSP is defined as the location at which a fire, if initiated, would cause the most severe detrimental impact on the building’s stability, effectively representing the worst-case fire scenario. In our framework, a Graph Neural Network (GNN) serves as an efficient and differentiable agent for conventional Finite Element Analysis (FEA) simulators by predicting the Maximum Interstory Drift Ratio (MIDR) under fire, which then guides the training and evaluation of the MFSP predictor. Additionally, we enhance our framework with a novel edge update mechanism and a transfer learning-based training scheme. Evaluations on a large-scale simulation dataset demonstrate the good performance of the proposed framework in identifying the MFSP, offering a transformative tool for optimizing fire safety assessments in structural design. All developed datasets and codes are open-sourced online.

nan

Article 483

Title@2025-05-28 (3): Training RL Agents for Multi-Objective Network Defense Tasks

Title: Training RL Agents for Multi-Objective Network Defense Tasks

Schulung von RL-Agenten für multi-objektive Netzwerkverteidigungsaufgaben

多目标网络防御任务培训RL代理 2505.22531v1

Authors: Andres Molina-Markham, Luis Robaina, Sean Steinle, Akash Trivedi, Derek Tsui, Nicholas Potteiger, Lauren Brandt, Ransom Winder, Ahmed Ridley

Open-ended learning (OEL) – which emphasizes training agents that achieve broad capability over narrow competency – is emerging as a paradigm to develop artificial intelligence (AI) agents to achieve robustness and generalization. However, despite promising results that demonstrate the benefits of OEL, applying OEL to develop autonomous agents for real-world cybersecurity applications remains a challenge. We propose a training approach, inspired by OEL, to develop autonomous network defenders. Our results demonstrate that like in other domains, OEL principles can translate into more robust and generalizable agents for cyber defense. To apply OEL to network defense, it is necessary to address several technical challenges. Most importantly, it is critical to provide a task representation approach over a broad universe of tasks that maintains a consistent interface over goals, rewards and action spaces. This way, the learning agent can train with varying network conditions, attacker behaviors, and defender goals while being able to build on previously gained knowledge. With our tools and results, we aim to fundamentally impact research that applies AI to solve cybersecurity problems. Specifically, as researchers develop gyms and benchmarks for cyber defense, it is paramount that they consider diverse tasks with consistent representations, such as those we propose in our work.

nan

Article 484

Title@2025-05-28 (3): Symplectic Generative Networks (SGNs): A Hamiltonian Framework for Invertible Deep Generative Modeling

Title: Symplectic Generative Networks (SGNs): A Hamiltonian Framework for Invertible Deep Generative Modeling

Symplektische Generative Netzwerke (SGNs): Ein Hamiltonsches Framework für invertible Deep Generative Modeling

症状产生网络:一个汉密尔顿框架,用于可垂直产生深层产生模型的建立 2505.22527v1

Authors: Agnideep Aich, Ashit Aich, Bruce Wade

We introduce the Symplectic Generative Network (SGN), a deep generative model that leverages Hamiltonian mechanics to construct an invertible, volume-preserving mapping between a latent space and the data space. By endowing the latent space with a symplectic structure and modeling data generation as the time evolution of a Hamiltonian system, SGN achieves exact likelihood evaluation without incurring the computational overhead of Jacobian determinant calculations. In this work, we provide a rigorous mathematical foundation for SGNs through a comprehensive theoretical framework that includes: (i) complete proofs of invertibility and volume preservation, (ii) a formal complexity analysis with theoretical comparisons to Variational Autoencoders and Normalizing Flows, (iii) strengthened universal approximation results with quantitative error bounds, (iv) an information-theoretic analysis based on the geometry of statistical manifolds, and (v) an extensive stability analysis with adaptive integration guarantees. These contributions highlight the fundamental advantages of SGNs and establish a solid foundation for future empirical investigations and applications to complex, high-dimensional data.

nan

Article 485

Title@2025-05-28 (3): Test-Time Alignment of Discrete Diffusion Models with Sequential Monte Carlo

Title: Test-Time Alignment of Discrete Diffusion Models with Sequential Monte Carlo

Test-Time Alignment von diskreten Diffusionsmodellen mit Sequential Monte Carlo

使用顺序式蒙特卡洛的分解传播模型的测试时间对齐 2505.22524v1

Authors: Chinmay Pani, Zijing Ou, Yingzhen Li

Discrete diffusion models have become highly effective across various domains. However, real-world applications often require the generative process to adhere to certain constraints but without task-specific fine-tuning. To this end, we propose a training-free method based on Sequential Monte Carlo (SMC) to sample from the reward-aligned target distribution at the test time. Our approach leverages twisted SMC with an approximate locally optimal proposal, obtained via a first-order Taylor expansion of the reward function. To address the challenge of ill-defined gradients in discrete spaces, we incorporate a Gumbel-Softmax relaxation, enabling efficient gradient-based approximation within the discrete generative framework. Empirical results on both synthetic datasets and image modelling validate the effectiveness of our approach.

nan

Article 486

Title@2025-05-28 (3): Evaluating Supervised Learning Models for Fraud Detection: A Comparative Study of Classical and Deep Architectures on Imbalanced Transaction Data

Title: Evaluating Supervised Learning Models for Fraud Detection: A Comparative Study of Classical and Deep Architectures on Imbalanced Transaction Data

Bewertung von überwachten Lernmodellen für Betrugserkennung: Eine vergleichende Studie klassischer und tiefer Architekturen zu unausgewogenen Transaktionsdaten

评价受监督的欺诈侦查学习模式:关于不平衡交易数据的经典和深层结构比较研究 2505.22521v1

Authors: Chao Wang, Chuanhao Nie, Yunbo Liu

Fraud detection remains a critical task in high-stakes domains such as finance and e-commerce, where undetected fraudulent transactions can lead to significant economic losses. In this study, we systematically compare the performance of four supervised learning models - Logistic Regression, Random Forest, Light Gradient Boosting Machine (LightGBM), and a Gated Recurrent Unit (GRU) network - on a large-scale, highly imbalanced online transaction dataset. While ensemble methods such as Random Forest and LightGBM demonstrated superior performance in both overall and class-specific metrics, Logistic Regression offered a reliable and interpretable baseline. The GRU model showed strong recall for the minority fraud class, though at the cost of precision, highlighting a trade-off relevant for real-world deployment. Our evaluation emphasizes not only weighted averages but also per-class precision, recall, and F1-scores, providing a nuanced view of each model’s effectiveness in detecting rare but consequential fraudulent activity. The findings underscore the importance of choosing models based on the specific risk tolerance and operational needs of fraud detection systems.

nan

Article 487

Title@2025-05-28 (3): IGNIS: A Neural Network Framework for Robust Parameter Estimation in Archimedean Copulas

Title: IGNIS: A Neural Network Framework for Robust Parameter Estimation in Archimedean Copulas

IGNIS: Ein neurales Netzwerk-Framework für robuste Parameterschätzungen in Archimedischen Copulas

INGNIS: Archimedean Copuulas 强参数估计神经网络框架 2505.22518v1

Authors: Agnideep Aich, Ashit Baran Aich, Bruce Wade

Parameter estimation for Archimedean copulas remains a challenging problem, particularly for the recently developed A1 and A2 families that exhibit complex dependency structures. Traditional methods, such as the Method of Moments (MoM), Maximum Likelihood Estimation (MLE), and Maximum Pseudo-Likelihood (MPL), often struggle due to issues of non-monotonic relationship with dependency measures such as Kendall’s tau (as in the case of A1) and numerical instability. In this paper, we present the IGNIS Network, a novel, unified neural framework that learns a direct mapping from observable dependency measures to copula parameters, thereby overcoming the limitations of classical approaches. Our approach is trained on simulated data spanning five Archimedean copula families including Clayton, Gumbel, Frank, A1, and A2, ensuring its general applicability across the entire family. Extensive simulation studies demonstrate that the IGNIS Network reduces estimation errors compared to MoM, while inherently enforcing parameter constraints through theory-guided post-processing. We further validate the practical utility of our method on diverse real-world datasets, including financial returns (AAPL-MSFT), healthcare metrics (CDC Diabetes indicators), and environmental measurements (PM2.5 air quality). Our results underscore the transformative potential of neural methods for robust and accurate dependence modeling in modern applications.

nan

Article 488

Title@2025-05-28 (3): Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?

Title: Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?

Kolmogorov-Arnold Achtung: Ist erlernbare Aufmerksamkeit besser für Vision Transformer?

科尔莫戈罗夫-阿诺尔德关注:对愿景转变者来说,学习关注是否更好? 2503.10632v2

Authors: Subhajit Maity, Killian Hitsman, Xin Li, Aritra Dutta

Kolmogorov-Arnold networks (KANs) are a remarkable innovation consisting of learnable activation functions with the potential to capture more complex relationships from data. Presently, KANs are deployed by replacing multilayer perceptrons (MLPs) in deep networks, including advanced architectures such as vision Transformers (ViTs). This work asks whether a similar replacement in the attention can bring benefits. In this paper, we design the first learnable attention called Kolmogorov-Arnold Attention (KArAt) for ViTs that can operate on any basis, ranging from Fourier, Wavelets, Splines, to Rational Functions. However, learnable activations in attention cause a memory explosion. To remedy this, we propose a modular version of KArAt that uses a low-rank approximation. By adopting the Fourier basis, Fourier-KArAt and its variants, in some cases, outperform their traditional softmax counterparts, or show comparable performance on CIFAR-10, CIFAR-100, and ImageNet-1K datasets. We also deploy Fourier KArAt to ConViT and Swin-Transformer, and use it in detection and segmentation with ViT-Det. We dissect these architectures’ performance by analyzing their loss landscapes, weight distributions, optimizer path, attention visualization, and transferability to other datasets. KArAt’s learnable activation shows a better attention score across all ViTs, indicating better token-to-token interactions, contributing to better inference. Still, its generalizability does not scale with larger ViTs. However, many factors, including the present computing interface, affect the performance of parameter- and memory-heavy KArAts. We note that the goal of this paper is not to produce efficient attention or challenge the traditional activations; by designing KArAt, we are the first to show that attention can be learned and encourage researchers to explore KArAt in conjunction with more advanced architectures.

nan

Article 489

Title@2025-05-28 (3): Accelerating Optimization via Differentiable Stopping Time

Title: Accelerating Optimization via Differentiable Stopping Time

Beschleunigung der Optimierung durch differenzierbare Stoppzeit

通过有区别的停止时间加速优化 2505.22509v1

Authors: Zhonglin Xie, Yiman Fong, Haoran Yuan, Zaiwen Wen

Optimization is an important module of modern machine learning applications. Tremendous efforts have been made to accelerate optimization algorithms. A common formulation is achieving a lower loss at a given time. This enables a differentiable framework with respect to the algorithm hyperparameters. In contrast, its dual, minimizing the time to reach a target loss, is believed to be non-differentiable, as the time is not differentiable. As a result, it usually serves as a conceptual framework or is optimized using zeroth-order methods. To address this limitation, we propose a differentiable stopping time and theoretically justify it based on differential equations. An efficient algorithm is designed to backpropagate through it. As a result, the proposed differentiable stopping time enables a new differentiable formulation for accelerating algorithms. We further discuss its applications, such as online hyperparameter tuning and learning to optimize. Our proposed methods show superior performance in comprehensive experiments across various problems, which confirms their effectiveness.

nan

Article 490

Title@2025-05-28 (3): Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models

Title: Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models

Closed-Form Training Dynamics Reveal Erlernte Funktionen und lineare Struktur in Word2Vec-ähnlichen Modellen

类似Word2Vec 模型中的封闭形式培训动态观测发现特性和线形结构 2502.09863v2

Authors: Dhruva Karkada, James B. Simon, Yasaman Bahri, Michael R. DeWeese

Self-supervised word embedding algorithms such as word2vec provide a minimal setting for studying representation learning in language modeling. We examine the quartic Taylor approximation of the word2vec loss around the origin, and we show that both the resulting training dynamics and the final performance on downstream tasks are empirically very similar to those of word2vec. Our main contribution is to analytically solve for both the gradient flow training dynamics and the final word embeddings in terms of only the corpus statistics and training hyperparameters. The solutions reveal that these models learn orthogonal linear subspaces one at a time, each one incrementing the effective rank of the embeddings until model capacity is saturated. Training on Wikipedia, we find that each of the top linear subspaces represents an interpretable topic-level concept. Finally, we apply our theory to describe how linear representations of more abstract semantic concepts emerge during training; these can be used to complete analogies via vector addition.

nan

Article 491

Title@2025-05-28 (3): Sparsification and Reconstruction from the Perspective of Representation Geometry

Title: Sparsification and Reconstruction from the Perspective of Representation Geometry

Sparsifikation und Rekonstruktion aus Sicht der Repräsentationsgeometrie

从代表制角度看分解与重建 2505.22506v1

Authors: Wenjie Sun, Bingzhe Wu, Zhile Yang, Chengke Wu

Sparse Autoencoders (SAEs) have emerged as a predominant tool in mechanistic interpretability, aiming to identify interpretable monosemantic features. However, how does sparse encoding organize the representations of activation vector from language models? What is the relationship between this organizational paradigm and feature disentanglement as well as reconstruction performance? To address these questions, we propose the SAEMA, which validates the stratified structure of the representation by observing the variability of the rank of the symmetric semipositive definite (SSPD) matrix corresponding to the modal tensor unfolded along the latent tensor with the level of noise added to the residual stream. To systematically investigate how sparse encoding alters representational structures, we define local and global representations, demonstrating that they amplify inter-feature distinctions by merging similar semantic features and introducing additional dimensionality. Furthermore, we intervene the global representation from an optimization perspective, proving a significant causal relationship between their separability and the reconstruction performance. This study explains the principles of sparsity from the perspective of representational geometry and demonstrates the impact of changes in representational structure on reconstruction performance. Particularly emphasizes the necessity of understanding representations and incorporating representational constraints, providing empirical references for developing new interpretable tools and improving SAEs. The code is available at \hyperlink{https://github.com/wenjie1835/SAERepGeo}{https://github.com/wenjie1835/SAERepGeo}.

nan

Article 492

Title@2025-05-28 (3): Geometric GNNs for Charged Particle Tracking at GlueX

Title: Geometric GNNs for Charged Particle Tracking at GlueX

Geometrische GNNs für geladene Partikelverfolgung bei GlueX

GNNs 用于凝胶X充电粒子跟踪的几何 GNNs 2505.22504v1

Authors: Ahmed Hossam Mohammed, Kishansingh Rajput, Simon Taylor, Denis Furletov, Sergey Furletov, Malachi Schram

Nuclear physics experiments are aimed at uncovering the fundamental building blocks of matter. The experiments involve high-energy collisions that produce complex events with many particle trajectories. Tracking charged particles resulting from collisions in the presence of a strong magnetic field is critical to enable the reconstruction of particle trajectories and precise determination of interactions. It is traditionally achieved through combinatorial approaches that scale worse than linearly as the number of hits grows. Since particle hit data naturally form a 3-dimensional point cloud and can be structured as graphs, Graph Neural Networks (GNNs) emerge as an intuitive and effective choice for this task. In this study, we evaluate the GNN model for track finding on the data from the GlueX experiment at Jefferson Lab. We use simulation data to train the model and test on both simulation and real GlueX measurements. We demonstrate that GNN-based track finding outperforms the currently used traditional method at GlueX in terms of segment-based efficiency at a fixed purity while providing faster inferences. We show that the GNN model can achieve significant speedup by processing multiple events in batches, which exploits the parallel computation capability of Graphical Processing Units (GPUs). Finally, we compare the GNN implementation on GPU and FPGA and describe the trade-off.

nan

Article 493

Title@2025-05-28 (3): Assessing Quantum Advantage for Gaussian Process Regression

Title: Assessing Quantum Advantage for Gaussian Process Regression

Bewertung des Quantenvorteils für Gaussian Process Regression

评估高山进程倒退的量度优势 2505.22502v1

Authors: Dominic Lowe, M. S. Kim, Roberto Bondesan

Gaussian Process Regression is a well-known machine learning technique for which several quantum algorithms have been proposed. We show here that in a wide range of scenarios these algorithms show no exponential speedup. We achieve this by rigorously proving that the condition number of a kernel matrix scales at least linearly with the matrix size under general assumptions on the data and kernel. We additionally prove that the sparsity and Frobenius norm of a kernel matrix scale linearly under similar assumptions. The implications for the quantum algorithms runtime are independent of the complexity of loading classical data on a quantum computer and also apply to dequantised algorithms. We supplement our theoretical analysis with numerical verification for popular kernels in machine learning.

nan

Article 494

Title@2025-05-28 (3): Novelty Detection in Reinforcement Learning with World Models

Title: Novelty Detection in Reinforcement Learning with World Models

Neuheitserkennung im Verstärkungslernen mit Weltmodellen

利用世界模式加强学习新颖发现 2310.08731v4

Authors: Geigh Zollicoffer, Kenneth Eaton, Jonathan Balloch, Julia Kim, Wei Zhou, Robert Wright, Mark O. Riedl

Reinforcement learning (RL) using world models has found significant recent successes. However, when a sudden change to world mechanics or properties occurs then agent performance and reliability can dramatically decline. We refer to the sudden change in visual properties or state transitions as novelties. Implementing novelty detection within generated world model frameworks is a crucial task for protecting the agent when deployed. In this paper, we propose straightforward bounding approaches to incorporate novelty detection into world model RL agents, by utilizing the misalignment of the world model’s hallucinated states and the true observed states as an anomaly score. We provide effective approaches to detecting novelties in a distribution of transitions learned by an agent in a world model. Finally, we show the advantage of our work in a novel environment compared to traditional machine learning novelty detection methods as well as currently accepted RL focused novelty detection algorithms.

nan

Article 495

Title@2025-05-28 (3): ProSpero: Active Learning for Robust Protein Design Beyond Wild-Type Neighborhoods

Title: ProSpero: Active Learning for Robust Protein Design Beyond Wild-Type Neighborhoods

ProSpero: Aktives Lernen für robustes Proteindesign jenseits von Wild-Typ-Nachbarschaften

ProSpero:在野生部落邻里以外积极学习巨型蛋白设计 2505.22494v1

Authors: Michal Kmicikiewicz, Vincent Fortuin, Ewa Szczurek

Designing protein sequences of both high fitness and novelty is a challenging task in data-efficient protein engineering. Exploration beyond wild-type neighborhoods often leads to biologically implausible sequences or relies on surrogate models that lose fidelity in novel regions. Here, we propose ProSpero, an active learning framework in which a frozen pre-trained generative model is guided by a surrogate updated from oracle feedback. By integrating fitness-relevant residue selection with biologically-constrained Sequential Monte Carlo sampling, our approach enables exploration beyond wild-type neighborhoods while preserving biological plausibility. We show that our framework remains effective even when the surrogate is misspecified. ProSpero consistently outperforms or matches existing methods across diverse protein engineering tasks, retrieving sequences of both high fitness and novelty.

nan

Article 496

Title@2025-05-28 (3): Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation

Title: Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation

Entmystifizierung des Paradoxon der wichtigen Probenahme mit einer geschätzten historisch-nachfolgenden Verhaltenspolitik in der Off-Policy-Bewertung

以非政策评价中的估计历史依赖者行为政策来解开重要性抽样反常现象的神秘化 2505.22492v1

Authors: Hongyi Zhou, Josiah P. Hanna, Jin Zhu, Ying Yang, Chengchun Shi

This paper studies off-policy evaluation (OPE) in reinforcement learning with a focus on behavior policy estimation for importance sampling. Prior work has shown empirically that estimating a history-dependent behavior policy can lead to lower mean squared error (MSE) even when the true behavior policy is Markovian. However, the question of why the use of history should lower MSE remains open. In this paper, we theoretically demystify this paradox by deriving a bias-variance decomposition of the MSE of ordinary importance sampling (IS) estimators, demonstrating that history-dependent behavior policy estimation decreases their asymptotic variances while increasing their finite-sample biases. Additionally, as the estimated behavior policy conditions on a longer history, we show a consistent decrease in variance. We extend these findings to a range of other OPE estimators, including the sequential IS estimator, the doubly robust estimator and the marginalized IS estimator, with the behavior policy estimated either parametrically or non-parametrically.

nan

Article 497

Title@2025-05-28 (3): On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling

Title: On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling

Über die überraschende Wirksamkeit großer Lernraten unter Standardbreitenskalierung

根据标准宽宽度比例扩大的大型学习率的惊人效果 2505.22491v1

Authors: Moritz Haas, Sebastian Bordt, Ulrike von Luxburg, Leena Chennuru Vankadara

The dominant paradigm for training large-scale vision and language models is He initialization and a single global learning rate (\textit{standard parameterization}, SP). Despite its practical success, standard parametrization remains poorly understood from a theoretical perspective: Existing infinite-width theory would predict instability under large learning rates and vanishing feature learning under stable learning rates. However, empirically optimal learning rates consistently decay much slower than theoretically predicted. By carefully studying neural network training dynamics, we demonstrate that this discrepancy is not fully explained by finite-width phenomena such as catapult effects or a lack of alignment between weights and incoming activations. We instead show that the apparent contradiction can be fundamentally resolved by taking the loss function into account: In contrast to Mean Squared Error (MSE) loss, we prove that under cross-entropy (CE) loss, an intermediate \textit{controlled divergence} regime emerges, where logits diverge but loss, gradients, and activations remain stable. Stable training under large learning rates enables persistent feature evolution at scale in all hidden layers, which is crucial for the practical success of SP. In experiments across optimizers (SGD, Adam), architectures (MLPs, GPT) and data modalities (vision, language), we validate that neural networks operate in this controlled divergence regime under CE loss but not under MSE loss. Our empirical evidence suggests that width-scaling considerations are surprisingly useful for predicting empirically optimal learning rate exponents. Finally, our analysis clarifies the effectiveness and limitations of recently proposed layerwise learning rate scalings for standard initialization.

nan

Article 498

Title@2025-05-28 (3): Understanding Adversarial Training with Energy-based Models

Title: Understanding Adversarial Training with Energy-based Models

Verständnis von Adversarial Training mit energiebasierten Modellen

与基于能源模式的对等培训的谅解 2505.22486v1

Authors: Mujtaba Hussain Mirza, Maria Rosaria Briglia, Filippo Bartolucci, Senad Beadini, Giuseppe Lisanti, Iacopo Masi

We aim at using Energy-based Model (EBM) framework to better understand adversarial training (AT) in classifiers, and additionally to analyze the intrinsic generative capabilities of robust classifiers. By viewing standard classifiers through an energy lens, we begin by analyzing how the energies of adversarial examples, generated by various attacks, differ from those of the natural samples. The central focus of our work is to understand the critical phenomena of Catastrophic Overfitting (CO) and Robust Overfitting (RO) in AT from an energy perspective. We analyze the impact of existing AT approaches on the energy of samples during training and observe that the behavior of the ``delta energy’ – change in energy between original sample and its adversarial counterpart – diverges significantly when CO or RO occurs. After a thorough analysis of these energy dynamics and their relationship with overfitting, we propose a novel regularizer, the Delta Energy Regularizer (DER), designed to smoothen the energy landscape during training. We demonstrate that DER is effective in mitigating both CO and RO across multiple benchmarks. We further show that robust classifiers, when being used as generative models, have limits in handling trade-off between image quality and variability. We propose an improved technique based on a local class-wise principal component analysis (PCA) and energy-based guidance for better class-specific initialization and adaptive stopping, enhancing sample diversity and generation quality. Considering that we do not explicitly train for generative modeling, we achieve a competitive Inception Score (IS) and Fr'echet inception distance (FID) compared to hybrid discriminative-generative models.

nan

Article 499

Title@2025-05-28 (3): Intrinsic User-Centric Interpretability through Global Mixture of Experts

Title: Intrinsic User-Centric Interpretability through Global Mixture of Experts

Intrinsische Benutzer-Centric-Interpretability durch globale Mischung von Experten

通过全球专家混合解释 2402.02933v4

Authors: Vinitra Swamy, Syrielle Montariol, Julian Blackwell, Jibril Frej, Martin Jaggi, Tanja Käser

In human-centric settings like education or healthcare, model accuracy and model explainability are key factors for user adoption. Towards these two goals, intrinsically interpretable deep learning models have gained popularity, focusing on accurate predictions alongside faithful explanations. However, there exists a gap in the human-centeredness of these approaches, which often produce nuanced and complex explanations that are not easily actionable for downstream users. We present InterpretCC (interpretable conditional computation), a family of intrinsically interpretable neural networks at a unique point in the design space that optimizes for ease of human understanding and explanation faithfulness, while maintaining comparable performance to state-of-the-art models. InterpretCC achieves this through adaptive sparse activation of features before prediction, allowing the model to use a different, minimal set of features for each instance. We extend this idea into an interpretable, global mixture-of-experts (MoE) model that allows users to specify topics of interest, discretely separates the feature space for each data point into topical subnetworks, and adaptively and sparsely activates these topical subnetworks for prediction. We apply InterpretCC for text, time series and tabular data across several real-world datasets, demonstrating comparable performance with non-interpretable baselines and outperforming intrinsically interpretable baselines. Through a user study involving 56 teachers, InterpretCC explanations are found to have higher actionability and usefulness over other intrinsically interpretable approaches.

nan

Article 500

Title@2025-05-28 (3): A Closer Look at Multimodal Representation Collapse

Title: A Closer Look at Multimodal Representation Collapse

Ein genauerer Blick auf multimodale Darstellungskollaps

更仔细地审视多模式代表制的崩溃 2505.22483v1

Authors: Abhra Chaudhuri, Anjan Dutta, Tu Bui, Serban Georgescu

We aim to develop a fundamental understanding of modality collapse, a recently observed empirical phenomenon wherein models trained for multimodal fusion tend to rely only on a subset of the modalities, ignoring the rest. We show that modality collapse happens when noisy features from one modality are entangled, via a shared set of neurons in the fusion head, with predictive features from another, effectively masking out positive contributions from the predictive features of the former modality and leading to its collapse. We further prove that cross-modal knowledge distillation implicitly disentangles such representations by freeing up rank bottlenecks in the student encoder, denoising the fusion-head outputs without negatively impacting the predictive features from either modality. Based on the above findings, we propose an algorithm that prevents modality collapse through explicit basis reallocation, with applications in dealing with missing modalities. Extensive experiments on multiple multimodal benchmarks validate our theoretical claims. Project page: https://abhrac.github.io/mmcollapse/.

nan

Article 501

Title@2025-05-28 (3): Hypothesis Testing in Imaging Inverse Problems

Title: Hypothesis Testing in Imaging Inverse Problems

Hypothesenprüfung in bildgebenden Inversen Problemen

想象反反问题假设测试 2505.22481v1

Authors: Yiming Xi, Konstantinos Zygalakis, Marcelo Pereyra

This paper proposes a framework for semantic hypothesis testing tailored to imaging inverse problems. Modern imaging methods struggle to support hypothesis testing, a core component of the scientific method that is essential for the rigorous interpretation of experiments and robust interfacing with decision-making processes. There are three main reasons why image-based hypothesis testing is challenging. First, the difficulty of using a single observation to simultaneously reconstruct an image, formulate hypotheses, and quantify their statistical significance. Second, the hypotheses encountered in imaging are mostly of semantic nature, rather than quantitative statements about pixel values. Third, it is challenging to control test error probabilities because the null and alternative distributions are often unknown. Our proposed approach addresses these difficulties by leveraging concepts from self-supervised computational imaging, vision-language models, and non-parametric hypothesis testing with e-values. We demonstrate our proposed framework through numerical experiments related to image-based phenotyping, where we achieve excellent power while robustly controlling Type I errors.

nan

Article 502

Title@2025-05-28 (3): Position: Don’t Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

Title: Position: Don’t Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

Position: Verwenden Sie den CLT nicht in LLM-Evalen mit weniger als ein paar hundert Datenpunkten

位置: 不要在LLM Evals中使用 CLT, 其数据点小于几百个数据点 2503.01747v3

Authors: Sam Bowyer, Laurence Aitchison, Desi R. Ivanova

Rigorous statistical evaluations of large language models (LLMs), including valid error bars and significance testing, are essential for meaningful and reliable performance assessment. Currently, when such statistical measures are reported, they typically rely on the Central Limit Theorem (CLT). In this position paper, we argue that while CLT-based methods for uncertainty quantification are appropriate when benchmarks consist of thousands of examples, they fail to provide adequate uncertainty estimates for LLM evaluations that rely on smaller, highly specialized benchmarks. In these small-data settings, we demonstrate that CLT-based methods perform very poorly, usually dramatically underestimating uncertainty (i.e. producing error bars that are too small). We give recommendations for alternative frequentist and Bayesian methods that are both easy to implement and more appropriate in these increasingly common scenarios. We provide a simple Python library for these Bayesian methods at https://github.com/sambowyer/bayes_evals .

nan

Article 503

Title@2025-05-28 (3): Non-Asymptotic Analysis of (Sticky) Track-and-Stop

Title: Non-Asymptotic Analysis of (Sticky) Track-and-Stop

Nicht-asymptotische Analyse von (Sticky) Track-and-Stop

对(Stiskky)轨道和停止的非症状分析 2505.22475v1

Authors: Riccardo Poiani, Martino Bernasconi, Andrea Celli

In pure exploration problems, a statistician sequentially collects information to answer a question about some stochastic and unknown environment. The probability of returning a wrong answer should not exceed a maximum risk parameter $\delta$ and good algorithms make as few queries to the environment as possible. The Track-and-Stop algorithm is a pioneering method to solve these problems. Specifically, it is well-known that it enjoys asymptotic optimality sample complexity guarantees for $\delta\to 0$ whenever the map from the environment to its correct answers is single-valued (e.g., best-arm identification with a unique optimal arm). The Sticky Track-and-Stop algorithm extends these results to settings where, for each environment, there might exist multiple correct answers (e.g., $\epsilon$-optimal arm identification). Although both methods are optimal in the asymptotic regime, their non-asymptotic guarantees remain unknown. In this work, we fill this gap and provide non-asymptotic guarantees for both algorithms.

nan

Article 504

Title@2025-05-28 (3): Bridging Language, Vision and Action: Multimodal VAEs in Robotic Manipulation Tasks

Title: Bridging Language, Vision and Action: Multimodal VAEs in Robotic Manipulation Tasks

Überbrückung von Sprache, Vision und Aktion: Multimodale VAE in Robotermanipulationsaufgaben

架桥语言、愿景和行动:机器人操纵任务中的多式机动性 2404.01932v2

Authors: Gabriela Sejnova, Michal Vavrecka, Karla Stepanova

In this work, we focus on unsupervised vision-language-action mapping in the area of robotic manipulation. Recently, multiple approaches employing pre-trained large language and vision models have been proposed for this task. However, they are computationally demanding and require careful fine-tuning of the produced outputs. A more lightweight alternative would be the implementation of multimodal Variational Autoencoders (VAEs) which can extract the latent features of the data and integrate them into a joint representation, as has been demonstrated mostly on image-image or image-text data for the state-of-the-art models. Here we explore whether and how can multimodal VAEs be employed in unsupervised robotic manipulation tasks in a simulated environment. Based on the obtained results, we propose a model-invariant training alternative that improves the models’ performance in a simulator by up to 55%. Moreover, we systematically evaluate the challenges raised by the individual tasks such as object or robot position variability, number of distractors or the task length. Our work thus also sheds light on the potential benefits and limitations of using the current multimodal VAEs for unsupervised learning of robotic motion trajectories based on vision and language.

nan

Article 505

Title@2025-05-28 (3): Forecasting Multivariate Urban Data via Decomposition and Spatio-Temporal Graph Analysis

Title: Forecasting Multivariate Urban Data via Decomposition and Spatio-Temporal Graph Analysis

Voraussichtliche Multivariate Stadtdaten durch Zersetzung und räumlich-Temporale Graphenanalyse

通过分解和时空空间图分析预测多变量城市数据 2505.22474v1

Authors: Amirhossein Sohrabbeig, Omid Ardakanian, Petr Musilek

The forecasting of multivariate urban data presents a complex challenge due to the intricate dependencies between various urban metrics such as weather, air pollution, carbon intensity, and energy demand. This paper introduces a novel multivariate time-series forecasting model that utilizes advanced Graph Neural Networks (GNNs) to capture spatial dependencies among different time-series variables. The proposed model incorporates a decomposition-based preprocessing step, isolating trend, seasonal, and residual components to enhance the accuracy and interpretability of forecasts. By leveraging the dynamic capabilities of GNNs, the model effectively captures interdependencies and improves the forecasting performance. Extensive experiments on real-world datasets, including electricity usage, weather metrics, carbon intensity, and air pollution data, demonstrate the effectiveness of the proposed approach across various forecasting scenarios. The results highlight the potential of the model to optimize smart infrastructure systems, contributing to energy-efficient urban development and enhanced public well-being.

nan

Article 506

Title@2025-05-28 (3): Pure Exploration with Infinite Answers

Title: Pure Exploration with Infinite Answers

Reine Exploration mit unendlichen Antworten

纯探索无无限答案 2505.22473v1

Authors: Riccardo Poiani, Martino Bernasconi, Andrea Celli

We study pure exploration problems where the set of correct answers is possibly infinite, e.g., the regression of any continuous function of the means of the bandit. We derive an instance-dependent lower bound for these problems. By analyzing it, we discuss why existing methods (i.e., Sticky Track-and-Stop) for finite answer problems fail at being asymptotically optimal in this more general setting. Finally, we present a framework, Sticky-Sequence Track-and-Stop, which generalizes both Track-and-Stop and Sticky Track-and-Stop, and that enjoys asymptotic optimality. Due to its generality, our analysis also highlights special cases where existing methods enjoy optimality.

nan

Article 507

Title@2025-05-28 (3): CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs

Title: CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs

CPINN-ABPI: Physik-informierte Neuralnetze für genaue Leistungsschätzung in MPCs

CPINN-ABPI: MPSoCs中精确功率估计物理内建神经网络 2505.22469v1

Authors: Mohamed R. Elshamy, Mehdi Elahi, Ahmad Patooghy, Abdel-Hameed A. Badawy

Efficient thermal and power management in modern multiprocessor systems-on-chip (MPSoCs) demands accurate power consumption estimation. One of the state-of-the-art approaches, Alternative Blind Power Identification (ABPI), theoretically eliminates the dependence on steady-state temperatures, addressing a major shortcoming of previous approaches. However, ABPI performance has remained unverified in actual hardware implementations. In this study, we conduct the first empirical validation of ABPI on commercial hardware using the NVIDIA Jetson Xavier AGX platform. Our findings reveal that, while ABPI provides computational efficiency and independence from steady-state temperature, it exhibits considerable accuracy deficiencies in real-world scenarios. To overcome these limitations, we introduce a novel approach that integrates Custom Physics-Informed Neural Networks (CPINNs) with the underlying thermal model of ABPI. Our approach employs a specialized loss function that harmonizes physical principles with data-driven learning, complemented by multi-objective genetic algorithm optimization to balance estimation accuracy and computational cost. In experimental validation, CPINN-ABPI achieves a reduction of 84.7\% CPU and 73.9\% GPU in the mean absolute error (MAE) relative to ABPI, with the weighted mean absolute percentage error (WMAPE) improving from 47\%–81\% to $\sim$12\%. The method maintains real-time performance with 195.3~$\mu$s of inference time, with similar 85\%–99\% accuracy gains across heterogeneous SoCs.

nan

Article 508

Title@2025-05-28 (3): FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation

Title: FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation

FitCF: Ein Framework für die automatische Feature-Importanz-geführte kontrafaktische Beispielgenerierung

FitCF: 自动地物、重要引导反事实实例生成框架 2501.00777v3

Authors: Qianli Wang, Nils Feldhus, Simon Ostermann, Luis Felipe Villa-Arenas, Sebastian Möller, Vera Schmitt

Counterfactual examples are widely used in natural language processing (NLP) as valuable data to improve models, and in explainable artificial intelligence (XAI) to understand model behavior. The automated generation of counterfactual examples remains a challenging task even for large language models (LLMs), despite their impressive performance on many tasks. In this paper, we first introduce ZeroCF, a faithful approach for leveraging important words derived from feature attribution methods to generate counterfactual examples in a zero-shot setting. Second, we present a new framework, FitCF, which further verifies aforementioned counterfactuals by label flip verification and then inserts them as demonstrations for few-shot prompting, outperforming two state-of-the-art baselines. Through ablation studies, we identify the importance of each of FitCF’s core components in improving the quality of counterfactuals, as assessed through flip rate, perplexity, and similarity measures. Furthermore, we show the effectiveness of LIME and Integrated Gradients as backbone attribution methods for FitCF and find that the number of demonstrations has the largest effect on performance. Finally, we reveal a strong correlation between the faithfulness of feature attribution scores and the quality of generated counterfactuals, which we hope will serve as an important finding for future research in this direction.

nan

Article 509

Title@2025-05-28 (3): Embedding Safety into RL: A New Take on Trust Region Methods

Title: Embedding Safety into RL: A New Take on Trust Region Methods

Einbettung der Sicherheit in RL: Ein neuer Ansatz für Methoden der Vertrauensregion

将安全嵌入RL:信任区域方法的新做法 2411.02957v3

Authors: Nikola Milosevic, Johannes Müller, Nico Scherf

Reinforcement Learning (RL) agents can solve diverse tasks but often exhibit unsafe behavior. Constrained Markov Decision Processes (CMDPs) address this by enforcing safety constraints, yet existing methods either sacrifice reward maximization or allow unsafe training. We introduce Constrained Trust Region Policy Optimization (C-TRPO), which reshapes the policy space geometry to ensure trust regions contain only safe policies, guaranteeing constraint satisfaction throughout training. We analyze its theoretical properties and connections to TRPO, Natural Policy Gradient (NPG), and Constrained Policy Optimization (CPO). Experiments show that C-TRPO reduces constraint violations while maintaining competitive returns.

nan

Article 510

Title@2025-05-28 (3): OptiMindTune: A Multi-Agent Framework for Intelligent Hyperparameter Optimization

Title: OptiMindTune: A Multi-Agent Framework for Intelligent Hyperparameter Optimization

OptiMindTune: Multi-Agenten-Framework für intelligente Hyperparameter-Optimierung

OptiMindTunne: 智能超参数优化的多机构框架 2505.19205v2

Authors: Meher Bhaskar Madiraju, Meher Sai Preetam Madiraju

Hyperparameter optimization (HPO) is a critical yet challenging aspect of machine learning model development, significantly impacting model performance and generalization. Traditional HPO methods often struggle with high dimensionality, complex interdependencies, and computational expense. This paper introduces OptiMindTune, a novel multi-agent framework designed to intelligently and efficiently optimize hyperparameters. OptiMindTune leverages the collaborative intelligence of three specialized AI agents – a Recommender Agent, an Evaluator Agent, and a Decision Agent – each powered by Google’s Gemini models. These agents address distinct facets of the HPO problem, from model selection and hyperparameter suggestion to robust evaluation and strategic decision-making. By fostering dynamic interactions and knowledge sharing, OptiMindTune aims to converge to optimal hyperparameter configurations more rapidly and robustly than existing single-agent or monolithic approaches. Our framework integrates principles from advanced large language models, and adaptive search to achieve scalable and intelligent AutoML. We posit that this multi-agent paradigm offers a promising avenue for tackling the increasing complexity of modern machine learning model tuning.

nan

Article 511

Title@2025-05-28 (3): Depth-Based Matrix Classification for the HHL Quantum Algorithm

Title: Depth-Based Matrix Classification for the HHL Quantum Algorithm

Tiefenbasierte Matrix-Klassifikation für den HHL-Quantenalgorithmus

HHL 量图算法的深度矩阵分类 2505.22454v1

Authors: Mark Danza, Sonia Lopez Alarcon, Cory Merkel

Under the nearing error-corrected era of quantum computing, it is necessary to understand the suitability of certain post-NISQ algorithms for practical problems. One of the most promising, applicable and yet difficult to implement in practical terms is the Harrow, Hassidim and Lloyd (HHL) algorithm for linear systems of equations. An enormous number of problems can be expressed as linear systems of equations, from Machine Learning to fluid dynamics. However, in most cases, HHL will not be able to provide a practical, reasonable solution to these problems. This paper’s goal inquires about whether problems can be labeled using Machine Learning classifiers as suitable or unsuitable for HHL implementation when some numerical information about the problem is known beforehand. This work demonstrates that training on significantly representative data distributions is critical to achieve good classifications of the problems based on the numerical properties of the matrix representing the system of equations. Accurate classification is possible through Multi-Layer Perceptrons, although with careful design of the training data distribution and classifier parameters.

nan

Article 512

Title: Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Unüberwachte Nachschulung für Multi-Modal LLM Reasoning via GRPO

无人监督的多模式LLM通过GROPO进行多模式LLM进修培训后培训 2505.22453v1

Authors: Lai Wei, Yuting Li, Chen Wang, Yue Wang, Linghe Kong, Weiran Huang, Lichao Sun

Improving Multi-modal Large Language Models (MLLMs) in the post-training stage typically relies on supervised fine-tuning (SFT) or reinforcement learning (RL). However, these supervised methods require expensive and manually annotated multi-modal data–an ultimately unsustainable resource. While recent efforts have explored unsupervised post-training, their methods are complex and difficult to iterate. In this work, we are the first to investigate the use of GRPO, a stable and scalable online RL algorithm, for enabling continual self-improvement without any external supervision. We propose MM-UPT, a simple yet effective framework for unsupervised post-training of MLLMs. MM-UPT builds upon GRPO, replacing traditional reward signals with a self-rewarding mechanism based on majority voting over multiple sampled responses. Our experiments demonstrate that MM-UPT significantly improves the reasoning ability of Qwen2.5-VL-7B (e.g., 66.3 %$\rightarrow$72.9 % on MathVista, 62.9 %$\rightarrow$68.7 % on We-Math), using standard dataset without ground truth labels. MM-UPT also outperforms prior unsupervised baselines and even approaches the results of supervised GRPO. Furthermore, we show that incorporating synthetic questions, generated solely by MLLM itself, can boost performance as well, highlighting a promising approach for scalable self-improvement. Overall, MM-UPT offers a new paradigm for continual, autonomous enhancement of MLLMs in the absence of external supervision. Our code is available at https://github.com/waltonfuture/MM-UPT.

nan

Article 513

Title@2025-05-28 (3): Position: All Current Generative Fidelity and Diversity Metrics are Flawed

Title: Position: All Current Generative Fidelity and Diversity Metrics are Flawed

Position: Alle aktuellen Generativen Fidelity und Diversity Metrics sind abgeflacht

位置:所有当前产生分裂性和多样性 2505.22450v1

Authors: Ossi Räisä, Boris van Breugel, Mihaela van der Schaar

Any method’s development and practical application is limited by our ability to measure its reliability. The popularity of generative modeling emphasizes the importance of good synthetic data metrics. Unfortunately, previous works have found many failure cases in current metrics, for example lack of outlier robustness and unclear lower and upper bounds. We propose a list of desiderata for synthetic data metrics, and a suite of sanity checks: carefully chosen simple experiments that aim to detect specific and known generative modeling failure modes. Based on these desiderata and the results of our checks, we arrive at our position: all current generative fidelity and diversity metrics are flawed. This significantly hinders practical use of synthetic data. Our aim is to convince the research community to spend more effort in developing metrics, instead of models. Additionally, through analyzing how current metrics fail, we provide practitioners with guidelines on how these metrics should (not) be used.

nan

Article 514

Title@2025-05-28 (3): SOReL and TOReL: Two Methods for Fully Offline Reinforcement Learning

Title: SOReL and TOReL: Two Methods for Fully Offline Reinforcement Learning

SOReL und TOReL: Zwei Methoden für vollständiges Offline-Verstärkungslernen

SOLEL和TOREL: 完全脱线强化学习的两种方法 2505.22442v1

Authors: Mattie Fellows, Clarisse Wibault, Uljad Berdica, Johannes Forkel, Jakob N. Foerster, Michael A. Osborne

Sample efficiency remains a major obstacle for real world adoption of reinforcement learning (RL): success has been limited to settings where simulators provide access to essentially unlimited environment interactions, which in reality are typically costly or dangerous to obtain. Offline RL in principle offers a solution by exploiting offline data to learn a near-optimal policy before deployment. In practice, however, current offline RL methods rely on extensive online interactions for hyperparameter tuning, and have no reliable bound on their initial online performance. To address these two issues, we introduce two algorithms. Firstly, SOReL: an algorithm for safe offline reinforcement learning. Using only offline data, our Bayesian approach infers a posterior over environment dynamics to obtain a reliable estimate of the online performance via the posterior predictive uncertainty. Crucially, all hyperparameters are also tuned fully offline. Secondly, we introduce TOReL: a tuning for offline reinforcement learning algorithm that extends our information rate based offline hyperparameter tuning methods to general offline RL approaches. Our empirical evaluation confirms SOReL’s ability to accurately estimate regret in the Bayesian setting whilst TOReL’s offline hyperparameter tuning achieves competitive performance with the best online hyperparameter tuning methods using only offline data. Thus, SOReL and TOReL make a significant step towards safe and reliable offline RL, unlocking the potential for RL in the real world. Our implementations are publicly available: https://github.com/CWibault/sorel_torel.

nan

Article 515

Title@2025-05-28 (3): Variational Positive-incentive Noise: How Noise Benefits Models

Title: Variational Positive-incentive Noise: How Noise Benefits Models

Variational Positiv-incentive Noise: Wie Lärm Vorteile Modelle

变化式积极积极激励噪音:如何创造噪音效益模式 2306.07651v2

Authors: Hongyuan Zhang, Sida Huang, Yubin Guo, Xuelong Li

A large number of works aim to alleviate the impact of noise due to an underlying conventional assumption of the negative role of noise. However, some existing works show that the assumption does not always hold. In this paper, we investigate how to benefit the classical models by random noise under the framework of Positive-incentive Noise (Pi-Noise). Since the ideal objective of Pi-Noise is intractable, we propose to optimize its variational bound instead, namely variational Pi-Noise (VPN). With the variational inference, a VPN generator implemented by neural networks is designed for enhancing base models and simplifying the inference of base models, without changing the architecture of base models. Benefiting from the independent design of base models and VPN generators, the VPN generator can work with most existing models. From the experiments, it is shown that the proposed VPN generator can improve the base models. It is appealing that the trained variational VPN generator prefers to blur the irrelevant ingredients in complicated images, which meets our expectations.

nan

Article 516

Title@2025-05-28 (3): LAMBDA: A Large Model Based Data Agent

Title: LAMBDA: A Large Model Based Data Agent

LAMBDA: Ein großer modellbasierter Datenagent

LAMBDA:一个大型模型数据代理 2407.17535v3

Authors: Maojun Sun, Ruijian Han, Binyan Jiang, Houduo Qi, Defeng Sun, Yancheng Yuan, Jian Huang

We introduce LArge Model Based Data Agent (LAMBDA), a novel open-source, code-free multi-agent data analysis system that leverages the power of large language models. LAMBDA is designed to address data analysis challenges in data-driven applications through innovatively designed data agents using natural language. At the core of LAMBDA are two key agent roles: the programmer and the inspector, which are engineered to work together seamlessly. Specifically, the programmer generates code based on the user’s instructions and domain-specific knowledge, while the inspector debugs the code when necessary. To ensure robustness and handle adverse scenarios, LAMBDA features a user interface that allows direct user intervention. Moreover, LAMBDA can flexibly integrate external models and algorithms through our proposed Knowledge Integration Mechanism, catering to the needs of customized data analysis. LAMBDA has demonstrated strong performance on various data analysis tasks. It has the potential to enhance data analysis paradigms by seamlessly integrating human and artificial intelligence, making it more accessible, effective, and efficient for users from diverse backgrounds. The strong performance of LAMBDA in solving data analysis problems is demonstrated using real-world data examples. The code for LAMBDA is available at https://github.com/AMA-CMFAI/LAMBDA and videos of three case studies can be viewed at https://www.polyu.edu.hk/ama/cmfai/lambda.html.

nan

Article 517

Title@2025-05-28 (3): Data-Driven Antenna Miniaturization: A Knowledge-Based System Integrating Quantum PSO and Predictive Machine Learning Models

Title: Data-Driven Antenna Miniaturization: A Knowledge-Based System Integrating Quantum PSO and Predictive Machine Learning Models

Datengetriebene Antenne Miniaturisierung: Ein wissensbasiertes System zur Integration von Quanten-PSO und vorausschauenden Machine Learning-Modellen

数据驱动天线微型化:以知识为基础的系统综合量子PSO和可预测性机器学习模型 2505.22440v1

Authors: Khan Masood Parvez, Sk Md Abidar Rahaman, Ali Shiri Sichani

The rapid evolution of wireless technologies necessitates automated design frameworks to address antenna miniaturization and performance optimization within constrained development cycles. This study demonstrates a machine learning enhanced workflow integrating Quantum-Behaved Dynamic Particle Swarm Optimization (QDPSO) with ANSYS HFSS simulations to accelerate antenna design. The QDPSO algorithm autonomously optimized loop dimensions in 11.53 seconds, achieving a resonance frequency of 1.4208 GHz a 12.7 percent reduction compared to conventional 1.60 GHz designs. Machine learning models (SVM, Random Forest, XGBoost, and Stacked ensembles) predicted resonance frequencies in 0.75 seconds using 936 simulation datasets, with stacked models showing superior training accuracy (R2=0.9825) and SVM demonstrating optimal validation performance (R2=0.7197). The complete design cycle, encompassing optimization, prediction, and ANSYS validation, required 12.42 minutes on standard desktop hardware (Intel i5-8500, 16GB RAM), contrasting sharply with the 50-hour benchmark of PSADEA-based approaches. This 240 times of acceleration eliminates traditional trial-and-error methods that often extend beyond seven expert-led days. The system enables precise specifications of performance targets with automated generation of fabrication-ready parameters, particularly benefiting compact consumer devices requiring rapid frequency tuning. By bridging AI-driven optimization with CAD validation, this framework reduces engineering workloads while ensuring production-ready designs, establishing a scalable paradigm for next-generation RF systems in 6G and IoT applications.

nan

Article 518

Title@2025-05-28 (3): Synonymous Variational Inference for Perceptual Image Compression

Title: Synonymous Variational Inference for Perceptual Image Compression

Synonyme Variationsableitung für Wahrnehmungsbildkompression

感知图像压缩的同义同义变异推理 2505.22438v1

Authors: Zijian Liang, Kai Niu, Changshuo Wang, Jin Xu, Ping Zhang

Recent contributions of semantic information theory reveal the set-element relationship between semantic and syntactic information, represented as synonymous relationships. In this paper, we propose a synonymous variational inference (SVI) method based on this synonymity viewpoint to re-analyze the perceptual image compression problem. It takes perceptual similarity as a typical synonymous criterion to build an ideal synonymous set (Synset), and approximate the posterior of its latent synonymous representation with a parametric density by minimizing a partial semantic KL divergence. This analysis theoretically proves that the optimization direction of perception image compression follows a triple tradeoff that can cover the existing rate-distortion-perception schemes. Additionally, we introduce synonymous image compression (SIC), a new image compression scheme that corresponds to the analytical process of SVI, and implement a progressive SIC codec to fully leverage the model’s capabilities. Experimental results demonstrate comparable rate-distortion-perception performance using a single progressive SIC codec, thus verifying the effectiveness of our proposed analysis method.

nan

Article 519

Title@2025-05-28 (3): Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

Title: Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

Ausgelagerte Diffusionsprobenahme: Effiziente hintere Inferenz in latenten Räumen generativer Modelle

外部外包扩散采样:在基因变异模型潜在空间中有效的后继推论 2502.06999v2

Authors: Siddarth Venkatraman, Mohsin Hasan, Minsu Kim, Luca Scimeca, Marcin Sendera, Yoshua Bengio, Glen Berseth, Nikolay Malkin

Any well-behaved generative model over a variable $\mathbf{x}$ can be expressed as a deterministic transformation of an exogenous (‘outsourced’) Gaussian noise variable $\mathbf{z}$: $\mathbf{x}=f_\theta(\mathbf{z})$. In such a model (\eg, a VAE, GAN, or continuous-time flow-based model), sampling of the target variable $\mathbf{x} \sim p_\theta(\mathbf{x})$ is straightforward, but sampling from a posterior distribution of the form $p(\mathbf{x}\mid\mathbf{y}) \propto p_\theta(\mathbf{x})r(\mathbf{x},\mathbf{y})$, where $r$ is a constraint function depending on an auxiliary variable $\mathbf{y}$, is generally intractable. We propose to amortize the cost of sampling from such posterior distributions with diffusion models that sample a distribution in the noise space ($\mathbf{z}$). These diffusion samplers are trained by reinforcement learning algorithms to enforce that the transformed samples $f_\theta(\mathbf{z})$ are distributed according to the posterior in the data space ($\mathbf{x}$). For many models and constraints, the posterior in noise space is smoother than in data space, making it more suitable for amortized inference. Our method enables conditional sampling under unconditional GAN, (H)VAE, and flow-based priors, comparing favorably with other inference methods. We demonstrate the proposed outsourced diffusion sampling in several experiments with large pretrained prior models: conditional image generation, reinforcement learning with human feedback, and protein structure generation.

nan

Article 520

Title@2025-05-28 (3): C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models

Title: C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models

C-LoRA: Kontextuelle Low-Rank-Anpassung für Unsicherheitsabschätzungen in großen Sprachmodellen

C-LORA:用于大语言模型中不确定性估算的不确定性估算的上下文性低风险适应 2505.17773v2

Authors: Amir Hossein Rahmati, Sanket Jantre, Weifeng Zhang, Yucheng Wang, Byung-Jun Yoon, Nathan M. Urban, Xiaoning Qian

Low-Rank Adaptation (LoRA) offers a cost-effective solution for fine-tuning large language models (LLMs), but it often produces overconfident predictions in data-scarce few-shot settings. To address this issue, several classical statistical learning approaches have been repurposed for scalable uncertainty-aware LoRA fine-tuning. However, these approaches neglect how input characteristics affect the predictive uncertainty estimates. To address this limitation, we propose Contextual Low-Rank Adaptation (\textbf{C-LoRA}) as a novel uncertainty-aware and parameter efficient fine-tuning approach, by developing new lightweight LoRA modules contextualized to each input data sample to dynamically adapt uncertainty estimates. Incorporating data-driven contexts into the parameter posteriors, C-LoRA mitigates overfitting, achieves well-calibrated uncertainties, and yields robust predictions. Extensive experiments demonstrate that C-LoRA consistently outperforms the state-of-the-art uncertainty-aware LoRA methods in both uncertainty quantification and model generalization. Ablation studies further confirm the critical role of our contextual modules in capturing sample-specific uncertainties. C-LoRA sets a new standard for robust, uncertainty-aware LLM fine-tuning in few-shot regimes.

nan

Article 521

Title@2025-05-28 (3): AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

Title: AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

AstroVisBench: Ein Code-Bench für wissenschaftliche Computing und Visualisierung in der Astronomie

AstroVisbench:天文科学计算和可视化标准 2505.20538v2

Authors: Sebastian Antony Joseph, Syed Murtaza Husain, Stella S. R. Offner, Stéphanie Juneau, Paul Torrey, Adam S. Bolton, Juan P. Farias, Niall Gaffney, Greg Durrett, Junyi Jessy Li

Large Language Models (LLMs) are being explored for applications in scientific research, including their capabilities to synthesize literature, answer research questions, generate research ideas, and even conduct computational experiments. Ultimately, our goal is for these to help scientists derive novel scientific insights. In many areas of science, such insights often arise from processing and visualizing data to understand its patterns. However, evaluating whether an LLM-mediated scientific workflow produces outputs conveying the correct scientific insights is challenging to evaluate and has not been addressed in past work. We introduce AstroVisBench, the first benchmark for both scientific computing and visualization in the astronomy domain. AstroVisBench judges a language model’s ability to both (1) create astronomy-specific workflows to process and analyze data and (2) visualize the results of these workflows through complex plots. Our evaluation of visualizations uses a novel LLM-as-a-judge workflow, which is validated against annotation by five professional astronomers. Using AstroVisBench we present an evaluation of state-of-the-art language models, showing a significant gap in their ability to engage in astronomy research as useful assistants. This evaluation provides a strong end-to-end evaluation for AI scientists that offers a path forward for the development of visualization-based workflows, which are central to a broad range of domains from physics to biology.

nan

Article 522

Title@2025-05-28 (3): Scaling Reasoning without Attention

Title: Scaling Reasoning without Attention

Skalierung ohne Aufmerksamkeit

无人注意的调整理由 2505.22425v1

Authors: Xueliang Zhao, Wei Wu, Lingpeng Kong

Large language models (LLMs) have made significant advances in complex reasoning tasks, yet they remain bottlenecked by two core challenges: architectural inefficiency due to reliance on Transformers, and a lack of structured fine-tuning for high-difficulty domains. We introduce \ourmodel, an attention-free language model that addresses both issues through architectural and data-centric innovations. Built on the state space dual (SSD) layers of Mamba-2, our model eliminates the need for self-attention and key-value caching, enabling fixed-memory, constant-time inference. To train it for complex reasoning, we propose a two-phase curriculum fine-tuning strategy based on the \textsc{PromptCoT} synthesis paradigm, which generates pedagogically structured problems via abstract concept selection and rationale-guided generation. On benchmark evaluations, \ourmodel-7B outperforms strong Transformer and hybrid models of comparable scale, and even surpasses the much larger Gemma3-27B by 2.6\% on AIME 24, 0.6\% on AIME 25, and 3.0\% on Livecodebench. These results highlight the potential of state space models as efficient and scalable alternatives to attention-based architectures for high-capacity reasoning.

nan

Article 523

Title@2025-05-28 (3): STaR-Bets: Sequential Target-Recalculating Bets for Tighter Confidence Intervals

Title: STaR-Bets: Sequential Target-Recalculating Bets for Tighter Confidence Intervals

StaR-Bets: Sequentielle Target-Rekalkulationswetten für engere Vertrauensintervalle

STaR-Bets: 更密切信任间隔的序列目标-计算重新计算保证 2505.22422v1

Authors: Václav Voráček, Francesco Orabona

The construction of confidence intervals for the mean of a bounded random variable is a classical problem in statistics with numerous applications in machine learning and virtually all scientific fields. In particular, obtaining the tightest possible confidence intervals is vital every time the sampling of the random variables is expensive. The current state-of-the-art method to construct confidence intervals is by using betting algorithms. This is a very successful approach for deriving optimal confidence sequences, even matching the rate of law of iterated logarithms. However, in the fixed horizon setting, these approaches are either sub-optimal or based on heuristic solutions with strong empirical performance but without a finite-time guarantee. Hence, no betting-based algorithm guaranteeing the optimal $\mathcal{O}(\sqrt{\frac{\sigma^2\log\frac1\delta}{n}})$ width of the confidence intervals are known. This work bridges this gap. We propose a betting-based algorithm to compute confidence intervals that empirically outperforms the competitors. Our betting strategy uses the optimal strategy in every step (in a certain sense), whereas the standard betting methods choose a constant strategy in advance. Leveraging this fact results in strict improvements even for classical concentration inequalities, such as the ones of Hoeffding or Bernstein. Moreover, we also prove that the width of our confidence intervals is optimal up to an $1+o(1)$ factor diminishing with $n$. The code is available on~https://github.com/vvoracek/STaR-bets-confidence-interval.

nan

Article 524

Title@2025-05-28 (3): Beyond Verifiable Rewards: Scaling Reinforcement Learning for Language Models to Unverifiable Data

Title: Beyond Verifiable Rewards: Scaling Reinforcement Learning for Language Models to Unverifiable Data

Jenseits von überprüfbaren Belohnungen: Skalierung von Verstärkung Lernen für Sprachmodelle zu unüberprüfbaren Daten

超越可核实的奖励:加强语文模式的强化学习,以获得不可核实的数据 2503.19618v2

Authors: Yunhao Tang, Sid Wang, Lovish Madaan, Rémi Munos

We propose to scale RL to unverifiable data with a novel algorithm JEPO (Jensen’s Evidence lower bound Policy Optimization). While most prior efforts on scaling RL for LLMs focus on verifiable data where ground truth answers are typically short-form and can be matched easily; we investigate the case where such assumptions are less valid (e.g., when answers are long-form such as mathematical proofs). To scale RL training to unverifiable data with contemporary training constraints, we propose JEPO. JEPO applies Jensen’s evidence lower bound, a pragmatic simplification of the evidence lower bound which views chain-of-thought as a latent variable in the generative process. We show that on verifiable data (math), JEPO is as effective as RL with verifiable rewards; on semi-verifiable data (numina), JEPO improves on soft-match based evaluations compared to RL with verifiable rewards which can only leverage a subset of the data source; finally, on unverifiable data (numina-proof), JEPO outperforms SFT and a few ablation baselines on likelihood evaluations.

nan

Article 525

Title@2025-05-28 (3): Mitigating Overthinking in Large Reasoning Models via Manifold Steering

Title: Mitigating Overthinking in Large Reasoning Models via Manifold Steering

Überdenken in großen Vernunftmodellen durch Manifold Steering verhindern

通过 MManicform 指导减轻大型理性模型中的过度思考 2505.22411v1

Authors: Yao Huang, Huanran Chen, Shouwei Ruan, Yichi Zhang, Xingxing Wei, Yinpeng Dong

Recent advances in Large Reasoning Models (LRMs) have demonstrated remarkable capabilities in solving complex tasks such as mathematics and coding. However, these models frequently exhibit a phenomenon known as overthinking during inference, characterized by excessive validation loops and redundant deliberation, leading to substantial computational overheads. In this paper, we aim to mitigate overthinking by investigating the underlying mechanisms from the perspective of mechanistic interpretability. We first showcase that the tendency of overthinking can be effectively captured by a single direction in the model’s activation space and the issue can be eased by intervening the activations along this direction. However, this efficacy soon reaches a plateau and even deteriorates as the intervention strength increases. We therefore systematically explore the activation space and find that the overthinking phenomenon is actually tied to a low-dimensional manifold, which indicates that the limited effect stems from the noises introduced by the high-dimensional steering direction. Based on this insight, we propose Manifold Steering, a novel approach that elegantly projects the steering direction onto the low-dimensional activation manifold given the theoretical approximation of the interference noise. Extensive experiments on DeepSeek-R1 distilled models validate that our method reduces output tokens by up to 71% while maintaining and even improving the accuracy on several mathematical benchmarks. Our method also exhibits robust cross-domain transferability, delivering consistent token reduction performance in code generation and knowledge-based QA tasks. Code is available at: https://github.com/Aries-iai/Manifold_Steering.

nan

Article 526

Title@2025-05-28 (3): Decoupled Subgraph Federated Learning

Title: Decoupled Subgraph Federated Learning

Entkoppelter Subgraph Federated Learning

分校分科分科分科分科 2402.19163v3

Authors: Javad Aliakbari, Johan Östman, Alexandre Graell i Amat

We address the challenge of federated learning on graph-structured data distributed across multiple clients. Specifically, we focus on the prevalent scenario of interconnected subgraphs, where interconnections between different clients play a critical role. We present a novel framework for this scenario, named FedStruct, that harnesses deep structural dependencies. To uphold privacy, unlike existing methods, FedStruct eliminates the necessity of sharing or generating sensitive node features or embeddings among clients. Instead, it leverages explicit global graph structure information to capture inter-node dependencies. We validate the effectiveness of FedStruct through experimental results conducted on six datasets for semi-supervised node classification, showcasing performance close to the centralized approach across various scenarios, including different data partitioning methods, varying levels of label availability, and number of clients.

nan

Article 527

Title@2025-05-28 (3): Beyond External Monitors: Enhancing Transparency of Large Language Models for Easier Monitoring

Title: Beyond External Monitors: Enhancing Transparency of Large Language Models for Easier Monitoring

Jenseits von externen Monitoren: Verbesserung der Transparenz von großen Sprachmodellen für eine einfachere Überwachung

外部监测之外的外部监测:提高大语言模型的透明度,促进更易监测 2502.05242v2

Authors: Guanxu Chen, Dongrui Liu, Tao Luo, Lijie Hu, Jing Shao

Large language models (LLMs) are becoming increasingly capable, but the mechanisms of their thinking and decision-making process remain unclear. Chain-of-thoughts (CoTs) have been commonly utilized to monitor LLMs, but this strategy fails to accurately reflect LLMs’ thinking process. Techniques based on LLMs’ hidden representations provide an inner perspective to monitor their latent thinking. However, previous methods only try to develop external monitors instead of making LLMs themselves easier to monitor. In this paper, we propose a novel method TELLME, improving the transparency of LLMs and helping monitors identify unsuitable and sensitive behaviors. Furthermore, we showcase the applications of TELLME on trustworthiness tasks (\eg, safety risks monitoring tasks and detoxification tasks), where LLMs achieve consistent improvement in transparency and task performance. More crucially, we theoretically analyze the improvement of TELLME on LLMs’ generalization ability through optimal transport theory.

nan

Article 528

Title@2025-05-28 (3): BILBO: BILevel Bayesian Optimization

Title: BILBO: BILevel Bayesian Optimization

BILBO: BILevel Bayesian Optimierung

BILBO: BI级巴耶斯最佳优化 2502.02121v2

Authors: Ruth Wan Theng Chew, Quoc Phong Nguyen, Bryan Kian Hsiang Low

Bilevel optimization is characterized by a two-level optimization structure, where the upper-level problem is constrained by optimal lower-level solutions, and such structures are prevalent in real-world problems. The constraint by optimal lower-level solutions poses significant challenges, especially in noisy, constrained, and derivative-free settings, as repeating lower-level optimizations is sample inefficient and predicted lower-level solutions may be suboptimal. We present BILevel Bayesian Optimization (BILBO), a novel Bayesian optimization algorithm for general bilevel problems with blackbox functions, which optimizes both upper- and lower-level problems simultaneously, without the repeated lower-level optimization required by existing methods. BILBO samples from confidence-bounds based trusted sets, which bounds the suboptimality on the lower level. Moreover, BILBO selects only one function query per iteration, where the function query selection strategy incorporates the uncertainty of estimated lower-level solutions and includes a conditional reassignment of the query to encourage exploration of the lower-level objective. The performance of BILBO is theoretically guaranteed with a sublinear regret bound for commonly used kernels and is empirically evaluated on several synthetic and real-world problems.

nan

Article 529

Title@2025-05-28 (3): Simultaneously Solving FBSDEs and their Associated Semilinear Elliptic PDEs with Small Neural Operators

Title: Simultaneously Solving FBSDEs and their Associated Semilinear Elliptic PDEs with Small Neural Operators

Gleichzeitige Lösung von FBSDs und ihren zugehörigen semilinearen elliptischen PDEs mit kleinen neuralen Operatoren

与小型神经操作器同时解决FBSDEs及其相关半线性椭圆形粒体 2410.14788v2

Authors: Takashi Furuya, Anastasis Kratsios

Forward-backwards stochastic differential equations (FBSDEs) play an important role in optimal control, game theory, economics, mathematical finance, and in reinforcement learning. Unfortunately, the available FBSDE solvers operate on \textit{individual} FBSDEs, meaning that they cannot provide a computationally feasible strategy for solving large families of FBSDEs, as these solvers must be re-run several times. \textit{Neural operators} (NOs) offer an alternative approach for \textit{simultaneously solving} large families of decoupled FBSDEs by directly approximating the solution operator mapping \textit{inputs:} terminal conditions and dynamics of the backwards process to \textit{outputs:} solutions to the associated FBSDE. Though universal approximation theorems (UATs) guarantee the existence of such NOs, these NOs are unrealistically large. Upon making only a few simple theoretically-guided tweaks to the standard convolutional NO build, we confirm that ``small’’ NOs can uniformly approximate the solution operator to structured families of FBSDEs with random terminal time, uniformly on suitable compact sets determined by Sobolev norms using a logarithmic depth, a constant width, and a polynomial rank in the reciprocal approximation error. This result is rooted in our second result, and main contribution to the NOs for PDE literature, showing that our convolutional NOs of similar depth and width but grow only \textit{quadratically} (at a dimension-free rate) when uniformly approximating the solution operator of the associated class of semilinear Elliptic PDEs to these families of FBSDEs. A key insight into how NOs work we uncover is that the convolutional layers of our NO can approximately implement the fixed point iteration used to prove the existence of a unique solution to these semilinear Elliptic PDEs.

nan

Article 530

Title@2025-05-28 (3): Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

Title: Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

Inferenz-Time Scaling für Flow-Modelle über stochastische Generation und Rollover Budget Forcing

通过存储器生成和滚转预算推力对流动模型的推推时间调整 2503.19385v4

Authors: Jaihoon Kim, Taehoon Yoon, Jisung Hwang, Minhyuk Sung

We propose an inference-time scaling approach for pretrained flow models. Recently, inference-time scaling has gained significant attention in LLMs and diffusion models, improving sample quality or better aligning outputs with user preferences by leveraging additional computation. For diffusion models, particle sampling has allowed more efficient scaling due to the stochasticity at intermediate denoising steps. On the contrary, while flow models have gained popularity as an alternative to diffusion models–offering faster generation and high-quality outputs in state-of-the-art image and video generative models–efficient inference-time scaling methods used for diffusion models cannot be directly applied due to their deterministic generative process. To enable efficient inference-time scaling for flow models, we propose three key ideas: 1) SDE-based generation, enabling particle sampling in flow models, 2) Interpolant conversion, broadening the search space and enhancing sample diversity, and 3) Rollover Budget Forcing (RBF), an adaptive allocation of computational resources across timesteps to maximize budget utilization. Our experiments show that SDE-based generation, particularly variance-preserving (VP) interpolant-based generation, improves the performance of particle sampling methods for inference-time scaling in flow models. Additionally, we demonstrate that RBF with VP-SDE achieves the best performance, outperforming all previous inference-time scaling approaches.

nan

Article 531

Title@2025-05-28 (3): Physics-Informed Distillation of Diffusion Models for PDE-Constrained Generation

Title: Physics-Informed Distillation of Diffusion Models for PDE-Constrained Generation

Physik-informierte Destillation von Diffusionsmodellen für PDE-kontrainierte Generation

PDE - 受培训的一代的传播模型的物理改造 2505.22391v1

Authors: Yi Zhang, Difan Zou

Modeling physical systems in a generative manner offers several advantages, including the ability to handle partial observations, generate diverse solutions, and address both forward and inverse problems. Recently, diffusion models have gained increasing attention in the modeling of physical systems, particularly those governed by partial differential equations (PDEs). However, diffusion models only access noisy data $\boldsymbol{x}_t$ at intermediate steps, making it infeasible to directly enforce constraints on the clean sample $\boldsymbol{x}_0$ at each noisy level. As a workaround, constraints are typically applied to the expectation of clean samples $\mathbb{E}[\boldsymbol{x}_0

\boldsymbol{x}_t]$, which is estimated using the learned score network. However, imposing PDE constraints on the expectation does not strictly represent the one on the true clean data, known as Jensen’s Gap. This gap creates a trade-off: enforcing PDE constraints may come at the cost of reduced accuracy in generative modeling. To address this, we propose a simple yet effective post-hoc distillation approach, where PDE constraints are not injected directly into the diffusion process, but instead enforced during a post-hoc distillation stage. We term our method as Physics-Informed Distillation of Diffusion Models (PIDDM). This distillation not only facilitates single-step generation with improved PDE satisfaction, but also support both forward and inverse problem solving and reconstruction from randomly partial observation. Extensive experiments across various PDE benchmarks demonstrate that PIDDM significantly improves PDE satisfaction over several recent and competitive baselines, such as PIDM, DiffusionPDE, and ECI-sampling, with less computation overhead. Our approach can shed light on more efficient and effective strategies for incorporating physical constraints into diffusion models.

nan

Article 532

Title@2025-05-28 (3): Revisiting Feature Interactions from the Perspective of Quadratic Neural Networks for Click-through Rate Prediction

Title: Revisiting Feature Interactions from the Perspective of Quadratic Neural Networks for Click-through Rate Prediction

Überprüfung von Feature-Interaktionen aus der Perspektive quadratischer neuraler Netzwerke für Click-through-Rate-Vorhersage

从 “ 点击通速率预测 “ 四方神经网络的角度重新审视地貌相互作用 2505.17999v2

Authors: Honghao Li, Yiwen Zhang, Yi Zhang, Lei Sang, Jieming Zhu

Hadamard Product (HP) has long been a cornerstone in click-through rate (CTR) prediction tasks due to its simplicity, effectiveness, and ability to capture feature interactions without additional parameters. However, the underlying reasons for its effectiveness remain unclear. In this paper, we revisit HP from the perspective of Quadratic Neural Networks (QNN), which leverage quadratic interaction terms to model complex feature relationships. We further reveal QNN’s ability to expand the feature space and provide smooth nonlinear approximations without relying on activation functions. Meanwhile, we find that traditional post-activation does not further improve the performance of the QNN. Instead, mid-activation is a more suitable alternative. Through theoretical analysis and empirical evaluation of 25 QNN neuron formats, we identify a good-performing variant and make further enhancements on it. Specifically, we propose the Multi-Head Khatri-Rao Product as a superior alternative to HP and a Self-Ensemble Loss with dynamic ensemble capability within the same network to enhance computational efficiency and performance. Ultimately, we propose a novel neuron format, QNN-alpha, which is tailored for CTR prediction tasks. Experimental results show that QNN-alpha achieves new state-of-the-art performance on six public datasets while maintaining low inference latency, good scalability, and excellent compatibility. The code, running logs, and detailed hyperparameter configurations are available at: https://github.com/salmon1802/QNN.

nan

Article 533

Title@2025-05-28 (3): DAM: Domain-Aware Module for Multi-Domain Dataset Condensation

Title: DAM: Domain-Aware Module for Multi-Domain Dataset Condensation

DAM: Domain-Aware-Modul für Multi-Domain-Datensatz-Kondensation

DAM: 多域数据集集中的域- 软件模块 2505.22387v1

Authors: Jaehyun Choi, Gyojin Han, Dong-Jae Lee, Sunghyun Baek, Junmo Kim

Dataset Condensation (DC) has emerged as a promising solution to mitigate the computational and storage burdens associated with training deep learning models. However, existing DC methods largely overlook the multi-domain nature of modern datasets, which are increasingly composed of heterogeneous images spanning multiple domains. In this paper, we extend DC and introduce Multi-Domain Dataset Condensation (MDDC), which aims to condense data that generalizes across both single-domain and multi-domain settings. To this end, we propose the Domain-Aware Module (DAM), a training-time module that embeds domain-related features into each synthetic image via learnable spatial masks. As explicit domain labels are mostly unavailable in real-world datasets, we employ frequency-based pseudo-domain labeling, which leverages low-frequency amplitude statistics. DAM is only active during the condensation process, thus preserving the same images per class (IPC) with prior methods. Experiments show that DAM consistently improves in-domain, out-of-domain, and cross-architecture performance over baseline dataset condensation methods.

nan

Article 534

Title@2025-05-28 (3): When do neural networks learn world models?

Title: When do neural networks learn world models?

Wann lernen neuronale Netzwerke Weltmodelle?

神经网络何时学习世界模型? 2502.09297v3

Authors: Tianren Zhang, Guanyu Chen, Feng Chen

Humans develop world models that capture the underlying generation process of data. Whether neural networks can learn similar world models remains an open problem. In this work, we present the first theoretical results for this problem, showing that in a multi-task setting, models with a low-degree bias provably recover latent data-generating variables under mild assumptions – even if proxy tasks involve complex, non-linear functions of the latents. However, such recovery is sensitive to model architecture. Our analysis leverages Boolean models of task solutions via the Fourier-Walsh transform and introduces new techniques for analyzing invertible Boolean transforms, which may be of independent interest. We illustrate the algorithmic implications of our results and connect them to related research areas, including self-supervised learning, out-of-distribution generalization, and the linear representation hypothesis in large language models.

nan

Article 535

Title@2025-05-28 (3): Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection

Title: Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection

Infinite-dimensionale Mahalanobis-Distanz mit Anwendungen zur kernisierten Neuheitserkennung

无限的马哈拉诺比斯距离,应用内核新闻探测技术 2407.11873v2

Authors: Nikita Zozoulenko, Thomas Cass, Lukas Gonon

The Mahalanobis distance is a classical tool used to measure the covariance-adjusted distance between points in $\bbR^d$. In this work, we extend the concept of Mahalanobis distance to separable Banach spaces by reinterpreting it as a Cameron-Martin norm associated with a probability measure. This approach leads to a basis-free, data-driven notion of anomaly distance through the so-called variance norm, which can naturally be estimated using empirical measures of a sample. Our framework generalizes the classical $\bbR^d$, functional $(L^2[0,1])^d$, and kernelized settings; importantly, it incorporates non-injective covariance operators. We prove that the variance norm is invariant under invertible bounded linear transformations of the data, extending previous results which are limited to unitary operators. In the Hilbert space setting, we connect the variance norm to the RKHS of the covariance operator and establish consistency and convergence results for estimation using empirical measures. Using the variance norm, we introduce the notion of a kernelized nearest-neighbour Mahalanobis distance. In an empirical study on 12 real-world data sets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series novelty detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels.

nan

Article 536

Title@2025-05-28 (3): Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning

Title: Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning

Überwindung von Dimensional Factorization Limits in diskreten Diffusionsmodellen durch Quantum Joint Distribution Learning

通过量子联合分发学习克服分辨传播模式中的分量限制 2505.05151v2

Authors: Chuangtao Chen, Qinglin Zhao, MengChu Zhou, Zhimin He, Haozhen Situ

This study explores quantum-enhanced discrete diffusion models to overcome classical limitations in learning high-dimensional distributions. We rigorously prove that classical discrete diffusion models, which calculate per-dimension transition probabilities to avoid exponential computational cost, exhibit worst-case linear scaling of Kullback-Leibler (KL) divergence with data dimension. To address this, we propose a Quantum Discrete Denoising Diffusion Probabilistic Model (QD3PM), which enables joint probability learning through diffusion and denoising in exponentially large Hilbert spaces. By deriving posterior states through quantum Bayes’ theorem, similar to the crucial role of posterior probabilities in classical diffusion models, and by learning the joint probability, we establish a solid theoretical foundation for quantum-enhanced diffusion models. For denoising, we design a quantum circuit using temporal information for parameter sharing and learnable classical-data-controlled rotations for encoding. Exploiting joint distribution learning, our approach enables single-step sampling from pure noise, eliminating iterative requirements of existing models. Simulations demonstrate the proposed model’s superior accuracy in modeling complex distributions compared to factorization methods. Hence, this paper establishes a new theoretical paradigm in generative models by leveraging the quantum advantage in joint distribution learning.

nan

Article 537

Title@2025-05-28 (3): A Divide-and-Conquer Approach for Modeling Arrival Times in Business Process Simulation

Title: A Divide-and-Conquer Approach for Modeling Arrival Times in Business Process Simulation

Ein Divide-and-Conquer-Ansatz für die Modellierung von Ankunftszeiten in der Business Process Simulation

在模拟商业进程中模拟抵达时 2505.22381v1

Authors: Lukas Kirchdorfer, Konrad Özdemir, Stjepan Kusenic, Han van der Aa, Heiner Stuckenschmidt

Business Process Simulation (BPS) is a critical tool for analyzing and improving organizational processes by estimating the impact of process changes. A key component of BPS is the case-arrival model, which determines the pattern of new case entries into a process. Although accurate case-arrival modeling is essential for reliable simulations, as it influences waiting and overall cycle times, existing approaches often rely on oversimplified static distributions of inter-arrival times. These approaches fail to capture the dynamic and temporal complexities inherent in organizational environments, leading to less accurate and reliable outcomes. To address this limitation, we propose Auto Time Kernel Density Estimation (AT-KDE), a divide-and-conquer approach that models arrival times of processes by incorporating global dynamics, day-of-week variations, and intraday distributional changes, ensuring both precision and scalability. Experiments conducted across 20 diverse processes demonstrate that AT-KDE is far more accurate and robust than existing approaches while maintaining sensible execution time efficiency.

nan

Article 538

Title@2025-05-28 (3): Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Association

Title: Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Association

Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Association

Lipschitz-Driven 不确定性为空间协会量化 2502.06067v2

Authors: David R. Burt, Renato Berlinghieri, Stephen Bates, Tamara Broderick

Estimating associations between spatial covariates and responses - rather than merely predicting responses - is central to environmental science, epidemiology, and economics. For instance, public health officials might be interested in whether air pollution has a strictly positive association with a health outcome, and the magnitude of any effect. Standard machine learning methods often provide accurate predictions but offer limited insight into covariate-response relationships. And we show that existing methods for constructing confidence (or credible) intervals for associations fail to provide nominal coverage in the face of model misspecification and distribution shift - despite both being essentially always present in spatial problems. We introduce a method that constructs valid frequentist confidence intervals for associations in spatial settings. Our method requires minimal assumptions beyond a form of spatial smoothness. In particular, we do not require model correctness or covariate overlap between training and target locations. Our approach is the first to guarantee nominal coverage in this setting and outperforms existing techniques in both real and simulated experiments.

nan

Article 539

Title@2025-05-28 (3): Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization

Title: Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization

Memento No More: Coaching von KI-Agenten zu Master mehrere Aufgaben durch Hinweise Internalisierung

不再纪念:通过Hints内部化,指导AI代理人员掌握多项任务 2502.01562v2

Authors: Minttu Alakuijala, Ya Gao, Georgy Ananov, Samuel Kaski, Pekka Marttinen, Alexander Ilin, Harri Valpola

As the general capabilities of artificial intelligence (AI) agents continue to evolve, their ability to learn to master multiple complex tasks through experience remains a key challenge. Current LLM agents, particularly those based on proprietary language models, typically rely on prompts to incorporate knowledge about the target tasks. This approach does not allow the agent to internalize this information and instead relies on ever-expanding prompts to sustain its functionality in diverse scenarios. This resembles a system of notes used by a person affected by anterograde amnesia, the inability to form new memories. In this paper, we propose a novel method to train AI agents to incorporate knowledge and skills for multiple tasks without the need for either cumbersome note systems or prior high-quality demonstration data. Our approach employs an iterative process where the agent collects new experiences, receives corrective feedback from humans in the form of hints, and integrates this feedback into its weights via a context distillation training procedure. We demonstrate the efficacy of our approach by implementing it in a Llama-3-based agent that, after only a few rounds of feedback, outperforms advanced models GPT-4o and DeepSeek-V3 in tasksets requiring correct sequencing of information retrieval, tool use, and question answering.

nan

Article 540

Title@2025-05-28 (3): Update Your Transformer to the Latest Release: Re-Basin of Task Vectors

Title: Update Your Transformer to the Latest Release: Re-Basin of Task Vectors

Aktualisieren Sie Ihren Transformer auf die neueste Version: Re-Basin der Task-Vektoren

将您的变换器更新为最新版本: 任务矢量的重新 Basin 2505.22697v1

Authors: Filippo Rinaldi, Giacomo Capitani, Lorenzo Bonicelli, Donato Crisostomi, Federico Bolelli, Elisa Ficarra, Emanuele Rodolà, Simone Calderara, Angelo Porrello

Foundation models serve as the backbone for numerous specialized models developed through fine-tuning. However, when the underlying pretrained model is updated or retrained (e.g., on larger and more curated datasets), the fine-tuned model becomes obsolete, losing its utility and requiring retraining. This raises the question: is it possible to transfer fine-tuning to a new release of the model? In this work, we investigate how to transfer fine-tuning to a new checkpoint without having to re-train, in a data-free manner. To do so, we draw principles from model re-basin and provide a recipe based on weight permutations to re-base the modifications made to the original base model, often called task vector. In particular, our approach tailors model re-basin for Transformer models, taking into account the challenges of residual connections and multi-head attention layers. Specifically, we propose a two-level method rooted in spectral theory, initially permuting the attention heads and subsequently adjusting parameters within select pairs of heads. Through extensive experiments on visual and textual tasks, we achieve the seamless transfer of fine-tuned knowledge to new pre-trained backbones without relying on a single training step or datapoint. Code is available at https://github.com/aimagelab/TransFusion.

nan

Article 541

Title@2025-05-28 (3): An Empirical Evaluation of Rewiring Approaches in Graph Neural Networks

Title: An Empirical Evaluation of Rewiring Approaches in Graph Neural Networks

Eine empirische Bewertung der Verdrahtungsansätze in Graphen-Neuralen Netzwerken

对图形神经网络重新布线方法的经验评价 2305.19717v2

Authors: Alessio Micheli, Domenico Tortorella

Graph neural networks compute node representations by performing multiple message-passing steps that consist in local aggregations of node features. Having deep models that can leverage longer-range interactions between nodes is hindered by the issues of over-smoothing and over-squashing. In particular, the latter is attributed to the graph topology which guides the message-passing, causing a node representation to become insensitive to information contained at distant nodes. Many graph rewiring methods have been proposed to remedy or mitigate this problem. However, properly evaluating the benefits of these methods is made difficult by the coupling of over-squashing with other issues strictly related to model training, such as vanishing gradients. Therefore, we propose an evaluation setting based on message-passing models that do not require training to compute node and graph representations. We perform a systematic experimental comparison on real-world node and graph classification tasks, showing that rewiring the underlying graph rarely does confer a practical benefit for message-passing.

nan

Article 542

Title: Topological Eigenvalue Theorems for Tensor Analysis in Multi-Modal Data Fusion

Topologische Eigenwert-Theoreme für die Tensoranalyse in multi-Modal Data Fusion

多模式数据融合中用于天线分析的多模式数据融合中的表光分析的表性地球价值地形学理论论 2409.09392v3

Authors: Ronald Katende

This paper presents a novel framework for tensor eigenvalue analysis in the context of multi-modal data fusion, leveraging topological invariants such as Betti numbers. Traditional approaches to tensor eigenvalue analysis often extend matrix theory, whereas this work introduces a topological perspective to enhance the understanding of tensor structures. By establishing new theorems that link eigenvalues to topological features, the proposed framework provides deeper insights into the latent structure of data, improving both interpretability and robustness. Applications in data fusion demonstrate the theoretical and practical significance of this approach, with potential for broad impact in machine learning and data science.

nan

Article 543

Title@2025-05-28 (3): Computing Optimal Transport Maps and Wasserstein Barycenters Using Conditional Normalizing Flows

Title: Computing Optimal Transport Maps and Wasserstein Barycenters Using Conditional Normalizing Flows

Computing Optimal Transport Maps und Wasserstein Barycenter mit bedingten Normalisierungsflüssen

使用条件性正常流动的最佳运输地图和瓦塞尔斯坦百分点 2505.22364v1

Authors: Gabriele Visentin, Patrick Cheridito

We present a novel method for efficiently computing optimal transport maps and Wasserstein barycenters in high-dimensional spaces. Our approach uses conditional normalizing flows to approximate the input distributions as invertible pushforward transformations from a common latent space. This makes it possible to directly solve the primal problem using gradient-based minimization of the transport cost, unlike previous methods that rely on dual formulations and complex adversarial optimization. We show how this approach can be extended to compute Wasserstein barycenters by solving a conditional variance minimization problem. A key advantage of our conditional architecture is that it enables the computation of barycenters for hundreds of input distributions, which was computationally infeasible with previous methods. Our numerical experiments illustrate that our approach yields accurate results across various high-dimensional tasks and compares favorably with previous state-of-the-art methods.

nan

Article 544

Title@2025-05-28 (3): Directed Homophily-Aware Graph Neural Network

Title: Directed Homophily-Aware Graph Neural Network

Regie führte homophily-aware Graph Neural Network

直导光电图神经网络 2505.22362v1

Authors: Aihu Zhang, Jiaxing Xu, Mengcheng Lan, Shili Xiang, Yiping Ke

Graph Neural Networks (GNNs) have achieved significant success in various learning tasks on graph-structured data. Nevertheless, most GNNs struggle to generalize to heterophilic neighborhoods. Additionally, many GNNs ignore the directional nature of real-world graphs, resulting in suboptimal performance on directed graphs with asymmetric structures. In this work, we propose Directed Homophily-aware Graph Neural Network (DHGNN), a novel framework that addresses these limitations by incorporating homophily-aware and direction-sensitive components. DHGNN employs a resettable gating mechanism to adaptively modulate message contributions based on homophily levels and informativeness, and a structure-aware noise-tolerant fusion module to effectively integrate node representations from the original and reverse directions. Extensive experiments on both homophilic and heterophilic directed graph datasets demonstrate that DHGNN outperforms state-of-the-art methods in node classification and link prediction. In particular, DHGNN improves over the best baseline by up to 15.07% in link prediction. Our analysis further shows that the gating mechanism captures directional homophily gaps and fluctuating homophily across layers, providing deeper insights into message-passing behavior on complex graph structures.

nan

Article 545

Title@2025-05-28 (3): Continuum-armed Bandit Optimization with Batch Pairwise Comparison Oracles

Title: Continuum-armed Bandit Optimization with Batch Pairwise Comparison Oracles

Kontinuierliche Bandit-Optimierung mit Batch Pairwise Vergleich Oracles

以批次对称比较甲骨文优化利用批次对称比较 2505.22361v1

Authors: Xiangyu Chang, Xi Chen, Yining Wang, Zhiyi Zeng

This paper studies a bandit optimization problem where the goal is to maximize a function $f(x)$ over $T$ periods for some unknown strongly concave function $f$. We consider a new pairwise comparison oracle, where the decision-maker chooses a pair of actions $(x, x’)$ for a consecutive number of periods and then obtains an estimate of $f(x)-f(x’)$. We show that such a pairwise comparison oracle finds important applications to joint pricing and inventory replenishment problems and network revenue management. The challenge in this bandit optimization is twofold. First, the decision-maker not only needs to determine a pair of actions $(x, x’)$ but also a stopping time $n$ (i.e., the number of queries based on $(x, x’)$). Second, motivated by our inventory application, the estimate of the difference $f(x)-f(x’)$ is biased, which is different from existing oracles in stochastic optimization literature. To address these challenges, we first introduce a discretization technique and local polynomial approximation to relate this problem to linear bandits. Then we developed a tournament successive elimination technique to localize the discretized cell and run an interactive batched version of LinUCB algorithm on cells. We establish regret bounds that are optimal up to poly-logarithmic factors. Furthermore, we apply our proposed algorithm and analytical framework to the two operations management problems and obtain results that improve state-of-the-art results in the existing literature.

nan

Article 546

Title@2025-05-28 (3): Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification

Title: Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification

Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification

多级损失多级损失多级损失多级分分分分化中梯源普遍化的多级几何事项 2505.22359v1

Authors: Matan Schliserman, Tomer Koren

We study the generalization performance of unregularized gradient methods for separable linear classification. While previous work mostly deal with the binary case, we focus on the multiclass setting with $k$ classes and establish novel population risk bounds for Gradient Descent for loss functions that decay to zero. In this setting, we show risk bounds that reveal that convergence rates are crucially influenced by the geometry of the loss template, as formalized by Wang and Scott (2024), rather than of the loss function itself. Particularly, we establish risk upper bounds that holds for any decay rate of the loss whose template is smooth with respect to the $p$-norm. In the case of exponentially decaying losses, our results indicates a contrast between the $p=\infty$ case, where the risk exhibits a logarithmic dependence on $k$, and $p=2$ where the risk scales linearly with $k$. To establish this separation formally, we also prove a lower bound in the latter scenario, demonstrating that the polynomial dependence on $k$ is unavoidable. Central to our analysis is a novel bound on the Rademacher complexity of low-noise vector-valued linear predictors with a loss template smooth w.r.t.~general $p$-norms.

nan

Article 547

Title@2025-05-28 (3): Budget-Adaptive Adapter Tuning in Orthogonal Subspaces for Continual Learning in LLMs

Title: Budget-Adaptive Adapter Tuning in Orthogonal Subspaces for Continual Learning in LLMs

Budget-Adaptive Adapter Tuning in Orthogonal Subspaces für kontinuierliches Lernen in LLMs

用于LLMM中持续学习的正方形子空间的预算-ADA 预算-ADA 调适器图案 2505.22358v1

Authors: Zhiyi Wan, Wanrou Du, Liang Li, Miao Pan, Xiaoqi Qin

Large language models (LLMs) often suffer from catastrophic forgetting in continual learning (CL) scenarios, where performance on previously learned tasks degrades severely while training on sequentially arriving tasks. Although pioneering CL approaches using orthogonal subspaces can mitigate task interference, they typically employ fixed budget allocation, neglecting the varying complexity across tasks and layers. Besides, recent budget-adaptive tuning methods for LLMs often adopt multi-stage paradigms that decouple optimization and budget allocation. Such decoupling results in potential misalignment, which hinders those approaches’ practical application in CL scenarios. To address these limitations, we propose OA-Adapter, a novel parameter-efficient approach for continual learning in LLMs that unifies dynamic budget adaptation with orthogonal subspace learning in a single end-to-end training stage. Specifically, OA-Adapter introduces a dynamic bottleneck dimension adaptation mechanism that simultaneously allocates an efficient parameter budget and optimizes task objectives without misalignment. To effectively preserve previously acquired knowledge while coordinating with the dynamic budget allocation, orthogonal constraints are applied specifically between the parameter subspace of the current task and the dynamically allocated parameter subspaces of historical tasks. Experimental results on continual learning benchmarks demonstrate that OA-Adapter outperforms state-of-the-art methods in both accuracy and parameter efficiency, achieving higher average accuracy while using 58.5% fewer parameters on the standard CL benchmark.

nan

Article 548

Title@2025-05-28 (3): Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings

Title: Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings

Eignungsfilter: Ein statistisches Rahmenwerk für die Klassifikator-Evaluierung in Real-World-Einsatzeinstellungen

适用性过滤器:在现实世界部署设置中进行分类评价的统计框架 2505.22356v1

Authors: Angéline Pouget, Mohammad Yaghini, Stephan Rabanser, Nicolas Papernot

Deploying machine learning models in safety-critical domains poses a key challenge: ensuring reliable model performance on downstream user data without access to ground truth labels for direct validation. We propose the suitability filter, a novel framework designed to detect performance deterioration by utilizing suitability signals – model output features that are sensitive to covariate shifts and indicative of potential prediction errors. The suitability filter evaluates whether classifier accuracy on unlabeled user data shows significant degradation compared to the accuracy measured on the labeled test dataset. Specifically, it ensures that this degradation does not exceed a pre-specified margin, which represents the maximum acceptable drop in accuracy. To achieve reliable performance evaluation, we aggregate suitability signals for both test and user data and compare these empirical distributions using statistical hypothesis testing, thus providing insights into decision uncertainty. Our modular method adapts to various models and domains. Empirical evaluations across different classification tasks demonstrate that the suitability filter reliably detects performance deviations due to covariate shift. This enables proactive mitigation of potential failures in high-stakes applications.

nan

Article 549

Title@2025-05-28 (3): Look Within or Look Beyond? A Theoretical Comparison Between Parameter-Efficient and Full Fine-Tuning

Title: Look Within or Look Beyond? A Theoretical Comparison Between Parameter-Efficient and Full Fine-Tuning

Schauen Sie nach innen oder schauen Sie darüber hinaus? Ein theoretischer Vergleich zwischen Parameter-Effizient und Full Fine-Tuning

内观还是外观? 参数有效与完全精准之间的理论比较。 2505.22355v1

Authors: Yongkang Liu, Xingle Xu, Ercong Nie, Zijing Wang, Shi Feng, Daling Wang, Qian Li, Hinrich Schütze

Parameter-Efficient Fine-Tuning (PEFT) methods achieve performance comparable to Full Fine-Tuning (FFT) while requiring significantly fewer computing resources, making it the go-to choice for researchers. We find that although PEFT can achieve competitive results on some benchmarks, its performance falls short of FFT in complex tasks, such as reasoning and instruction-based fine-tuning. In this paper, we compare the characteristics of PEFT and FFT in terms of representational capacity and robustness based on optimization theory. We theoretically demonstrate that PEFT is a strict subset of FFT. By providing theoretical upper bounds for PEFT, we show that the limited parameter space constrains the model’s representational ability, making it more susceptible to perturbations. Experiments on 15 datasets encompassing classification, generation, reasoning, instruction fine-tuning tasks and 11 adversarial test sets validate our theories. We hope that these results spark further research beyond the realms of well established PEFT. The source code is in the anonymous Github repository\footnote{https://github.com/misonsky/PEFTEval}.

nan

Article 550

Title@2025-05-28 (3): Context-sensitive neocortical neurons transform the effectiveness and efficiency of neural information processing

Title: Context-sensitive neocortical neurons transform the effectiveness and efficiency of neural information processing

Kontext-sensible neocortical Neuronen verwandeln die Wirksamkeit und Effizienz der neuronalen Informationsverarbeitung

环境敏感的新园艺神经元改变神经信息处理的效益和效率 2207.07338v7

Authors: Khubaib Ahmed, Ahsan Adeel, Mario Franco, Mohsin Raza

Deep learning (DL) has big-data processing capabilities that are as good, or even better, than those of humans in many real-world domains, but at the cost of high energy requirements that may be unsustainable in some applications and of errors, that, though infrequent, can be large. We hypothesise that a fundamental weakness of DL lies in its intrinsic dependence on integrate-and-fire point neurons that maximise information transmission irrespective of whether it is relevant in the current context or not. This leads to unnecessary neural firing and to the feedforward transmission of conflicting messages, which makes learning difficult and processing energy inefficient. Here we show how to circumvent these limitations by mimicking the capabilities of context-sensitive neocortical neurons that receive input from diverse sources as a context to amplify and attenuate the transmission of relevant and irrelevant information, respectively. We demonstrate that a deep network composed of such local processors seeks to maximise agreement between the active neurons, thus restricting the transmission of conflicting information to higher levels and reducing the neural activity required to process large amounts of heterogeneous real-world data. As shown to be far more effective and efficient than current forms of DL, this two-point neuron study offers a possible step-change in transforming the cellular foundations of deep network architectures.

nan

Article 551

Title: AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings

AKRMap: Adaptive Kernel-Regression für vertrauenswürdige Visualisierung von Cross-Modal-Embeddings

AKRMap:跨模式嵌入的可信赖可视化的适应性内核倒退 2505.14664v2

Authors: Yilin Ye, Junchao Huang, Xingchen Zeng, Jiazhi Xia, Wei Zeng

Cross-modal embeddings form the foundation for multi-modal models. However, visualization methods for interpreting cross-modal embeddings have been primarily confined to traditional dimensionality reduction (DR) techniques like PCA and t-SNE. These DR methods primarily focus on feature distributions within a single modality, whilst failing to incorporate metrics (e.g., CLIPScore) across multiple modalities. This paper introduces AKRMap, a new DR technique designed to visualize cross-modal embeddings metric with enhanced accuracy by learning kernel regression of the metric landscape in the projection space. Specifically, AKRMap constructs a supervised projection network guided by a post-projection kernel regression loss, and employs adaptive generalized kernels that can be jointly optimized with the projection. This approach enables AKRMap to efficiently generate visualizations that capture complex metric distributions, while also supporting interactive features such as zoom and overlay for deeper exploration. Quantitative experiments demonstrate that AKRMap outperforms existing DR methods in generating more accurate and trustworthy visualizations. We further showcase the effectiveness of AKRMap in visualizing and comparing cross-modal embeddings for text-to-image models. Code and demo are available at https://github.com/yilinye/AKRMap.

nan

Article 552

Title@2025-05-28 (3): Progressive Data Dropout: An Embarrassingly Simple Approach to Faster Training

Title: Progressive Data Dropout: An Embarrassingly Simple Approach to Faster Training

Progressive Data Dropout: Ein verblüffend einfacher Ansatz zum schnelleren Training

渐进数据辍学:快速培训的一个令人尴尬的简单方法 2505.22342v1

Authors: Shriram M S, Xinyue Hao, Shihao Hou, Yang Lu, Laura Sevilla-Lara, Anurag Arnab, Shreyank N Gowda

The success of the machine learning field has reliably depended on training on large datasets. While effective, this trend comes at an extraordinary cost. This is due to two deeply intertwined factors: the size of models and the size of datasets. While promising research efforts focus on reducing the size of models, the other half of the equation remains fairly mysterious. Indeed, it is surprising that the standard approach to training remains to iterate over and over, uniformly sampling the training dataset. In this paper we explore a series of alternative training paradigms that leverage insights from hard-data-mining and dropout, simple enough to implement and use that can become the new training standard. The proposed Progressive Data Dropout reduces the number of effective epochs to as little as 12.4% of the baseline. This savings actually do not come at any cost for accuracy. Surprisingly, the proposed method improves accuracy by up to 4.82%. Our approach requires no changes to model architecture or optimizer, and can be applied across standard training pipelines, thus posing an excellent opportunity for wide adoption. Code can be found here: https://github.com/bazyagami/LearningWithRevision

nan

Article 553

Title@2025-05-28 (3): Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start

Title: Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start

Multimodale Reasoning durch verstärktes Lernen mit kaltem Start fördern

通过 “ 冷起 “ 的强化学习推进多模式理由 2505.22334v1

Authors: Lai Wei, Yuting Li, Kaipeng Zheng, Chen Wang, Yue Wang, Linghe Kong, Lichao Sun, Weiran Huang

Recent advancements in large language models (LLMs) have demonstrated impressive chain-of-thought reasoning capabilities, with reinforcement learning (RL) playing a crucial role in this progress. While “aha moment” patterns–where models exhibit self-correction through reflection–are often attributed to emergent properties from RL, we first demonstrate that these patterns exist in multimodal LLMs (MLLMs) prior to RL training but may not necessarily correlate with improved reasoning performance. Building on these insights, we present a comprehensive study on enhancing multimodal reasoning through a two-stage approach: (1) supervised fine-tuning (SFT) as a cold start with structured chain-of-thought reasoning patterns, followed by (2) reinforcement learning via GRPO to further refine these capabilities. Our extensive experiments show that this combined approach consistently outperforms both SFT-only and RL-only methods across challenging multimodal reasoning benchmarks. The resulting models achieve state-of-the-art performance among open-source MLLMs at both 3B and 7B scales, with our 7B model showing substantial improvements over base models (e.g., 66.3 %$\rightarrow$73.4 % on MathVista, 62.9 %$\rightarrow$70.4 % on We-Math) and our 3B model achieving performance competitive with several 7B models. Overall, this work provides practical guidance for building advanced multimodal reasoning models. Our code is available at https://github.com/waltonfuture/RL-with-Cold-Start.

nan

Article 554

Title@2025-05-28 (3): Credal Prediction based on Relative Likelihood

Title: Credal Prediction based on Relative Likelihood

Credal Prediction basierend auf relativer Likelihood

基于相对可能性的裂变预测 2505.22332v1

Authors: Timo Löhr, Paul Hofman, Felix Mohr, Eyke Hüllermeier

Predictions in the form of sets of probability distributions, so-called credal sets, provide a suitable means to represent a learner’s epistemic uncertainty. In this paper, we propose a theoretically grounded approach to credal prediction based on the statistical notion of relative likelihood: The target of prediction is the set of all (conditional) probability distributions produced by the collection of plausible models, namely those models whose relative likelihood exceeds a specified threshold. This threshold has an intuitive interpretation and allows for controlling the trade-off between correctness and precision of credal predictions. We tackle the problem of approximating credal sets defined in this way by means of suitably modified ensemble learning techniques. To validate our approach, we illustrate its effectiveness by experiments on benchmark datasets demonstrating superior uncertainty representation without compromising predictive performance. We also compare our method against several state-of-the-art baselines in credal prediction.

nan

Article 555

Title@2025-05-28 (3): Learning in Stackelberg Games with Non-myopic Agents

Title: Learning in Stackelberg Games with Non-myopic Agents

Lernen in Stackelberg Spiele mit nicht-myopischen Agenten

学习与非中色剂在斯塔克尔贝格运动会中的学习 2208.09407v3

Authors: Nika Haghtalab, Thodoris Lykouris, Sloan Nietert, Alexander Wei

We study Stackelberg games where a principal repeatedly interacts with a non-myopic long-lived agent, without knowing the agent’s payoff function. Although learning in Stackelberg games is well-understood when the agent is myopic, dealing with non-myopic agents poses additional complications. In particular, non-myopic agents may strategize and select actions that are inferior in the present in order to mislead the principal’s learning algorithm and obtain better outcomes in the future. We provide a general framework that reduces learning in presence of non-myopic agents to robust bandit optimization in the presence of myopic agents. Through the design and analysis of minimally reactive bandit algorithms, our reduction trades off the statistical efficiency of the principal’s learning algorithm against its effectiveness in inducing near-best-responses. We apply this framework to Stackelberg security games (SSGs), pricing with unknown demand curve, general finite Stackelberg games, and strategic classification. In each setting, we characterize the type and impact of misspecifications present in near-best responses and develop a learning algorithm robust to such misspecifications. On the way, we improve the state-of-the-art query complexity of learning in SSGs with $n$ targets from $O(n^3)$ to a near-optimal $\widetilde{O}(n)$ by uncovering a fundamental structural property of these games. The latter result is of independent interest beyond learning with non-myopic agents.

nan

Article 556

Title@2025-05-28 (3): When Does Neuroevolution Outcompete Reinforcement Learning in Transfer Learning Tasks?

Title: When Does Neuroevolution Outcompete Reinforcement Learning in Transfer Learning Tasks?

Wann führt Neuroevolution das Verstärkte Lernen in Transfer-Lernaufgaben durch?

在转让学习任务方面,神经革命何时会超越竞争加强学习? 2505.22696v1

Authors: Eleni Nisioti, Joachim Winther Pedersen, Erwan Plantec, Milton L. Montero, Sebastian Risi

The ability to continuously and efficiently transfer skills across tasks is a hallmark of biological intelligence and a long-standing goal in artificial systems. Reinforcement learning (RL), a dominant paradigm for learning in high-dimensional control tasks, is known to suffer from brittleness to task variations and catastrophic forgetting. Neuroevolution (NE) has recently gained attention for its robustness, scalability, and capacity to escape local optima. In this paper, we investigate an understudied dimension of NE: its transfer learning capabilities. To this end, we introduce two benchmarks: a) in stepping gates, neural networks are tasked with emulating logic circuits, with designs that emphasize modular repetition and variation b) ecorobot extends the Brax physics engine with objects such as walls and obstacles and the ability to easily switch between different robotic morphologies. Crucial in both benchmarks is the presence of a curriculum that enables evaluating skill transfer across tasks of increasing complexity. Our empirical analysis shows that NE methods vary in their transfer abilities and frequently outperform RL baselines. Our findings support the potential of NE as a foundation for building more adaptable agents and highlight future challenges for scaling NE to complex, real-world problems.

nan

Article 557

Title@2025-05-28 (3): LLM-ODDR: A Large Language Model Framework for Joint Order Dispatching and Driver Repositioning

Title: LLM-ODDR: A Large Language Model Framework for Joint Order Dispatching and Driver Repositioning

LLM-ODDR: Ein großes Sprachmodell für Joint Order Dispatching und Driver Repositioning

LLM-ODDD:联合调度和司机重新定位大语言示范框架 2505.22695v1

Authors: Tengfei Lyu, Siyuan Feng, Hao Liu, Hai Yang

Ride-hailing platforms face significant challenges in optimizing order dispatching and driver repositioning operations in dynamic urban environments. Traditional approaches based on combinatorial optimization, rule-based heuristics, and reinforcement learning often overlook driver income fairness, interpretability, and adaptability to real-world dynamics. To address these gaps, we propose LLM-ODDR, a novel framework leveraging Large Language Models (LLMs) for joint Order Dispatching and Driver Repositioning (ODDR) in ride-hailing services. LLM-ODDR framework comprises three key components: (1) Multi-objective-guided Order Value Refinement, which evaluates orders by considering multiple objectives to determine their overall value; (2) Fairness-aware Order Dispatching, which balances platform revenue with driver income fairness; and (3) Spatiotemporal Demand-Aware Driver Repositioning, which optimizes idle vehicle placement based on historical patterns and projected supply. We also develop JointDR-GPT, a fine-tuned model optimized for ODDR tasks with domain knowledge. Extensive experiments on real-world datasets from Manhattan taxi operations demonstrate that our framework significantly outperforms traditional methods in terms of effectiveness, adaptability to anomalous conditions, and decision interpretability. To our knowledge, this is the first exploration of LLMs as decision-making agents in ride-hailing ODDR tasks, establishing foundational insights for integrating advanced language models within intelligent transportation systems.

nan

Article 558

Title@2025-05-28 (3): Individualised Counterfactual Examples Using Conformal Prediction Intervals

Title: Individualised Counterfactual Examples Using Conformal Prediction Intervals

Individualisierte gegenfaktische Beispiele mit konformen Vorhersageintervallen

使用非正式预测间隔的个别反事实实例 2505.22326v1

Authors: James M. Adams, Gesine Reinert, Lukasz Szpruch, Carsten Maple, Andrew Elliott

Counterfactual explanations for black-box models aim to pr ovide insight into an algorithmic decision to its recipient. For a binary classification problem an individual counterfactual details which features might be changed for the model to infer the opposite class. High-dimensional feature spaces that are typical of machine learning classification models admit many possible counterfactual examples to a decision, and so it is important to identify additional criteria to select the most useful counterfactuals. In this paper, we explore the idea that the counterfactuals should be maximally informative when considering the knowledge of a specific individual about the underlying classifier. To quantify this information gain we explicitly model the knowledge of the individual, and assess the uncertainty of predictions which the individual makes by the width of a conformal prediction interval. Regions of feature space where the prediction interval is wide correspond to areas where the confidence in decision making is low, and an additional counterfactual example might be more informative to an individual. To explore and evaluate our individualised conformal prediction interval counterfactuals (CPICFs), first we present a synthetic data set on a hypercube which allows us to fully visualise the decision boundary, conformal intervals via three different methods, and resultant CPICFs. Second, in this synthetic data set we explore the impact of a single CPICF on the knowledge of an individual locally around the original query. Finally, in both our synthetic data set and a complex real world dataset with a combination of continuous and discrete variables, we measure the utility of these counterfactuals via data augmentation, testing the performance on a held out set.

nan

Article 559

Title@2025-05-28 (3): A Closer Look on Memorization in Tabular Diffusion Model: A Data-Centric Perspective

Title: A Closer Look on Memorization in Tabular Diffusion Model: A Data-Centric Perspective

Ein genauerer Blick auf die Erinnerung an Tabular Diffusion Modell: Eine datenzentrische Perspektive

更仔细地看一看表格传播模型中的记忆化:数据核心视角 2505.22322v1

Authors: Zhengyu Fang, Zhimeng Jiang, Huiyuan Chen, Xiaoge Zhang, Kaiyu Tang, Xiao Li, Jing Li

Diffusion models have shown strong performance in generating high-quality tabular data, but they carry privacy risks by reproducing exact training samples. While prior work focuses on dataset-level augmentation to reduce memorization, little is known about which individual samples contribute most. We present the first data-centric study of memorization dynamics in tabular diffusion models. We quantify memorization for each real sample based on how many generated samples are flagged as replicas, using a relative distance ratio. Our empirical analysis reveals a heavy-tailed distribution of memorization counts: a small subset of samples contributes disproportionately to leakage, confirmed via sample-removal experiments. To understand this, we divide real samples into top- and non-top-memorized groups and analyze their training-time behaviors. We track when each sample is first memorized and monitor per-epoch memorization intensity (AUC). Memorized samples are memorized slightly earlier and show stronger signals in early training. Based on these insights, we propose DynamicCut, a two-stage, model-agnostic mitigation method: (a) rank samples by epoch-wise intensity, (b) prune a tunable top fraction, and (c) retrain on the filtered dataset. Across multiple tabular datasets and models, DynamicCut reduces memorization with minimal impact on data diversity and downstream performance. It also complements augmentation-based defenses. Furthermore, DynamicCut enables cross-model transferability: high-ranked samples identified from one model (e.g., a diffusion model) are also effective for reducing memorization when removed from others, such as GANs and VAEs.

nan

Article 560

Title@2025-05-28 (3): Core Context Aware Transformers for Long Context Language Modeling

Title: Core Context Aware Transformers for Long Context Language Modeling

Core Context Aware Transformers für lange Kontext-Sprachenmodellierung

长语语言建模核心认知变型器 2412.12465v2

Authors: Yaofo Chen, Zeng You, Shuhai Zhang, Haokun Li, Yirui Li, Yaowei Wang, Mingkui Tan

Transformer-based Large Language Models (LLMs) have exhibited remarkable success in extensive tasks primarily attributed to self-attention mechanism, which requires a token to consider all preceding tokens as its context to compute attention. However, when the context length L becomes very large (e.g., 128K), the amount of potentially redundant information in the context tends to increase. The redundant context not only hampers the modeling representation performance but also incurs unnecessary computational and storage overhead. In this paper, we propose a plug-and-play Core Context Aware (CCA) Attention for efficient long-context modeling, comprising two complementary modules: 1) Globality-aware pooling module groups input tokens and dynamically compresses each group into one core token based on their significance. In this way, our method automatically focuses and strengthens core context while diminishing redundancy during the learning process, leading to effective long-term dependency modeling. 2) Locality-preserving module incorporates neighboring tokens to preserve local context for detailed representation. Notably, our CCA-Attention is able to replace the self-attention module in existing LLMs with minimal fine-tuning cost. Extensive experimental results show the superiority of our method in both long-context modeling and computational efficiency over state-of-the-art methods.

nan

Article 561

Title@2025-05-28 (3): Copresheaf Topological Neural Networks: A Generalized Deep Learning Framework

Title: Copresheaf Topological Neural Networks: A Generalized Deep Learning Framework

Copresheaf Topologische neurale Netzwerke: Ein generalisiertes Deep Learning Framework

Copresheaf 地形神经网络:普遍深层学习框架 2505.21251v2

Authors: Mustafa Hajij, Lennart Bastian, Sarah Osentoski, Hardik Kabaria, John L. Davenport, Sheik Dawood, Balaji Cherukuri, Joseph G. Kocheemoolayil, Nastaran Shahmansouri, Adrian Lew, Theodore Papamarkou, Tolga Birdal

We introduce copresheaf topological neural networks (CTNNs), a powerful and unifying framework that encapsulates a wide spectrum of deep learning architectures, designed to operate on structured data: including images, point clouds, graphs, meshes, and topological manifolds. While deep learning has profoundly impacted domains ranging from digital assistants to autonomous systems, the principled design of neural architectures tailored to specific tasks and data types remains one of the field’s most persistent open challenges. CTNNs address this gap by grounding model design in the language of copresheaves, a concept from algebraic topology that generalizes and subsumes most practical deep learning models in use today. This abstract yet constructive formulation yields a rich design space from which theoretically sound and practically effective solutions can be derived to tackle core challenges in representation learning: long-range dependencies, oversmoothing, heterophily, and non-Euclidean domains. Our empirical results on structured data benchmarks demonstrate that CTNNs consistently outperform conventional baselines, particularly in tasks requiring hierarchical or localized sensitivity. These results underscore CTNNs as a principled, multi-scale foundation for the next generation of deep learning architectures.

nan

Article 562

Title@2025-05-28 (3): If Pigs Could Fly… Can LLMs Logically Reason Through Counterfactuals?

Title: If Pigs Could Fly… Can LLMs Logically Reason Through Counterfactuals?

Wenn Schweine fliegen könnten… können LLMs logischerweise durch Gegenfakten denken?

如果猪能飞… 2505.22318v1

Authors: Ishwar B Balappanawar, Vamshi Krishna Bonagiri, Anish R Joishy, Manas Gaur, Krishnaprasad Thirunarayan, Ponnurangam Kumaraguru

Large Language Models (LLMs) demonstrate impressive reasoning capabilities in familiar contexts, but struggle when the context conflicts with their parametric knowledge. To investigate this phenomenon, we introduce CounterLogic, a dataset containing 1,800 examples across 9 logical schemas, explicitly designed to evaluate logical reasoning through counterfactual (hypothetical knowledge-conflicting) scenarios. Our systematic evaluation of 11 LLMs across 6 different datasets reveals a consistent performance degradation, with accuracies dropping by 27% on average when reasoning through counterfactual information. We propose Self-Segregate, a prompting method enabling metacognitive awareness (explicitly identifying knowledge conflicts) before reasoning. Our method dramatically narrows the average performance gaps from 27% to just 11%, while significantly increasing the overall accuracy (+7.5%). We discuss the implications of these findings and draw parallels to human cognitive processes, particularly on how humans disambiguate conflicting information during reasoning tasks. Our findings offer practical insights for understanding and enhancing LLMs reasoning capabilities in real-world applications, especially where models must logically reason independently of their factual knowledge.

nan

Article 563

Title@2025-05-28 (3): Rethinking BPS: A Utility-Based Evaluation Framework

Title: Rethinking BPS: A Utility-Based Evaluation Framework

Rethinking BPS: Ein Nutzen-basierter Bewertungsrahmen

重新思考BPS:基于公用事业的评价框架 2505.22316v1

Authors: Konrad Özdemir, Lukas Kirchdorfer, Keyvan Amiri Elyasi, Han van der Aa, Heiner Stuckenschmidt

Business process simulation (BPS) is a key tool for analyzing and optimizing organizational workflows, supporting decision-making by estimating the impact of process changes. The reliability of such estimates depends on the ability of a BPS model to accurately mimic the process under analysis, making rigorous accuracy evaluation essential. However, the state-of-the-art approach to evaluating BPS models has two key limitations. First, it treats simulation as a forecasting problem, testing whether models can predict unseen future events. This fails to assess how well a model captures the as-is process, particularly when process behavior changes from train to test period. Thus, it becomes difficult to determine whether poor results stem from an inaccurate model or the inherent complexity of the data, such as unpredictable drift. Second, the evaluation approach strongly relies on Earth Mover’s Distance-based metrics, which can obscure temporal patterns and thus yield misleading conclusions about simulation quality. To address these issues, we propose a novel framework that evaluates simulation quality based on its ability to generate representative process behavior. Instead of comparing simulated logs to future real-world executions, we evaluate whether predictive process monitoring models trained on simulated data perform comparably to those trained on real data for downstream analysis tasks. Empirical results show that our framework not only helps identify sources of discrepancies but also distinguishes between model accuracy and data complexity, offering a more meaningful way to assess BPS quality.

nan

Article 564

Title@2025-05-28 (3): MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections

Title: MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections

MUDDFormer: Breaking Residual Engpässe in Transformatoren über Multiway Dynamic Dense Connections

MUDDFormer:通过多路动态感应连接在变形器中打破残余瓶颈 2502.12170v2

Authors: Da Xiao, Qingye Meng, Shengping Li, Xingyuan Yuan

We propose MUltiway Dynamic Dense (MUDD) connections, a simple yet effective method to address the limitations of residual connections and enhance cross-layer information flow in Transformers. Unlike existing dense connection approaches with static and shared connection weights, MUDD generates connection weights dynamically depending on hidden states at each sequence position and for each decoupled input stream (the query, key, value or residual) of a Transformer block. MUDD connections can be seamlessly integrated into any Transformer architecture to create MUDDFormer. Extensive experiments show that MUDDFormer significantly outperforms Transformers across various model architectures and scales in language modeling, achieving the performance of Transformers trained with 1.8X-2.4X compute. Notably, MUDDPythia-2.8B matches Pythia-6.9B in pretraining ppl and downstream tasks and even rivals Pythia-12B in five-shot settings, while adding only 0.23% parameters and 0.4% computation. Code in JAX and PyTorch and pre-trained models are available at https://github.com/Caiyun-AI/MUDDFormer .

nan

Article 565

Title@2025-05-28 (3): From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization

Title: From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization

Von Dormant zu Gelöscht: Tamper-Resistent Unlearning durch Gewicht-Raum-Regularisierung

从杜尔曼特移到删除:通过宽空正规化,让塔帕-较远摆脱学习 2505.22310v1

Authors: Shoaib Ahmed Siddiqui, Adrian Weller, David Krueger, Gintare Karolina Dziugaite, Michael Curtis Mozer, Eleni Triantafillou

Recent unlearning methods for LLMs are vulnerable to relearning attacks: knowledge believed-to-be-unlearned re-emerges by fine-tuning on a small set of (even seemingly-unrelated) examples. We study this phenomenon in a controlled setting for example-level unlearning in vision classifiers. We make the surprising discovery that forget-set accuracy can recover from around 50% post-unlearning to nearly 100% with fine-tuning on just the retain set – i.e., zero examples of the forget set. We observe this effect across a wide variety of unlearning methods, whereas for a model retrained from scratch excluding the forget set (gold standard), the accuracy remains at 50%. We observe that resistance to relearning attacks can be predicted by weight-space properties, specifically, $L_2$-distance and linear mode connectivity between the original and the unlearned model. Leveraging this insight, we propose a new class of methods that achieve state-of-the-art resistance to relearning attacks.

nan

Article 566

Title@2025-05-28 (3): FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration

Title: FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration

FireQ: Schnelle INT4-FP8-Kernel- und RoPE-gestützte Quantisierung für LLM-Inferenzbeschleunigung

消防:快速INT4-FFP8 内核和ROPE-感知的LLM 推推加速量 2505.20839v2

Authors: Daehyeon Baek, Jieun Choi, Jimyoung Son, Kyungmin Bin, Seungbeom Choi, Kihyo Moon, Minsung Jang, Hyojung Lee

As large language models become increasingly prevalent, memory bandwidth constraints significantly limit inference throughput, motivating post-training quantization (PTQ). In this paper, we propose FireQ, a co-designed PTQ framework and an INT4-FP8 matrix multiplication kernel that accelerates LLM inference across all linear layers. Specifically, FireQ quantizes linear layer weights and key-values to INT4, and activations and queries to FP8, significantly enhancing throughput. Additionally, we introduce a three-stage pipelining for the prefill phase, which modifies the FlashAttention-3 kernel, effectively reducing time-to-first-token in the prefill phase. To minimize accuracy loss from quantization, we develop novel outlier smoothing techniques tailored separately for linear and attention layers. In linear layers, we explicitly use per-tensor scaling to prevent underflow caused by the FP8 quantization scaling factor of INT4 quantization, and channel-wise scaling to compensate for coarse granularity of INT4. In attention layers, we address quantization challenges posed by rotary positional embeddings (RoPE) by combining pre-RoPE and post-RoPE scaling strategies. FireQ significantly outperforms state-of-the-art methods, achieving 1.68x faster inference in feed-forward network layers on Llama2-7B and 1.26x faster prefill phase performance on Llama3-8B compared to QServe, with negligible accuracy loss.

nan

Article 567

Title@2025-05-28 (3): Transformers Pretrained on Procedural Data Contain Modular Structures for Algorithmic Reasoning

Title: Transformers Pretrained on Procedural Data Contain Modular Structures for Algorithmic Reasoning

Transformer vorgebildet auf verfahrenstechnische Daten enthalten modulare Strukturen für algorithmische Vernunft

在包含用于算法理由的模块结构的程序性数据方面受过预先培训的变异器 2505.22308v1

Authors: Zachary Shinnick, Liangze Jiang, Hemanth Saratchandran, Anton van den Hengel, Damien Teney

Pretraining on large, semantically rich datasets is key for developing language models. Surprisingly, recent studies have shown that even synthetic data, generated procedurally through simple semantic-free algorithms, can yield some of the same benefits as natural language pretraining. It is unclear what specific capabilities such simple synthetic data instils in a model, where these capabilities reside in the architecture, and how they manifest within its weights. In this short paper, we identify several beneficial forms of procedural data, together with specific algorithmic reasoning skills that improve in small transformers. Our core finding is that different procedural rules instil distinct but complementary inductive structures in the model. With extensive ablations and partial-transfer experiments, we discover that these structures reside in different parts of the model. Attention layers often carry the most transferable information, but some pretraining rules impart useful structure to MLP blocks instead. Most interestingly, the structures induced by multiple rules can be composed to jointly reinforce multiple capabilities. These results suggest an exciting possibility of disentangling the acquisition of knowledge from reasoning in language models, with the goal of improving their robustness and data efficiency.

nan

Article 568

Title@2025-05-28 (3): Risk-Informed Diffusion Transformer for Long-Tail Trajectory Prediction in the Crash Scenario

Title: Risk-Informed Diffusion Transformer for Long-Tail Trajectory Prediction in the Crash Scenario

Risiko-informierter Diffusionstransformator für langspurige Trajektorien-Vorhersage im Crash-Szenario

崩溃设想情景中长帆轨迹预测风险化传导变异器 2501.16349v2

Authors: Junlan Chen, Pei Liu, Zihao Zhang, Hongyi Zhao, Yufei Ji, Ziyuan Pu

Trajectory prediction methods have been widely applied in autonomous driving technologies. Although the overall performance accuracy of trajectory prediction is relatively high, the lack of trajectory data in critical scenarios in the training data leads to the long-tail phenomenon. Normally, the trajectories of the tail data are more critical and more difficult to predict and may include rare scenarios such as crashes. To solve this problem, we extracted the trajectory data from real-world crash scenarios, which contain more long-tail data. Meanwhile, based on the trajectory data in this scenario, we integrated graph-based risk information and diffusion with transformer and proposed the Risk-Informed Diffusion Transformer (RI-DiT) trajectory prediction method. Extensive experiments were conducted on trajectory data in the real-world crash scenario, and the results show that the algorithm we proposed has good performance. When predicting the data of the tail 10\% (Top 10\%), the minADE and minFDE indicators are 0.016/2.667 m. At the same time, we showed the trajectory conditions of different long-tail distributions. The distribution of trajectory data is closer to the tail, the less smooth the trajectory is. Through the trajectory data in real-world crash scenarios, Our work expands the methods to overcome the long-tail challenges in trajectory prediction. Our method, RI-DiT, integrates inverse time to collision (ITTC) and the feature of traffic flow, which can predict long-tail trajectories more accurately and improve the safety of autonomous driving systems.

nan

Article 569

Title@2025-05-28 (3): Robustness and Cybersecurity in the EU Artificial Intelligence Act

Title: Robustness and Cybersecurity in the EU Artificial Intelligence Act

Robustheit und Cybersicherheit im EU-Gesetz über künstliche Intelligenz

《欧盟人工情报法》中的强力和网络安全 2502.16184v2

Authors: Henrik Nolte, Miriam Rateike, Michèle Finck

The EU Artificial Intelligence Act (AIA) establishes different legal principles for different types of AI systems. While prior work has sought to clarify some of these principles, little attention has been paid to robustness and cybersecurity. This paper aims to fill this gap. We identify legal challenges and shortcomings in provisions related to robustness and cybersecurity for high-risk AI systems(Art. 15 AIA) and general-purpose AI models (Art. 55 AIA). We show that robustness and cybersecurity demand resilience against performance disruptions. Furthermore, we assess potential challenges in implementing these provisions in light of recent advancements in the machine learning (ML) literature. Our analysis informs efforts to develop harmonized standards, guidelines by the European Commission, as well as benchmarks and measurement methodologies under Art. 15(2) AIA. With this, we seek to bridge the gap between legal terminology and ML research, fostering a better alignment between research and implementation efforts.

nan

Article 570

Title@2025-05-28 (3): Versatile Cardiovascular Signal Generation with a Unified Diffusion Transformer

Title: Versatile Cardiovascular Signal Generation with a Unified Diffusion Transformer

Vielseitige kardiovaskuläre Signalgenerierung mit einem Unified Diffusion Transformer

具有统一扩散变异器的心血管心血管信号生成 2505.22306v1

Authors: Zehua Chen, Yuyang Miao, Liyuan Wang, Luyun Fan, Danilo P. Mandic, Jun Zhu

Cardiovascular signals such as photoplethysmography (PPG), electrocardiography (ECG), and blood pressure (BP) are inherently correlated and complementary, together reflecting the health of cardiovascular system. However, their joint utilization in real-time monitoring is severely limited by diverse acquisition challenges from noisy wearable recordings to burdened invasive procedures. Here we propose UniCardio, a multi-modal diffusion transformer that reconstructs low-quality signals and synthesizes unrecorded signals in a unified generative framework. Its key innovations include a specialized model architecture to manage the signal modalities involved in generation tasks and a continual learning paradigm to incorporate varying modality combinations. By exploiting the complementary nature of cardiovascular signals, UniCardio clearly outperforms recent task-specific baselines in signal denoising, imputation, and translation. The generated signals match the performance of ground-truth signals in detecting abnormal health conditions and estimating vital signs, even in unseen domains, while ensuring interpretability for human experts. These advantages position UniCardio as a promising avenue for advancing AI-assisted healthcare.

nan

Article 571

Title@2025-05-28 (3): LLäMmlein: Compact and Competitive German-Only Language Models from Scratch

Title: LLäMmlein: Compact and Competitive German-Only Language Models from Scratch

LLäMmlein: Kompakte und wettbewerbsfähige deutschsprachige Sprachmodelle von Scratch

LläMmlein:来自斯克拉奇的契约和竞争性独德语言模式 2411.11171v4

Authors: Jan Pfister, Julia Wunderle, Andreas Hotho

We create two German-only decoder models, LL"aMmlein 120M and 1B, transparently from scratch and publish them, along with the training data, for the German NLP research community to use. The model training involved several key steps, including extensive data preprocessing, the creation of a custom German tokenizer, the training itself, as well as the evaluation of the final models on various benchmarks. Throughout the training process, multiple checkpoints were saved and analyzed using the SuperGLEBer benchmark to monitor the models’ learning dynamics. Compared to state-of-the-art models on the SuperGLEBer benchmark, both LL"aMmlein models performed competitively, consistently matching or surpassing models with similar parameter sizes. The results show that the models’ quality scales with size as expected, but performance improvements on some tasks plateaued early, offering valuable insights into resource allocation for future model development.

nan

Article 572

Title@2025-05-28 (3): Diss-l-ECT: Dissecting Graph Data with Local Euler Characteristic Transforms

Title: Diss-l-ECT: Dissecting Graph Data with Local Euler Characteristic Transforms

Diss-l-ECT: Entschlüsselung von Graphendaten mit lokalen Euler-Charakteristik-Transformationen

Diss- l- ECT: 用本地电磁特征变换解析图表数据 2410.02622v2

Authors: Julius von Rohrscheidt, Bastian Rieck

The Euler Characteristic Transform (ECT) is an efficiently-computable geometrical-topological invariant that characterizes the global shape of data. In this paper, we introduce the Local Euler Characteristic Transform ($\ell$-ECT), a novel extension of the ECT particularly designed to enhance expressivity and interpretability in graph representation learning. Unlike traditional Graph Neural Networks (GNNs), which may lose critical local details through aggregation, the $\ell$-ECT provides a lossless representation of local neighborhoods. This approach addresses key limitations in GNNs by preserving nuanced local structures while maintaining global interpretability. Moreover, we construct a rotation-invariant metric based on $\ell$-ECTs for spatial alignment of data spaces. Our method exhibits superior performance compared to standard GNNs on a variety of node-classification tasks, while also offering theoretical guarantees that demonstrate its effectiveness.

nan

Article 573

Title@2025-05-28 (3): 360-LLaMA-Factory: Plug & Play Sequence Parallelism for Long Post-Training

Title: 360-LLaMA-Factory: Plug & Play Sequence Parallelism for Long Post-Training

360-LlaMA-Fabrik: Plug & Play-Sequenz-Parallelität für langes Nachtraining

360-LLamaMA-Factory: 长期培训之后的插件和播放序列平行主义 2505.22296v1

Authors: Haosheng Zou, Xiaowei Lv, Shousheng Jia, Xiangzheng Zhang

Adding sequence parallelism into LLaMA-Factory, we open-sourced 360-LLaMA-Factory at https://github.com/Qihoo360/360-LLaMA-Factory. 360-LLaMA-Factory has received wide recognition and used in models such as Light-R1 arXiv:2503.10460, TinyR1 arXiv:2503.04872, Kaggle AIMO math models and also in large companies’ training frameworks. This technical report delves deeper into the different sequence parallel modes behind 360-LLaMA-Factory and discusses our implementation insights.

nan

Article 574

Title@2025-05-28 (3): Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Title: Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Light-R1: Curriculum SFT, DPO und RL für Long COT aus Scratch und darüber hinaus

Light-R1:SFT、DPO和RL课程,用于Scratch及以后的长期COT 2503.10460v4

Authors: Liang Wen, Yunke Cai, Fenrui Xiao, Xin He, Qi An, Zhenyu Duan, Yimin Du, Junchen Liu, Lifu Tang, Xiaowei Lv, Haosheng Zou, Yongchao Deng, Shousheng Jia, Xiangzheng Zhang

This paper introduces Light-R1, an open-source suite for training long reasoning models using reproducible and cost-effective methodology. Given the proprietary nature of data used in the DeepSeek-R1 series, we develop an alternative approach leveraging exclusively public data and models. Our curriculum training progressively increases data difficulty, combined with multi-staged post-training. Our Light-R1-32B model, trained from Qwen2.5-32B-Instruct, outperforms DeepSeek-R1-Distill-Qwen-32B in math reasoning. Experimental results show that this curriculum approach becomes more effective when distinct, diverse datasets are available for different training stages: fine-tuning DeepSeek-R1-Distilled models (pre-tuned by DeepSeek team on proprietary data) with 3,000 challenging examples from our curriculum dataset yielded state-of-the-art 7B and 14B models, while the 32B model, Light-R1-32B-DS performed comparably to QwQ-32B and DeepSeek-R1. Furthermore, we extend our work by applying GRPO on long reasoning models. Our final Light-R1-14B-DS achieves SOTA performance among 14B models in math, with AIME24 & 25 scores of 74.0 and 60.2 respectively, surpassing many 32B models and DeepSeek-R1-Distill-Llama-70B. Despite math-focused training, Light-R1-14B-DS demonstrates strong cross-domain generalization. Light-R1 represents a significant advancement in making sophisticated reasoning models more accessible and implementable in real-world applications. Our models, training data and code have been made available at https://github.com/Qihoo360/Light-R1.

nan

Article 575

Title@2025-05-28 (3): MoRE: A Mixture of Low-Rank Experts for Adaptive Multi-Task Learning

Title: MoRE: A Mixture of Low-Rank Experts for Adaptive Multi-Task Learning

MoRE: Eine Mischung aus Low-Rank Experten für adaptives Multi-Task Learning

MoRE: 适应性多任务学习低级专家混合组合 2505.22694v1

Authors: Dacao Zhang, Kun Zhang, Shimao Chu, Le Wu, Xin Li, Si Wei

With the rapid development of Large Language Models (LLMs), Parameter-Efficient Fine-Tuning (PEFT) methods have gained significant attention, which aims to achieve efficient fine-tuning of LLMs with fewer parameters. As a representative PEFT method, Low-Rank Adaptation (LoRA) introduces low-rank matrices to approximate the incremental tuning parameters and achieves impressive performance over multiple scenarios. After that, plenty of improvements have been proposed for further improvement. However, these methods either focus on single-task scenarios or separately train multiple LoRA modules for multi-task scenarios, limiting the efficiency and effectiveness of LoRA in multi-task scenarios. To better adapt to multi-task fine-tuning, in this paper, we propose a novel Mixture of Low-Rank Experts (MoRE) for multi-task PEFT. Specifically, instead of using an individual LoRA for each task, we align different ranks of LoRA module with different tasks, which we named low-rank experts. Moreover, we design a novel adaptive rank selector to select the appropriate expert for each task. By jointly training low-rank experts, MoRE can enhance the adaptability and efficiency of LoRA in multi-task scenarios. Finally, we conduct extensive experiments over multiple multi-task benchmarks along with different LLMs to verify model performance. Experimental results demonstrate that compared to traditional LoRA and its variants, MoRE significantly improves the performance of LLMs in multi-task scenarios and incurs no additional inference cost. We also release the model and code to facilitate the community.

nan

Article 576

Title@2025-05-28 (3): Rethinking the Unsolvable: When In-Context Search Meets Test-Time Scaling

Title: Rethinking the Unsolvable: When In-Context Search Meets Test-Time Scaling

Das Unlösbare neu denken: Wenn In-Context Search Test-Time Scaling trifft

重新思考无法解答的问题: 当 In-Ctext 搜索遇到测试时间缩放时 2505.22290v1

Authors: Fanzeng Xia, Yidong Luo, Tinko Sebastian Bartels, Yaqi Xu, Tongxin Li

Recent research has highlighted that Large Language Models (LLMs), even when trained to generate extended long reasoning steps, still face significant challenges on hard reasoning problems. However, much of the existing literature relies on direct prompting with simple in-context learning examples for evaluation, which largely overlooks advanced techniques to elicit LLMs’ deliberate reasoning before drawing conclusions that LLMs hit a performance ceiling. In this paper, we systematically explore the combined potential of in-context search and test-time scaling on super hard reasoning tasks. We find that by employing advanced in-context search prompting to LLMs augmented with internal scaling, one can achieve transformative performance breakthroughs on tasks previously deemed “unsolvable” (e.g., reported success rates below 5%). We provide both empirical results and theoretical analysis of how this combination can unleash LLM reasoning capabilities: i) Empirically, on controlled NP-hard tasks and complex real-world planning benchmarks, our approach achieves up to a 30x improvement in success rates compared to previously reported results without any external mechanisms; ii) Theoretically, we show that in-context search prompting, when combined with internal scaling, significantly extends the complexity class of solvable reasoning problems. These findings challenge prevailing assumptions about the limitations of LLMs on complex tasks, indicating that current evaluation paradigms systematically underestimate their true potential. Our work calls for a critical reassessment of how LLM reasoning is benchmarked and a more robust evaluation strategy that fully captures the true capabilities of contemporary LLMs, which can lead to a better understanding of their operational reasoning boundaries in real-world deployments.

nan

Article 577

Title@2025-05-28 (3): A Variational Perspective on Generative Protein Fitness Optimization

Title: A Variational Perspective on Generative Protein Fitness Optimization

Eine abwechslungsreiche Perspektive auf generative Protein-Fitness-Optimierung

关于最优化的生质蛋白质健身的变异视角 2501.19200v2

Authors: Lea Bogensperger, Dominik Narnhofer, Ahmed Allam, Konrad Schindler, Michael Krauthammer

The goal of protein fitness optimization is to discover new protein variants with enhanced fitness for a given use. The vast search space and the sparsely populated fitness landscape, along with the discrete nature of protein sequences, pose significant challenges when trying to determine the gradient towards configurations with higher fitness. We introduce Variational Latent Generative Protein Optimization (VLGPO), a variational perspective on fitness optimization. Our method embeds protein sequences in a continuous latent space to enable efficient sampling from the fitness distribution and combines a (learned) flow matching prior over sequence mutations with a fitness predictor to guide optimization towards sequences with high fitness. VLGPO achieves state-of-the-art results on two different protein benchmarks of varying complexity. Moreover, the variational design with explicit prior and likelihood functions offers a flexible plug-and-play framework that can be easily customized to suit various protein design tasks.

nan

Article 578

Title@2025-05-28 (3): Random Feature Representation Boosting

Title: Random Feature Representation Boosting

Zufällige Merkmalsdarstellung steigert sich

随机特性显示促进 2501.18283v3

Authors: Nikita Zozoulenko, Thomas Cass, Lukas Gonon

We introduce Random Feature Representation Boosting (RFRBoost), a novel method for constructing deep residual random feature neural networks (RFNNs) using boosting theory. RFRBoost uses random features at each layer to learn the functional gradient of the network representation, enhancing performance while preserving the convex optimization benefits of RFNNs. In the case of MSE loss, we obtain closed-form solutions to greedy layer-wise boosting with random features. For general loss functions, we show that fitting random feature residual blocks reduces to solving a quadratically constrained least squares problem. Through extensive numerical experiments on tabular datasets for both regression and classification, we show that RFRBoost significantly outperforms RFNNs and end-to-end trained MLP ResNets in the small- to medium-scale regime where RFNNs are typically applied. Moreover, RFRBoost offers substantial computational benefits, and theoretical guarantees stemming from boosting theory.

nan

Article 579

Title@2025-05-28 (3): Sample Efficient Robot Learning in Supervised Effect Prediction Tasks

Title: Sample Efficient Robot Learning in Supervised Effect Prediction Tasks

Beispiel Effizientes Roboter-Lernen in überwachten Effekt-Vorhersage-Aufgaben

在监督效应预测任务中提高机器人学习效率 2412.02331v2

Authors: Mehmet Arda Eren, Erhan Oztop

In self-supervised robotic learning, agents acquire data through active interaction with their environment, incurring costs such as energy use, human oversight, and experimental time. To mitigate these, sample-efficient exploration is essential. While intrinsic motivation (IM) methods like learning progress (LP) are widely used in robotics, and active learning (AL) is well established for classification in machine learning, few frameworks address continuous, high-dimensional regression tasks typical of world model learning. We propose MUSEL (Model Uncertainty for Sample-Efficient Learning), a novel AL framework tailored for regression tasks in robotics, such as action-effect prediction. MUSEL introduces a model uncertainty metric that combines total predictive uncertainty, learning progress, and input diversity to guide data acquisition. We validate our approach using a Stochastic Variational Deep Kernel Learning (SVDKL) model in two robotic tabletop tasks. Experimental results demonstrate that MUSEL improves both learning accuracy and sample efficiency, validating its effectiveness in learning action effects and selecting informative samples.

nan

Article 580

Title@2025-05-28 (3): From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning

Title: From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning

Von den Kerneln zu den Features: Eine Multi-Scale Adaptive Theorie des Feature Learning

从核心到地貌特征:多尺度适应性地貌学习理论 2502.03210v2

Authors: Noa Rubin, Kirsten Fischer, Javed Lindner, David Dahmen, Inbar Seroussi, Zohar Ringel, Michael Krämer, Moritz Helias

Feature learning in neural networks is crucial for their expressive power and inductive biases, motivating various theoretical approaches. Some approaches describe network behavior after training through a change in kernel scale from initialization, resulting in a generalization power comparable to a Gaussian process. Conversely, in other approaches training results in the adaptation of the kernel to the data, involving directional changes to the kernel. The relationship and respective strengths of these two views have so far remained unresolved. This work presents a theoretical framework of multi-scale adaptive feature learning bridging these two views. Using methods from statistical mechanics, we derive analytical expressions for network output statistics which are valid across scaling regimes and in the continuum between them. A systematic expansion of the network’s probability distribution reveals that mean-field scaling requires only a saddle-point approximation, while standard scaling necessitates additional correction terms. Remarkably, we find across regimes that kernel adaptation can be reduced to an effective kernel rescaling when predicting the mean network output in the special case of a linear network. However, for linear and non-linear networks, the multi-scale adaptive approach captures directional feature learning effects, providing richer insights than what could be recovered from a rescaling of the kernel alone.

nan

Article 581

Title@2025-05-28 (3): Zero-Shot Mono-to-Binaural Speech Synthesis

Title: Zero-Shot Mono-to-Binaural Speech Synthesis

Null-Schuss-Mono-bis-Binaural-Sprachsynthese

零热单声词合成 2412.08356v2

Authors: Alon Levkovitch, Julian Salazar, Soroosh Mariooryad, RJ Skerry-Ryan, Nadav Bar, Bastiaan Kleijn, Eliya Nachmani

We present ZeroBAS, a neural method to synthesize binaural audio from monaural audio recordings and positional information without training on any binaural data. To our knowledge, this is the first published zero-shot neural approach to mono-to-binaural audio synthesis. Specifically, we show that a parameter-free geometric time warping and amplitude scaling based on source location suffices to get an initial binaural synthesis that can be refined by iteratively applying a pretrained denoising vocoder. Furthermore, we find this leads to generalization across room conditions, which we measure by introducing a new dataset, TUT Mono-to-Binaural, to evaluate state-of-the-art monaural-to-binaural synthesis methods on unseen conditions. Our zero-shot method is perceptually on-par with the performance of supervised methods on the standard mono-to-binaural dataset, and even surpasses them on our out-of-distribution TUT Mono-to-Binaural dataset. Our results highlight the potential of pretrained generative audio models and zero-shot learning to unlock robust binaural audio synthesis.

nan

Article 582

Title@2025-05-28 (3): Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

Title: Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

利用语言代理框架中的双重进程理论促进实时同时人类-AI合作 2502.11882v5

Authors: Shao Zhang, Xihuai Wang, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, Ying Wen

Agents built on large language models (LLMs) have excelled in turn-by-turn human-AI collaboration but struggle with simultaneous tasks requiring real-time interaction. Latency issues and the challenge of inferring variable human strategies hinder their ability to make autonomous decisions without explicit instructions. Through experiments with current independent System 1 and System 2 methods, we validate the necessity of using Dual Process Theory (DPT) in real-time tasks. We propose DPT-Agent, a novel language agent framework that integrates System 1 and System 2 for efficient real-time simultaneous human-AI collaboration. DPT-Agent’s System 1 uses a Finite-state Machine (FSM) and code-as-policy for fast, intuitive, and controllable decision-making. DPT-Agent’s System 2 integrates Theory of Mind (ToM) and asynchronous reflection to infer human intentions and perform reasoning-based autonomous decisions. We demonstrate the effectiveness of DPT-Agent through further experiments with rule-based agents and human collaborators, showing significant improvements over mainstream LLM-based frameworks. DPT-Agent can effectively help LLMs convert correct slow thinking and reasoning into executable actions, thereby improving performance. To the best of our knowledge, DPT-Agent is the first language agent framework that achieves successful real-time simultaneous human-AI collaboration autonomously. Code of DPT-Agent can be found in https://github.com/sjtu-marl/DPT-Agent.

nan

Article 583

Title@2025-05-28 (3): TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup

Title: TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup

TransMLA: Migration von GQA-Modellen zu MLA mit voller DeepSeek-Kompatibilität und Speedup

TransMLA:将GQA模型迁移到具有全深搜索兼容性和加速性的司法协助模式 2502.07864v4

Authors: Fanxu Meng, Pingzhi Tang, Zengwei Yao, Xing Sun, Muhan Zhang

In this paper, we present TransMLA, a framework that seamlessly converts any GQA-based pre-trained model into an MLA-based model. Our approach enables direct compatibility with DeepSeek’s codebase, allowing these models to fully leverage DeepSeek-specific optimizations such as vLLM and SGlang. By compressing 93% of the KV cache in LLaMA-2-7B, TransMLA achieves a 10.6x inference speedup at an 8K context length while preserving meaningful output quality. Additionally, the model requires only 6 billion tokens for fine-tuning to regain performance on par with the original across multiple benchmarks. TransMLA offers a practical solution for migrating GQA-based models to the MLA structure. When combined with DeepSeek’s advanced features, such as FP8 quantization and Multi-Token Prediction, even greater inference acceleration can be realized.

nan

Article 584

Title@2025-05-28 (3): Full Domain Analysis in Fluid Dynamics

Title: Full Domain Analysis in Fluid Dynamics

Vollständige Domänenanalyse in Fluiddynamik

流体动态全域分析 2505.22275v1

Authors: Alexander Hagg, Adam Gaier, Dominik Wilde, Alexander Asteroth, Holger Foysi, Dirk Reith

Novel techniques in evolutionary optimization, simulation and machine learning allow for a broad analysis of domains like fluid dynamics, in which computation is expensive and flow behavior is complex. Under the term of full domain analysis we understand the ability to efficiently determine the full space of solutions in a problem domain, and analyze the behavior of those solutions in an accessible and interactive manner. The goal of full domain analysis is to deepen our understanding of domains by generating many examples of flow, their diversification, optimization and analysis. We define a formal model for full domain analysis, its current state of the art, and requirements of subcomponents. Finally, an example is given to show what we can learn by using full domain analysis. Full domain analysis, rooted in optimization and machine learning, can be a helpful tool in understanding complex systems in computational physics and beyond.

nan

Article 585

Title@2025-05-28 (3): EventFlow: Forecasting Temporal Point Processes with Flow Matching

Title: EventFlow: Forecasting Temporal Point Processes with Flow Matching

EventFlow: Vorhersage von zeitlichen Punktprozessen mit Flow Matching

事件:预测与流动匹配的时点进程 2410.07430v2

Authors: Gavin Kerrigan, Kai Nelson, Padhraic Smyth

Continuous-time event sequences, in which events occur at irregular intervals, are ubiquitous across a wide range of industrial and scientific domains. The contemporary modeling paradigm is to treat such data as realizations of a temporal point process, and in machine learning it is common to model temporal point processes in an autoregressive fashion using a neural network. While autoregressive models are successful in predicting the time of a single subsequent event, their performance can degrade when forecasting longer horizons due to cascading errors and myopic predictions. We propose EventFlow, a non-autoregressive generative model for temporal point processes. The model builds on the flow matching framework in order to directly learn joint distributions over event times, side-stepping the autoregressive process. EventFlow is simple to implement and achieves a 20%-53% lower error than the nearest baseline on standard TPP benchmarks while simultaneously using fewer model calls at sampling time.

nan

Article 586

Title@2025-05-28 (3): Reward Generalization in RLHF: A Topological Perspective

Title: Reward Generalization in RLHF: A Topological Perspective

Lohnverallgemeinerung in RLHF: Eine topologische Perspektive

RLHF的奖励普遍化:地形学观点 2402.10184v7

Authors: Tianyi Qiu, Fanzhi Zeng, Jiaming Ji, Dong Yan, Kaile Wang, Jiayi Zhou, Yang Han, Josef Dai, Xuehai Pan, Yaodong Yang

Existing alignment methods share a common topology of information flow, where reward information is collected from humans, modeled with preference learning, and used to tune language models. However, this shared topology has not been systematically characterized, nor have its alternatives been thoroughly explored, leaving the problems of low data efficiency and unreliable generalization unaddressed. As a solution, we introduce a theory of reward generalization in reinforcement learning from human feedback (RLHF), focusing on the topology of information flow at both macro and micro levels. At the macro level, we portray the RLHF information flow as an autoencoding process over behavior distributions, formalizing the RLHF objective of distributional consistency between human preference and model behavior. At the micro level, we present induced Bayesian networks to model the impact of dataset topologies on reward generalization. Combining analysis on both levels, we propose reward modeling from tree-structured preference information. It is shown to reduce reward uncertainty by up to $\Theta(\log n/\log\log n)$ times compared to baselines, where $n$ is the dataset size. Validation on three NLP tasks shows that it achieves an average win rate of 65% against baselines, thus improving reward generalization for free via topology design, while reducing the amount of data requiring annotation.

nan

Article 587

Title@2025-05-28 (3): A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators

Title: A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators

Eine neuartige Charakterisierung des Populationsgebiets unter der Risikodeckungskurve (AURC) und Raten von Finite Sample-Schätzern

风险覆盖曲线下人口区的新特点和有限抽样估计率 2410.15361v3

Authors: Han Zhou, Jordy Van Landeghem, Teodora Popordanoska, Matthew B. Blaschko

The selective classifier (SC) has been proposed for rank based uncertainty thresholding, which could have applications in safety critical areas such as medical diagnostics, autonomous driving, and the justice system. The Area Under the Risk-Coverage Curve (AURC) has emerged as the foremost evaluation metric for assessing the performance of SC systems. In this work, we present a formal statistical formulation of population AURC, presenting an equivalent expression that can be interpreted as a reweighted risk function. Through Monte Carlo methods, we derive empirical AURC plug-in estimators for finite sample scenarios. The weight estimators associated with these plug-in estimators are shown to be consistent, with low bias and tightly bounded mean squared error (MSE). The plug-in estimators are proven to converge at a rate of $\mathcal{O}(\sqrt{\ln(n)/n})$ demonstrating statistical consistency. We empirically validate the effectiveness of our estimators through experiments across multiple datasets, model architectures, and confidence score functions (CSFs), demonstrating consistency and effectiveness in fine-tuning AURC performance.

nan

Article 588

Title@2025-05-28 (3): Improving Rule-based Reasoning in LLMs using Neurosymbolic Representations

Title: Improving Rule-based Reasoning in LLMs using Neurosymbolic Representations

Verbesserung der regelbasierten Reasoning in LLMs mit neurosymbolischen Darstellungen

改进使用新阳性表示法的LLM中基于规则的理据 2502.01657v3

Authors: Varun Dhanraj, Chris Eliasmith

Large language models (LLMs) continue to face challenges in reliably solving reasoning tasks, particularly those that require precise rule following, as often found in mathematical reasoning. This paper introduces a novel neurosymbolic method that improves LLM reasoning by encoding hidden states into neurosymbolic vectors, enabling problem-solving within a neurosymbolic vector space. The results are decoded and merged with the original hidden state, significantly boosting the model’s performance on numerical reasoning tasks. By offloading computation through neurosymbolic representations, this method enhances efficiency, reliability, and interpretability. Experimental results demonstrate an average of 88.6% lower cross-entropy loss and 15.4 times more problems correctly solved on a suite of mathematical reasoning tasks compared to chain-of-thought prompting and supervised fine-tuning (LoRA), without degrading performance on other tasks. We make our code available at: https://github.com/vdhanraj/Neurosymbolic-LLM.

nan

Article 589

Title@2025-05-28 (3): Training on Plausible Counterfactuals Removes Spurious Correlations

Title: Training on Plausible Counterfactuals Removes Spurious Correlations

Training auf Plausible Counterfactals entfernt spurlose Korrelationen

关于可视反事实消除污损的培训 2505.16583v3

Authors: Shpresim Sadiku, Kartikeya Chitranshi, Hiroshi Kera, Sebastian Pokutta

Plausible counterfactual explanations (p-CFEs) are perturbations that minimally modify inputs to change classifier decisions while remaining plausible under the data distribution. In this study, we demonstrate that classifiers can be trained on p-CFEs labeled with induced \emph{incorrect} target classes to classify unperturbed inputs with the original labels. While previous studies have shown that such learning is possible with adversarial perturbations, we extend this paradigm to p-CFEs. Interestingly, our experiments reveal that learning from p-CFEs is even more effective: the resulting classifiers achieve not only high in-distribution accuracy but also exhibit significantly reduced bias with respect to spurious correlations.

nan

Article 590

Title@2025-05-28 (3): LiDAR Based Semantic Perception for Forklifts in Outdoor Environments

Title: LiDAR Based Semantic Perception for Forklifts in Outdoor Environments

LiDAR basierte semantische Wahrnehmung für Gabelstapler im Freien

室外环境中叉车使用基于 LiDAR 的语义感 2505.22258v1

Authors: Benjamin Serfling, Hannes Reichert, Lorenzo Bayerlein, Konrad Doll, Kati Radkhah-Lens

In this study, we present a novel LiDAR-based semantic segmentation framework tailored for autonomous forklifts operating in complex outdoor environments. Central to our approach is the integration of a dual LiDAR system, which combines forward-facing and downward-angled LiDAR sensors to enable comprehensive scene understanding, specifically tailored for industrial material handling tasks. The dual configuration improves the detection and segmentation of dynamic and static obstacles with high spatial precision. Using high-resolution 3D point clouds captured from two sensors, our method employs a lightweight yet robust approach that segments the point clouds into safety-critical instance classes such as pedestrians, vehicles, and forklifts, as well as environmental classes such as driveable ground, lanes, and buildings. Experimental validation demonstrates that our approach achieves high segmentation accuracy while satisfying strict runtime requirements, establishing its viability for safety-aware, fully autonomous forklift navigation in dynamic warehouse and yard environments.

nan

Article 591

Title@2025-05-28 (3): Something’s Fishy In The Data Lake: A Critical Re-evaluation of Table Union Search Benchmarks

Title: Something’s Fishy In The Data Lake: A Critical Re-evaluation of Table Union Search Benchmarks

Irgendetwas ist Fishy In The Data Lake: Eine kritische Neubewertung der Tabelle Union Suche Benchmarks

“数据湖中的鱼:对表格联合搜索基准的重要重新评估” 2505.21329v2

Authors: Allaa Boutaleb, Bernd Amann, Hubert Naacke, Rafael Angarita

Recent table representation learning and data discovery methods tackle table union search (TUS) within data lakes, which involves identifying tables that can be unioned with a given query table to enrich its content. These methods are commonly evaluated using benchmarks that aim to assess semantic understanding in real-world TUS tasks. However, our analysis of prominent TUS benchmarks reveals several limitations that allow simple baselines to perform surprisingly well, often outperforming more sophisticated approaches. This suggests that current benchmark scores are heavily influenced by dataset-specific characteristics and fail to effectively isolate the gains from semantic understanding. To address this, we propose essential criteria for future benchmarks to enable a more realistic and reliable evaluation of progress in semantic table union search.

nan

Article 592

Title@2025-05-28 (3): Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training

Title: Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training

Revisiting Group Relative Policy Optimization: Einblicke in die On-Policy- und Off-Policy-Schulung

重新审视小组相对政策优化:对政策和非政策培训的深入了解 2505.22257v1

Authors: Youssef Mroueh, Nicolas Dupuis, Brian Belgodere, Apoorva Nitsure, Mattia Rigotti, Kristjan Greenewald, Jiri Navratil, Jerret Ross, Jesus Rios

We revisit Group Relative Policy Optimization (GRPO) in both on-policy and off-policy optimization regimes. Our motivation comes from recent work on off-policy Proximal Policy Optimization (PPO), which improves training stability, sampling efficiency, and memory usage. In addition, a recent analysis of GRPO suggests that estimating the advantage function with off-policy samples could be beneficial. Building on these observations, we adapt GRPO to the off-policy setting. We show that both on-policy and off-policy GRPO objectives yield an improvement in the reward. This result motivates the use of clipped surrogate objectives in the off-policy version of GRPO. We then compare the empirical performance of reinforcement learning with verifiable rewards in post-training using both GRPO variants. Our results show that off-policy GRPO either significantly outperforms or performs on par with its on-policy counterpart.

nan

Article 593

Title@2025-05-28 (3): Train Sparse Autoencoders Efficiently by Utilizing Features Correlation

Title: Train Sparse Autoencoders Efficiently by Utilizing Features Correlation

Bahnsparse Autoencoder effizient durch die Nutzung von Funktionen Korrelation

通过使用地物关联, 高效地列列“ 分散的自动编译器” 。 2505.22255v1

Authors: Vadim Kurochkin, Yaroslav Aksenov, Daniil Laptev, Daniil Gavrilov, Nikita Balagansky

Sparse Autoencoders (SAEs) have demonstrated significant promise in interpreting the hidden states of language models by decomposing them into interpretable latent directions. However, training SAEs at scale remains challenging, especially when large dictionary sizes are used. While decoders can leverage sparse-aware kernels for efficiency, encoders still require computationally intensive linear operations with large output dimensions. To address this, we propose KronSAE, a novel architecture that factorizes the latent representation via Kronecker product decomposition, drastically reducing memory and computational overhead. Furthermore, we introduce mAND, a differentiable activation function approximating the binary AND operation, which improves interpretability and performance in our factorized framework.

nan

Article 594

Title@2025-05-28 (3): A Unified Online-Offline Framework for Co-Branding Campaign Recommendations

Title: A Unified Online-Offline Framework for Co-Branding Campaign Recommendations

Ein einheitliches Online-Offline-Rahmenwerk für Co-Branding-Kampagnenempfehlungen

联合捆绑运动建议统一在线离线框架 2505.22254v1

Authors: Xiangxiang Dai, Xiaowei Sun, Jinhang Zuo, Xutong Liu, John C. S. Lui

Co-branding has become a vital strategy for businesses aiming to expand market reach within recommendation systems. However, identifying effective cross-industry partnerships remains challenging due to resource imbalances, uncertain brand willingness, and ever-changing market conditions. In this paper, we provide the first systematic study of this problem and propose a unified online-offline framework to enable co-branding recommendations. Our approach begins by constructing a bipartite graph linking initiating'' andtarget’’ brands to quantify co-branding probabilities and assess market benefits. During the online learning phase, we dynamically update the graph in response to market feedback, while striking a balance between exploring new collaborations for long-term gains and exploiting established partnerships for immediate benefits. To address the high initial co-branding costs, our framework mitigates redundant exploration, thereby enhancing short-term performance while ensuring sustainable strategic growth. In the offline optimization phase, our framework consolidates the interests of multiple sub-brands under the same parent brand to maximize overall returns, avoid excessive investment in single sub-brands, and reduce unnecessary costs associated with over-prioritizing a single sub-brand. We present a theoretical analysis of our approach, establishing a highly nontrivial sublinear regret bound for online learning in the complex co-branding problem, and enhancing the approximation guarantee for the NP-hard offline budget allocation optimization. Experiments on both synthetic and real-world co-branding datasets demonstrate the practical effectiveness of our framework, with at least 12\% improvement.

nan

Article 595

Title@2025-05-28 (3): B-XAIC Dataset: Benchmarking Explainable AI for Graph Neural Networks Using Chemical Data

Title: B-XAIC Dataset: Benchmarking Explainable AI for Graph Neural Networks Using Chemical Data

B-XAIC Datensatz: Benchmarking Erklärbare KI für Graph Neuronale Netzwerke unter Verwendung chemischer Daten

B-XAIC数据集:使用化学数据的图形神经网络基准可解释的AI 2505.22252v1

Authors: Magdalena Proszewska, Tomasz Danel, Dawid Rymarczyk

Understanding the reasoning behind deep learning model predictions is crucial in cheminformatics and drug discovery, where molecular design determines their properties. However, current evaluation frameworks for Explainable AI (XAI) in this domain often rely on artificial datasets or simplified tasks, employing data-derived metrics that fail to capture the complexity of real-world scenarios and lack a direct link to explanation faithfulness. To address this, we introduce B-XAIC, a novel benchmark constructed from real-world molecular data and diverse tasks with known ground-truth rationales for assigned labels. Through a comprehensive evaluation using B-XAIC, we reveal limitations of existing XAI methods for Graph Neural Networks (GNNs) in the molecular domain. This benchmark provides a valuable resource for gaining deeper insights into the faithfulness of XAI, facilitating the development of more reliable and interpretable models.

nan

Article 596

Title@2025-05-28 (3): Evaluating Compact LLMs for Zero-Shot Iberian Language Tasks on End-User Devices

Title: Evaluating Compact LLMs for Zero-Shot Iberian Language Tasks on End-User Devices

Bewertung kompakter LLMs für blitzfreie iberische Sprachaufgaben auf Endbenutzer-Geräten

评价关于最终用户装置的零 - 低 - 低 - 高 - 伊比利亚语语言任务 2504.03312v2

Authors: Luís Couto Seller, Íñigo Sanz Torres, Adrián Vogel-Fernández, Carlos González Carballo, Pedro Miguel Sánchez Sánchez, Adrián Carruana Martín, Enrique de Miguel Ambite

Large Language Models have significantly advanced natural language processing, achieving remarkable performance in tasks such as language generation, translation, and reasoning. However, their substantial computational requirements restrict deployment to high-end systems, limiting accessibility on consumer-grade devices. This challenge is especially pronounced for under-resourced languages like those spoken in the Iberian Peninsula, where relatively limited linguistic resources and benchmarks hinder effective evaluation. This work presents a comprehensive evaluation of compact state-of-the-art LLMs across several essential NLP tasks tailored for Iberian languages. The results reveal that while some models consistently excel in certain tasks, significant performance gaps remain, particularly for languages such as Basque. These findings highlight the need for further research on balancing model compactness with robust multilingual performance

nan

Article 597

Title@2025-05-28 (3): UDuo: Universal Dual Optimization Framework for Online Matching

Title: UDuo: Universal Dual Optimization Framework for Online Matching

UDuo: Universal Dual Optimization Framework für Online-Matching

UDuo: 通用双优化在线匹配框架 2505.22243v1

Authors: Bin Li, Diwei Liu, Zehong Hu, Jia Jia

Online resource allocation under budget constraints critically depends on proper modeling of user arrival dynamics. Classical approaches employ stochastic user arrival models to derive near-optimal solutions through fractional matching formulations of exposed users for downstream allocation tasks. However, this is no longer a reasonable assumption when the environment changes dynamically. In this work, We propose the Universal Dual optimization framework UDuo, a novel paradigm that fundamentally rethinks online allocation through three key innovations: (i) a temporal user arrival representation vector that explicitly captures distribution shifts in user arrival patterns and resource consumption dynamics, (ii) a resource pacing learner with adaptive allocation policies that generalize to heterogeneous constraint scenarios, and (iii) an online time-series forecasting approach for future user arrival distributions that achieves asymptotically optimal solutions with constraint feasibility guarantees in dynamic environments. Experimental results show that UDuo achieves higher efficiency and faster convergence than the traditional stochastic arrival model in real-world pricing while maintaining rigorous theoretical validity for general online allocation problems.

nan

Article 598

Title@2025-05-28 (3): Reinforcement Learning with Verifiable Rewards: GRPO’s Effective Loss, Dynamics, and Success Amplification

Title: Reinforcement Learning with Verifiable Rewards: GRPO’s Effective Loss, Dynamics, and Success Amplification

Verstärktes Lernen mit überprüfbaren Belohnungen: Effektiver Verlust, Dynamik und Erfolgsverstärkung von GRPO

利用可核实的奖励加强学习:GROP的有效损失、动态和成功扩展 2503.06639v3

Authors: Youssef Mroueh

Group Relative Policy Optimization (GRPO) was introduced recently and used successfully to train DeepSeek-R1 models for promoting reasoning capabilities of LLMs using verifiable or binary rewards. We show in this paper that GRPO with verifiable rewards can be written as a Kullback–Leibler (KL) regularized contrastive loss, where the contrastive samples are synthetic data sampled from the old policy. The optimal GRPO policy $\pi_{n}$ can be expressed explicitly in terms of the binary reward, as well as the first- and second-order statistics of the old policy ($\pi_{n-1}$) and the reference policy $\pi_{\text{ref}}$. Iterating this scheme, we obtain a sequence of policies $\pi_{n}$ for which we can quantify the probability of success $p_n$. We show that the probability of success of the policy satisfies a recurrence that converges to a fixed point of a function that depends on the initial probability of success $p_{\text{ref}}$ and the regularization parameter $\beta$ of the $KL$ regularizer. We show that the fixed point $p^*$ is guaranteed to be larger than $p_{\text{ref}}$, thereby demonstrating that GRPO effectively amplifies the probability of success of the policy.

nan

Article 599

Title@2025-05-28 (3): Rethinking GNN Expressive Power from a Distributed Computational Model Perspective

Title: Rethinking GNN Expressive Power from a Distributed Computational Model Perspective

Überdenken von GNN Expressive Power aus einer distributed Computational Model Perspective

从分配的计算模型模型角度重新思考GNNN 的表达力 2410.01308v3

Authors: Guanyu Cui, Yuhe Guo, Zhewei Wei, Hsin-Hao Su

The success of graph neural networks (GNNs) has motivated theoretical studies on their expressive power, often through alignments with the Weisfeiler-Lehman (WL) tests. However, such analyses typically focus on the ability of GNNs to distinguish between graph structures, rather than to compute or approximate specific function classes. The latter is more commonly studied in machine learning theory, including results such as the Turing completeness of recurrent networks and the universal approximation property of feedforward networks. We argue that using well-defined computational models, such as a modified CONGEST model with clearly specified preprocessing and postprocessing, offers a more sound framework for analyzing GNN expressiveness. Within this framework, we show that allowing unrestricted preprocessing or incorporating externally computed features, while claiming that these precomputations enhance the expressiveness, can sometimes lead to problems. We also show that the lower bound on a GNN’s capacity (depth multiplied by width) to simulate one iteration of the WL test actually grows nearly linearly with graph size, indicating that the WL test is not locally computable and is misaligned with message-passing GNNs. Despite these negative results, we also present positive results that characterize the effects of virtual nodes and edges from a computational model perspective. Finally, we highlight several open problems regarding GNN expressiveness for further exploration.

nan

Article 600

Title@2025-05-28 (3): NRFormer: Nationwide Nuclear Radiation Forecasting with Spatio-Temporal Transformer

Title: NRFormer: Nationwide Nuclear Radiation Forecasting with Spatio-Temporal Transformer

NRFormer: landesweite Vorhersage der nuklearen Strahlung mit Spatio-Temporal Transformer

NR 前:利用时空变压器进行全国核辐射预报 2410.11924v3

Authors: Tengfei Lyu, Jindong Han, Hao Liu

Nuclear radiation, which refers to the energy emitted from atomic nuclei during decay, poses significant risks to human health and environmental safety. Recently, advancements in monitoring technology have facilitated the effective recording of nuclear radiation levels and related factors, such as weather conditions. The abundance of monitoring data enables the development of accurate and reliable nuclear radiation forecasting models, which play a crucial role in informing decision-making for individuals and governments. However, this task is challenging due to the imbalanced distribution of monitoring stations over a wide spatial range and the non-stationary radiation variation patterns. In this study, we introduce NRFormer, a novel framework tailored for the nationwide prediction of nuclear radiation variations. By integrating a non-stationary temporal attention module, an imbalance-aware spatial attention module, and a radiation propagation prompting module, NRFormer collectively captures complex spatio-temporal dynamics of nuclear radiation. Extensive experiments on two real-world datasets demonstrate the superiority of our proposed framework against 11 baselines.

nan

Article 601

Title@2025-05-28 (3): On Provable Length and Compositional Generalization

Title: On Provable Length and Compositional Generalization

Auf evable Länge und kompositorische Verallgemeinerung

关于可预见长度和组成式通泛化 2402.04875v6

Authors: Kartik Ahuja, Amin Mansouri

Out-of-distribution generalization capabilities of sequence-to-sequence models can be studied from the lens of two crucial forms of generalization: length generalization – the ability to generalize to longer sequences than ones seen during training, and compositional generalization: the ability to generalize to token combinations not seen during training. In this work, we provide first provable guarantees on length and compositional generalization for common sequence-to-sequence models – deep sets, transformers, state space models, and recurrent neural nets – trained to minimize the prediction error. We show that \emph{limited capacity} versions of these different architectures achieve both length and compositional generalization provided the training distribution is sufficiently diverse. In the first part, we study structured limited capacity variants of different architectures and arrive at the generalization guarantees with limited diversity requirements on the training distribution. In the second part, we study limited capacity variants with less structural assumptions and arrive at generalization guarantees but with more diversity requirements on the training distribution. Further, we also show that chain-of-thought supervision enables length generalization in higher capacity counterparts of the different architectures we study.

nan

Article 602

Title: Yambda-5B – A Large-Scale Multi-modal Dataset for Ranking And Retrieval

Yambda-5B – Ein multimodaler Datensatz für das Ranking und das Retrieval

Yambda-5B – – 用于排名和检索的大型多模式数据集 2505.22238v1

Authors: A. Ploshkin, V. Tytskiy, A. Pismenny, V. Baikalov, E. Taychinov, A. Permiakov, D. Burlakov, E. Krofto, N. Savushkin

We present Yambda-5B, a large-scale open dataset sourced from the Yandex.Music streaming platform. Yambda-5B contains 4.79 billion user-item interactions from 1 million users across 9.39 million tracks. The dataset includes two primary types of interactions: implicit feedback (listening events) and explicit feedback (likes, dislikes, unlikes and undislikes). In addition, we provide audio embeddings for most tracks, generated by a convolutional neural network trained on audio spectrograms. A key distinguishing feature of Yambda-5B is the inclusion of the is_organic flag, which separates organic user actions from recommendation-driven events. This distinction is critical for developing and evaluating machine learning algorithms, as Yandex.Music relies on recommender systems to personalize track selection for users. To support rigorous benchmarking, we introduce an evaluation protocol based on a Global Temporal Split, allowing recommendation algorithms to be assessed in conditions that closely mirror real-world use. We report benchmark results for standard baselines (ItemKNN, iALS) and advanced models (SANSA, SASRec) using a variety of evaluation metrics. By releasing Yambda-5B to the community, we aim to provide a readily accessible, industrial-scale resource to advance research, foster innovation, and promote reproducible results in recommender systems.

nan

Article 603

Title@2025-05-28 (3): Decision-Focused Forecasting: A Differentiable Multistage Optimisation Architecture

Title: Decision-Focused Forecasting: A Differentiable Multistage Optimisation Architecture

Entscheidungsorientierte Prognose: Eine differenzierbare mehrstufige Optimierungsarchitektur

决定重点预测:可区别的多阶段优化结构 2405.14719v2

Authors: Egon Peršak, Miguel F. Anjos

Most decision-focused learning work has focused on single stage problems whereas many real-world decision problems are more appropriately modelled using multistage optimisation. In multistage problems contextual information is revealed over time, decisions have to be taken sequentially, and decisions now have an intertemporal effect on future decisions. Decision-focused forecasting is a recurrent differentiable optimisation architecture that expresses a fully differentiable multistage optimisation approach. This architecture enables us to account for the intertemporal decision effects of forecasts. We show what gradient adjustments are made to account for the state-path caused by forecasting. We apply the model to multistage problems in energy storage arbitrage and portfolio optimisation and report that our model outperforms existing approaches.

nan

Article 604

Title@2025-05-28 (3): Optimal kernel regression bounds under energy-bounded noise

Title: Optimal kernel regression bounds under energy-bounded noise

Optimale Kernel-Regressionsgrenzen unter energiegebundenem Rauschen

在受能源限制的噪音下的最佳内核回归界限 2505.22235v1

Authors: Amon Lahr, Johannes Köhler, Anna Scampicchio, Melanie N. Zeilinger

Non-conservative uncertainty bounds are key for both assessing an estimation algorithm’s accuracy and in view of downstream tasks, such as its deployment in safety-critical contexts. In this paper, we derive a tight, non-asymptotic uncertainty bound for kernel-based estimation, which can also handle correlated noise sequences. Its computation relies on a mild norm-boundedness assumption on the unknown function and the noise, returning the worst-case function realization within the hypothesis class at an arbitrary query input location. The value of this function is shown to be given in terms of the posterior mean and covariance of a Gaussian process for an optimal choice of the measurement noise covariance. By rigorously analyzing the proposed approach and comparing it with other results in the literature, we show its effectiveness in returning tight and easy-to-compute bounds for kernel-based estimates.

nan

Article 605

Title@2025-05-28 (3): Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models

Title: Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models

Qualität Across-Sprachen beurteilen: Ein mehrsprachiger Ansatz zur Vorschulung von Datenfiltern mit Sprachmodellen

判断各语文的质量:采用多种语文办法,利用语言模式进行培训前数据过滤 2505.22232v1

Authors: Mehdi Ali, Manuel Brack, Max Lübbering, Elias Wendt, Abbas Goher Khan, Richard Rutmann, Alex Jude, Maurice Kraus, Alexander Arno Weber, Felix Stollenwerk, David Kaczér, Florian Mai, Lucie Flek, Rafet Sifa, Nicolas Flores-Herr, Joachim Köhler, Patrick Schramowski, Michael Fromm, Kristian Kersting

High-quality multilingual training data is essential for effectively pretraining large language models (LLMs). Yet, the availability of suitable open-source multilingual datasets remains limited. Existing state-of-the-art datasets mostly rely on heuristic filtering methods, restricting both their cross-lingual transferability and scalability. Here, we introduce JQL, a systematic approach that efficiently curates diverse and high-quality multilingual data at scale while significantly reducing computational demands. JQL distills LLMs’ annotation capabilities into lightweight annotators based on pretrained multilingual embeddings. These models exhibit robust multilingual and cross-lingual performance, even for languages and scripts unseen during training. Evaluated empirically across 35 languages, the resulting annotation pipeline substantially outperforms current heuristic filtering methods like Fineweb2. JQL notably enhances downstream model training quality and increases data retention rates. Our research provides practical insights and valuable resources for multilingual data curation, raising the standards of multilingual dataset development.

nan

Article 606

Title@2025-05-28 (3): You Do Not Fully Utilize Transformer’s Representation Capacity

Title: You Do Not Fully Utilize Transformer’s Representation Capacity

Sie nicht voll nutzen Transformer-Repräsentanz Kapazität

您没有充分利用变换器的代表能力 2502.09245v2

Authors: Gleb Gerasimov, Yaroslav Aksenov, Nikita Balagansky, Viacheslav Sinii, Daniil Gavrilov

In contrast to RNNs, which compress their history into a single hidden state, Transformers can attend to all past tokens directly. However, standard Transformers rely solely on the hidden state from the previous layer to represent the entire context. We show that this design choice induces representation collapse and degrades performance. To address this issue, we introduce Layer-Integrated Memory (LIMe), a lightweight extension that leverages existing key-value buffers and learns per-head, per-layer routing weights to integrate representations from all previous layers with negligible overhead. Through extensive experiments-including language modeling, synthetic reasoning benchmarks, and very deep architectures-LIMe consistently achieves faster convergence, lower perplexity per FLOP, and substantial accuracy improvements on synthetic tasks while preserving higher value-vector entropy and improved token separability. Finally, our analysis of the learned routing weights reveals systematic reuse of both local and long-distance features, demonstrating how LIMe mitigates collapse, unlocks richer representations without increasing hidden-state size, and points to promising directions for future research.

nan

Article 607

Title@2025-05-28 (3): Solver-Free Decision-Focused Learning for Linear Optimization Problems

Title: Solver-Free Decision-Focused Learning for Linear Optimization Problems

Solver-Free decision-focused Learning für lineare Optimierungsprobleme

处理线性优化问题的无解决者决定-集中学习 2505.22224v1

Authors: Senne Berden, Ali İrfan Mahmutoğulları, Dimos Tsouros, Tias Guns

Mathematical optimization is a fundamental tool for decision-making in a wide range of applications. However, in many real-world scenarios, the parameters of the optimization problem are not known a priori and must be predicted from contextual features. This gives rise to predict-then-optimize problems, where a machine learning model predicts problem parameters that are then used to make decisions via optimization. A growing body of work on decision-focused learning (DFL) addresses this setting by training models specifically to produce predictions that maximize downstream decision quality, rather than accuracy. While effective, DFL is computationally expensive, because it requires solving the optimization problem with the predicted parameters at each loss evaluation. In this work, we address this computational bottleneck for linear optimization problems, a common class of problems in both DFL literature and real-world applications. We propose a solver-free training method that exploits the geometric structure of linear optimization to enable efficient training with minimal degradation in solution quality. Our method is based on the insight that a solution is optimal if and only if it achieves an objective value that is at least as good as that of its adjacent vertices on the feasible polytope. Building on this, our method compares the estimated quality of the ground-truth optimal solution with that of its precomputed adjacent vertices, and uses this as loss function. Experiments demonstrate that our method significantly reduces computational cost while maintaining high decision quality.

nan

Article 608

Title@2025-05-28 (3): Taming Recommendation Bias with Causal Intervention on Evolving Personal Popularity

Title: Taming Recommendation Bias with Causal Intervention on Evolving Personal Popularity

Zähmungsempfehlung Bias mit ursächlicher Intervention zur Entwicklung persönlicher Beliebtheit

” 与个人大众演变的因果关系干预 “ 的 “ 比亚斯 “ 和 “ 个人大众演变 “ 的 “ 比亚斯 “ 建议 2505.14310v2

Authors: Shiyin Tan, Dongyuan Li, Renhe Jiang, Zhen Wang, Xingtong Yu, Manabu Okumura

Popularity bias occurs when popular items are recommended far more frequently than they should be, negatively impacting both user experience and recommendation accuracy. Existing debiasing methods mitigate popularity bias often uniformly across all users and only partially consider the time evolution of users or items. However, users have different levels of preference for item popularity, and this preference is evolving over time. To address these issues, we propose a novel method called CausalEPP (Causal Intervention on Evolving Personal Popularity) for taming recommendation bias, which accounts for the evolving personal popularity of users. Specifically, we first introduce a metric called {Evolving Personal Popularity} to quantify each user’s preference for popular items. Then, we design a causal graph that integrates evolving personal popularity into the conformity effect, and apply deconfounded training to mitigate the popularity bias of the causal graph. During inference, we consider the evolution consistency between users and items to achieve a better recommendation. Empirical studies demonstrate that CausalEPP outperforms baseline methods in reducing popularity bias while improving recommendation accuracy.

nan

Article 609

Title@2025-05-28 (3): Quantum framework for Reinforcement Learning: Integrating Markov decision process, quantum arithmetic, and trajectory search

Title: Quantum framework for Reinforcement Learning: Integrating Markov decision process, quantum arithmetic, and trajectory search

Quanten-Framework for Reinforcement Learning: Markov-Entscheidungsprozess, Quantenarithmetik und Flugbahnsuche integrieren

强化学习的量子框架:纳入Markov决策程序、量数算术和轨迹搜索 2412.18208v3

Authors: Thet Htar Su, Shaswot Shresthamali, Masaaki Kondo

This paper introduces a quantum framework for addressing reinforcement learning (RL) tasks, grounded in the quantum principles and leveraging a fully quantum model of the classical Markov decision process (MDP). By employing quantum concepts and a quantum search algorithm, this work presents the implementation and optimization of the agent-environment interactions entirely within the quantum domain, eliminating reliance on classical computations. Key contributions include the quantum-based state transitions, return calculation, and trajectory search mechanism that utilize quantum principles to demonstrate the realization of RL processes through quantum phenomena. The implementation emphasizes the fundamental role of quantum superposition in enhancing computational efficiency for RL tasks. Results demonstrate the capacity of a quantum model to achieve quantum enhancement in RL, highlighting the potential of fully quantum implementations in decision-making tasks. This work not only underscores the applicability of quantum computing in machine learning but also contributes to the field of quantum reinforcement learning (QRL) by offering a robust framework for understanding and exploiting quantum computing in RL systems.

nan

Article 610

Title@2025-05-28 (3): Advancing Sequential Numerical Prediction in Autoregressive Models

Title: Advancing Sequential Numerical Prediction in Autoregressive Models

Advancing Sequential Numerical Prediction in Autoregressive Modelle

自动递减模型中推进序列序号预测 2505.13077v2

Authors: Xiang Fei, Jinghui Lu, Qi Sun, Hao Feng, Yanjie Wang, Wei Shi, An-Lan Wang, Jingqun Tang, Can Huang

Autoregressive models have become the de facto choice for sequence generation tasks, but standard approaches treat digits as independent tokens and apply cross-entropy loss, overlooking the coherent structure of numerical sequences. This paper introduces Numerical Token Integrity Loss (NTIL) to address this gap. NTIL operates at two levels: (1) token-level, where it extends the Earth Mover’s Distance (EMD) to preserve ordinal relationships between numerical values, and (2) sequence-level, where it penalizes the overall discrepancy between the predicted and actual sequences. This dual approach improves numerical prediction and integrates effectively with LLMs/MLLMs. Extensive experiments show significant performance improvements with NTIL.

nan

Article 611

Title@2025-05-28 (3): On the Within-class Variation Issue in Alzheimer’s Disease Detection

Title: On the Within-class Variation Issue in Alzheimer’s Disease Detection

Zur klasseninternen Variationsfrage bei der Alzheimer-Erkennung

阿尔茨海默氏氏病检测的类内变化变化问题 2409.16322v2

Authors: Jiawen Kang, Dongrui Han, Lingwei Meng, Jingyan Zhou, Jinchao Li, Xixin Wu, Helen Meng

Alzheimer’s Disease (AD) detection employs machine learning classification models to distinguish between individuals with AD and those without. Different from conventional classification tasks, we identify within-class variation as a critical challenge in AD detection: individuals with AD exhibit a spectrum of cognitive impairments. Therefore, simplistic binary AD classification may overlook two crucial aspects: within-class heterogeneity and instance-level imbalance. In this work, we found using a sample score estimator can generate sample-specific soft scores aligning with cognitive scores. We subsequently propose two simple yet effective methods: Soft Target Distillation (SoTD) and Instance-level Re-balancing (InRe), targeting two problems respectively. Based on the ADReSS and CU-MARVEL corpora, we demonstrated and analyzed the advantages of the proposed approaches in detection performance. These findings provide insights for developing robust and reliable AD detection models.

nan

Article 612

Title@2025-05-28 (3): Interpreting CLIP with Hierarchical Sparse Autoencoders

Title: Interpreting CLIP with Hierarchical Sparse Autoencoders

CLIP mit Hierarchical Sparse Autoencodern interpretieren

使用等级式的粗度自动解析器解释 CLIP 2502.20578v2

Authors: Vladimir Zaigrajew, Hubert Baniecki, Przemyslaw Biecek

Sparse autoencoders (SAEs) are useful for detecting and steering interpretable features in neural networks, with particular potential for understanding complex multimodal representations. Given their ability to uncover interpretable features, SAEs are particularly valuable for analyzing large-scale vision-language models (e.g., CLIP and SigLIP), which are fundamental building blocks in modern systems yet remain challenging to interpret and control. However, current SAE methods are limited by optimizing both reconstruction quality and sparsity simultaneously, as they rely on either activation suppression or rigid sparsity constraints. To this end, we introduce Matryoshka SAE (MSAE), a new architecture that learns hierarchical representations at multiple granularities simultaneously, enabling a direct optimization of both metrics without compromise. MSAE establishes a new state-of-the-art Pareto frontier between reconstruction quality and sparsity for CLIP, achieving 0.99 cosine similarity and less than 0.1 fraction of variance unexplained while maintaining ~80% sparsity. Finally, we demonstrate the utility of MSAE as a tool for interpreting and controlling CLIP by extracting over 120 semantic concepts from its representation to perform concept-based similarity search and bias analysis in downstream tasks like CelebA. We make the codebase available at https://github.com/WolodjaZ/MSAE.

nan

Article 613

Title@2025-05-28 (3): LaMM: Semi-Supervised Pre-Training of Large-Scale Materials Models

Title: LaMM: Semi-Supervised Pre-Training of Large-Scale Materials Models

LaMM: Halbüberwachte Vorausbildung von großformatigen Werkstoffmodellen

LAMM: 大型材料模型的半监督前培训 2505.22208v1

Authors: Yosuke Oyama, Yusuke Majima, Eiji Ohta, Yasufumi Sakai

Neural network potentials (NNPs) are crucial for accelerating computational materials science by surrogating density functional theory (DFT) calculations. Improving their accuracy is possible through pre-training and fine-tuning, where an NNP model is first pre-trained on a large-scale dataset and then fine-tuned on a smaller target dataset. However, this approach is computationally expensive, mainly due to the cost of DFT-based dataset labeling and load imbalances during large-scale pre-training. To address this, we propose LaMM, a semi-supervised pre-training method incorporating improved denoising self-supervised learning and a load-balancing algorithm for efficient multi-node training. We demonstrate that our approach effectively leverages a large-scale dataset of $\sim$300 million semi-labeled samples to train a single NNP model, resulting in improved fine-tuning performance in terms of both speed and accuracy.

nan

Article 614

Title@2025-05-28 (3): Pitfalls of Rule- and Model-based Verifiers – A Case Study on Mathematical Reasoning

Title: Pitfalls of Rule- and Model-based Verifiers – A Case Study on Mathematical Reasoning

Pitfalls of Rule- and Model-based Verifiers – Eine Fallstudie zur mathematischen Begründung

规则和基于示范的验证符咒 – – 关于数学理由的个案研究 2505.22203v1

Authors: Yuzhen Huang, Weihao Zeng, Xingshan Zeng, Qi Zhu, Junxian He

Trustworthy verifiers are essential for the success of reinforcement learning with verifiable reward (RLVR), which is the core methodology behind various large reasoning models such as DeepSeek-R1. In complex domains like mathematical reasoning, rule-based verifiers have been widely adopted in previous works to train strong reasoning models. However, the reliability of these verifiers and their impact on the RL training process remain poorly understood. In this work, we take mathematical reasoning as a case study and conduct a comprehensive analysis of various verifiers in both static evaluation and RL training scenarios. First, we find that current open-source rule-based verifiers often fail to recognize equivalent answers presented in different formats across multiple commonly used mathematical datasets, resulting in non-negligible false negative rates. This limitation adversely affects RL training performance and becomes more pronounced as the policy model gets stronger. Subsequently, we investigate model-based verifiers as a potential solution to address these limitations. While the static evaluation shows that model-based verifiers achieve significantly higher verification accuracy, further analysis and RL training results imply that they are highly susceptible to hacking, where they misclassify certain patterns in responses as correct (i.e., false positives). This vulnerability is exploited during policy model optimization, leading to artificially inflated rewards. Our findings underscore the unique risks inherent to both rule-based and model-based verifiers, aiming to offer valuable insights to develop more robust reward systems in reinforcement learning.

nan

Article 615

Title@2025-05-28 (3): Enhancing Uncertainty Estimation and Interpretability via Bayesian Non-negative Decision Layer

Title: Enhancing Uncertainty Estimation and Interpretability via Bayesian Non-negative Decision Layer

Verbesserung der Unsicherheitsabschätzung und -interpretierbarkeit über Bayesian Non-negative Decision Layer

通过Bayesian非负决定层加强不确定性的估算和解释 2505.22199v1

Authors: Xinyue Hu, Zhibin Duan, Bo Chen, Mingyuan Zhou

Although deep neural networks have demonstrated significant success due to their powerful expressiveness, most models struggle to meet practical requirements for uncertainty estimation. Concurrently, the entangled nature of deep neural networks leads to a multifaceted problem, where various localized explanation techniques reveal that multiple unrelated features influence the decisions, thereby undermining interpretability. To address these challenges, we develop a Bayesian Non-negative Decision Layer (BNDL), which reformulates deep neural networks as a conditional Bayesian non-negative factor analysis. By leveraging stochastic latent variables, the BNDL can model complex dependencies and provide robust uncertainty estimation. Moreover, the sparsity and non-negativity of the latent variables encourage the model to learn disentangled representations and decision layers, thereby improving interpretability. We also offer theoretical guarantees that BNDL can achieve effective disentangled learning. In addition, we developed a corresponding variational inference method utilizing a Weibull variational inference network to approximate the posterior distribution of the latent variables. Our experimental results demonstrate that with enhanced disentanglement capabilities, BNDL not only improves the model’s accuracy but also provides reliable uncertainty estimation and improved interpretability.

nan

Article 616

Title@2025-05-28 (3): An Augmentation-Aware Theory for Self-Supervised Contrastive Learning

Title: An Augmentation-Aware Theory for Self-Supervised Contrastive Learning

Eine Augmentations-Bewusst-Theorie für selbstüberwachtes kontrastives Lernen

自我监督违规学习的增强- 软件软件理论 2505.22196v1

Authors: Jingyi Cui, Hongwei Wen, Yisen Wang

Self-supervised contrastive learning has emerged as a powerful tool in machine learning and computer vision to learn meaningful representations from unlabeled data. Meanwhile, its empirical success has encouraged many theoretical studies to reveal the learning mechanisms. However, in the existing theoretical research, the role of data augmentation is still under-exploited, especially the effects of specific augmentation types. To fill in the blank, we for the first time propose an augmentation-aware error bound for self-supervised contrastive learning, showing that the supervised risk is bounded not only by the unsupervised risk, but also explicitly by a trade-off induced by data augmentation. Then, under a novel semantic label assumption, we discuss how certain augmentation methods affect the error bound. Lastly, we conduct both pixel- and representation-level experiments to verify our proposed theoretical results.

nan

Article 617

Title@2025-05-28 (3): Physics-inspired Generative AI models via real hardware-based noisy quantum diffusion

Title: Physics-inspired Generative AI models via real hardware-based noisy quantum diffusion

Physik-inspirierte Generative KI-Modelle über reale Hardware-basierte laute Quantendiffusion

通过实实在在的硬件噪音量子扩散产生人工智能模型 2505.22193v1

Authors: Marco Parigi, Stefano Martina, Francesco Aldo Venturelli, Filippo Caruso

Quantum Diffusion Models (QDMs) are an emerging paradigm in Generative AI that aims to use quantum properties to improve the performances of their classical counterparts. However, existing algorithms are not easily scalable due to the limitations of near-term quantum devices. Following our previous work on QDMs, here we propose and implement two physics-inspired protocols. In the first, we use the formalism of quantum stochastic walks, showing that a specific interplay of quantum and classical dynamics in the forward process produces statistically more robust models generating sets of MNIST images with lower Fr'echet Inception Distance (FID) than using totally classical dynamics. In the second approach, we realize an algorithm to generate images by exploiting the intrinsic noise of real IBM quantum hardware with only four qubits. Our work could be a starting point to pave the way for new scenarios for large-scale algorithms in quantum Generative AI, where quantum noise is neither mitigated nor corrected, but instead exploited as a useful resource.

nan

Article 618

Title@2025-05-28 (3): Beyond RMSE and MAE: Introducing EAUC to unmask hidden bias and unfairness in dyadic regression models

Title: Beyond RMSE and MAE: Introducing EAUC to unmask hidden bias and unfairness in dyadic regression models

Jenseits von RMSE und MAE: Einführung des EUC zur Enttarnung versteckter Bias und Ungerechtigkeit in dyadischen Regressionsmodellen

RUSE 和MAE 之后的RUSE 和MAE:将EAUC引入dyadic回归模型中隐蔽的偏见和不公平现象 2401.10690v5

Authors: Jorge Paz-Ruza, Amparo Alonso-Betanzos, Bertha Guijarro-Berdiñas, Brais Cancela, Carlos Eiras-Franco

Dyadic regression models, which output real-valued predictions for pairs of entities, are fundamental in many domains (e.g. obtaining user-product ratings in Recommender Systems) and promising and under exploration in others (e.g. tuning patient-drug dosages in precision pharmacology). In this work, we prove that non-uniform observed value distributions of individual entities lead to severe biases in state-of-the-art models, skewing predictions towards the average of observed past values for the entity and providing worse-than-random predictive power in eccentric yet crucial cases; we name this phenomenon eccentricity bias. We show that global error metrics like Root Mean Squared Error (RMSE) are insufficient to capture this bias, and we introduce Eccentricity-Area Under the Curve (EAUC) as a novel metric that can quantify it in all studied domains and models. We prove the intuitive interpretation of EAUC by experimenting with naive post-training bias corrections, and theorize other options to use EAUC to guide the construction of fair models. This work contributes a bias-aware evaluation of dyadic regression to prevent unfairness in critical real-world applications of such systems.

nan

Article 619

Title@2025-05-28 (3): LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently

Title: LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently

LoRA-One: Ein-Schritt-Full Gradient könnte genug für feines Tuning von großen Sprachmodellen sein, wahrscheinlich und effizient

LORA-OI: 精巧、高效、可预见和高效的微调大语言模型的单步全步可满足需要 2502.01235v2

Authors: Yuanhe Zhang, Fanghui Liu, Yudong Chen

This paper explores how theory can guide and enhance practical algorithms, using Low-Rank Adaptation (LoRA, Hu et al. 2022) in large language models as a case study. We rigorously prove that, under gradient descent, LoRA adapters align with specific singular subspaces of the one-step full fine-tuning gradient. This result suggests that, by properly initializing the adapters using the one-step full gradient, subspace alignment can be achieved immediately and applicable to both linear and nonlinear models. Building on our theory, we propose a theory-driven algorithm, LoRA-One, where the linear convergence (as well as generalization) is built and incorporating preconditioners theoretically helps mitigate the effects of ill-conditioning. Besides, our theory reveals connections between LoRA-One and other gradient-alignment-based methods, helping to clarify misconceptions in the design of such algorithms. LoRA-One achieves significant empirical improvements over LoRA and its variants across benchmarks in natural language understanding, mathematical reasoning, and code generation. Code is available at: https://github.com/YuanheZ/LoRA-One.

nan

Article 620

Title@2025-05-28 (3): LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits

Title: LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits

LC-Tsallis-INF: Generalisierte Best-of-Both-Worlds Lineare Kontextbanditen

LC-Tsallis-INF: 普遍化的两世界最佳线性线性直线性范围内的强盗 2403.03219v3

Authors: Masahiro Kato, Shinji Ito

We investigate the \emph{linear contextual bandit problem} with independent and identically distributed (i.i.d.) contexts. In this problem, we aim to develop a \emph{Best-of-Both-Worlds} (BoBW) algorithm with regret upper bounds in both stochastic and adversarial regimes. We develop an algorithm based on \emph{Follow-The-Regularized-Leader} (FTRL) with Tsallis entropy, referred to as the $\alpha$-\emph{Linear-Contextual (LC)-Tsallis-INF}. We show that its regret is at most $O(\log(T))$ in the stochastic regime under the assumption that the suboptimality gap is uniformly bounded from below, and at most $O(\sqrt{T})$ in the adversarial regime. Furthermore, our regret analysis is extended to more general regimes characterized by the \emph{margin condition} with a parameter $\beta \in (1, \infty]$, which imposes a milder assumption on the suboptimality gap. We show that the proposed algorithm achieves $O\left(\log(T)^{\frac{1+\beta}{2+\beta}}T^{\frac{1}{2+\beta}}\right)$ regret under the margin condition.

nan

Article 621

Title@2025-05-28 (3): Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes

Title: Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes

Kontinuierliche und diskrete Diffusion mit nicht gleichzeitigen Diffusionsprozessen

与非平行扩散进程一起进行连续和分解的不连续和分解文本传播 2505.22165v1

Authors: Bocheng Li, Zhujin Gao, Linli Xu

Diffusion models have emerged as a promising approach for text generation, with recent works falling into two main categories: discrete and continuous diffusion models. Discrete diffusion models apply token corruption independently using categorical distributions, allowing for different diffusion progress across tokens but lacking fine-grained control. Continuous diffusion models map tokens to continuous spaces and apply fine-grained noise, but the diffusion progress is uniform across tokens, limiting their ability to capture semantic nuances. To address these limitations, we propose \textbf{\underline{N}}on-simultan\textbf{\underline{e}}ous C\textbf{\underline{o}}ntinuous \textbf{\underline{Diff}}usion Models (NeoDiff), a novel diffusion model that integrates the strengths of both discrete and continuous approaches. NeoDiff introduces a Poisson diffusion process for the forward process, enabling a flexible and fine-grained noising paradigm, and employs a time predictor for the reverse process to adaptively modulate the denoising progress based on token semantics. Furthermore, NeoDiff utilizes an optimized schedule for inference to ensure more precise noise control and improved performance. Our approach unifies the theories of discrete and continuous diffusion models, offering a more principled and effective framework for text generation. Experimental results on several text generation tasks demonstrate NeoDiff’s superior performance compared to baselines of non-autoregressive continuous and discrete diffusion models, iterative-based methods and autoregressive diffusion-based methods. These results highlight NeoDiff’s potential as a powerful tool for generating high-quality text and advancing the field of diffusion-based text generation.

nan

Article 622

Title@2025-05-28 (3): AgriFM: A Multi-source Temporal Remote Sensing Foundation Model for Crop Mapping

Title: AgriFM: A Multi-source Temporal Remote Sensing Foundation Model for Crop Mapping

AgriFM: Multi-Source-Modell für die zeitliche Fernerkundung

AgriFM:多种来源的时空遥感基金会作物绘图模型 2505.21357v2

Authors: Wenyuan Li, Shunlin Liang, Keyan Chen, Yongzhe Chen, Han Ma, Jianglei Xu, Yichuan Ma, Shikang Guan, Husheng Fang, Zhenwei Shi

Accurate crop mapping fundamentally relies on modeling multi-scale spatiotemporal patterns, where spatial scales range from individual field textures to landscape-level context, and temporal scales capture both short-term phenological transitions and full growing-season dynamics. Transformer-based remote sensing foundation models (RSFMs) offer promising potential for crop mapping due to their innate ability for unified spatiotemporal processing. However, current RSFMs remain suboptimal for crop mapping: they either employ fixed spatiotemporal windows that ignore the multi-scale nature of crop systems or completely disregard temporal information by focusing solely on spatial patterns. To bridge these gaps, we present AgriFM, a multi-source remote sensing foundation model specifically designed for agricultural crop mapping. Our approach begins by establishing the necessity of simultaneous hierarchical spatiotemporal feature extraction, leading to the development of a modified Video Swin Transformer architecture where temporal down-sampling is synchronized with spatial scaling operations. This modified backbone enables efficient unified processing of long time-series satellite inputs. AgriFM leverages temporally rich data streams from three satellite sources including MODIS, Landsat-8/9 and Sentinel-2, and is pre-trained on a global representative dataset comprising over 25 million image samples supervised by land cover products. The resulting framework incorporates a versatile decoder architecture that dynamically fuses these learned spatiotemporal representations, supporting diverse downstream tasks. Comprehensive evaluations demonstrate AgriFM’s superior performance over conventional deep learning approaches and state-of-the-art general-purpose RSFMs across all downstream tasks. Codes will be available at https://github.com/flyakon/AgriFM.

nan

Article 623

Title@2025-05-28 (3): The informativeness of the gradient revisited

Title: The informativeness of the gradient revisited

Die Aufschlusskraft des Gradienten wurde überarbeitet

重新讨论的梯度信息性 2505.22158v1

Authors: Rustem Takhanov

In the past decade gradient-based deep learning has revolutionized several applications. However, this rapid advancement has highlighted the need for a deeper theoretical understanding of its limitations. Research has shown that, in many practical learning tasks, the information contained in the gradient is so minimal that gradient-based methods require an exceedingly large number of iterations to achieve success. The informativeness of the gradient is typically measured by its variance with respect to the random selection of a target function from a hypothesis class. We use this framework and give a general bound on the variance in terms of a parameter related to the pairwise independence of the target function class and the collision entropy of the input distribution. Our bound scales as $ \tilde{\mathcal{O}}(\varepsilon+e^{-\frac{1}{2}\mathcal{E}_c}) $, where $ \tilde{\mathcal{O}} $ hides factors related to the regularity of the learning model and the loss function, $ \varepsilon $ measures the pairwise independence of the target function class and $\mathcal{E}_c$ is the collision entropy of the input distribution. To demonstrate the practical utility of our bound, we apply it to the class of Learning with Errors (LWE) mappings and high-frequency functions. In addition to the theoretical analysis, we present experiments to understand better the nature of recent deep learning-based attacks on LWE.

nan

Article 624

Title@2025-05-28 (3): Towards Practical Defect-Focused Automated Code Review

Title: Towards Practical Defect-Focused Automated Code Review

Auf dem Weg zu einer praktischen fehlerorientierten automatisierten Code-Überprüfung

走向实际失效-受污染的自动编码审查 2505.17928v2

Authors: Junyi Lu, Lili Jiang, Xiaojia Li, Jianbing Fang, Fengjun Zhang, Li Yang, Chun Zuo

The complexity of code reviews has driven efforts to automate review comments, but prior approaches oversimplify this task by treating it as snippet-level code-to-text generation and relying on text similarity metrics like BLEU for evaluation. These methods overlook repository context, real-world merge request evaluation, and defect detection, limiting their practicality. To address these issues, we explore the full automation pipeline within the online recommendation service of a company with nearly 400 million daily active users, analyzing industry-grade C++ codebases comprising hundreds of thousands of lines of code. We identify four key challenges: 1) capturing relevant context, 2) improving key bug inclusion (KBI), 3) reducing false alarm rates (FAR), and 4) integrating human workflows. To tackle these, we propose 1) code slicing algorithms for context extraction, 2) a multi-role LLM framework for KBI, 3) a filtering mechanism for FAR reduction, and 4) a novel prompt design for better human interaction. Our approach, validated on real-world merge requests from historical fault reports, achieves a 2x improvement over standard LLMs and a 10x gain over previous baselines. While the presented results focus on C++, the underlying framework design leverages language-agnostic principles (e.g., AST-based analysis), suggesting potential for broader applicability.

nan

Article 625

Title@2025-05-28 (3): Uncertainty Estimation for Heterophilic Graphs Through the Lens of Information Theory

Title: Uncertainty Estimation for Heterophilic Graphs Through the Lens of Information Theory

Ungewissheitsschätzung für heterophile Graphen durch die Linse der Informationstheorie

信息镜头信息理论流流中异血哲学图谱的不确定性估计 2505.22152v1

Authors: Dominik Fuchsgruber, Tom Wollschläger, Johannes Bordne, Stephan Günnemann

While uncertainty estimation for graphs recently gained traction, most methods rely on homophily and deteriorate in heterophilic settings. We address this by analyzing message passing neural networks from an information-theoretic perspective and developing a suitable analog to data processing inequality to quantify information throughout the model’s layers. In contrast to non-graph domains, information about the node-level prediction target can increase with model depth if a node’s features are semantically different from its neighbors. Therefore, on heterophilic graphs, the latent embeddings of an MPNN each provide different information about the data distribution - different from homophilic settings. This reveals that considering all node representations simultaneously is a key design principle for epistemic uncertainty estimation on graphs beyond homophily. We empirically confirm this with a simple post-hoc density estimator on the joint node embedding space that provides state-of-the-art uncertainty on heterophilic graphs. At the same time, it matches prior work on homophilic graphs without explicitly exploiting homophily through post-processing.

nan

Article 626

Title@2025-05-28 (3): Oryx: a Performant and Scalable Algorithm for Many-Agent Coordination in Offline MARL

Title: Oryx: a Performant and Scalable Algorithm for Many-Agent Coordination in Offline MARL

Oryx: ein performanter und skalierbarer Algorithmus für viele-Agenten-Koordination in Offline MARL

Oryx: MARL 离线下许多机构协调的性能和可缩放的数值 2505.22151v1

Authors: Claude Formanek, Omayma Mahjoub, Louay Ben Nessir, Sasha Abramowitz, Ruan de Kock, Wiem Khlifi, Simon Du Toit, Felix Chalumeau, Daniel Rajaonarivonivelomanantsoa, Arnol Fokam, Siddarth Singh, Ulrich Mbou Sob, Arnu Pretorius

A key challenge in offline multi-agent reinforcement learning (MARL) is achieving effective many-agent multi-step coordination in complex environments. In this work, we propose Oryx, a novel algorithm for offline cooperative MARL to directly address this challenge. Oryx adapts the recently proposed retention-based architecture Sable and combines it with a sequential form of implicit constraint Q-learning (ICQ), to develop a novel offline auto-regressive policy update scheme. This allows Oryx to solve complex coordination challenges while maintaining temporal coherence over lengthy trajectories. We evaluate Oryx across a diverse set of benchmarks from prior works (SMAC, RWARE, and Multi-Agent MuJoCo) covering tasks of both discrete and continuous control, varying in scale and difficulty. Oryx achieves state-of-the-art performance on more than 80% of the 65 tested datasets, outperforming prior offline MARL methods and demonstrating robust generalisation across domains with many agents and long horizons. Finally, we introduce new datasets to push the limits of many-agent coordination in offline MARL, and demonstrate Oryx’s superior ability to scale effectively in such settings. We will make all of our datasets, experimental data, and code available upon publication.

nan

Article 627

Title@2025-05-28 (3): Gradient Boosting Reinforcement Learning

Title: Gradient Boosting Reinforcement Learning

Gradientenfördernde Stärkung des Lernens

逐步推进强化学习 2407.08250v2

Authors: Benjamin Fuhrer, Chen Tessler, Gal Dalal

We present Gradient Boosting Reinforcement Learning (GBRL), a framework that adapts the strengths of gradient boosting trees (GBT) to reinforcement learning (RL) tasks. While neural networks (NNs) have become the de facto choice for RL, they face significant challenges with structured and categorical features and tend to generalize poorly to out-of-distribution samples. These are challenges for which GBTs have traditionally excelled in supervised learning. However, GBT’s application in RL has been limited. The design of traditional GBT libraries is optimized for static datasets with fixed labels, making them incompatible with RL’s dynamic nature, where both state distributions and reward signals evolve during training. GBRL overcomes this limitation by continuously interleaving tree construction with environment interaction. Through extensive experiments, we demonstrate that GBRL outperforms NNs in domains with structured observations and categorical features while maintaining competitive performance on standard continuous control benchmarks. Like its supervised learning counterpart, GBRL demonstrates superior robustness to out-of-distribution samples and better handles irregular state-action relationships.

nan

Article 628

Title@2025-05-28 (3): Bridging Arbitrary and Tree Metrics via Differentiable Gromov Hyperbolicity

Title: Bridging Arbitrary and Tree Metrics via Differentiable Gromov Hyperbolicity

Überbrückung von Willkür- und Baummetrics durch differenzierbare Gromov-Hyperbolizität

通过差别化格罗莫夫双向主义 2505.21073v2

Authors: Pierre Houedry, Nicolas Courty, Florestan Martin-Baillon, Laetitia Chapel, Titouan Vayer

Trees and the associated shortest-path tree metrics provide a powerful framework for representing hierarchical and combinatorial structures in data. Given an arbitrary metric space, its deviation from a tree metric can be quantified by Gromov’s $\delta$-hyperbolicity. Nonetheless, designing algorithms that bridge an arbitrary metric to its closest tree metric is still a vivid subject of interest, as most common approaches are either heuristical and lack guarantees, or perform moderately well. In this work, we introduce a novel differentiable optimization framework, coined DeltaZero, that solves this problem. Our method leverages a smooth surrogate for Gromov’s $\delta$-hyperbolicity which enables a gradient-based optimization, with a tractable complexity. The corresponding optimization procedure is derived from a problem with better worst case guarantees than existing bounds, and is justified statistically. Experiments on synthetic and real-world datasets demonstrate that our method consistently achieves state-of-the-art distortion.

nan

Article 629

Title@2025-05-28 (3): Limited Generalizability in Argument Mining: State-Of-The-Art Models Learn Datasets, Not Arguments

Title: Limited Generalizability in Argument Mining: State-Of-The-Art Models Learn Datasets, Not Arguments

Begrenzte Verallgemeinerbarkeit im Argumentbergbau: State-of-The-Art-Modelle lernen Datensätze, keine Argumente

《争议采矿业的限制性通用性:国家与艺术中的模式学习数据集,非论据》 2505.22137v1

Authors: Marc Feger, Katarina Boland, Stefan Dietze

Identifying arguments is a necessary prerequisite for various tasks in automated discourse analysis, particularly within contexts such as political debates, online discussions, and scientific reasoning. In addition to theoretical advances in understanding the constitution of arguments, a significant body of research has emerged around practical argument mining, supported by a growing number of publicly available datasets. On these benchmarks, BERT-like transformers have consistently performed best, reinforcing the belief that such models are broadly applicable across diverse contexts of debate. This study offers the first large-scale re-evaluation of such state-of-the-art models, with a specific focus on their ability to generalize in identifying arguments. We evaluate four transformers, three standard and one enhanced with contrastive pre-training for better generalization, on 17 English sentence-level datasets as most relevant to the task. Our findings show that, to varying degrees, these models tend to rely on lexical shortcuts tied to content words, suggesting that apparent progress may often be driven by dataset-specific cues rather than true task alignment. While the models achieve strong results on familiar benchmarks, their performance drops markedly when applied to unseen datasets. Nonetheless, incorporating both task-specific pre-training and joint benchmark training proves effective in enhancing both robustness and generalization.

nan

Article 630

Title@2025-05-28 (3): RAD: Redundancy-Aware Distillation for Hybrid Models via Self-Speculative Decoding

Title: RAD: Redundancy-Aware Distillation for Hybrid Models via Self-Speculative Decoding

RAD: Redundanz-Bewusst-Destillation für Hybridmodelle über selbstspekulative Decodierung

RAD: 通过自投机代号为混合模型进行再利用-软件蒸馏 2505.22135v1

Authors: Yuichiro Hoshino, Hideyuki Tachibana, Muneyoshi Inahara, Hiroto Takegawa

Hybrid models combining Transformers and State Space Models (SSMs) are promising for balancing performance and efficiency. However, optimizing these hybrid models, particularly by addressing the potential redundancy inherent within the Transformer components, remains a significant challenge. In this paper, we propose RAD (Redundancy-Aware Distillation), a novel framework that uses self-speculative decoding as a diagnostic tool to identify redundant attention layers within the model. These identified layers are then selectively replaced with SSM components, followed by targeted (self-)distillation. Specifically, RAD focuses knowledge transfer on the components identified as redundant, considering architectural changes and specific weight initialization strategies. We experimentally demonstrate that self-distillation using RAD significantly surpasses the performance of the original base model on mathematical and coding tasks. Furthermore, RAD is also effective in standard knowledge distillation settings, achieving up to approximately 2x faster convergence compared to baseline methods. Notably, while a baseline model distilled from a Llama-3.1 70B teacher achieves scores of 46.17 on GSM8K and 22.75 on CRUX, RAD achieves significantly higher scores of 71.27 on GSM8K and 28.25 on CRUX, even when using a much smaller Llama-3.1 8B teacher. RAD offers a new pathway for efficient optimization and performance enhancement in the distillation of hybrid models.

nan

Article 631

Title@2025-05-28 (3): JEDI: Latent End-to-end Diffusion Mitigates Agent-Human Performance Asymmetry in Model-Based Reinforcement Learning

Title: JEDI: Latent End-to-end Diffusion Mitigates Agent-Human Performance Asymmetry in Model-Based Reinforcement Learning

JEDI: Latent End-to-End-Diffusion mildert die Asymmetrie von Agent-Human Performance im modellbasierten Verstärkungslernen

JEDI: 以模型为基础的加强学习中前端至终端扩散消化剂-人类性能对称性 2505.19698v2

Authors: Jing Yu Lim, Zarif Ikram, Samson Yu, Haozhe Ma, Tze-Yun Leong, Dianbo Liu

Recent advances in model-based reinforcement learning (MBRL) have achieved super-human level performance on the Atari100k benchmark, driven by reinforcement learning agents trained on powerful diffusion world models. However, we identify that the current aggregates mask a major performance asymmetry: MBRL agents dramatically outperform humans in some tasks despite drastically underperforming in others, with the former inflating the aggregate metrics. This is especially pronounced in pixel-based agents trained with diffusion world models. In this work, we address the pronounced asymmetry observed in pixel-based agents as an initial attempt to reverse the worrying upward trend observed in them. We address the problematic aggregates by delineating all tasks as Agent-Optimal or Human-Optimal and advocate for equal importance on metrics from both sets. Next, we hypothesize this pronounced asymmetry is due to the lack of temporally-structured latent space trained with the World Model objective in pixel-based methods. Lastly, to address this issue, we propose Joint Embedding DIffusion (JEDI), a novel latent diffusion world model trained end-to-end with the self-consistency objective. JEDI outperforms SOTA models in human-optimal tasks while staying competitive across the Atari100k benchmark, and runs 3 times faster with 43% lower memory than the latest pixel-based diffusion baseline. Overall, our work rethinks what it truly means to cross human-level performance in Atari100k.

nan

Article 632

Title@2025-05-28 (3): Optimize Cardinality Estimation Model Pretraining by Simplifying the Training Datasets

Title: Optimize Cardinality Estimation Model Pretraining by Simplifying the Training Datasets

Kardinalitätsabschätzungsmodell optimieren Vorschulung durch Vereinfachung der Trainingsdatensätze

通过简化培训数据集,优化红红心估计模型预培训模式 2502.14350v2

Authors: Boyang Fang

The cardinality estimation is a key aspect of query optimization research, and its performance has significantly improved with the integration of machine learning. To overcome the “cold start” problem or the lack of model transferability in learned cardinality estimators, some pre-training cardinality estimation models have been proposed that use learning across multiple datasets and corresponding workloads. These models typically train on a dataset created by uniformly sampling from many datasets, but this approach may not be optimal. By applying the Group Distributionally Robust Optimization (Group DRO) algorithm to training datasets, we find that some specific training datasets contribute more significantly to model performance than others. Based on this observation, we conduct extensive experiments to delve deeper into pre-training cardinality estimators. Our results show how the performance of these models can be influenced by the datasets and corresponding workloads. Finally, we introduce a simplified training dataset, which has been reduced to a fraction of the size of existing pretraining datasets. Sufficient experimental results demonstrate that the pre-trained cardinality estimator based on this simplified dataset can still achieve comparable performance to existing models in zero-shot setups.

nan

Article 633

Title@2025-05-28 (3): Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL

Title: Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL

Neuvisualisierung von Schwach-zu-Strong-Verallgemeinerung in Theorie und Praxis: Reverse KL vs. Forward KL

重新审视理论和实践中弱到强的简单化:反向 KL vs. fward KL 2502.11107v3

Authors: Wei Yao, Wenkai Yang, Ziqiao Wang, Yankai Lin, Yong Liu

As large language models advance toward superhuman performance, ensuring their alignment with human values and abilities grows increasingly complex. Weak-to-strong generalization offers a promising approach by leveraging predictions from weaker models to guide stronger systems, but its effectiveness could be constrained by the inherent noise and inaccuracies in these weak predictions. To address this, we propose a theoretically grounded approach that replaces forward KL divergence-whose mass-covering behavior risks overfitting to imperfect weak signals-with reverse KL divergence. Reverse KL divergence’s zero-forcing effect prioritizes high-confidence predictions, effectively mitigating the influence of unreliable weak supervision. Theoretically, we extend existing bounds and derive tighter lower bounds for both forward and reverse KL divergence, establishing that reverse KL achieves at least comparable guarantees to forward KL. Notably, when a sufficiently pre-trained strong model is fine-tuned on the last linear layer, reverse KL guarantees that it outperforms its weak supervisor by the magnitude of their disagreement. Empirically, we demonstrate that reverse KL and reverse cross-entropy enable strong models to successfully outperform those trained with forward KL and standard cross-entropy across most settings, highlighting the practical advantages of these reverse losses.

nan

Article 634

Title@2025-05-28 (3): BiMi Sheets: Infosheets for bias mitigation methods

Title: BiMi Sheets: Infosheets for bias mitigation methods

BiMi Sheets: Infosheets für Methoden zur Biasminderung

BiMi 工作表:用于减少偏差方法的信息表 2505.22114v1

Authors: MaryBeth Defrance, Guillaume Bied, Maarten Buyl, Jefrey Lijffijt, Tijl De Bie

Over the past 15 years, hundreds of bias mitigation methods have been proposed in the pursuit of fairness in machine learning (ML). However, algorithmic biases are domain-, task-, and model-specific, leading to a `portability trap’: bias mitigation solutions in one context may not be appropriate in another. Thus, a myriad of design choices have to be made when creating a bias mitigation method, such as the formalization of fairness it pursues, and where and how it intervenes in the ML pipeline. This creates challenges in benchmarking and comparing the relative merits of different bias mitigation methods, and limits their uptake by practitioners. We propose BiMi Sheets as a portable, uniform guide to document the design choices of any bias mitigation method. This enables researchers and practitioners to quickly learn its main characteristics and to compare with their desiderata. Furthermore, the sheets’ structure allow for the creation of a structured database of bias mitigation methods. In order to foster the sheets’ adoption, we provide a platform for finding and creating BiMi Sheets at bimisheet.com.

nan

Article 635

Title@2025-05-28 (3): Understanding Model Ensemble in Transferable Adversarial Attack

Title: Understanding Model Ensemble in Transferable Adversarial Attack

Model-Ensemble in übertragbarem Widersacher-Angriff verstehen

理解可转让反向攻击中可相互转让攻击的示范组合 2410.06851v3

Authors: Wei Yao, Zeliang Zhang, Huayi Tang, Yong Liu

Model ensemble adversarial attack has become a powerful method for generating transferable adversarial examples that can target even unknown models, but its theoretical foundation remains underexplored. To address this gap, we provide early theoretical insights that serve as a roadmap for advancing model ensemble adversarial attack. We first define transferability error to measure the error in adversarial transferability, alongside concepts of diversity and empirical model ensemble Rademacher complexity. We then decompose the transferability error into vulnerability, diversity, and a constant, which rigidly explains the origin of transferability error in model ensemble attack: the vulnerability of an adversarial example to ensemble components, and the diversity of ensemble components. Furthermore, we apply the latest mathematical tools in information theory to bound the transferability error using complexity and generalization terms, contributing to three practical guidelines for reducing transferability error: (1) incorporating more surrogate models, (2) increasing their diversity, and (3) reducing their complexity in cases of overfitting. Finally, extensive experiments with 54 models validate our theoretical framework, representing a significant step forward in understanding transferable model ensemble adversarial attacks.

nan

Article 636

Title@2025-05-28 (3): The quest for the GRAph Level autoEncoder (GRALE)

Title: The quest for the GRAph Level autoEncoder (GRALE)

Die Suche nach dem GRAph Level AutoEncoder (GRALE)

寻求GRALE(GRALE)的GRAP 高级自动编码器(GRALE) 2505.22109v1

Authors: Paul Krzakala, Gabriel Melo, Charlotte Laclau, Florence d’Alché-Buc, Rémi Flamary

Although graph-based learning has attracted a lot of attention, graph representation learning is still a challenging task whose resolution may impact key application fields such as chemistry or biology. To this end, we introduce GRALE, a novel graph autoencoder that encodes and decodes graphs of varying sizes into a shared embedding space. GRALE is trained using an Optimal Transport-inspired loss that compares the original and reconstructed graphs and leverages a differentiable node matching module, which is trained jointly with the encoder and decoder. The proposed attention-based architecture relies on Evoformer, the core component of AlphaFold, which we extend to support both graph encoding and decoding. We show, in numerical experiments on simulated and molecular data, that GRALE enables a highly general form of pre-training, applicable to a wide range of downstream tasks, from classification and regression to more complex tasks such as graph interpolation, editing, matching, and prediction.

nan

Article 637

Title@2025-05-28 (3): Inclusive, Differentially Private Federated Learning for Clinical Data

Title: Inclusive, Differentially Private Federated Learning for Clinical Data

Inklusives, differenziert privates Federated Learning für klinische Daten

包容性、差异化私联校临床数据学习 2505.22108v1

Authors: Santhosh Parampottupadam, Melih Coşğun, Sarthak Pati, Maximilian Zenk, Saikat Roy, Dimitrios Bounias, Benjamin Hamm, Sinem Sav, Ralf Floca, Klaus Maier-Hein

Federated Learning (FL) offers a promising approach for training clinical AI models without centralizing sensitive patient data. However, its real-world adoption is hindered by challenges related to privacy, resource constraints, and compliance. Existing Differential Privacy (DP) approaches often apply uniform noise, which disproportionately degrades model performance, even among well-compliant institutions. In this work, we propose a novel compliance-aware FL framework that enhances DP by adaptively adjusting noise based on quantifiable client compliance scores. Additionally, we introduce a compliance scoring tool based on key healthcare and security standards to promote secure, inclusive, and equitable participation across diverse clinical settings. Extensive experiments on public datasets demonstrate that integrating under-resourced, less compliant clinics with highly regulated institutions yields accuracy improvements of up to 15% over traditional FL. This work advances FL by balancing privacy, compliance, and performance, making it a viable solution for real-world clinical workflows in global healthcare.

nan

Article 638

Title@2025-05-28 (3): Curse of High Dimensionality Issue in Transformer for Long-context Modeling

Title: Curse of High Dimensionality Issue in Transformer for Long-context Modeling

Fluch der Hochdimensionalitätsfrage im Transformer für die Langkontextmodellierung

变异器中高多维度问题的诅咒,用于长期建模 2505.22107v1

Authors: Shuhai Zhang, Zeng You, Yaofo Chen, Zhiquan Wen, Qianyue Wang, Zhijie Qiu, Yuanqing Li, Mingkui Tan

Transformer-based large language models (LLMs) excel in natural language processing tasks by capturing long-range dependencies through self-attention mechanisms. However, long-context modeling faces significant computational inefficiencies due to \textit{redundant} attention computations: while attention weights are often \textit{sparse}, all tokens consume \textit{equal} computational resources. In this paper, we reformulate traditional probabilistic sequence modeling as a \textit{supervised learning task}, enabling the separation of relevant and irrelevant tokens and providing a clearer understanding of redundancy. Based on this reformulation, we theoretically analyze attention sparsity, revealing that only a few tokens significantly contribute to predictions. Building on this, we formulate attention optimization as a linear coding problem and propose a \textit{group coding strategy}, theoretically showing its ability to improve robustness against random noise and enhance learning efficiency. Motivated by this, we propose \textit{Dynamic Group Attention} (DGA), which leverages the group coding to explicitly reduce redundancy by aggregating less important tokens during attention computation. Empirical results show that our DGA significantly reduces computational costs while maintaining competitive performance.Code is available at https://github.com/bolixinyu/DynamicGroupAttention.

nan

Article 639

Title@2025-05-28 (3): Devil is in the Details: Density Guidance for Detail-Aware Generation with Flow Models

Title: Devil is in the Details: Density Guidance for Detail-Aware Generation with Flow Models

Devil ist in den Details: Dichte-Anleitung für Detail-Aware-Generation mit Flow-Modellen

魔鬼在细节中: 使用流动模型生成详细软件的密度指导 2502.05807v2

Authors: Rafał Karczewski, Markus Heinonen, Vikas Garg

Diffusion models have emerged as a powerful class of generative models, capable of producing high-quality images by mapping noise to a data distribution. However, recent findings suggest that image likelihood does not align with perceptual quality: high-likelihood samples tend to be smooth, while lower-likelihood ones are more detailed. Controlling sample density is thus crucial for balancing realism and detail. In this paper, we analyze an existing technique, Prior Guidance, which scales the latent code to influence image detail. We introduce score alignment, a condition that explains why this method works and show that it can be tractably checked for any continuous normalizing flow model. We then propose Density Guidance, a principled modification of the generative ODE that enables exact log-density control during sampling. Finally, we extend Density Guidance to stochastic sampling, ensuring precise log-density control while allowing controlled variation in structure or fine details. Our experiments demonstrate that these techniques provide fine-grained control over image detail without compromising sample quality. Code is available at https://github.com/Aalto-QuML/density-guidance.

nan

Article 640

Title@2025-05-28 (3): Visuospatial Cognitive Assistant

Title: Visuospatial Cognitive Assistant

Visuospatial Cognitive Assistant

活性呼吸空间感知助理 2505.12312v3

Authors: Qi Feng

Video-based spatial cognition is vital for robotics and embodied AI but challenges current Vision-Language Models (VLMs). This paper makes two key contributions. First, we introduce ViCA (Visuospatial Cognitive Assistant)-322K, a diverse dataset of 322,003 QA pairs from real-world indoor videos (ARKitScenes, ScanNet, ScanNet++), offering supervision for 3D metadata-grounded queries and video-based complex reasoning. Second, we develop ViCA-7B, fine-tuned on ViCA-322K, which achieves new state-of-the-art on all eight VSI-Bench tasks, outperforming existing models, including larger ones (e.g., +26.1 on Absolute Distance). For interpretability, we present ViCA-Thinking-2.68K, a dataset with explicit reasoning chains, and fine-tune ViCA-7B to create ViCA-7B-Thinking, a model that articulates its spatial reasoning. Our work highlights the importance of targeted data and suggests paths for improved temporal-spatial modeling. We release all resources to foster research in robust visuospatial intelligence.

nan

Article 641

Title@2025-05-28 (3): Efficient Dynamic Shielding for Parametric Safety Specifications

Title: Efficient Dynamic Shielding for Parametric Safety Specifications

Effiziente dynamische Abschirmung für parametrische Sicherheitsspezifikationen

用于参数安全规格的有效动态防护 2505.22104v1

Authors: Davide Corsi, Kaushik Mallik, Andoni Rodriguez, Cesar Sanchez

Shielding has emerged as a promising approach for ensuring safety of AI-controlled autonomous systems. The algorithmic goal is to compute a shield, which is a runtime safety enforcement tool that needs to monitor and intervene the AI controller’s actions if safety could be compromised otherwise. Traditional shields are designed statically for a specific safety requirement. Therefore, if the safety requirement changes at runtime due to changing operating conditions, the shield needs to be recomputed from scratch, causing delays that could be fatal. We introduce dynamic shields for parametric safety specifications, which are succinctly represented sets of all possible safety specifications that may be encountered at runtime. Our dynamic shields are statically designed for a given safety parameter set, and are able to dynamically adapt as the true safety specification (permissible by the parameters) is revealed at runtime. The main algorithmic novelty lies in the dynamic adaptation procedure, which is a simple and fast algorithm that utilizes known features of standard safety shields, like maximal permissiveness. We report experimental results for a robot navigation problem in unknown territories, where the safety specification evolves as new obstacles are discovered at runtime. In our experiments, the dynamic shields took a few minutes for their offline design, and took between a fraction of a second and a few seconds for online adaptation at each step, whereas the brute-force online recomputation approach was up to 5 times slower.

nan

Article 642

Title@2025-05-28 (3): Towards Visuospatial Cognition via Hierarchical Fusion of Visual Experts

Title: Towards Visuospatial Cognition via Hierarchical Fusion of Visual Experts

Auf dem Weg zur Visuospatialen Kognition durch hierarchische Fusion von visuellen Experten

争取通过视觉专家的等级化融合实现纵向空间聚合 2505.12363v3

Authors: Qi Feng

While Multimodal Large Language Models (MLLMs) excel at general vision-language tasks, visuospatial cognition - reasoning about spatial layouts, relations, and dynamics - remains a significant challenge. Existing models often lack the necessary architectural components and specialized training data for fine-grained spatial understanding. We introduce ViCA2 (Visuospatial Cognitive Assistant 2), a novel MLLM designed to enhance spatial reasoning. ViCA2 features a dual vision encoder architecture integrating SigLIP for semantics and Hiera for spatial structure, coupled with a token ratio control mechanism for efficiency. We also developed ViCA-322K, a new large-scale dataset with over 322,000 spatially grounded question-answer pairs for targeted instruction tuning. On the challenging VSI-Bench benchmark, our ViCA2-7B model achieves a state-of-the-art average score of 56.8, significantly surpassing larger open-source models (e.g., LLaVA-NeXT-Video-72B, 40.9) and leading proprietary models (Gemini-1.5 Pro, 45.4). This demonstrates the effectiveness of our approach in achieving strong visuospatial intelligence with a compact model. We release ViCA2, its codebase, and the ViCA-322K dataset to facilitate further research.

nan

Article 643

Title@2025-05-28 (3): Conditional Denoising Meets Polynomial Modeling: A Flexible Decoupled Framework for Time Series Forecasting

Title: Conditional Denoising Meets Polynomial Modeling: A Flexible Decoupled Framework for Time Series Forecasting

Bedingtes Stören trifft auf Polynommodellierung: Ein flexibles entkoppeltes Framework für die Zeitreihenprognose

满足多面性建模:时间序列预测灵活拆分框架 2410.13253v6

Authors: Jintao Zhang, Mingyue Cheng, Xiaoyu Tao, Zhiding Liu, Daoyu Wang

Time series forecasting models are becoming increasingly prevalent due to their critical role in decision-making across various domains. However, most existing approaches represent the coupled temporal patterns, often neglecting the distinction between their specific components. In particular, fluctuating patterns and smooth trends within time series exhibit distinct characteristics. In this work, to model complicated temporal patterns, we propose a Conditional Denoising Polynomial Modeling (CDPM) framework, where probabilistic diffusion models and deterministic linear models are trained end-to-end. Instead of modeling the coupled time series, CDPM decomposes it into trend and seasonal components for modeling them separately. To capture the fluctuating seasonal component, we employ a probabilistic diffusion model based on statistical properties from the historical window. For the smooth trend component, a module is proposed to enhance linear models by incorporating historical dependencies, thereby preserving underlying trends and mitigating noise distortion. Extensive experiments conducted on six benchmarks demonstrate the effectiveness of our framework, highlighting the potential of combining probabilistic and deterministic models.Our code is available at https://github.com/zjt-gpu/CDPM.

nan

Article 644

Title@2025-05-28 (3): On the Transferability and Discriminability of Repersentation Learning in Unsupervised Domain Adaptation

Title: On the Transferability and Discriminability of Repersentation Learning in Unsupervised Domain Adaptation

Über die Übertragbarkeit und Diskriminierbarkeit von Representation Learning in unüberwachter Domain-Anpassung

关于无监督域适应中可转让性和可转让性 2505.22099v1

Authors: Wenwen Qiang, Ziyin Gu, Lingyu Si, Jiangmeng Li, Changwen Zheng, Fuchun Sun, Hui Xiong

In this paper, we addressed the limitation of relying solely on distribution alignment and source-domain empirical risk minimization in Unsupervised Domain Adaptation (UDA). Our information-theoretic analysis showed that this standard adversarial-based framework neglects the discriminability of target-domain features, leading to suboptimal performance. To bridge this theoretical-practical gap, we defined “good representation learning” as guaranteeing both transferability and discriminability, and proved that an additional loss term targeting target-domain discriminability is necessary. Building on these insights, we proposed a novel adversarial-based UDA framework that explicitly integrates a domain alignment objective with a discriminability-enhancing constraint. Instantiated as Domain-Invariant Representation Learning with Global and Local Consistency (RLGLC), our method leverages Asymmetrically-Relaxed Wasserstein of Wasserstein Distance (AR-WWD) to address class imbalance and semantic dimension weighting, and employs a local consistency mechanism to preserve fine-grained target-domain discriminative information. Extensive experiments across multiple benchmark datasets demonstrate that RLGLC consistently surpasses state-of-the-art methods, confirming the value of our theoretical perspective and underscoring the necessity of enforcing both transferability and discriminability in adversarial-based UDA.

nan

Article 645

Title@2025-05-28 (3): Knowledge Base Construction for Knowledge-Augmented Text-to-SQL

Title: Knowledge Base Construction for Knowledge-Augmented Text-to-SQL

Knowledge Base Construction für wissensbasierte Text-zu-SQL

知识强化文字到SQL知识基础建设 2505.22096v1

Authors: Jinheon Baek, Horst Samulowitz, Oktie Hassanzadeh, Dharmashankar Subramanian, Sola Shirai, Alfio Gliozzo, Debarun Bhattacharjya

Text-to-SQL aims to translate natural language queries into SQL statements, which is practical as it enables anyone to easily retrieve the desired information from databases. Recently, many existing approaches tackle this problem with Large Language Models (LLMs), leveraging their strong capability in understanding user queries and generating corresponding SQL code. Yet, the parametric knowledge in LLMs might be limited to covering all the diverse and domain-specific queries that require grounding in various database schemas, which makes generated SQLs less accurate oftentimes. To tackle this, we propose constructing the knowledge base for text-to-SQL, a foundational source of knowledge, from which we retrieve and generate the necessary knowledge for given queries. In particular, unlike existing approaches that either manually annotate knowledge or generate only a few pieces of knowledge for each query, our knowledge base is comprehensive, which is constructed based on a combination of all the available questions and their associated database schemas along with their relevant knowledge, and can be reused for unseen databases from different datasets and domains. We validate our approach on multiple text-to-SQL datasets, considering both the overlapping and non-overlapping database scenarios, where it outperforms relevant baselines substantially.

nan

Article 646

Title@2025-05-28 (3): Diffusion Models as Cartoonists: The Curious Case of High Density Regions

Title: Diffusion Models as Cartoonists: The Curious Case of High Density Regions

Diffusionsmodelle als Karikaturisten: Der seltsame Fall von Regionen mit hoher Dichte

作为漫画家的传播模型:高密度地区令人好奇的案例 2411.01293v4

Authors: Rafał Karczewski, Markus Heinonen, Vikas Garg

We investigate what kind of images lie in the high-density regions of diffusion models. We introduce a theoretical mode-tracking process capable of pinpointing the exact mode of the denoising distribution, and we propose a practical high-density sampler that consistently generates images of higher likelihood than usual samplers. Our empirical findings reveal the existence of significantly higher likelihood samples that typical samplers do not produce, often manifesting as cartoon-like drawings or blurry images depending on the noise level. Curiously, these patterns emerge in datasets devoid of such examples. We also present a novel approach to track sample likelihoods in diffusion SDEs, which remarkably incurs no additional computational cost. Code is available at https://github.com/Aalto-QuML/high-density-diffusion.

nan

Article 647

Title@2025-05-28 (3): High Volume Rate 3D Ultrasound Reconstruction with Diffusion Models

Title: High Volume Rate 3D Ultrasound Reconstruction with Diffusion Models

Hohe Lautstärke 3D-Ultraschall-Rekonstruktion mit Diffusions-Modellen

3D超声波重建,采用传播模型 2505.22090v1

Authors: Tristan S. W. Stevens, Oisín Nolan, Oudom Somphone, Jean-Luc Robert, Ruud J. G. van Sloun

Three-dimensional ultrasound enables real-time volumetric visualization of anatomical structures. Unlike traditional 2D ultrasound, 3D imaging reduces the reliance on precise probe orientation, potentially making ultrasound more accessible to clinicians with varying levels of experience and improving automated measurements and post-exam analysis. However, achieving both high volume rates and high image quality remains a significant challenge. While 3D diverging waves can provide high volume rates, they suffer from limited tissue harmonic generation and increased multipath effects, which degrade image quality. One compromise is to retain the focusing in elevation while leveraging unfocused diverging waves in the lateral direction to reduce the number of transmissions per elevation plane. Reaching the volume rates achieved by full 3D diverging waves, however, requires dramatically undersampling the number of elevation planes. Subsequently, to render the full volume, simple interpolation techniques are applied. This paper introduces a novel approach to 3D ultrasound reconstruction from a reduced set of elevation planes by employing diffusion models (DMs) to achieve increased spatial and temporal resolution. We compare both traditional and supervised deep learning-based interpolation methods on a 3D cardiac ultrasound dataset. Our results show that DM-based reconstruction consistently outperforms the baselines in image quality and downstream task performance. Additionally, we accelerate inference by leveraging the temporal consistency inherent to ultrasound sequences. Finally, we explore the robustness of the proposed method by exploiting the probabilistic nature of diffusion posterior sampling to quantify reconstruction uncertainty and demonstrate improved recall on out-of-distribution data with synthetic anomalies under strong subsampling.

nan

Article 648

Title@2025-05-28 (3): Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN

Title: Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN

Basis- und Exponentvorhersage in mathematischen Ausdrücken mit Multi-Output CNN

利用有线电视新闻网的多种产出对数学表达式进行基础和指数预测 2407.14967v2

Authors: Md Laraib Salam, Akash S Balsaraf, Gaurav Gupta, Ashish Rajeshwar Kulkarni

The use of neural networks and deep learning techniques in image processing has significantly advanced the field, enabling highly accurate recognition results. However, achieving high recognition rates often necessitates complex network models, which can be challenging to train and require substantial computational resources. This research presents a simplified yet effective approach to predicting both the base and exponent from images of mathematical expressions using a multi-output Convolutional Neural Network (CNN). The model is trained on 10,900 synthetically generated images containing exponent expressions, incorporating random noise, font size variations, and blur intensity to simulate real-world conditions. The proposed CNN model demonstrates robust performance with efficient training time. The experimental results indicate that the model achieves high accuracy in predicting the base and exponent values, proving the efficacy of this approach in handling noisy and varied input images.

nan

Article 649

Title@2025-05-28 (3): Domain-Specific Pruning of Large Mixture-of-Experts Models with Few-shot Demonstrations

Title: Domain-Specific Pruning of Large Mixture-of-Experts Models with Few-shot Demonstrations

Domain-spezifisches Pruning von großen Mixture-of-Experts-Modellen mit nur wenigen Demonstrationen

大型混合型专家模型的域特定情景,少发示范 2504.06792v2

Authors: Zican Dong, Han Peng, Peiyu Liu, Wayne Xin Zhao, Dong Wu, Feng Xiao, Zhifeng Wang

Mixture-of-Experts (MoE) models achieve a favorable trade-off between performance and inference efficiency by activating only a subset of experts. However, the memory overhead of storing all experts remains a major limitation, especially in large-scale MoE models such as DeepSeek-R1(671B). In this study, we investigate domain specialization and expert redundancy in large-scale MoE models and uncover a consistent behavior we term few-shot expert localization, with only a few in-domain demonstrations, the model consistently activates a sparse and stable subset of experts on tasks within the same domain. Building on this observation, we propose a simple yet effective pruning framework, EASY-EP, that leverages a few domain-specific demonstrations to identify and retain only the most relevant experts. EASY-EP comprises two key components: output-aware expert importance assessment and expert-level token contribution estimation. The former evaluates the importance of each expert for the current token by considering the gating scores and L2 norm of the outputs of activated experts, while the latter assesses the contribution of tokens based on representation similarities before and after routed experts. Experiments on DeepSeek-R1 and DeepSeek-V3-0324 show that our method can achieve comparable performances and $2.99\times$ throughput under the same memory budget with full model with only half the experts.

nan

Article 650

Title@2025-05-28 (3): PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning

Title: PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning

PADAM: Parallel gemittelter Adam reduziert Fehler bei stochastischer Optimierung im wissenschaftlichen maschinellen Lernen

PADAM: 平行平均 Adam 减少科学机器学习中随机优化的错误 2505.22085v1

Authors: Arnulf Jentzen, Julian Kranz, Adrian Riekert

Averaging techniques such as Ruppert–Polyak averaging and exponential movering averaging (EMA) are powerful approaches to accelerate optimization procedures of stochastic gradient descent (SGD) optimization methods such as the popular ADAM optimizer. However, depending on the specific optimization problem under consideration, the type and the parameters for the averaging need to be adjusted to achieve the smallest optimization error. In this work we propose an averaging approach, which we refer to as parallel averaged ADAM (PADAM), in which we compute parallely different averaged variants of ADAM and during the training process dynamically select the variant with the smallest optimization error. A central feature of this approach is that this procedure requires no more gradient evaluations than the usual ADAM optimizer as each of the averaged trajectories relies on the same underlying ADAM trajectory and thus on the same underlying gradients. We test the proposed PADAM optimizer in 13 stochastic optimization and deep neural network (DNN) learning problems and compare its performance with known optimizers from the literature such as standard SGD, momentum SGD, Adam with and without EMA, and ADAMW. In particular, we apply the compared optimizers to physics-informed neural network, deep Galerkin, deep backward stochastic differential equation and deep Kolmogorov approximations for boundary value partial differential equation problems from scientific machine learning, as well as to DNN approximations for optimal control and optimal stopping problems. In nearly all of the considered examples PADAM achieves, sometimes among others and sometimes exclusively, essentially the smallest optimization error. This work thus strongly suggest to consider PADAM for scientific machine learning problems and also motivates further research for adaptive averaging procedures within the training of DNNs.

nan

Article 651

Title@2025-05-28 (3): Hyperbolic recurrent neural network as the first type of non-Euclidean neural quantum state ansatz

Title: Hyperbolic recurrent neural network as the first type of non-Euclidean neural quantum state ansatz

Hyperbolisches rezidivierendes neuronales Netzwerk als erste Art von nicht-euklidischen neuronalen Quantenzustandsansatz

超双曲经常性神经网络,作为第一种非欧洲的神经量子状态 ansatz 2505.22083v1

Authors: H. L. Dao

In this work, we introduce the first type of non-Euclidean neural quantum state (NQS) ansatz, in the form of the hyperbolic GRU (a variant of recurrent neural networks (RNNs)), to be used in the Variational Monte Carlo method of approximating the ground state wavefunction for quantum many-body systems. In particular, we examine the performances of NQS ansatzes constructed from both conventional or Euclidean RNN/GRU and from hyperbolic GRU in the prototypical settings of the one- and two-dimensional transverse field Ising models (TFIM) of up to 100 spins and the one-dimensional Heisenberg $J_1J_2$ and $J_1J_2J_3$ systems of up 50 spins. By virtue of the fact that, for all of the experiments performed in this work, hyperbolic GRU can yield performances comparable to or better than Euclidean RNNs, which have been extensively studied in these settings in the literature, our work is a proof-of-concept for the viability of hyperbolic GRU as the first type of non-Euclidean NQS ansatz for quantum many-body systems. Furthermore, in settings where the Hamiltonian displays a clear hierarchical interaction structure, such as the 1D Heisenberg $J_1J_2$ & $J_1J_2J_3$ systems with the 1st, 2nd and even 3rd nearest neighbor interactions, our results show that hyperbolic GRU definitively outperforms its Euclidean version in all instances. The fact that these results are reminiscent of the established ones from natural language processing where hyperbolic GRU almost always outperforms Euclidean RNNs when the training data exhibit a tree-like or hierarchical structure leads us to hypothesize that hyperbolic GRU NQS ansatz would likely outperform Euclidean RNN/GRU NQS ansatz in quantum spin systems that involve different degrees of nearest neighbor interactions. Finally, with this work, we hope to initiate future studies of other types of non-Euclidean NQS beyond hyperbolic GRU.

nan

Article 652

Title@2025-05-28 (3): Improved Bounds for Swap Multicalibration and Swap Omniprediction

Title: Improved Bounds for Swap Multicalibration and Swap Omniprediction

Verbesserte Bounds für Swap Multikalibrierung und Swap Omniprediction

用于交换多校准和交换面宽度的改进宽度 2505.20885v2

Authors: Haipeng Luo, Spandan Senapati, Vatsal Sharan

In this paper, we consider the related problems of multicalibration – a multigroup fairness notion and omniprediction – a simultaneous loss minimization paradigm, both in the distributional and online settings. The recent work of Garg et al. (2024) raised the open problem of whether it is possible to efficiently achieve $O(\sqrt{T})$ $\ell_{2}$-multicalibration error against bounded linear functions. In this paper, we answer this question in a strongly affirmative sense. We propose an efficient algorithm that achieves $O(T^{\frac{1}{3}})$ $\ell_{2}$-swap multicalibration error (both in high probability and expectation). On propagating this bound onward, we obtain significantly improved rates for $\ell_{1}$-swap multicalibration and swap omniprediction for a loss class of convex Lipschitz functions. In particular, we show that our algorithm achieves $O(T^{\frac{2}{3}})$ $\ell_{1}$-swap multicalibration and swap omniprediction errors, thereby improving upon the previous best-known bound of $O(T^{\frac{7}{8}})$. As a consequence of our improved online results, we further obtain several improved sample complexity rates in the distributional setting. In particular, we establish a $O(\varepsilon ^ {-3})$ sample complexity of efficiently learning an $\varepsilon$-swap omnipredictor for the class of convex and Lipschitz functions, $O(\varepsilon ^{-2.5})$ sample complexity of efficiently learning an $\varepsilon$-swap agnostic learner for the squared loss, and $O(\varepsilon ^ {-5}), O(\varepsilon ^ {-2.5})$ sample complexities of learning $\ell_{1}, \ell_{2}$-swap multicalibrated predictors against linear functions, all of which significantly improve on the previous best-known bounds.

nan

Article 653

Title@2025-05-28 (3): LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation

Title: LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation

LongReD: Degradierung von Langtext-Großen Sprachmodellen durch Restaurationsdestillation

LongReD:通过恢复蒸馏减少长长长大语言模型的短期退化 2502.07365v3

Authors: Zican Dong, Junyi Li, Jinhao Jiang, Mingyu Xu, Wayne Xin Zhao, Bingning Wang, Weipeng Chen

Large language models (LLMs) have gained extended context windows through scaling positional encodings and lightweight continual pre-training. However, this often leads to degraded performance on short-text tasks, while the reasons for this degradation remain insufficiently explored. In this work, we identify two primary factors contributing to this issue: distribution drift in hidden states and attention scores, and catastrophic forgetting during continual pre-training. To address these challenges, we propose Long Context Pre-training with Restoration Distillation (LongReD), a novel approach designed to mitigate short-text performance degradation through minimizing the distribution discrepancy between the extended and original models. Besides training on long texts, LongReD distills the hidden state of selected layers from the original model on short texts. Additionally, LongReD also introduces a short-to-long distillation, aligning the output distribution on short texts with that on long texts by leveraging skipped positional indices. Experiments on common text benchmarks demonstrate that LongReD effectively preserves the model’s short-text performance while maintaining comparable or even better capacity to handle long texts than baselines. Our code is available at https://github.com/RUCAIBox/LongReD.

nan

Article 654

Title@2025-05-28 (3): A Hybrid Multi-Factor Network with Dynamic Sequence Modeling for Early Warning of Intraoperative Hypotension

Title: A Hybrid Multi-Factor Network with Dynamic Sequence Modeling for Early Warning of Intraoperative Hypotension

Hybrides Multi-Factor-Netzwerk mit dynamischer Sequenzmodellierung zur Frühwarnung von intraoperativer Hypotonie

混合多要素网络,具有动态序列模型模型,以及早警告不合作水分的不合作状态; 2409.11064v3

Authors: Mingyue Cheng, Jintao Zhang, Zhiding Liu, Chunli Liu

Intraoperative hypotension (IOH) prediction using past physiological signals is crucial, as IOH may lead to inadequate organ perfusion and significantly elevate the risk of severe complications and mortality. However, current methods often rely on static modeling, overlooking the complex temporal dependencies and the inherently non-stationary nature of physiological signals. We propose a Hybrid Multi-Factor (HMF) network that formulates IOH prediction as a dynamic sequence forecasting task, explicitly capturing both temporal dependencies and physiological non-stationarity. We represent signal dynamics as multivariate time series and decompose them into trend and seasonal components, enabling separate modeling of long-term and periodic variations. Each component is encoded with a patch-based Transformer to balance computational efficiency and feature representation. To address distributional drift from evolving signals, we introduce a symmetric normalization mechanism. Experiments on both public and real-world clinical datasets show that HMF significantly outperforms competitive baselines. We hope HMF offers new insights into IOH prediction and ultimately promotes safer surgical care. Our code is available at https://github.com/Mingyue-Cheng/HMF.

nan

Article 655

Title@2025-05-28 (3): Can Test-time Computation Mitigate Memorization Bias in Neural Symbolic Regression?

Title: Can Test-time Computation Mitigate Memorization Bias in Neural Symbolic Regression?

Kann Testzeit-Computation Mitigate Memorization Bias in Neural Symbolische Regression?

测试时计算在神经符号回落中是否可模拟记忆回弹? 2505.22081v1

Authors: Shun Sato, Issei Sato

Symbolic regression aims to discover mathematical equations that fit given numerical data. It has been applied in various fields of scientific research, such as producing human-readable expressions that explain physical phenomena. Recently, Neural symbolic regression (NSR) methods that involve Transformers pre-trained on large-scale synthetic datasets have gained attention. While these methods offer advantages such as short inference time, they suffer from low performance, particularly when the number of input variables is large. In this study, we hypothesized that this limitation stems from the memorization bias of Transformers in symbolic regression. We conducted a quantitative evaluation of this bias in Transformers using a synthetic dataset and found that Transformers rarely generate expressions not present in the training data. Additional theoretical analysis reveals that this bias arises from the Transformer’s inability to construct expressions compositionally while verifying their numerical validity. We finally examined if tailoring test-time strategies can lead to reduced memorization bias and better performance. We empirically demonstrate that providing additional information to the model at test time can significantly mitigate memorization bias. On the other hand, we also find that reducing memorization bias does not necessarily correlate with improved performance. These findings contribute to a deeper understanding of the limitations of NSR approaches and offer a foundation for designing more robust, generalizable symbolic regression methods. Code is available at https://github.com/Shun-0922/Mem-Bias-NSR .

nan

Article 656

Title@2025-05-28 (3): The Resurrection of the ReLU

Title: The Resurrection of the ReLU

Die Auferstehung der ReLU

鲁鲁的复活, 2505.22074v1

Authors: Coşku Can Horuz, Geoffrey Kasenbacher, Saya Higuchi, Sebastian Kairat, Jendrik Stoltz, Moritz Pesl, Bernhard A. Moser, Christoph Linse, Thomas Martinetz, Sebastian Otte

Modeling sophisticated activation functions within deep learning architectures has evolved into a distinct research direction. Functions such as GELU, SELU, and SiLU offer smooth gradients and improved convergence properties, making them popular choices in state-of-the-art models. Despite this trend, the classical ReLU remains appealing due to its simplicity, inherent sparsity, and other advantageous topological characteristics. However, ReLU units are prone to becoming irreversibly inactive - a phenomenon known as the dying ReLU problem - which limits their overall effectiveness. In this work, we introduce surrogate gradient learning for ReLU (SUGAR) as a novel, plug-and-play regularizer for deep architectures. SUGAR preserves the standard ReLU function during the forward pass but replaces its derivative in the backward pass with a smooth surrogate that avoids zeroing out gradients. We demonstrate that SUGAR, when paired with a well-chosen surrogate function, substantially enhances generalization performance over convolutional network architectures such as VGG-16 and ResNet-18, providing sparser activations while effectively resurrecting dead ReLUs. Moreover, we show that even in modern architectures like Conv2NeXt and Swin Transformer - which typically employ GELU - substituting these with SUGAR yields competitive and even slightly superior performance. These findings challenge the prevailing notion that advanced activation functions are necessary for optimal performance. Instead, they suggest that the conventional ReLU, particularly with appropriate gradient handling, can serve as a strong, versatile revived classic across a broad range of deep learning vision models.

nan

Article 657

Title@2025-05-28 (3): PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models

Title: PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models

PRMBench: Ein feinkörniger und anspruchsvoller Benchmark für Prozess-Level-Reward-Modelle

PRMBBench:进程一级奖励模式的精细和质疑基准 2501.03124v4

Authors: Mingyang Song, Zhaochen Su, Xiaoye Qu, Jiawei Zhou, Yu Cheng

Process-level Reward Models (PRMs) are crucial for complex reasoning and decision-making tasks, where each intermediate step plays an important role in the reasoning process. Since language models are prone to various types of errors during the reasoning process, PRMs are required to possess nuanced capabilities for detecting various implicit error types in real-world scenarios. However, current benchmarks primarily focus on step correctness, failing to evaluate PRMs’ performance systematically. To address this gap, we introduce PRMBench, a process-level benchmark specifically designed to assess the fine-grained error detection capabilities of PRMs. PRMBench comprises 6,216 carefully designed problems and 83,456 step-level labels, evaluating models across multiple dimensions, including simplicity, soundness, and sensitivity. In our experiments on 15 models, spanning both open-source PRMs and closed-source large language models prompted as critic models, we uncover significant weaknesses in current PRMs. These findings underscore the challenges inherent in process-level evaluation and highlight key directions for future research. We hope PRMBench can be a robust bench for advancing research on PRM evaluation and development.

nan

Article 658

Title@2025-05-28 (3): Message-Passing GNNs Fail to Approximate Sparse Triangular Factorizations

Title: Message-Passing GNNs Fail to Approximate Sparse Triangular Factorizations

Message-Passing-GNNs fehlschlagen an ungefähren Sparse Dreiecks-Fabrizierungen

投送信件 GNN 失败于近似偏差的三角三角因子化 2502.01397v2

Authors: Vladislav Trifonov, Ekaterina Muravleva, Ivan Oseledets

Graph Neural Networks (GNNs) have been proposed as a tool for learning sparse matrix preconditioners, which are key components in accelerating linear solvers. This position paper argues that message-passing GNNs are fundamentally incapable of approximating sparse triangular factorizations. We demonstrate that message-passing GNNs fundamentally fail to approximate sparse triangular factorizations for classes of matrices for which high-quality preconditioners exist but require non-local dependencies. To illustrate this, we construct a set of baselines using both synthetic matrices and real-world examples from the SuiteSparse collection. Across a range of GNN architectures, including Graph Attention Networks and Graph Transformers, we observe severe performance degradation compared to exact or K-optimal factorizations, with cosine similarity dropping below $0.6$ in key cases. Our theoretical and empirical results suggest that architectural innovations beyond message-passing are necessary for applying GNNs to scientific computing tasks such as matrix factorization. Experiments demonstrate that overcoming non-locality alone is insufficient. Tailored architectures are necessary to capture the required dependencies since even a completely non-local Graph Transformer fails to match the proposed baselines.

nan

Article 659

Title@2025-05-28 (3): Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head

Title: Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head

Dual-Head-Wissensdestillation: Optimierung der Logits-Nutzung mit Hilfe eines Hilfskopfes

双头知识蒸馏:用辅助头加强登录的使用 2411.08937v2

Authors: Penghui Yang, Chen-Chen Zong, Sheng-Jun Huang, Lei Feng, Bo An

Traditional knowledge distillation focuses on aligning the student’s predicted probabilities with both ground-truth labels and the teacher’s predicted probabilities. However, the transition to predicted probabilities from logits would obscure certain indispensable information. To address this issue, it is intuitive to additionally introduce a logit-level loss function as a supplement to the widely used probability-level loss function, for exploiting the latent information of logits. Unfortunately, we empirically find that the amalgamation of the newly introduced logit-level loss and the previous probability-level loss will lead to performance degeneration, even trailing behind the performance of employing either loss in isolation. We attribute this phenomenon to the collapse of the classification head, which is verified by our theoretical analysis based on the neural collapse theory. Specifically, the gradients of the two loss functions exhibit contradictions in the linear classifier yet display no such conflict within the backbone. Drawing from the theoretical analysis, we propose a novel method called dual-head knowledge distillation, which partitions the linear classifier into two classification heads responsible for different losses, thereby preserving the beneficial effects of both losses on the backbone while eliminating adverse influences on the classification head. Extensive experiments validate that our method can effectively exploit the information inside the logits and achieve superior performance against state-of-the-art counterparts. Our code is available at: https://github.com/penghui-yang/DHKD.

nan

Article 660

Title@2025-05-28 (3): Learning Latent Graph Structures and their Uncertainty

Title: Learning Latent Graph Structures and their Uncertainty

Lernen Latent Graph Structures und ihre Unsicherheit

学习后边图结构及其不确定性 2405.19933v2

Authors: Alessandro Manenti, Daniele Zambon, Cesare Alippi

Graph neural networks use relational information as an inductive bias to enhance prediction performance. Not rarely, task-relevant relations are unknown and graph structure learning approaches have been proposed to learn them from data. Given their latent nature, no graph observations are available to provide a direct training signal to the learnable relations. Therefore, graph topologies are typically learned on the prediction task alongside the other graph neural network parameters. In this paper, we demonstrate that minimizing point-prediction losses does not guarantee proper learning of the latent relational information and its associated uncertainty. Conversely, we prove that suitable loss functions on the stochastic model outputs simultaneously grant solving two tasks: (i) learning the unknown distribution of the latent graph and (ii) achieving optimal predictions of the target variable. Finally, we propose a sampling-based method that solves this joint learning task. Empirical results validate our theoretical claims and demonstrate the effectiveness of the proposed approach.

nan

Article 661

Title@2025-05-28 (3): Towards Resilient and Sustainable Global Industrial Systems: An Evolutionary-Based Approach

Title: Towards Resilient and Sustainable Global Industrial Systems: An Evolutionary-Based Approach

Auf dem Weg zu stabilen und nachhaltigen globalen Industriesystemen: ein evolutionärer Ansatz

走向具有复原力和可持续的全球工业系统:基于演变的方法 2503.11688v2

Authors: Václav Jirkovský, Jiří Kubalík, Petr Kadera, Arnd Schirrmann, Andreas Mitschke, Andreas Zindel

This paper presents a new complex optimization problem in the field of automatic design of advanced industrial systems and proposes a hybrid optimization approach to solve the problem. The problem is multi-objective as it aims at finding solutions that minimize CO2 emissions, transportation time, and costs. The optimization approach combines an evolutionary algorithm and classical mathematical programming to design resilient and sustainable global manufacturing networks. Further, it makes use of the OWL ontology for data consistency and constraint management. The experimental validation demonstrates the effectiveness of the approach in both single and double sourcing scenarios. The proposed methodology, in general, can be applied to any industry case with complex manufacturing and supply chain challenges.

nan

Article 662

Title@2025-05-28 (3): Quantum Kernel Learning for Small Dataset Modeling in Semiconductor Fabrication: Application to Ohmic Contact

Title: Quantum Kernel Learning for Small Dataset Modeling in Semiconductor Fabrication: Application to Ohmic Contact

Quanten-Kernel-Lernen für kleine Datensätze Modellierung in Halbleiterfertigung: Anwendung auf Ohm-Kontakt

半导体制造中小型数据集建模的量子核心学习: Ohmic 接触的应用 2409.10803v3

Authors: Zeheng Wang, Fangzhou Wang, Liang Li, Zirui Wang, Timothy van der Laan, Ross C. C. Leon, Jing-Kai Huang, Muhammad Usman

Modeling complex semiconductor fabrication processes such as Ohmic contact formation remains challenging due to high-dimensional parameter spaces and limited experimental data. While classical machine learning (CML) approaches have been successful in many domains, their performance degrades in small-sample, nonlinear scenarios. In this work, we investigate quantum machine learning (QML) as an alternative, exploiting quantum kernels to capture intricate correlations from compact datasets. Using only 159 experimental GaN HEMT samples, we develop a quantum kernel-aligned regressor (QKAR) combining a shallow Pauli-Z feature map with a trainable quantum kernel alignment (QKA) layer. All models, including seven baseline CML regressors, are evaluated under a unified PCA-based preprocessing pipeline to ensure a fair comparison. QKAR consistently outperforms classical baselines across multiple metrics (MAE, MSE, RMSE), achieving a mean absolute error of 0.338 Omega mm when validated on experimental data. We further assess noise robustness and generalization through cross-validation and new device fabrication. These findings suggest that carefully constructed QML models could provide predictive advantages in data-constrained semiconductor modeling, offering a foundation for practical deployment on near-term quantum hardware. While challenges remain for both QML and CML, this study demonstrates QML’s potential as a complementary approach in complex process modeling tasks.

nan

Article 663

Title@2025-05-28 (3): A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Title: A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Eine umfassende Umfrage in LLM(-Agent) Full Stack Sicherheit: Daten, Schulung und Bereitstellung

用LLLM(-代理)全堆安全:数据、培训和部署进行的全面调查 2504.15585v3

Authors: Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Junyuan Mao, Yu Wang, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Wenjie Qu, Yue Liu, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Zhaoxin Fan, Yi Ding, Donghai Hong, Jiaming Ji, Yingxin Lai, Zitong Yu, Xinfeng Li, Yifan Jiang, Yanhui Li, Xinyu Deng, Junlin Wu, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Qiufeng Wang, Wenxuan Wang, Dongrui Liu, Yanwei Yue, Wenke Huang, Guancheng Wan, Heng Chang, Tianlin Li, Yi Yu, Chenghao Li, Jiawei Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Jiaheng Zhang, Tianwei Zhang, Xingjun Ma, Jindong Gu, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Lingjuan Lyu, Yuval Elovici, Bhavya Kailkhura, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, Xiaofeng Wang, Dacheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu

The remarkable success of Large Language Models (LLMs) has illuminated a promising pathway toward achieving Artificial General Intelligence for both academic and industrial communities, owing to their unprecedented performance across various applications. As LLMs continue to gain prominence in both research and commercial domains, their security and safety implications have become a growing concern, not only for researchers and corporations but also for every nation. Currently, existing surveys on LLM safety primarily focus on specific stages of the LLM lifecycle, e.g., deployment phase or fine-tuning phase, lacking a comprehensive understanding of the entire “lifechain” of LLMs. To address this gap, this paper introduces, for the first time, the concept of “full-stack” safety to systematically consider safety issues throughout the entire process of LLM training, deployment, and eventual commercialization. Compared to the off-the-shelf LLM safety surveys, our work demonstrates several distinctive advantages: (I) Comprehensive Perspective. We define the complete LLM lifecycle as encompassing data preparation, pre-training, post-training, deployment and final commercialization. To our knowledge, this represents the first safety survey to encompass the entire lifecycle of LLMs. (II) Extensive Literature Support. Our research is grounded in an exhaustive review of over 800+ papers, ensuring comprehensive coverage and systematic organization of security issues within a more holistic understanding. (III) Unique Insights. Through systematic literature analysis, we have developed reliable roadmaps and perspectives for each chapter. Our work identifies promising research directions, including safety in data generation, alignment techniques, model editing, and LLM-based agent systems. These insights provide valuable guidance for researchers pursuing future work in this field.

nan

Article 664

Title@2025-05-28 (3): ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

Title: ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

ORIGEN: Zero-Shot 3D-Orientierungsgrundierung in Text-zu-Bild-Generierung

将零热3D定向定位作为产生文字到图像的基础 2503.22194v2

Authors: Yunhong Min, Daehyeon Choi, Kyeongmin Yeo, Jihyun Lee, Minhyuk Sung

We introduce ORIGEN, the first zero-shot method for 3D orientation grounding in text-to-image generation across multiple objects and diverse categories. While previous work on spatial grounding in image generation has mainly focused on 2D positioning, it lacks control over 3D orientation. To address this, we propose a reward-guided sampling approach using a pretrained discriminative model for 3D orientation estimation and a one-step text-to-image generative flow model. While gradient-ascent-based optimization is a natural choice for reward-based guidance, it struggles to maintain image realism. Instead, we adopt a sampling-based approach using Langevin dynamics, which extends gradient ascent by simply injecting random noise–requiring just a single additional line of code. Additionally, we introduce adaptive time rescaling based on the reward function to accelerate convergence. Our experiments show that ORIGEN outperforms both training-based and test-time guidance methods across quantitative metrics and user studies.

nan

Article 665

Title@2025-05-28 (3): Reinforced Reasoning for Embodied Planning

Title: Reinforced Reasoning for Embodied Planning

Verstärkte Begründung für die körperbetonte Planung

强化规划强化理由 2505.22050v1

Authors: Di Wu, Jiaxin Fan, Junzhe Zang, Guanbo Wang, Wei Yin, Wenhao Li, Bo Jin

Embodied planning requires agents to make coherent multi-step decisions based on dynamic visual observations and natural language goals. While recent vision-language models (VLMs) excel at static perception tasks, they struggle with the temporal reasoning, spatial understanding, and commonsense grounding needed for planning in interactive environments. In this work, we introduce a reinforcement fine-tuning framework that brings R1-style reasoning enhancement into embodied planning. We first distill a high-quality dataset from a powerful closed-source model and perform supervised fine-tuning (SFT) to equip the model with structured decision-making priors. We then design a rule-based reward function tailored to multi-step action quality and optimize the policy via Generalized Reinforced Preference Optimization (GRPO). Our approach is evaluated on Embench, a recent benchmark for interactive embodied tasks, covering both in-domain and out-of-domain scenarios. Experimental results show that our method significantly outperforms models of similar or larger scale, including GPT-4o-mini and 70B+ open-source baselines, and exhibits strong generalization to unseen environments. This work highlights the potential of reinforcement-driven reasoning to advance long-horizon planning in embodied AI.

nan

Article 666

Title@2025-05-28 (3): Differentiable Generalized Sliced Wasserstein Plans

Title: Differentiable Generalized Sliced Wasserstein Plans

Unterschiedliche generalisierte Wasserstein-Pläne

刀切瓦西斯坦计划 2505.22049v1

Authors: Laetitia Chapel, Romain Tavenard, Samuel Vaiter

Optimal Transport (OT) has attracted significant interest in the machine learning community, not only for its ability to define meaningful distances between probability distributions – such as the Wasserstein distance – but also for its formulation of OT plans. Its computational complexity remains a bottleneck, though, and slicing techniques have been developed to scale OT to large datasets. Recently, a novel slicing scheme, dubbed min-SWGG, lifts a single one-dimensional plan back to the original multidimensional space, finally selecting the slice that yields the lowest Wasserstein distance as an approximation of the full OT plan. Despite its computational and theoretical advantages, min-SWGG inherits typical limitations of slicing methods: (i) the number of required slices grows exponentially with the data dimension, and (ii) it is constrained to linear projections. Here, we reformulate min-SWGG as a bilevel optimization problem and propose a differentiable approximation scheme to efficiently identify the optimal slice, even in high-dimensional settings. We furthermore define its generalized extension for accommodating to data living on manifolds. Finally, we demonstrate the practical value of our approach in various applications, including gradient flows on manifolds and high-dimensional spaces, as well as a novel sliced OT-based conditional flow matching for image generation – where fast computation of transport plans is essential.

nan

Article 667

Title@2025-05-28 (3): Learning Curves of Stochastic Gradient Descent in Kernel Regression

Title: Learning Curves of Stochastic Gradient Descent in Kernel Regression

Lernkurven des stochastischen Gradienten Abstiegs in Kernel-Regression

内核倒退中尾部渐变源的学习曲线 2505.22048v1

Authors: Haihan Zhang, Weicheng Lin, Yuanshi Liu, Cong Fang

This paper considers a canonical problem in kernel regression: how good are the model performances when it is trained by the popular online first-order algorithms, compared to the offline ones, such as ridge and ridgeless regression? In this paper, we analyze the foundational single-pass Stochastic Gradient Descent (SGD) in kernel regression under source condition where the optimal predictor can even not belong to the RKHS, i.e. the model is misspecified. Specifically, we focus on the inner product kernel over the sphere and characterize the exact orders of the excess risk curves under different scales of sample sizes $n$ concerning the input dimension $d$. Surprisingly, we show that SGD achieves min-max optimal rates up to constants among all the scales, without suffering the saturation, a prevalent phenomenon observed in (ridge) regression, except when the model is highly misspecified and the learning is in a final stage where $n\gg d^{\gamma}$ with any constant $\gamma >0$. The main reason for SGD to overcome the curse of saturation is the exponentially decaying step size schedule, a common practice in deep neural network training. As a byproduct, we provide the \emph{first} provable advantage of the scheme over the iterative averaging method in the common setting.

nan

Article 668

Title@2025-05-28 (3): Learning to Steer Learners in Games

Title: Learning to Steer Learners in Games

Lernen zu Steer Learners in Spielen

在运动会中学习向运动会中的稳坐学生学习 2502.20770v2

Authors: Yizhou Zhang, Yi-An Ma, Eric Mazumdar

We consider the problem of learning to exploit learning algorithms through repeated interactions in games. Specifically, we focus on the case of repeated two player, finite-action games, in which an optimizer aims to steer a no-regret learner to a Stackelberg equilibrium without knowledge of its payoffs. We first show that this is impossible if the optimizer only knows that the learner is using an algorithm from the general class of no-regret algorithms. This suggests that the optimizer requires more information about the learner’s objectives or algorithm to successfully exploit them. Building on this intuition, we reduce the problem for the optimizer to that of recovering the learner’s payoff structure. We demonstrate the effectiveness of this approach if the learner’s algorithm is drawn from a smaller class by analyzing two examples: one where the learner uses an ascent algorithm, and another where the learner uses stochastic mirror ascent with known regularizer and step sizes.

nan

Article 669

Title@2025-05-28 (3): PUATE: Efficient Average Treatment Effect Estimation from Treated (Positive) and Unlabeled Units

Title: PUATE: Efficient Average Treatment Effect Estimation from Treated (Positive) and Unlabeled Units

PUATE: Effiziente Schätzung des durchschnittlichen Behandlungseffekts aus behandelten (Positiven) und nicht gekennzeichneten Einheiten

PUATE: 高效平均处理效果估算处理(积极)单位和无标签单位的高效平均处理效果 2501.19345v2

Authors: Masahiro Kato, Fumiaki Kozai, Ryo Inokuchi

The estimation of average treatment effects (ATEs), defined as the difference in expected outcomes between treatment and control groups, is a central topic in causal inference. This study develops semiparametric efficient estimators for ATE in a setting where only a treatment group and an unlabeled group, consisting of units whose treatment status is unknown, are observed. This scenario constitutes a variant of learning from positive and unlabeled data (PU learning) and can be viewed as a special case of ATE estimation with missing data. For this setting, we derive the semiparametric efficiency bounds, which characterize the lowest achievable asymptotic variance for regular estimators. We then construct semiparametric efficient ATE estimators that attain these bounds. Our results contribute to the literature on causal inference with missing data and weakly supervised learning.

nan

Article 670

Title@2025-05-28 (3): MultiScale Contextual Bandits for Long Term Objectives

Title: MultiScale Contextual Bandits for Long Term Objectives

MultiScale Contextual Bandits für langfristige Ziele

长期目标多层次背景影响 2503.17674v2

Authors: Richa Rastogi, Yuta Saito, Thorsten Joachims

The feedback that AI systems (e.g., recommender systems, chatbots) collect from user interactions is a crucial source of training data. While short-term feedback (e.g., clicks, engagement) is widely used for training, there is ample evidence that optimizing short-term feedback does not necessarily achieve the desired long-term objectives. Unfortunately, directly optimizing for long-term objectives is challenging, and we identify the disconnect in the timescales of short-term interventions (e.g., rankings) and the long-term feedback (e.g., user retention) as one of the key obstacles. To overcome this disconnect, we introduce the framework of MultiScale Policy Learning to contextually reconcile that AI systems need to act and optimize feedback at multiple interdependent timescales. Following a PAC-Bayes motivation, we show how the lower timescales with more plentiful data can provide a data-dependent hierarchical prior for faster learning at higher scales, where data is more scarce. As a result, the policies at all levels effectively optimize for the long-term. We instantiate the framework with MultiScale Off-Policy Bandit Learning (MSBL) and demonstrate its effectiveness on three tasks relating to recommender and conversational systems.

nan

Article 671

Title@2025-05-28 (3): Latent Mamba Operator for Partial Differential Equations

Title: Latent Mamba Operator for Partial Differential Equations

Latent Mamba Operator für partielle Differentialgleichungen

部分差异方程的中端 Mamba 运算符 2505.19105v2

Authors: Karn Tiwari, Niladri Dutta, N M Anoop Krishnan, Prathosh A P

Neural operators have emerged as powerful data-driven frameworks for solving Partial Differential Equations (PDEs), offering significant speedups over numerical methods. However, existing neural operators struggle with scalability in high-dimensional spaces, incur high computational costs, and face challenges in capturing continuous and long-range dependencies in PDE dynamics. To address these limitations, we introduce the Latent Mamba Operator (LaMO), which integrates the efficiency of state-space models (SSMs) in latent space with the expressive power of kernel integral formulations in neural operators. We also establish a theoretical connection between state-space models (SSMs) and the kernel integral of neural operators. Extensive experiments across diverse PDE benchmarks on regular grids, structured meshes, and point clouds covering solid and fluid physics datasets, LaMOs achieve consistent state-of-the-art (SOTA) performance, with a 32.3% improvement over existing baselines in solution operator approximation, highlighting its efficacy in modeling complex PDE solutions.

nan

Article 672

Title@2025-05-28 (3): Estimating the Effects of Sample Training Orders for Large Language Models without Retraining

Title: Estimating the Effects of Sample Training Orders for Large Language Models without Retraining

Bewertung der Auswirkungen von Mustertrainingsaufträgen für große Sprachmodelle ohne Umschulung

估计无再培训的大语言模式抽样培训令的影响 2505.22042v1

Authors: Hao Yang, Haoxuan Li, Mengyue Yang, Xu Chen, Mingming Gong

The order of training samples plays a crucial role in large language models (LLMs), significantly impacting both their external performance and internal learning dynamics. Traditional methods for investigating this effect generally require retraining the model with various sample orders, which is computationally infeasible for LLMs. In this work, we improve traditional methods by designing a retraining-free framework. By approximating Adam optimizer updates with first- and second-order Taylor expansions and utilizing random projection methods to store intermediate checkpoints, our framework can efficiently estimate model parameters for arbitrary training sample orders. Next, we apply our framework to two downstream research problems: (1) Training curriculum design for LLMs – we base our retraining-free framework to propose a novel curriculum learning strategy that augments curriculum proposals with estimated model performances, enabling more informed sample scheduling. (2) LLMs’ memorization and generalization effect analysis – we use our retraining-free framework to estimate how the positions of training samples influence LLMs’ capacity for memorization and generalization. We conduct extensive experiments to validate the effectiveness of our retraining-free framework in reproducing the true model performances, and further demonstrate its potential in optimizing LLM training curricula and analyzing the memorization and generalization effects of LLMs.

nan

Article 673

Title@2025-05-28 (3): Detecting Undesired Process Behavior by Means of Retrieval Augmented Generation

Title: Detecting Undesired Process Behavior by Means of Retrieval Augmented Generation

Erkennung von unerwünschtem Prozessverhalten mittels retrievaler Augmented Generation

通过回收增加一代的手段检测不想要的流程行为 2505.22041v1

Authors: Michael Grohs, Adrian Rebmann, Jana-Rebecca Rehse

Conformance checking techniques detect undesired process behavior by comparing process executions that are recorded in event logs to desired behavior that is captured in a dedicated process model. If such models are not available, conformance checking techniques are not applicable, but organizations might still be interested in detecting undesired behavior in their processes. To enable this, existing approaches use Large Language Models (LLMs), assuming that they can learn to distinguish desired from undesired behavior through fine-tuning. However, fine-tuning is highly resource-intensive and the fine-tuned LLMs often do not generalize well. To address these limitations, we propose an approach that requires neither a dedicated process model nor resource-intensive fine-tuning to detect undesired process behavior. Instead, we use Retrieval Augmented Generation (RAG) to provide an LLM with direct access to a knowledge base that contains both desired and undesired process behavior from other processes, assuming that the LLM can transfer this knowledge to the process at hand. Our evaluation shows that our approach outperforms fine-tuned LLMs in detecting undesired behavior, demonstrating that RAG is a viable alternative to resource-intensive fine-tuning, particularly when enriched with relevant context from the event log, such as frequent traces and activities.

nan

Article 674

Title@2025-05-28 (3): Revisiting In-Context Learning with Long Context Language Models

Title: Revisiting In-Context Learning with Long Context Language Models

Das In-Context-Lernen mit langen Kontext-Sprachmodellen

以长方语言模式重新研究内文学习 2412.16926v3

Authors: Jinheon Baek, Sun Jae Lee, Prakhar Gupta, Geunseob Oh, Siddharth Dalmia, Prateek Kolhar

In-Context Learning (ICL) is a technique by which language models make predictions based on examples provided in their input context. Previously, their context window size imposed a limit on the number of examples that can be shown, making example selection techniques crucial for identifying the maximally effective set of examples. However, the recent advent of Long Context Language Models (LCLMs) has significantly increased the number of examples that can be included in context, raising an important question of whether ICL performance in a many-shot regime is still sensitive to the method of sample selection. To answer this, we revisit these approaches in the context of LCLMs through extensive experiments on 18 datasets spanning 4 tasks. Surprisingly, we observe that sophisticated example selection techniques do not yield significant improvements over a simple random sample selection method. Instead, we discover that the advent of LCLMs has fundamentally shifted the challenge of ICL from that of selecting the most effective examples to that of collecting sufficient examples to fill the context window. Specifically, in certain datasets, including all available examples does not fully utilize the context window; however, by augmenting the examples in context with a simple data augmentation approach, we substantially improve ICL performance by 5%.

nan

Article 675

Title@2025-05-28 (3): Weakly-Supervised Contrastive Learning for Imprecise Class Labels

Title: Weakly-Supervised Contrastive Learning for Imprecise Class Labels

Schwachüberwachtes Kontrastives Lernen für ungenaue Klassen-Etiketten

简便类标签的微弱监督反竞争学习 2505.22028v1

Authors: Zi-Hao Zhou, Jun-Jie Wang, Tong Wei, Min-Ling Zhang

Contrastive learning has achieved remarkable success in learning effective representations, with supervised contrastive learning often outperforming self-supervised approaches. However, in real-world scenarios, data annotations are often ambiguous or inaccurate, meaning that class labels may not reliably indicate whether two examples belong to the same class. This limitation restricts the applicability of supervised contrastive learning. To address this challenge, we introduce the concept of ``continuous semantic similarity’’ to define positive and negative pairs. Instead of directly relying on imprecise class labels, we measure the semantic similarity between example pairs, which quantifies how closely they belong to the same category by iteratively refining weak supervisory signals. Based on this concept, we propose a graph-theoretic framework for weakly-supervised contrastive learning, where semantic similarity serves as the graph weights. Our framework is highly versatile and can be applied to many weakly-supervised learning scenarios. We demonstrate its effectiveness through experiments in two common settings, i.e., noisy label and partial label learning, where existing methods can be easily integrated to significantly improve performance. Theoretically, we establish an error bound for our approach, showing that it can approximate supervised contrastive learning under mild conditions. The implementation code is available at https://github.com/Speechless-10308/WSC.

nan

Article 676

Title@2025-05-28 (3): Evaluation of the impact of expert knowledge: How decision support scores impact the effectiveness of automatic knowledge-driven feature engineering (aKDFE)

Title: Evaluation of the impact of expert knowledge: How decision support scores impact the effectiveness of automatic knowledge-driven feature engineering (aKDFE)

Bewertung der Auswirkungen von Expertenwissen: Wie die Entscheidungsunterstützung die Wirksamkeit des automatischen wissensbasierten Feature Engineerings beeinflusst (aKDFE)

评价专家知识的影响:决策支持的评分如何影响知识驱动的自动知识特性工程(KDFE)的有效性 2504.05928v2

Authors: Olof Björneld, Tora Hammar, Daniel Nilsson, Alisa Lincke, Welf Löwe

Adverse Drug Events (ADEs), harmful medication effects, pose significant healthcare challenges, impacting patient safety and costs. This study evaluates automatic Knowledge-Driven Feature Engineering (aKDFE) for improved ADE prediction from Electronic Health Record (EHR) data, comparing it with automated event-based Knowledge Discovery in Databases (KDD). We investigated how incorporating domain-specific ADE risk scores for prolonged heart QT interval, extracted from the Janusmed Riskprofile (Janusmed) Clinical Decision Support System (CDSS), affects prediction performance using EHR data and medication handling events. Results indicate that, while aKDFE step 1 (event-based feature generation) alone did not significantly improve ADE prediction performance, aKDFE step 2 (patient-centric transformation) enhances the prediction performance. High Area Under the Receiver Operating Characteristic curve (AUROC) values suggest strong feature correlations to the outcome, aligning with the predictive power of patients’ prior healthcare history for ADEs. Statistical analysis did not confirm that incorporating the Janusmed information (i) risk scores and (ii) medication route of administration into the model’s feature set enhanced predictive performance. However, the patient-centric transformation applied by aKDFE proved to be a highly effective feature engineering approach. Limitations include a single-project focus, potential bias from machine learning pipeline methods, and reliance on AUROC. In conclusion, aKDFE, particularly with patient-centric transformation, improves ADE prediction from EHR data. Future work will explore attention-based models, event feature sequences, and automatic methods for incorporating domain knowledge into the aKDFE framework.

nan

Article 677

Title@2025-05-28 (3): Efficient Online Reinforcement Learning for Diffusion Policy

Title: Efficient Online Reinforcement Learning for Diffusion Policy

Effizientes Online-Verstärkungslernen für die Diffusionspolitik

高效在线强化学习促进传播政策 2502.00361v3

Authors: Haitong Ma, Tianyi Chen, Kai Wang, Na Li, Bo Dai

Diffusion policies have achieved superior performance in imitation learning and offline reinforcement learning (RL) due to their rich expressiveness. However, the conventional diffusion training procedure requires samples from target distribution, which is impossible in online RL since we cannot sample from the optimal policy. Backpropagating policy gradient through the diffusion process incurs huge computational costs and instability, thus being expensive and not scalable. To enable efficient training of diffusion policies in online RL, we generalize the conventional denoising score matching by reweighting the loss function. The resulting Reweighted Score Matching (RSM) preserves the optimal solution and low computational cost of denoising score matching, while eliminating the need to sample from the target distribution and allowing learning to optimize value functions. We introduce two tractable reweighted loss functions to solve two commonly used policy optimization problems, policy mirror descent and max-entropy policy, resulting in two practical algorithms named Diffusion Policy Mirror Descent (DPMD) and Soft Diffusion Actor-Critic (SDAC). We conducted comprehensive comparisons on MuJoCo benchmarks. The empirical results show that the proposed algorithms outperform recent diffusion-policy online RLs on most tasks, and the DPMD improves more than 120% over soft actor-critic on Humanoid and Ant.

nan

Article 678

Title@2025-05-28 (3): Model Diffusion for Certifiable Few-shot Transfer Learning

Title: Model Diffusion for Certifiable Few-shot Transfer Learning

Modell-Diffusion für zertifizierbares Transfer-Lernen mit wenigen Fotos

可核证的 “ 几光 “ 转让学习模型传播 2502.06970v2

Authors: Fady Rezk, Royson Lee, Henry Gouk, Timothy Hospedales, Minyoung Kim

In contemporary deep learning, a prevalent and effective workflow for solving low-data problems is adapting powerful pre-trained foundation models (FMs) to new tasks via parameter-efficient fine-tuning (PEFT). However, while empirically effective, the resulting solutions lack generalisation guarantees to certify their accuracy - which may be required for ethical or legal reasons prior to deployment in high-importance applications. In this paper we develop a novel transfer learning approach that is designed to facilitate non-vacuous learning theoretic generalisation guarantees for downstream tasks, even in the low-shot regime. Specifically, we first use upstream tasks to train a distribution over PEFT parameters. We then learn the downstream task by a sample-and-evaluate procedure – sampling plausible PEFTs from the trained diffusion model and selecting the one with the highest likelihood on the downstream data. Crucially, this confines our model hypothesis to a finite set of PEFT samples. In contrast to the typical continuous hypothesis spaces of neural network weights, this facilitates tighter risk certificates. We instantiate our bound and show non-trivial generalization guarantees compared to existing learning approaches which lead to vacuous bounds in the low-shot regime.

nan

Article 679

Title@2025-05-28 (3): Learning in Compact Spaces with Approximately Normalized Transformers

Title: Learning in Compact Spaces with Approximately Normalized Transformers

Lernen in kompakten Räumen mit etwa normalisierten Transformatoren

学习与大约正常化变异器的紧凑空间的学习 2505.22014v1

Authors: Jörg K. H. Franke, Urs Spiegelhalter, Marianna Nezhurina, Jenia Jitsev, Frank Hutter, Michael Hefenbrock

In deep learning, regularization and normalization are common solutions for challenges such as overfitting, numerical instabilities, and the increasing variance in the residual stream. An alternative approach is to force all parameters and representations to lie on a hypersphere. This removes the need for regularization and increases convergence speed, but comes with additional costs. In this work, we propose a more holistic but approximate normalization (anTransformer). Our approach constrains the norm of parameters and normalizes all representations via scalar multiplications motivated by the tight concentration of the norms of high-dimensional random vectors. When applied to GPT training, we observe a 40% faster convergence compared to models with QK normalization, with less than 3% additional runtime. Deriving scaling laws for anGPT, we found our method enables training with larger batch sizes and fewer hyperparameters, while matching the favorable scaling characteristics of classic GPT architectures.

nan

Article 680

Title@2025-05-28 (3): SageAttention2++: A More Efficient Implementation of SageAttention2

Title: SageAttention2++: A More Efficient Implementation of SageAttention2

SageAttention2++: Effizientere Umsetzung von SageAttention2

SageAttention2++:更有效地实施SageAttention2 2505.21136v2

Authors: Jintao Zhang, Xiaoming Xu, Jia Wei, Haofeng Huang, Pengle Zhang, Chendong Xiang, Jun Zhu, Jianfei Chen

The efficiency of attention is critical because its time complexity grows quadratically with sequence length. SageAttention2 addresses this by utilizing quantization to accelerate matrix multiplications (Matmul) in attention. To further accelerate SageAttention2, we propose to utilize the faster instruction of FP8 Matmul accumulated in FP16. The instruction is 2x faster than the FP8 Matmul used in SageAttention2. Our experiments show that SageAttention2++ achieves a 3.9x speedup over FlashAttention while maintaining the same attention accuracy as SageAttention2. This means SageAttention2++ effectively accelerates various models, including those for language, image, and video generation, with negligible end-to-end metrics loss. The code will be available at https://github.com/thu-ml/SageAttention.

nan

Article 681

Title@2025-05-28 (3): A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

Title: A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

Eine umfassende Real-World Bewertung von Audio Watermarking Algorithmen: Werden sie überleben Neural Codecs?

对音频水标定法的全面现实世界评估:它们能否生存神经规范? 2505.19663v2

Authors: Yigitcan Özer, Woosung Choi, Joan Serrà, Mayank Kumar Singh, Wei-Hsiang Liao, Yuki Mitsufuji

We introduce the Robust Audio Watermarking Benchmark (RAW-Bench), a benchmark for evaluating deep learning-based audio watermarking methods with standardized and systematic comparisons. To simulate real-world usage, we introduce a comprehensive audio attack pipeline with various distortions such as compression, background noise, and reverberation, along with a diverse test dataset including speech, environmental sounds, and music recordings. Evaluating four existing watermarking methods on RAW-bench reveals two main insights: (i) neural compression techniques pose the most significant challenge, even when algorithms are trained with such compressions; and (ii) training with audio attacks generally improves robustness, although it is insufficient in some cases. Furthermore, we find that specific distortions, such as polarity inversion, time stretching, or reverb, seriously affect certain methods. The evaluation framework is accessible at github.com/SonyResearch/raw_bench.

nan

Article 682

Title@2025-05-28 (3): Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains

Title: Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains

Domaino1s: Leitende LLM-Gründung für erklärbare Antworten in High-Stakes-Domains

域1:在高占用域中解释可解答案的指导性LLM 2501.14431v2

Authors: Xu Chu, Zhijie Tan, Hanlin Xue, Guanyu Wang, Tong Mo, Weiping Li

Large Language Models (LLMs) are widely applied to downstream domains. However, current LLMs for high-stakes domain tasks, such as financial investment and legal QA, typically generate brief answers without reasoning processes and explanations. This limits users’ confidence in making decisions based on their responses. While original CoT shows promise, it lacks self-correction mechanisms during reasoning. This work introduces Domain$o1$s, which enhances LLMs’ reasoning capabilities on domain tasks through supervised fine-tuning and tree search. We construct CoT-stock-2k and CoT-legal-2k datasets for fine-tuning models that activate domain-specific reasoning steps based on their judgment. Additionally, we propose Selective Tree Exploration to spontaneously explore solution spaces and sample optimal reasoning paths to improve performance. We also introduce PROOF-Score, a new metric for evaluating domain models’ explainability, complementing traditional accuracy metrics with richer assessment dimensions. Extensive experiments on stock investment recommendation and legal reasoning QA tasks demonstrate Domaino1s’s leading performance and explainability. Our code is available at https://github.com/Hyalinesky/Domaino1s.

nan

Article 683

Title@2025-05-28 (3): Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences

Title: Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences

Align-DA: Align Score-basierte atmosphärische Daten Assimilation mit mehreren Präferenzen

Aleign-DA: 与多重优惠相仿的一致计分大气数据 2505.22008v1

Authors: Jing-An Sun, Hang Fan, Junchao Gong, Ben Fei, Kun Chen, Fenghua Ling, Wenlong Zhang, Wanghan Xu, Li Yan, Pierre Gentine, Lei Bai

Data assimilation (DA) aims to estimate the full state of a dynamical system by combining partial and noisy observations with a prior model forecast, commonly referred to as the background. In atmospheric applications, this problem is fundamentally ill-posed due to the sparsity of observations relative to the high-dimensional state space. Traditional methods address this challenge by simplifying background priors to regularize the solution, which are empirical and require continual tuning for application. Inspired by alignment techniques in text-to-image diffusion models, we propose Align-DA, which formulates DA as a generative process and uses reward signals to guide background priors, replacing manual tuning with data-driven alignment. Specifically, we train a score-based model in the latent space to approximate the background-conditioned prior, and align it using three complementary reward signals for DA: (1) assimilation accuracy, (2) forecast skill initialized from the assimilated state, and (3) physical adherence of the analysis fields. Experiments with multiple reward signals demonstrate consistent improvements in analysis quality across different evaluation metrics and observation-guidance strategies. These results show that preference alignment, implemented as a soft constraint, can automatically adapt complex background priors tailored to DA, offering a promising new direction for advancing the field.

nan

Article 684

Title@2025-05-28 (3): Generalization Analysis for Supervised Contrastive Representation Learning under Non-IID Settings

Title: Generalization Analysis for Supervised Contrastive Representation Learning under Non-IID Settings

Generalisierungsanalyse für überwachtes Kontrastives Repräsentationslernen unter Nicht-IID-Einstellungen

在非IID设置下受监督的违反代表制学习的通用分析 2505.04937v3

Authors: Nong Minh Hieu, Antoine Ledent

Contrastive Representation Learning (CRL) has achieved impressive success in various domains in recent years. Nevertheless, the theoretical understanding of the generalization behavior of CRL has remained limited. Moreover, to the best of our knowledge, the current literature only analyzes generalization bounds under the assumption that the data tuples used for contrastive learning are independently and identically distributed. However, in practice, we are often limited to a fixed pool of reusable labeled data points, making it inevitable to recycle data across tuples to create sufficiently large datasets. Therefore, the tuple-wise independence condition imposed by previous works is invalidated. In this paper, we provide a generalization analysis for the CRL framework under non-$i.i.d.$ settings that adheres to practice more realistically. Drawing inspiration from the literature on U-statistics, we derive generalization bounds which indicate that the required number of samples in each class scales as the logarithm of the covering number of the class of learnable feature representations associated to that class. Next, we apply our main results to derive excess risk bounds for common function classes such as linear maps and neural networks.

nan

Article 685

Title@2025-05-28 (3): Locking-Free Training of Physics-Informed Neural Network for Solving Nearly Incompressible Elasticity Equations

Title: Locking-Free Training of Physics-Informed Neural Network for Solving Nearly Incompressible Elasticity Equations

Locking-Free Training of Physics-informed Neural Network for Solving Fast Incompressible Elasticity Equations

用于解决近不压缩弹性等量的物理内成神经网络的无锁化培训 2505.21994v1

Authors: Josef Dick, Seungchan Ko, Kassem Mustapha, Sanghyeon Park

Due to divergence instability, the accuracy of low-order conforming finite element methods for nearly incompressible homogeneous elasticity equations deteriorates as the Lam'e coefficient $\lambda\to\infty$, or equivalently as the Poisson ratio $\nu\to1/2$. This phenomenon, known as locking or non-robustness, remains not fully understood despite extensive investigation. In this paper, we propose a robust method based on a fundamentally different, machine-learning-driven approach. Leveraging recently developed Physics-Informed Neural Networks (PINNs), we address the numerical solution of linear elasticity equations governing nearly incompressible materials. The core idea of our method is to appropriately decompose the given equations to alleviate the extreme imbalance in the coefficients, while simultaneously solving both the forward and inverse problems to recover the solutions of the decomposed systems as well as the associated external conditions. Through various numerical experiments, including constant, variable and parametric Lam'e coefficients, we illustrate the efficiency of the proposed methodology.

nan

Article 686

Title@2025-05-28 (3): Identifying Causal Direction via Variational Bayesian Compression

Title: Identifying Causal Direction via Variational Bayesian Compression

Identifizierung der Kausalrichtung durch variationale Bayesische Kompression

通过变异贝耶斯压缩确定因果方向 2505.07503v3

Authors: Quang-Duy Tran, Bao Duong, Phuoc Nguyen, Thin Nguyen

Telling apart the cause and effect between two random variables with purely observational data is a challenging problem that finds applications in various scientific disciplines. A key principle utilized in this task is the algorithmic Markov condition, which postulates that the joint distribution, when factorized according to the causal direction, yields a more succinct codelength compared to the anti-causal direction. Previous approaches approximate these codelengths by relying on simple functions or Gaussian processes (GPs) with easily evaluable complexity, compromising between model fitness and computational complexity. To overcome these limitations, we propose leveraging the variational Bayesian learning of neural networks as an interpretation of the codelengths. Consequently, we can enhance the model fitness while promoting the succinctness of the codelengths, while avoiding the significant computational complexity of the GP-based approaches. Extensive experiments on both synthetic and real-world benchmarks in cause-effect identification demonstrate the effectiveness of our proposed method, surpassing the overall performance of related complexity-based and structural causal model regression-based approaches.

nan

Article 687

Title@2025-05-28 (3): ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning

Title: ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning

ACE: Exploring Activation Cosine Ähnlichkeit und Varianz für genaues und kalibrationseffizientes LLM Pruning

ACE: 探索在准确度和校准-有效LLM Pruning 方面活跃共生相近性和差异 2505.21987v1

Authors: Zhendong Mi, Zhenglun Kong, Geng Yuan, Shaoyi Huang

With the rapid expansion of large language models (LLMs), the demand for memory and computational resources has grown significantly. Recent advances in LLM pruning aim to reduce the size and computational cost of these models. However, existing methods often suffer from either suboptimal pruning performance or low time efficiency during the pruning process. In this work, we propose an efficient and effective pruning method that simultaneously achieves high pruning performance and fast pruning speed with improved calibration efficiency. Our approach introduces two key innovations: (1) An activation cosine similarity loss-guided pruning metric, which considers the angular deviation of the output activation between the dense and pruned models. (2) An activation variance-guided pruning metric, which helps preserve semantic distinctions in output activations after pruning, enabling effective pruning with shorter input sequences. These two components can be readily combined to enhance LLM pruning in both accuracy and efficiency. Experimental results show that our method achieves up to an 18% reduction in perplexity and up to 63% decrease in pruning time on prevalent LLMs such as LLaMA, LLaMA-2, and OPT.

nan

Article 688

Title@2025-05-28 (3): Reward-Independent Messaging for Decentralized Multi-Agent Reinforcement Learning

Title: Reward-Independent Messaging for Decentralized Multi-Agent Reinforcement Learning

Reward-independent Messaging für dezentralisiertes Mehr-Agenten-Verstärkungs-Lernen

权力下放多机构加强学习分权式多机构加强学习的回报独立通信 2505.21985v1

Authors: Naoto Yoshida, Tadahiro Taniguchi

In multi-agent reinforcement learning (MARL), effective communication improves agent performance, particularly under partial observability. We propose MARL-CPC, a framework that enables communication among fully decentralized, independent agents without parameter sharing. MARL-CPC incorporates a message learning model based on collective predictive coding (CPC) from emergent communication research. Unlike conventional methods that treat messages as part of the action space and assume cooperation, MARL-CPC links messages to state inference, supporting communication in non-cooperative, reward-independent settings. We introduce two algorithms -Bandit-CPC and IPPO-CPC- and evaluate them in non-cooperative MARL tasks. Benchmarks show that both outperform standard message-as-action approaches, establishing effective communication even when messages offer no direct benefit to the sender. These results highlight MARL-CPC’s potential for enabling coordination in complex, decentralized environments.

nan

Article 689

Title@2025-05-28 (3): How to Synthesize Text Data without Model Collapse?

Title: How to Synthesize Text Data without Model Collapse?

Wie können Sie Textdaten ohne Modellkollaps synthesieren?

如何在没有模式折叠的情况下合成文本数据 ? 2412.14689v3

Authors: Xuekai Zhu, Daixuan Cheng, Hengli Li, Kaiyan Zhang, Ermo Hua, Xingtai Lv, Ning Ding, Zhouhan Lin, Zilong Zheng, Bowen Zhou

Model collapse in synthetic data indicates that iterative training on self-generated data leads to a gradual decline in performance. With the proliferation of AI models, synthetic data will fundamentally reshape the web data ecosystem. Future GPT-${n}$ models will inevitably be trained on a blend of synthetic and human-produced data. In this paper, we focus on two questions: what is the impact of synthetic data on language model training, and how to synthesize data without model collapse? We first pre-train language models across different proportions of synthetic data, revealing a negative correlation between the proportion of synthetic data and model performance. We further conduct statistical analysis on synthetic data to uncover distributional shift phenomenon and over-concentration of n-gram features. Inspired by the above findings, we propose token editing on human-produced data to obtain semi-synthetic data. As a proof of concept, we theoretically demonstrate that token-level editing can prevent model collapse, as the test error is constrained by a finite upper bound. We conduct extensive experiments on pre-training from scratch, continual pre-training, and supervised fine-tuning. The results validate our theoretical proof that token-level editing improves model performance.

nan

Article 690

Title@2025-05-28 (3): Latent Weight Diffusion: Generating reactive policies instead of trajectories

Title: Latent Weight Diffusion: Generating reactive policies instead of trajectories

Latent Weight Diffusion: Erzeugen von reaktiven Strategien anstelle von Trajektorien

负负重扩散: 产生反应性政策, 而不是轨迹 2410.14040v2

Authors: Shashank Hegde, Satyajeet Das, Gautam Salhotra, Gaurav S. Sukhatme

With the increasing availability of open-source robotic data, imitation learning has emerged as a viable approach for both robot manipulation and locomotion. Currently, large generalized policies are trained to predict controls or trajectories using diffusion models, which have the desirable property of learning multimodal action distributions. However, generalizability comes with a cost, namely, larger model size and slower inference. This is especially an issue for robotic tasks that require high control frequency. Further, there is a known trade-off between performance and action horizon for Diffusion Policy (DP), a popular model for generating trajectories: fewer diffusion queries accumulate greater trajectory tracking errors. For these reasons, it is common practice to run these models at high inference frequency, subject to robot computational constraints. To address these limitations, we propose Latent Weight Diffusion (LWD), a method that uses diffusion to generate closed-loop policies (weights for neural policies) for robotic tasks, rather than generating trajectories. Learning the behavior distribution through parameter space over trajectory space offers two key advantages: longer action horizons (fewer diffusion queries) & robustness to perturbations while retaining high performance; and a lower inference compute cost. To this end, we show that LWD has higher success rates than DP when the action horizon is longer and when stochastic perturbations exist in the environment. Furthermore, LWD achieves multitask performance comparable to DP while requiring just ~1/45th of the inference-time FLOPS

nan

Article 691

Title@2025-05-28 (3): Two-Stage Feature Generation with Transformer and Reinforcement Learning

Title: Two-Stage Feature Generation with Transformer and Reinforcement Learning

Zweistufige Feature-Generierung mit Transformer und Verstärkungslernen

具有变换器和强化学习的两阶段特色生成 2505.21978v1

Authors: Wanfu Gao, Zengyao Man, Zebin He, Yuhao Tang, Jun Gao, Kunpeng Liu

Feature generation is a critical step in machine learning, aiming to enhance model performance by capturing complex relationships within the data and generating meaningful new features. Traditional feature generation methods heavily rely on domain expertise and manual intervention, making the process labor-intensive and challenging to adapt to different scenarios. Although automated feature generation techniques address these issues to some extent, they often face challenges such as feature redundancy, inefficiency in feature space exploration, and limited adaptability to diverse datasets and tasks. To address these problems, we propose a Two-Stage Feature Generation (TSFG) framework, which integrates a Transformer-based encoder-decoder architecture with Proximal Policy Optimization (PPO). The encoder-decoder model in TSFG leverages the Transformer’s self-attention mechanism to efficiently represent and transform features, capturing complex dependencies within the data. PPO further enhances TSFG by dynamically adjusting the feature generation strategy based on task-specific feedback, optimizing the process for improved performance and adaptability. TSFG dynamically generates high-quality feature sets, significantly improving the predictive performance of machine learning models. Experimental results demonstrate that TSFG outperforms existing state-of-the-art methods in terms of feature quality and adaptability.

nan

Article 692

Title@2025-05-28 (3): Judging LLMs on a Simplex

Title: Judging LLMs on a Simplex

LLMs auf einem Simplex zu urteilen

以简单方式判断LLMs 2505.21972v1

Authors: Patrick Vossler, Fan Xia, Yifan Mai, Jean Feng

Automated evaluation of free-form outputs from large language models (LLMs) is challenging because many distinct answers can be equally valid. A common practice is to use LLMs themselves as judges, but the theoretical properties of this approach are not yet well understood. We show that a geometric framework that represents both judges and candidates as points on a probability simplex can provide helpful insight on what is or is not identifiable using LLM judges. Our theoretical analysis uncovers a “phase transition” in ranking identifiability: for binary scoring systems, true rankings are identifiable even with weak judges under mild assumptions, while rankings become non-identifiable for three or more scoring levels even with infinite data, absent additional prior knowledge. This non-identifiability highlights how uncertainty in rankings stems from not only aleatoric uncertainty (i.e., inherent stochasticity in the data) but also epistemic uncertainty regarding which assumptions hold, an aspect that has received limited attention until now. To integrate both types of uncertainty, we use Bayesian inference to encode assumptions as priors and conduct sensitivity analysis of ranking estimates and credible intervals. Empirical evaluations across multiple benchmarks demonstrate that Bayesian inference yields more accurate rankings and substantially improves coverage rates. These results underscore the importance of taking a more holistic approach to uncertainty quantification when using LLMs as judges.

nan

Article 693

Title@2025-05-28 (3): Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Title: Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Heterogene Token-Übertragung in LLM-Wissensbearbeitung abmildern

减轻LLLM知识编辑中变异式 Tok 超称 2502.00602v2

Authors: Tianci Liu, Ruirui Li, Zihan Dong, Hui Liu, Xianfeng Tang, Qingyu Yin, Linjun Zhang, Haoyu Wang, Jing Gao

Large language models (LLMs) have achieved remarkable performance on various natural language tasks. However, they are trained on static corpora and their knowledge can become outdated quickly in the fast-changing world. This motivates the development of knowledge editing (KE) to update specific knowledge in LLMs without changing unrelated others or compromising their pre-trained capabilities. Previous efforts sought to update a small amount of parameters of a LLM and proved effective for making selective updates. Nonetheless, the edited LLM often exhibits degraded ability to reason about the new knowledge. In this work, we identify a key issue: heterogeneous token overfitting (HTO), where the LLM overfits different tokens in the provided knowledge at varying rates. To tackle this, we propose OVERTONE, a token-level smoothing method that mitigates HTO by adaptively refining the target distribution. Theoretically, OVERTONE offers better parameter updates with negligible computation overhead. It also induces an implicit DPO but does not require preference data pairs. Extensive experiments across four editing methods, two LLMs, and diverse scenarios demonstrate the effectiveness and versatility of our method.

nan

Article 694

Title@2025-05-28 (3): Robust Reward Alignment via Hypothesis Space Batch Cutting

Title: Robust Reward Alignment via Hypothesis Space Batch Cutting

Robuste Belohnung Ausrichtung durch Hypothesis Raum Batch Schneiden

通过假设空间批量切割进行强力奖励调整 2502.02921v3

Authors: Zhixian Xie, Haode Zhang, Yizhe Feng, Wanxin Jin

Reward design in reinforcement learning and optimal control is challenging. Preference-based alignment addresses this by enabling agents to learn rewards from ranked trajectory pairs provided by humans. However, existing methods often struggle from poor robustness to unknown false human preferences. In this work, we propose a robust and efficient reward alignment method based on a novel and geometrically interpretable perspective: hypothesis space batched cutting. Our method iteratively refines the reward hypothesis space through “cuts” based on batches of human preferences. Within each batch, human preferences, queried based on disagreement, are grouped using a voting function to determine the appropriate cut, ensuring a bounded human query complexity. To handle unknown erroneous preferences, we introduce a conservative cutting method within each batch, preventing erroneous human preferences from making overly aggressive cuts to the hypothesis space. This guarantees provable robustness against false preferences, while eliminating the need to explicitly identify them. We evaluate our method in a model predictive control setting across diverse tasks. The results demonstrate that our framework achieves comparable or superior performance to state-of-the-art methods in error-free settings while significantly outperforming existing methods when handling a high percentage of erroneous human preferences.

nan

Article 695

Title@2025-05-28 (3): Cooperation of Experts: Fusing Heterogeneous Information with Large Margin

Title: Cooperation of Experts: Fusing Heterogeneous Information with Large Margin

Kooperation von Experten: Verschmelzende Heterogene Informationen mit großer Spanne

专家合作:利用具有较大边际效应的异种信息 2505.20853v2

Authors: Shuo Wang, Shunyang Huang, Jinghui Yuan, Zhixiang Shen, Zhao Kang

Fusing heterogeneous information remains a persistent challenge in modern data analysis. While significant progress has been made, existing approaches often fail to account for the inherent heterogeneity of object patterns across different semantic spaces. To address this limitation, we propose the Cooperation of Experts (CoE) framework, which encodes multi-typed information into unified heterogeneous multiplex networks. By overcoming modality and connection differences, CoE provides a powerful and flexible model for capturing the intricate structures of real-world complex data. In our framework, dedicated encoders act as domain-specific experts, each specializing in learning distinct relational patterns in specific semantic spaces. To enhance robustness and extract complementary knowledge, these experts collaborate through a novel large margin mechanism supported by a tailored optimization strategy. Rigorous theoretical analyses guarantee the framework’s feasibility and stability, while extensive experiments across diverse benchmarks demonstrate its superior performance and broad applicability. Our code is available at https://github.com/strangeAlan/CoE.

nan

Article 696

Title@2025-05-28 (3): EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles

Title: EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles

EnsemW2S: Verbesserung der Schwach-zu-Strong-Verallgemeinerung mit großsprachigen Modellensembles

EnsemW2S:用大语言模型组合加强弱至强的通用化 2505.21959v1

Authors: Aakriti Agrawal, Mucong Ding, Zora Che, Chenghao Deng, Anirudh Satheesh, Bang An, Bayan Bruss, John Langford, Furong Huang

With Large Language Models (LLMs) rapidly approaching and potentially surpassing human-level performance, it has become imperative to develop approaches capable of effectively supervising and enhancing these powerful models using smaller, human-level models exposed to only human-level data. We address this critical weak-to-strong (W2S) generalization challenge by proposing a novel method aimed at improving weak experts, by training on the same limited human-level data, enabling them to generalize to complex, super-human-level tasks. Our approach, called \textbf{EnsemW2S}, employs a token-level ensemble strategy that iteratively combines multiple weak experts, systematically addressing the shortcomings identified in preceding iterations. By continuously refining these weak models, we significantly enhance their collective ability to supervise stronger student models. We extensively evaluate the generalization performance of both the ensemble of weak experts and the subsequent strong student model across in-distribution (ID) and out-of-distribution (OOD) datasets. For OOD, we specifically introduce question difficulty as an additional dimension for defining distributional shifts. Our empirical results demonstrate notable improvements, achieving 4\%, and 3.2\% improvements on ID datasets and, upto 6\% and 2.28\% on OOD datasets for experts and student models respectively, underscoring the effectiveness of our proposed method in advancing W2S generalization.

nan

Article 697

Title@2025-05-28 (3): A Stochastic Approximation Approach for Efficient Decentralized Optimization on Random Networks

Title: A Stochastic Approximation Approach for Efficient Decentralized Optimization on Random Networks

Ein stochastischer Annäherungsansatz für eine effiziente dezentralisierte Optimierung von Random Networks

随机网络高效分散优化优化的斯托卡接近方法 2410.18774v2

Authors: Chung-Yiu Yau, Haoming Liu, Hoi-To Wai

A challenging problem in decentralized optimization is to develop algorithms with fast convergence on random and time varying topologies under unreliable and bandwidth-constrained communication network. This paper studies a stochastic approximation approach with a Fully Stochastic Primal Dual Algorithm (FSPDA) framework. Our framework relies on a novel observation that randomness in time varying topology can be incorporated in a stochastic augmented Lagrangian formulation, whose expected value admits saddle points that coincide with stationary solutions of the decentralized optimization problem. With the FSPDA framework, we develop two new algorithms supporting efficient sparsified communication on random time varying topologies – FSPDA-SA allows agents to execute multiple local gradient steps depending on the time varying topology to accelerate convergence, and FSPDA-STORM further incorporates a variance reduction step to improve sample complexity. For problems with smooth (possibly non-convex) objective function, within $T$ iterations, we show that FSPDA-SA (resp. FSPDA-STORM) finds an $\mathcal{O}( 1/\sqrt{T} )$-stationary (resp. $\mathcal{O}( 1/T^{2/3} )$) solution. Numerical experiments show the benefits of the FSPDA algorithms.

nan

Article 698

Title@2025-05-28 (3): Kimi k1.5: Scaling Reinforcement Learning with LLMs

Title: Kimi k1.5: Scaling Reinforcement Learning with LLMs

Kimi k1.5: Skalierungs-Verstärkungs-Lernen mit LLMs

Kimi k1.5:利用LLMs加强加强学习 2501.12599v3

Authors: Kimi Team, Angang Du, Bofei Gao, Bowei Xing, Changjiu Jiang, Cheng Chen, Cheng Li, Chenjun Xiao, Chenzhuang Du, Chonghua Liao, Chuning Tang, Congcong Wang, Dehao Zhang, Enming Yuan, Enzhe Lu, Fengxiang Tang, Flood Sung, Guangda Wei, Guokun Lai, Haiqing Guo, Han Zhu, Hao Ding, Hao Hu, Hao Yang, Hao Zhang, Haotian Yao, Haotian Zhao, Haoyu Lu, Haoze Li, Haozhen Yu, Hongcheng Gao, Huabin Zheng, Huan Yuan, Jia Chen, Jianhang Guo, Jianlin Su, Jianzhou Wang, Jie Zhao, Jin Zhang, Jingyuan Liu, Junjie Yan, Junyan Wu, Lidong Shi, Ling Ye, Longhui Yu, Mengnan Dong, Neo Zhang, Ningchen Ma, Qiwei Pan, Qucheng Gong, Shaowei Liu, Shengling Ma, Shupeng Wei, Sihan Cao, Siying Huang, Tao Jiang, Weihao Gao, Weimin Xiong, Weiran He, Weixiao Huang, Wenhao Wu, Wenyang He, Xianghui Wei, Xianqing Jia, Xingzhe Wu, Xinran Xu, Xinxing Zu, Xinyu Zhou, Xuehai Pan, Y. Charles, Yang Li, Yangyang Hu, Yangyang Liu, Yanru Chen, Yejie Wang, Yibo Liu, Yidao Qin, Yifeng Liu, Ying Yang, Yiping Bao, Yulun Du, Yuxin Wu, Yuzhi Wang, Zaida Zhou, Zhaoji Wang, Zhaowei Li, Zhen Zhu, Zheng Zhang, Zhexu Wang, Zhilin Yang, Zhiqi Huang, Zihao Huang, Ziyao Xu, Zonghan Yang, Zongyu Lin

Language model pretraining with next token prediction has proved effective for scaling compute but is limited to the amount of available training data. Scaling reinforcement learning (RL) unlocks a new axis for the continued improvement of artificial intelligence, with the promise that large language models (LLMs) can scale their training data by learning to explore with rewards. However, prior published work has not produced competitive results. In light of this, we report on the training practice of Kimi k1.5, our latest multi-modal LLM trained with RL, including its RL training techniques, multi-modal data recipes, and infrastructure optimization. Long context scaling and improved policy optimization methods are key ingredients of our approach, which establishes a simplistic, effective RL framework without relying on more complex techniques such as Monte Carlo tree search, value functions, and process reward models. Notably, our system achieves state-of-the-art reasoning performance across multiple benchmarks and modalities – e.g., 77.5 on AIME, 96.2 on MATH 500, 94-th percentile on Codeforces, 74.9 on MathVista – matching OpenAI’s o1. Moreover, we present effective long2short methods that use long-CoT techniques to improve short-CoT models, yielding state-of-the-art short-CoT reasoning results – e.g., 60.8 on AIME, 94.6 on MATH500, 47.3 on LiveCodeBench – outperforming existing short-CoT models such as GPT-4o and Claude Sonnet 3.5 by a large margin (up to +550%).

nan

Article 699

Title@2025-05-28 (3): Stochastic Primal-Dual Double Block-Coordinate for Two-way Partial AUC Maximization

Title: Stochastic Primal-Dual Double Block-Coordinate for Two-way Partial AUC Maximization

Stochastische primäre Doppelblockkoordinate für Zwei-Wege-Partielle AUC-Maximierung

双向部分AUC 最大化 2505.21944v1

Authors: Linli Zhou, Bokun Wang, My T. Thai, Tianbao Yang

Two-way partial AUC (TPAUC) is a critical performance metric for binary classification with imbalanced data, as it focuses on specific ranges of the true positive rate (TPR) and false positive rate (FPR). However, stochastic algorithms for TPAUC optimization remain under-explored, with existing methods either limited to approximated TPAUC loss functions or burdened by sub-optimal complexities. To overcome these limitations, we introduce two innovative stochastic primal-dual double block-coordinate algorithms for TPAUC maximization. These algorithms utilize stochastic block-coordinate updates for both the primal and dual variables, catering to both convex and non-convex settings. We provide theoretical convergence rate analyses, demonstrating significant improvements over prior approaches. Our experimental results, based on multiple benchmark datasets, validate the superior performance of our algorithms, showcasing faster convergence and better generalization. This work advances the state of the art in TPAUC optimization and offers practical tools for real-world machine learning applications.

nan

Article 700

Title@2025-05-28 (3): Continual Learning Beyond Experience Rehearsal and Full Model Surrogates

Title: Continual Learning Beyond Experience Rehearsal and Full Model Surrogates

Kontinuierliches Lernen über die Erfahrung hinaus Proben und vollständige Modellüberlagerungen

排练和全模模范代理公司 2505.21942v1

Authors: Prashant Bhat, Laurens Niesten, Elahe Arani, Bahram Zonooz

Continual learning (CL) has remained a significant challenge for deep neural networks as learning new tasks erases previously acquired knowledge, either partially or completely. Existing solutions often rely on experience rehearsal or full model surrogates to mitigate CF. While effective, these approaches introduce substantial memory and computational overhead, limiting their scalability and applicability in real-world scenarios. To address this, we propose SPARC, a scalable CL approach that eliminates the need for experience rehearsal and full-model surrogates. By effectively combining task-specific working memories and task-agnostic semantic memory for cross-task knowledge consolidation, SPARC results in a remarkable parameter efficiency, using only 6% of the parameters required by full-model surrogates. Despite its lightweight design, SPARC achieves superior performance on Seq-TinyImageNet and matches rehearsal-based methods on various CL benchmarks. Additionally, weight re-normalization in the classification layer mitigates task-specific biases, establishing SPARC as a practical and scalable solution for CL under stringent efficiency constraints.

nan

Article 701

Title@2025-05-28 (3): Go With the Flow: Fast Diffusion for Gaussian Mixture Models

Title: Go With the Flow: Fast Diffusion for Gaussian Mixture Models

Mit dem Fluss gehen: Schnelle Diffusion für Gaussian Mixture Models

随流而去:高山混合模型的快速扩散 2412.09059v4

Authors: George Rapakoulias, Ali Reza Pedram, Fengjiao Liu, Lingjiong Zhu, Panagiotis Tsiotras

Schrodinger Bridges (SBs) are diffusion processes that steer, in finite time, a given initial distribution to another final one while minimizing a suitable cost functional. Although various methods for computing SBs have recently been proposed in the literature, most of these approaches require computationally expensive training schemes, even for solving low-dimensional problems. In this work, we propose an analytic parametrization of a set of feasible policies for steering the distribution of a dynamical system from one Gaussian Mixture Model (GMM) to another. Instead of relying on standard non-convex optimization techniques, the optimal policy within the set can be approximated as the solution of a low-dimensional linear program whose dimension scales linearly with the number of components in each mixture. The proposed method generalizes naturally to more general classes of dynamical systems, such as controllable linear time-varying systems, enabling efficient solutions to multi-marginal momentum SB between GMMs, a challenging distribution interpolation problem. We showcase the potential of this approach in low-to-moderate dimensional problems such as image-to-image translation in the latent space of an autoencoder, learning of cellular dynamics using multi-marginal momentum SB problems, and various other examples. We also test our approach on an Entropic Optimal Transport (EOT) benchmark problem and show that it outperforms state-of-the-art methods in cases where the boundary distributions are mixture models while requiring virtually no training.

nan

Article 702

Title@2025-05-28 (3): Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection

Title: Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection

Praktische Adversarialangriffe auf stochastische Banditen durch gefälschte Dateninjektion

通过假数据注射,实际对抗性攻击斯托卡强盗 2505.21938v1

Authors: Qirun Zeng, Eric He, Richard Hoffmann, Xuchuang Wang, Jinhang Zuo

Adversarial attacks on stochastic bandits have traditionally relied on some unrealistic assumptions, such as per-round reward manipulation and unbounded perturbations, limiting their relevance to real-world systems. We propose a more practical threat model, Fake Data Injection, which reflects realistic adversarial constraints: the attacker can inject only a limited number of bounded fake feedback samples into the learner’s history, simulating legitimate interactions. We design efficient attack strategies under this model, explicitly addressing both magnitude constraints (on reward values) and temporal constraints (on when and how often data can be injected). Our theoretical analysis shows that these attacks can mislead both Upper Confidence Bound (UCB) and Thompson Sampling algorithms into selecting a target arm in nearly all rounds while incurring only sublinear attack cost. Experiments on synthetic and real-world datasets validate the effectiveness of our strategies, revealing significant vulnerabilities in widely used stochastic bandit algorithms under practical adversarial scenarios.

nan

Article 703

Title@2025-05-28 (3): ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

Title: ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

ReQFlow: Rektifizierter Quaternionsfluss für effiziente und hochwertige Protein-Backbone-Generation

ReQFlow:为高效和高品质蛋白后骨生成而调整的四量流动 2502.14637v3

Authors: Angxiao Yue, Zichong Wang, Hongteng Xu

Protein backbone generation plays a central role in de novo protein design and is significant for many biological and medical applications. Although diffusion and flow-based generative models provide potential solutions to this challenging task, they often generate proteins with undesired designability and suffer computational inefficiency. In this study, we propose a novel rectified quaternion flow (ReQFlow) matching method for fast and high-quality protein backbone generation. In particular, our method generates a local translation and a 3D rotation from random noise for each residue in a protein chain, which represents each 3D rotation as a unit quaternion and constructs its flow by spherical linear interpolation (SLERP) in an exponential format. We train the model by quaternion flow (QFlow) matching with guaranteed numerical stability and rectify the QFlow model to accelerate its inference and improve the designability of generated protein backbones, leading to the proposed ReQFlow model. Experiments show that ReQFlow achieves on-par performance in protein backbone generation while requiring much fewer sampling steps and significantly less inference time (e.g., being 37x faster than RFDiffusion and 63x faster than Genie2 when generating a backbone of length 300), demonstrating its effectiveness and efficiency. The code is available at https://github.com/AngxiaoYue/ReQFlow.

nan

Article 704

Title@2025-05-28 (3): Higher-Order Group Synchronization

Title: Higher-Order Group Synchronization

Gruppensynchronisierung mit höherer Ordnung

高级分级组同步化 2505.21932v1

Authors: Adriana L. Duncan, Joe Kileel

Group synchronization is the problem of determining reliable global estimates from noisy local measurements on networks. The typical task for group synchronization is to assign elements of a group to the nodes of a graph in a way that respects group elements given on the edges which encode information about local pairwise relationships between the nodes. In this paper, we introduce a novel higher-order group synchronization problem which operates on a hypergraph and seeks to synchronize higher-order local measurements on the hyperedges to obtain global estimates on the nodes. Higher-order group synchronization is motivated by applications to computer vision and image processing, among other computational problems. First, we define the problem of higher-order group synchronization and discuss its mathematical foundations. Specifically, we give necessary and sufficient synchronizability conditions which establish the importance of cycle consistency in higher-order group synchronization. Then, we propose the first computational framework for general higher-order group synchronization; it acts globally and directly on higher-order measurements using a message passing algorithm. We discuss theoretical guarantees for our framework, including convergence analyses under outliers and noise. Finally, we show potential advantages of our method through numerical experiments. In particular, we show that in certain cases our higher-order method applied to rotational and angular synchronization outperforms standard pairwise synchronization methods and is more robust to outliers. We also show that our method has comparable performance on simulated cryo-electron microscopy (cryo-EM) data compared to a standard cryo-EM reconstruction package.

nan

Article 705

Title@2025-05-28 (3): Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning

Title: Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning

Ermittlung von Kriterien für die Neugewichtung von Verlusten zur Verbesserung des LLM-Entlernens

探索损失重新加权标准,加强LLM 重新学习 2505.11953v2

Authors: Puning Yang, Qizhou Wang, Zhuo Huang, Tongliang Liu, Chengqi Zhang, Bo Han

Loss reweighting has shown significant benefits for machine unlearning with large language models (LLMs). However, their exact functionalities are left unclear and the optimal strategy remains an open question, thus impeding the understanding and improvement of existing methodologies. In this paper, we identify two distinct goals of loss reweighting, namely, Saturation and Importance – the former indicates that those insufficiently optimized data should be emphasized, while the latter stresses some critical data that are most influential for loss minimization. To study their usefulness, we design specific reweighting strategies for each goal and evaluate their respective effects on unlearning. We conduct extensive empirical analyses on well-established benchmarks, and summarize some important observations as follows: (i) Saturation enhances efficacy more than importance-based reweighting, and their combination can yield additional improvements. (ii) Saturation typically allocates lower weights to data with lower likelihoods, whereas importance-based reweighting does the opposite. (iii) The efficacy of unlearning is also largely influenced by the smoothness and granularity of the weight distributions. Based on these findings, we propose SatImp, a simple reweighting method that combines the advantages of both saturation and importance. Empirical results on extensive datasets validate the efficacy of our method, potentially bridging existing research gaps and indicating directions for future research. Our code is available at https://github.com/tmlr-group/SatImp.

nan

Article 706

Title@2025-05-28 (3): Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets

Title: Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets

Effizientes Ensemble für die Feinabstimmung von Sprachmodellen auf mehreren Datensätzen

多个数据集微调语言模型高效组合组合 2505.21930v1

Authors: Dongyue Li, Ziniu Zhang, Lu Wang, Hongyang R. Zhang

This paper develops an ensemble method for fine-tuning a language model to multiple datasets. Existing methods, such as quantized LoRA (QLoRA), are efficient when adapting to a single dataset. When training on multiple datasets of different tasks, a common setup in practice, it remains unclear how to design an efficient adaptation for fine-tuning language models. We propose to use an ensemble of multiple smaller adapters instead of a single adapter per task. We design an efficient algorithm that partitions $n$ datasets into $m$ groups, where $m$ is typically much smaller than $n$ in practice, and train one adapter for each group before taking a weighted combination to form the ensemble. The algorithm leverages a first-order approximation property of low-rank adaptation to quickly obtain the fine-tuning performances of dataset combinations since methods like LoRA stay close to the base model. Hence, we use the gradients of the base model to estimate its behavior during fine-tuning. Empirically, this approximation holds with less than $1\%$ error on models with up to $34$ billion parameters, leading to an estimation of true fine-tuning performances under $5\%$ error while speeding up computation compared to base fine-tuning by $105$ times. When applied to fine-tune Llama and GPT models on ten text classification tasks, our approach provides up to $10\%$ higher average test accuracy over QLoRA, with only $9\%$ more FLOPs. On a Llama model with $34$ billion parameters, an ensemble of QLoRA increases test accuracy by $3\%$ compared to QLoRA, with only $8\%$ more FLOPs.

nan

Article 707

Title@2025-05-28 (3): Efficient Logit-based Knowledge Distillation of Deep Spiking Neural Networks for Full-Range Timestep Deployment

Title: Efficient Logit-based Knowledge Distillation of Deep Spiking Neural Networks for Full-Range Timestep Deployment

Effiziente Logit-basierte Wissensdestillation von Tiefen-Spiking-Neural-Netzwerken für die Bereitstellung von Vollstrecken-Zeitschritten

用于全红时间步骤部署的深渗透神经网络的高效基于逻辑的知识蒸馏 2501.15925v2

Authors: Chengting Yu, Xiaochen Zhao, Lei Liu, Shu Yang, Gaoang Wang, Erping Li, Aili Wang

Spiking Neural Networks (SNNs) are emerging as a brain-inspired alternative to traditional Artificial Neural Networks (ANNs), prized for their potential energy efficiency on neuromorphic hardware. Despite this, SNNs often suffer from accuracy degradation compared to ANNs and face deployment challenges due to fixed inference timesteps, which require retraining for adjustments, limiting operational flexibility. To address these issues, our work considers the spatio-temporal property inherent in SNNs, and proposes a novel distillation framework for deep SNNs that optimizes performance across full-range timesteps without specific retraining, enhancing both efficacy and deployment adaptability. We provide both theoretical analysis and empirical validations to illustrate that training guarantees the convergence of all implicit models across full-range timesteps. Experimental results on CIFAR-10, CIFAR-100, CIFAR10-DVS, and ImageNet demonstrate state-of-the-art performance among distillation-based SNNs training methods. Our code is available at https://github.com/Intelli-Chip-Lab/snn_temporal_decoupling_distillation.

nan

Article 708

Title@2025-05-28 (3): Subspecialty-Specific Foundation Model for Intelligent Gastrointestinal Pathology

Title: Subspecialty-Specific Foundation Model for Intelligent Gastrointestinal Pathology

Subspezialitätsspezifisches Stiftungsmodell für intelligente Gastrointestinalpathologie

智能气胃肠道病理学 2505.21928v1

Authors: Lianghui Zhu, Xitong Ling, Minxi Ouyang, Xiaoping Liu, Mingxi Fu, Tian Guan, Fanglei Fu, Xuanyu Wang, Maomao Zeng, Mingxi Zhu, Yibo Jin, Liming Liu, Song Duan, Qiming He, Yizhi Wang, Luxi Xie, Houqiang Li, Yonghong He, Sufang Tian

Gastrointestinal (GI) diseases represent a clinically significant burden, necessitating precise diagnostic approaches to optimize patient outcomes. Conventional histopathological diagnosis, heavily reliant on the subjective interpretation of pathologists, suffers from limited reproducibility and diagnostic variability. To overcome these limitations and address the lack of pathology-specific foundation models for GI diseases, we develop Digepath, a specialized foundation model for GI pathology. Our framework introduces a dual-phase iterative optimization strategy combining pretraining with fine-screening, specifically designed to address the detection of sparsely distributed lesion areas in whole-slide images. Digepath is pretrained on more than 353 million image patches from over 200,000 hematoxylin and eosin-stained slides of GI diseases. It attains state-of-the-art performance on 33 out of 34 tasks related to GI pathology, including pathological diagnosis, molecular prediction, gene mutation prediction, and prognosis evaluation, particularly in diagnostically ambiguous cases and resolution-agnostic tissue classification.We further translate the intelligent screening module for early GI cancer and achieve near-perfect 99.6% sensitivity across 9 independent medical institutions nationwide. The outstanding performance of Digepath highlights its potential to bridge critical gaps in histopathological practice. This work not only advances AI-driven precision pathology for GI diseases but also establishes a transferable paradigm for other pathology subspecialties.

nan

Article 709

Title@2025-05-28 (3): RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination

Title: RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination

RenderFormer: Transformer-basiertes Neural-Rendering von Dreiecksnetzen mit globaler Beleuchtung

成形前:以变形器为基础的以全球光化为工具的三角三角光板的神经成形 2505.21925v1

Authors: Chong Zeng, Yue Dong, Pieter Peers, Hongzhi Wu, Xin Tong

We present RenderFormer, a neural rendering pipeline that directly renders an image from a triangle-based representation of a scene with full global illumination effects and that does not require per-scene training or fine-tuning. Instead of taking a physics-centric approach to rendering, we formulate rendering as a sequence-to-sequence transformation where a sequence of tokens representing triangles with reflectance properties is converted to a sequence of output tokens representing small patches of pixels. RenderFormer follows a two stage pipeline: a view-independent stage that models triangle-to-triangle light transport, and a view-dependent stage that transforms a token representing a bundle of rays to the corresponding pixel values guided by the triangle-sequence from the view-independent stage. Both stages are based on the transformer architecture and are learned with minimal prior constraints. We demonstrate and evaluate RenderFormer on scenes with varying complexity in shape and light transport.

nan

Article 710

Title@2025-05-28 (3): FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design

Title: FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design

FALCON: Ein ML-Framework für vollautomatisierte Layout-Kontrainierte analoge Schaltungen

FALCON: 完全自动布局约束模拟电路设计 ML 框架 2505.21923v1

Authors: Asal Mehradfar, Xuzhe Zhao, Yilun Huang, Emir Ceyani, Yankai Yang, Shihao Han, Hamidreza Aghasi, Salman Avestimehr

Designing analog circuits from performance specifications is a complex, multi-stage process encompassing topology selection, parameter inference, and layout feasibility. We introduce FALCON, a unified machine learning framework that enables fully automated, specification-driven analog circuit synthesis through topology selection and layout-constrained optimization. Given a target performance, FALCON first selects an appropriate circuit topology using a performance-driven classifier guided by human design heuristics. Next, it employs a custom, edge-centric graph neural network trained to map circuit topology and parameters to performance, enabling gradient-based parameter inference through the learned forward model. This inference is guided by a differentiable layout cost, derived from analytical equations capturing parasitic and frequency-dependent effects, and constrained by design rules. We train and evaluate FALCON on a large-scale custom dataset of 1M analog mm-wave circuits, generated and simulated using Cadence Spectre across 20 expert-designed topologies. Through this evaluation, FALCON demonstrates >99\% accuracy in topology inference, <10\% relative error in performance prediction, and efficient layout-aware design that completes in under 1 second per instance. Together, these results position FALCON as a practical and extensible foundation model for end-to-end analog circuit design automation.

nan

Article 711

Title@2025-05-28 (3): Self-supervised Learning Method Using Transformer for Multi-dimensional Sensor Data Processing

Title: Self-supervised Learning Method Using Transformer for Multi-dimensional Sensor Data Processing

Selbstüberwachte Lernmethode mit Transformer für mehrdimensionale Sensordatenverarbeitung

利用变压器进行多维传感器数据处理的自监督学习方法 2505.21918v1

Authors: Haruki Kai, Tsuyoshi Okita

We developed a deep learning algorithm for human activity recognition using sensor signals as input. In this study, we built a pretrained language model based on the Transformer architecture, which is widely used in natural language processing. By leveraging this pretrained model, we aimed to improve performance on the downstream task of human activity recognition. While this task can be addressed using a vanilla Transformer, we propose an enhanced n-dimensional numerical processing Transformer that incorporates three key features: embedding n-dimensional numerical data through a linear layer, binning-based pre-processing, and a linear transformation in the output layer. We evaluated the effectiveness of our proposed model across five different datasets. Compared to the vanilla Transformer, our model demonstrated 10%-15% improvements in accuracy.

nan

Article 712

Title@2025-05-28 (3): SlimLLM: Accurate Structured Pruning for Large Language Models

Title: SlimLLM: Accurate Structured Pruning for Large Language Models

SlimLLM: Genau strukturiertes Pruning für große Sprachmodelle

SlimLLM:大型语言模型的准确结构审慎 2505.22689v1

Authors: Jialong Guo, Xinghao Chen, Yehui Tang, Yunhe Wang

Large language models(LLMs) have garnered significant attention and demonstrated impressive capabilities in a wide range of applications. However, due to their enormous computational costs, the deployment and application of LLMs are often severely limited. To address this issue, structured pruning is an effective solution to compress the parameters of LLMs. Determining the importance of each sub-module in LLMs and minimizing performance loss are critical issues that need to be carefully addressed in structured pruning. In this paper, we propose an effective and fast structured pruning method named SlimLLM for large language models. For channel and attention head pruning, we evaluate the importance based on the entire channel or head, rather than merely aggregating the importance of individual elements within a sub-module. This approach enables a more holistic consideration of the interdependence among elements within the sub-module. In addition, we design a simple linear regression strategy for the output matrix to quickly recover performance. We also propose layer-based importance ratio to determine the pruning ratio for each layer. Based on the LLaMA benchmark results, our SlimLLM outperforms other methods and achieves state-of-the-art performance.

nan

Article 713

Title@2025-05-28 (3): Understanding the behavior of representation forgetting in continual learning

Title: Understanding the behavior of representation forgetting in continual learning

Das Verhalten der Repräsentation verstehen vergessen im kontinuierlichen Lernen

理解在不断学习中遗忘的代言人行为 2505.20970v2

Authors: Joonkyu Kim, Yejin Kim, Jy-yong Sohn

In continual learning scenarios, catastrophic forgetting of previously learned tasks is a critical issue, making it essential to effectively measure such forgetting. Recently, there has been growing interest in focusing on representation forgetting, the forgetting measured at the hidden layer. In this paper, we provide the first theoretical analysis of representation forgetting and use this analysis to better understand the behavior of continual learning. First, we introduce a new metric called representation discrepancy, which measures the difference between representation spaces constructed by two snapshots of a model trained through continual learning. We demonstrate that our proposed metric serves as an effective surrogate for the representation forgetting while remaining analytically tractable. Second, through mathematical analysis of our metric, we derive several key findings about the dynamics of representation forgetting: the forgetting occurs more rapidly to a higher degree as the layer index increases, while increasing the width of the network slows down the forgetting process. Third, we support our theoretical findings through experiments on real image datasets, including Split-CIFAR100 and ImageNet1K.

nan

Article 714

Title@2025-05-28 (3): ExpProof : Operationalizing Explanations for Confidential Models with ZKPs

Title: ExpProof : Operationalizing Explanations for Confidential Models with ZKPs

ExpProof : Operationalisierung von Erklärungen für vertrauliche Modelle mit ZKPs

利用:对ZKPs的机密模型的解释投入运作 2502.03773v3

Authors: Chhavi Yadav, Evan Monroe Laufer, Dan Boneh, Kamalika Chaudhuri

In principle, explanations are intended as a way to increase trust in machine learning models and are often obligated by regulations. However, many circumstances where these are demanded are adversarial in nature, meaning the involved parties have misaligned interests and are incentivized to manipulate explanations for their purpose. As a result, explainability methods fail to be operational in such settings despite the demand \cite{bordt2022post}. In this paper, we take a step towards operationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs (ZKPs), a cryptographic primitive. Specifically we explore ZKP-amenable versions of the popular explainability algorithm LIME and evaluate their performance on Neural Networks and Random Forests. Our code is publicly available at https://github.com/emlaufer/ExpProof.

nan

Article 715

Title@2025-05-28 (3): Taming Transformer Without Using Learning Rate Warmup

Title: Taming Transformer Without Using Learning Rate Warmup

Zähmung Transformer ohne Verwendung von Lernrate Warmup

塔姆变形器不使用学习速率暖化 2505.21910v1

Authors: Xianbiao Qi, Yelin He, Jiaquan Ye, Chun-Guang Li, Bojia Zi, Xili Dai, Qin Zou, Rong Xiao

Scaling Transformer to a large scale without using some technical tricks such as learning rate warump and using an obviously lower learning rate is an extremely challenging task, and is increasingly gaining more attention. In this paper, we provide a theoretical analysis for the process of training Transformer and reveal the rationale behind the model crash phenomenon in the training process, termed \textit{spectral energy concentration} of ${\bW_q}^{\top} \bW_k$, which is the reason for a malignant entropy collapse, where ${\bW_q}$ and $\bW_k$ are the projection matrices for the query and the key in Transformer, respectively. To remedy this problem, motivated by \textit{Weyl’s Inequality}, we present a novel optimization strategy, \ie, making the weight updating in successive steps smooth – if the ratio $\frac{\sigma_{1}(\nabla \bW_t)}{\sigma_{1}(\bW_{t-1})}$ is larger than a threshold, we will automatically bound the learning rate to a weighted multiple of $\frac{\sigma_{1}(\bW_{t-1})}{\sigma_{1}(\nabla \bW_t)}$, where $\nabla \bW_t$ is the updating quantity in step $t$. Such an optimization strategy can prevent spectral energy concentration to only a few directions, and thus can avoid malignant entropy collapse which will trigger the model crash. We conduct extensive experiments using ViT, Swin-Transformer and GPT, showing that our optimization strategy can effectively and stably train these Transformers without using learning rate warmup.

nan

Article 716

Title@2025-05-28 (3): Criticality and Safety Margins for Reinforcement Learning

Title: Criticality and Safety Margins for Reinforcement Learning

Kritizität und Sicherheitsmargen für verstärktes Lernen

强化学习的临界和安全边缘 2409.18289v2

Authors: Alexander Grushin, Walt Woods, Alvaro Velasquez, Simon Khan

State of the art reinforcement learning methods sometimes encounter unsafe situations. Identifying when these situations occur is of interest both for post-hoc analysis and during deployment, where it might be advantageous to call out to a human overseer for help. Efforts to gauge the criticality of different points in time have been developed, but their accuracy is not well established due to a lack of ground truth, and they are not designed to be easily interpretable by end users. Therefore, we seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users. We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions. We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality. Safety margins make these interpretable, when defined as the number of random actions for which performance loss will not exceed some tolerance with high confidence. We demonstrate this approach in several environment-agent combinations; for an A3C agent in an Atari Beamrider environment, the lowest 5% of safety margins contain 47% of agent losses; i.e., supervising only 5% of decisions could potentially prevent roughly half of an agent’s errors. This criticality framework measures the potential impacts of bad decisions, even before those decisions are made, allowing for more effective debugging and oversight of autonomous agents.

nan

Article 717

Title: Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Verstärktes Lernen für Out-of-Distribution-Reasoning in LLMs: Eine empirische Studie zur diagnostischen Gruppencodierung

在LLMM中加强分配外原因的强化学习:诊断相关群体编码经验研究 2505.21908v1

Authors: Hanyin Wang, Zhenbang Wu, Gururaj Kolar, Hariprasad Korsapati, Brian Bartlett, Bryan Hull, Jimeng Sun

Diagnosis-Related Group (DRG) codes are essential for hospital reimbursement and operations but require labor-intensive assignment. Large Language Models (LLMs) struggle with DRG coding due to the out-of-distribution (OOD) nature of the task: pretraining corpora rarely contain private clinical or billing data. We introduce DRG-Sapphire, which uses large-scale reinforcement learning (RL) for automated DRG coding from clinical notes. Built on Qwen2.5-7B and trained with Group Relative Policy Optimization (GRPO) using rule-based rewards, DRG-Sapphire introduces a series of RL enhancements to address domain-specific challenges not seen in previous mathematical tasks. Our model achieves state-of-the-art accuracy on the MIMIC-IV benchmark and generates physician-validated reasoning for DRG assignments, significantly enhancing explainability. Our study further sheds light on broader challenges of applying RL to knowledge-intensive, OOD tasks. We observe that RL performance scales approximately linearly with the logarithm of the number of supervised fine-tuning (SFT) examples, suggesting that RL effectiveness is fundamentally constrained by the domain knowledge encoded in the base model. For OOD tasks like DRG coding, strong RL performance requires sufficient knowledge infusion prior to RL. Consequently, scaling SFT may be more effective and computationally efficient than scaling RL alone for such tasks.

nan

Article 718

Title@2025-05-28 (3): OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

Title: OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

OVERT: Ein Benchmark für eine überwiderrechtliche Bewertung von Text-zu-Bild-Modellen

GUT: 对文本到图像模型的反否决评价基准 2505.21347v2

Authors: Ziheng Cheng, Yixiao Huang, Hui Xu, Somayeh Sojoudi, Xuandong Zhao, Dawn Song, Song Mei

Text-to-Image (T2I) models have achieved remarkable success in generating visual content from text inputs. Although multiple safety alignment strategies have been proposed to prevent harmful outputs, they often lead to overly cautious behavior – rejecting even benign prompts – a phenomenon known as $\textit{over-refusal}$ that reduces the practical utility of T2I models. Despite over-refusal having been observed in practice, there is no large-scale benchmark that systematically evaluates this phenomenon for T2I models. In this paper, we present an automatic workflow to construct synthetic evaluation data, resulting in OVERT ($\textbf{OVE}$r-$\textbf{R}$efusal evaluation on $\textbf{T}$ext-to-image models), the first large-scale benchmark for assessing over-refusal behaviors in T2I models. OVERT includes 4,600 seemingly harmful but benign prompts across nine safety-related categories, along with 1,785 genuinely harmful prompts (OVERT-unsafe) to evaluate the safety-utility trade-off. Using OVERT, we evaluate several leading T2I models and find that over-refusal is a widespread issue across various categories (Figure 1), underscoring the need for further research to enhance the safety alignment of T2I models without compromising their functionality. As a preliminary attempt to reduce over-refusal, we explore prompt rewriting; however, we find it often compromises faithfulness to the meaning of the original prompts. Finally, we demonstrate the flexibility of our generation framework in accommodating diverse safety requirements by generating customized evaluation data adapting to user-defined policies.

nan

Article 719

Title@2025-05-28 (3): Geometry-Informed Neural Operator Transformer

Title: Geometry-Informed Neural Operator Transformer

Geometrie-informierter Neuraloperator Transformer

智能神经操作器变换器 2504.19452v3

Authors: Qibang Liu, Vincient Zhong, Hadi Meidani, Diab Abueidda, Seid Koric, Philippe Geubelle

Machine-learning-based surrogate models offer significant computational efficiency and faster simulations compared to traditional numerical methods, especially for problems requiring repeated evaluations of partial differential equations. This work introduces the Geometry-Informed Neural Operator Transformer (GINOT), which integrates the transformer architecture with the neural operator framework to enable forward predictions for arbitrary geometries. GINOT encodes the surface points cloud of a geometry using a sampling and grouping mechanism combined with an attention mechanism, ensuring invariance to point order and padding while maintaining robustness to variations in point density. The geometry information is seamlessly integrated with query points in the solution decoder through the attention mechanism. The performance of GINOT is validated on multiple challenging datasets, showcasing its high accuracy and strong generalization capabilities for complex and arbitrary 2D and 3D geometries.

nan

Article 720

Title@2025-05-28 (3): Integrating Intermediate Layer Optimization and Projected Gradient Descent for Solving Inverse Problems with Diffusion Models

Title: Integrating Intermediate Layer Optimization and Projected Gradient Descent for Solving Inverse Problems with Diffusion Models

Integration von Intermediate Layer Optimization und projizierter Gradient Descent zur Lösung inverser Probleme mit Diffusionsmodellen

整合中间层优化和预测梯度,以解决传播模型的反向问题 2505.20789v2

Authors: Yang Zheng, Wen Li, Zhaoqiang Liu

Inverse problems (IPs) involve reconstructing signals from noisy observations. Recently, diffusion models (DMs) have emerged as a powerful framework for solving IPs, achieving remarkable reconstruction performance. However, existing DM-based methods frequently encounter issues such as heavy computational demands and suboptimal convergence. In this work, building upon the idea of the recent work DMPlug, we propose two novel methods, DMILO and DMILO-PGD, to address these challenges. Our first method, DMILO, employs intermediate layer optimization (ILO) to alleviate the memory burden inherent in DMPlug. Additionally, by introducing sparse deviations, we expand the range of DMs, enabling the exploration of underlying signals that may lie outside the range of the diffusion model. We further propose DMILO-PGD, which integrates ILO with projected gradient descent (PGD), thereby reducing the risk of suboptimal convergence. We provide an intuitive theoretical analysis of our approaches under appropriate conditions and validate their superiority through extensive experiments on diverse image datasets, encompassing both linear and nonlinear IPs. Our results demonstrate significant performance gains over state-of-the-art methods, highlighting the effectiveness of DMILO and DMILO-PGD in addressing common challenges in DM-based IP solvers.

nan

Article 721

Title@2025-05-28 (3): Combinatorial Reinforcement Learning with Preference Feedback

Title: Combinatorial Reinforcement Learning with Preference Feedback

Kombinatorisches Stärkungslernen mit Präferenz-Feedback

结合强化学习与优先反馈 2502.10158v2

Authors: Joongkyu Lee, Min-hwan Oh

In this paper, we consider combinatorial reinforcement learning with preference feedback, where a learning agent sequentially offers an action–an assortment of multiple items to–a user, whose preference feedback follows a multinomial logistic (MNL) model. This framework allows us to model real-world scenarios, particularly those involving long-term user engagement, such as in recommender systems and online advertising. However, this framework faces two main challenges: (1) the unknown value of each item, unlike traditional MNL bandits that only address single-step preference feedback, and (2) the difficulty of ensuring optimism while maintaining tractable assortment selection in the combinatorial action space with unknown values. In this paper, we assume a contextual MNL preference model, where the mean utilities are linear, and the value of each item is approximated by a general function. We propose an algorithm, MNL-VQL, that addresses these challenges, making it both computationally and statistically efficient. As a special case, for linear MDPs (with the MNL preference feedback), we establish the first regret lower bound in this framework and show that MNL-VQL achieves nearly minimax-optimal regret. To the best of our knowledge, this is the first work to provide statistical guarantees in combinatorial RL with preference feedback.

nan

Article 722

Title@2025-05-28 (3): ReGNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction

Title: ReGNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction

ReGNet: Reziproke Raum-Bewusst-Langstrecken-Modellierung für kristalline Eigenschaftsvorhersage

ReGNet:水晶财产预测的对等空间-软件长距离模型模型 2502.02748v2

Authors: Jianan Nie, Peiyao Xiao, Kaiyi Ji, Peng Gao

Predicting properties of crystals from their structures is a fundamental yet challenging task in materials science. Unlike molecules, crystal structures exhibit infinite periodic arrangements of atoms, requiring methods capable of capturing both local and global information effectively. However, most current works fall short of capturing long-range interactions within periodic structures. To address this limitation, we leverage \emph{reciprocal space} to efficiently encode long-range interactions with learnable filters within Fourier transforms. We introduce Reciprocal Geometry Network (ReGNet), a novel architecture that integrates geometric GNNs and reciprocal blocks to model short-range and long-range interactions, respectively. Experimental results on JARVIS, Materials Project, and MatBench demonstrate that ReGNet achieves state-of-the-art predictive accuracy across a range of crystal property prediction tasks. Additionally, we explore a model extension that employs the mixture-of-experts for multi-property prediction with promising results and high computational efficiency. These findings highlight the potential of our model as a scalable and accurate solution for crystal property prediction. The code will be released upon paper acceptance.

nan

Article 723

Title@2025-05-28 (3): Language-Enhanced Representation Learning for Single-Cell Transcriptomics

Title: Language-Enhanced Representation Learning for Single-Cell Transcriptomics

Sprachverstärktes Repräsentationslernen für Single-Cell-Transkriptomik

单一计算机转基因学的提高语言代表性学习 2503.09427v3

Authors: Yaorui Shi, Jiaqi Yang, Changhao Nai, Sihang Li, Junfeng Fang, Xiang Wang, Zhiyuan Liu, Yang Zhang

Single-cell RNA sequencing (scRNA-seq) offers detailed insights into cellular heterogeneity. Recent advancements leverage single-cell large language models (scLLMs) for effective representation learning. These models focus exclusively on transcriptomic data, neglecting complementary biological knowledge from textual descriptions. To overcome this limitation, we propose scMMGPT, a novel multimodal framework designed for language-enhanced representation learning in single-cell transcriptomics. Unlike existing methods, scMMGPT employs robust cell representation extraction, preserving quantitative gene expression data, and introduces an innovative two-stage pre-training strategy combining discriminative precision with generative flexibility. Extensive experiments demonstrate that scMMGPT significantly outperforms unimodal and multimodal baselines across key downstream tasks, including cell annotation and clustering, and exhibits superior generalization in out-of-distribution scenarios.

nan

Article 724

Title@2025-05-28 (3): Federated Continual Graph Learning

Title: Federated Continual Graph Learning

Föderiertes kontinuierliches Graphenlernen

联邦连续图学习 2411.18919v3

Authors: Yinlin Zhu, Miao Hu, Di Wu

Managing evolving graph data presents substantial challenges in storage and privacy, and training graph neural networks (GNNs) on such data often leads to catastrophic forgetting, impairing performance on earlier tasks. Despite existing continual graph learning (CGL) methods mitigating this to some extent, they rely on centralized architectures and ignore the potential of distributed graph databases to leverage collective intelligence. To this end, we propose Federated Continual Graph Learning (FCGL) to adapt GNNs across multiple evolving graphs under storage and privacy constraints. Our empirical study highlights two core challenges: local graph forgetting (LGF), where clients lose prior knowledge when adapting to new tasks, and global expertise conflict (GEC), where the global GNN exhibits sub-optimal performance in both adapting to new tasks and retaining old ones, arising from inconsistent client expertise during server-side parameter aggregation. To address these, we introduce POWER, a framework that preserves experience nodes with maximum local-global coverage locally to mitigate LGF, and leverages pseudo-prototype reconstruction with trajectory-aware knowledge transfer to resolve GEC. Experiments on various graph datasets demonstrate POWER’s superiority over federated adaptations of CGL baselines and vision-centric federated continual learning approaches.

nan

Article 725

Title@2025-05-28 (3): Towards Large Reasoning Models for Agriculture

Title: Towards Large Reasoning Models for Agriculture

Auf dem Weg zu groß angelegten Konzepten für die Landwirtschaft

争取实现农业大理由解释模式 2505.19259v2

Authors: Hossein Zaremehrjerdi, Shreyan Ganguly, Ashlyn Rairdin, Elizabeth Tranel, Benjamin Feuer, Juan Ignacio Di Salvo, Srikanth Panthulugiri, Hernan Torres Pacin, Victoria Moser, Sarah Jones, Joscif G Raigne, Yanben Shen, Heidi M. Dornath, Aditya Balu, Adarsh Krishnamurthy, Asheesh K Singh, Arti Singh, Baskar Ganapathysubramanian, Chinmay Hegde, Soumik Sarkar

Agricultural decision-making involves complex, context-specific reasoning, where choices about crops, practices, and interventions depend heavily on geographic, climatic, and economic conditions. Traditional large language models (LLMs) often fall short in navigating this nuanced problem due to limited reasoning capacity. We hypothesize that recent advances in large reasoning models (LRMs) can better handle such structured, domain-specific inference. To investigate this, we introduce AgReason, the first expert-curated open-ended science benchmark with 100 questions for agricultural reasoning. Evaluations across thirteen open-source and proprietary models reveal that LRMs outperform conventional ones, though notable challenges persist, with the strongest Gemini-based baseline achieving 36% accuracy. We also present AgThoughts, a large-scale dataset of 44.6K question-answer pairs generated with human oversight and equipped with synthetically generated reasoning traces. Using AgThoughts, we develop AgThinker, a suite of small reasoning models that can be run on consumer-grade GPUs, and show that our dataset can be effective in unlocking agricultural reasoning abilities in LLMs. Our project page is here: https://baskargroup.github.io/Ag_reasoning/

nan

Article 726

Title@2025-05-28 (3): Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization

Title: Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization

Komprimierende Sine-Activated Low-Rank-Adapter durch Quantisierung nach dem Training

通过培训后定量化压缩松状活动低Rank适应器 2505.21895v1

Authors: Cameron Gordon, Yiping Ji, Hemanth Saratchandran, Paul Albert, Simon Lucey

Low-Rank Adaptation (LoRA) has become a standard approach for parameter-efficient fine-tuning, offering substantial reductions in trainable parameters by modeling updates as the product of two low-rank matrices. While effective, the low-rank constraint inherently limits representational capacity, often resulting in reduced performance compared to full-rank fine-tuning. Recent work by Ji et al. (2025) has addressed this limitation by applying a fixed-frequency sinusoidal transformation to low-rank adapters, increasing their stable rank without introducing additional parameters. This raises a crucial question: can the same sine-activated technique be successfully applied within the context of Post-Training Quantization to retain benefits even after model compression? In this paper, we investigate this question by extending the sinusoidal transformation framework to quantized LoRA adapters. We develop a theoretical analysis showing that the stable rank of a quantized adapter is tightly linked to that of its full-precision counterpart, motivating the use of such rank-enhancing functions even under quantization. Our results demonstrate that the expressivity gains from a sinusoidal non-linearity persist after quantization, yielding highly compressed adapters with negligible loss in performance. We validate our approach across a range of fine-tuning tasks for language, vision and text-to-image generation achieving significant memory savings while maintaining competitive accuracy.

nan

Article 727

Title@2025-05-28 (3): SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training

Title: SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training

SDPO: Importance-Sampled Direct Preference Optimierung für stabile Diffusionsschulungen

SDPO: 稳定传播培训的重要性抽样直接优惠优化 2505.21893v1

Authors: Xiaomeng Yang, Zhiyu Tan, Junyan Wang, Zhijian Zhou, Hao Li

Preference learning has become a central technique for aligning generative models with human expectations. Recently, it has been extended to diffusion models through methods like Direct Preference Optimization (DPO). However, existing approaches such as Diffusion-DPO suffer from two key challenges: timestep-dependent instability, caused by a mismatch between the reverse and forward diffusion processes and by high gradient variance in early noisy timesteps, and off-policy bias arising from the mismatch between optimization and data collection policies. We begin by analyzing the reverse diffusion trajectory and observe that instability primarily occurs at early timesteps with low importance weights. To address these issues, we first propose DPO-C\&M, a practical strategy that improves stability by clipping and masking uninformative timesteps while partially mitigating off-policy bias. Building on this, we introduce SDPO (Importance-Sampled Direct Preference Optimization), a principled framework that incorporates importance sampling into the objective to fully correct for off-policy bias and emphasize informative updates during the diffusion process. Experiments on CogVideoX-2B, CogVideoX-5B, and Wan2.1-1.3B demonstrate that both methods outperform standard Diffusion-DPO, with SDPO achieving superior VBench scores, human preference alignment, and training robustness. These results highlight the importance of timestep-aware, distribution-corrected optimization in diffusion-based preference learning.

nan

Article 728

Title@2025-05-28 (3): ControlTac: Force- and Position-Controlled Tactile Data Augmentation with a Single Reference Image

Title: ControlTac: Force- and Position-Controlled Tactile Data Augmentation with a Single Reference Image

ControlTac: Kraft- und positionsgesteuerte taktile Datenvergrößerung mit einem einzigen Referenzbild

控制塔克: 带有单一参考图像的力控和位置控轨迹数据增强 2505.20498v2

Authors: Dongyu Luo, Kelin Yu, Amir-Hossein Shahidzadeh, Cornelia Fermüller, Yiannis Aloimonos, Ruohan Gao

Vision-based tactile sensing has been widely used in perception, reconstruction, and robotic manipulation. However, collecting large-scale tactile data remains costly due to the localized nature of sensor-object interactions and inconsistencies across sensor instances. Existing approaches to scaling tactile data, such as simulation and free-form tactile generation, often suffer from unrealistic output and poor transferability to downstream tasks. To address this, we propose ControlTac, a two-stage controllable framework that generates realistic tactile images conditioned on a single reference tactile image, contact force, and contact position. With those physical priors as control input, ControlTac generates physically plausible and varied tactile images that can be used for effective data augmentation. Through experiments on three downstream tasks, we demonstrate that ControlTac can effectively augment tactile datasets and lead to consistent gains. Our three real-world experiments further validate the practical utility of our approach. Project page: https://dongyuluo.github.io/controltac.

nan

Article 729

Title@2025-05-28 (3): Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion

Title: Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion

Fast lineare Konvergenz unter Minimal-Score Annahmen: Quantisierte Transition Diffusion

在最低分数假设下几乎线性聚合:量化过渡扩散 2505.21892v1

Authors: Xunpeng Huang, Yingyu Lin, Nikki Lijing Kuang, Hanze Dong, Difan Zou, Yian Ma, Tong Zhang

Continuous diffusion models have demonstrated remarkable performance in data generation across various domains, yet their efficiency remains constrained by two critical limitations: (1) the local adjacency structure of the forward Markov process, which restricts long-range transitions in the data space, and (2) inherent biases introduced during the simulation of time-inhomogeneous reverse denoising processes. To address these challenges, we propose Quantized Transition Diffusion (QTD), a novel approach that integrates data quantization with discrete diffusion dynamics. Our method first transforms the continuous data distribution $p_$ into a discrete one $q_$ via histogram approximation and binary encoding, enabling efficient representation in a structured discrete latent space. We then design a continuous-time Markov chain (CTMC) with Hamming distance-based transitions as the forward process, which inherently supports long-range movements in the original data space. For reverse-time sampling, we introduce a \textit{truncated uniformization} technique to simulate the reverse CTMC, which can provably provide unbiased generation from $q_$ under minimal score assumptions. Through a novel KL dynamic analysis of the reverse CTMC, we prove that QTD can generate samples with $O(d\ln^2(d/\epsilon))$ score evaluations in expectation to approximate the $d$–dimensional target distribution $p_$ within an $\epsilon$ error tolerance. Our method not only establishes state-of-the-art inference efficiency but also advances the theoretical foundations of diffusion-based generative modeling by unifying discrete and continuous diffusion paradigms.

nan

Article 730

Title@2025-05-28 (3): Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models

Title: Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models

Auf dem Weg zu robuster automatisierter Wahrnehmungsqualitätsbewertung mit Sprachstiftungsmodellen

以语音基金会模式进行强有力的自主声音质量评估 2505.21356v2

Authors: Whenty Ariyanti, Kuan-Yu Chen, Sabato Marco Siniscalchi, Hsin-Min Wang, Yu Tsao

Perceptual voice quality assessment is essential for diagnosing and monitoring voice disorders. Traditionally, expert raters use scales such as the CAPE-V and GRBAS. However, these are subjective and prone to inter-rater variability, motivating the need for automated, objective assessment methods. This study proposes VOQANet, a deep learning framework with an attention mechanism that leverages a Speech Foundation Model (SFM) to extract high-level acoustic and prosodic information from raw speech. To improve robustness and interpretability, we introduce VOQANet+, which integrates handcrafted acoustic features such as jitter, shimmer, and harmonics-to-noise ratio (HNR) with SFM embeddings into a hybrid representation. Unlike prior work focusing only on vowel-based phonation (PVQD-A subset) from the Perceptual Voice Quality Dataset (PVQD), we evaluate our models on both vowel-based and sentence-level speech (PVQD-S subset) for better generalizability. Results show that sentence-based input outperforms vowel-based input, particularly at the patient level, highlighting the benefit of longer utterances for capturing voice attributes. VOQANet consistently surpasses baseline methods in root mean squared error and Pearson correlation across CAPE-V and GRBAS dimensions, with VOQANet+ achieving further improvements. Additional tests under noisy conditions show that VOQANet+ maintains high prediction accuracy, supporting its use in real-world and telehealth settings. These findings demonstrate the value of combining SFM embeddings with domain-informed acoustic features for interpretable and robust voice quality assessment.

nan

Article 731

Title@2025-05-28 (3): Symbolic Foundation Regressor on Complex Networks

Title: Symbolic Foundation Regressor on Complex Networks

Symbolischer Foundation-Regressor auf komplexen Netzwerken

复杂网络上的反射器 2505.21879v1

Authors: Weiting Liu, Jiaxu Cui, Jiao Hu, En Wang, Bo Yang

In science, we are interested not only in forecasting but also in understanding how predictions are made, specifically what the interpretable underlying model looks like. Data-driven machine learning technology can significantly streamline the complex and time-consuming traditional manual process of discovering scientific laws, helping us gain insights into fundamental issues in modern science. In this work, we introduce a pre-trained symbolic foundation regressor that can effectively compress complex data with numerous interacting variables while producing interpretable physical representations. Our model has been rigorously tested on non-network symbolic regression, symbolic regression on complex networks, and the inference of network dynamics across various domains, including physics, biochemistry, ecology, and epidemiology. The results indicate a remarkable improvement in equation inference efficiency, being three times more effective than baseline approaches while maintaining accurate predictions. Furthermore, we apply our model to uncover more intuitive laws of interaction transmission from global epidemic outbreak data, achieving optimal data fitting. This model extends the application boundary of pre-trained symbolic regression models to complex networks, and we believe it provides a foundational solution for revealing the hidden mechanisms behind changes in complex phenomena, enhancing interpretability, and inspiring further scientific discoveries.

nan

Article 732

Title@2025-05-28 (3): Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning

Title: Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning

Hybride Batch-Normalisierung: Lösung des Dilemmas der Batch-Normalisierung im Federated Learning

混合批次正常化:解决联邦学习中批次正常化的难题 2505.21877v1

Authors: Hongyao Chen, Tianyang Xu, Xiaojun Wu, Josef Kittler

Batch Normalisation (BN) is widely used in conventional deep neural network training to harmonise the input-output distributions for each batch of data. However, federated learning, a distributed learning paradigm, faces the challenge of dealing with non-independent and identically distributed data among the client nodes. Due to the lack of a coherent methodology for updating BN statistical parameters, standard BN degrades the federated learning performance. To this end, it is urgent to explore an alternative normalisation solution for federated learning. In this work, we resolve the dilemma of the BN layer in federated learning by developing a customised normalisation approach, Hybrid Batch Normalisation (HBN). HBN separates the update of statistical parameters (i.e. , means and variances used for evaluation) from that of learnable parameters (i.e. , parameters that require gradient updates), obtaining unbiased estimates of global statistical parameters in distributed scenarios. In contrast with the existing solutions, we emphasise the supportive power of global statistics for federated learning. The HBN layer introduces a learnable hybrid distribution factor, allowing each computing node to adaptively mix the statistical parameters of the current batch with the global statistics. Our HBN can serve as a powerful plugin to advance federated learning performance. It reflects promising merits across a wide range of federated learning settings, especially for small batch sizes and heterogeneous data.

nan

Article 733

Title@2025-05-28 (3): Targeted Unlearning Using Perturbed Sign Gradient Methods With Applications On Medical Images

Title: Targeted Unlearning Using Perturbed Sign Gradient Methods With Applications On Medical Images

Gezieltes Lernen mit gestörten Zeichen Gradient Methoden mit Anwendungen auf medizinischen Bildern

采用固定信号渐进方法,在医学图像上应用医学图象,有针对性地取消学习 2505.21872v1

Authors: George R. Nahass, Zhu Wang, Homa Rashidisabet, Won Hwa Kim, Sasha Hubschman, Jeffrey C. Peterson, Ghasem Yazdanpanah, Chad A. Purnell, Pete Setabutr, Ann Q. Tran, Darvin Yi, Sathya N. Ravi

Machine unlearning aims to remove the influence of specific training samples from a trained model without full retraining. While prior work has largely focused on privacy-motivated settings, we recast unlearning as a general-purpose tool for post-deployment model revision. Specifically, we focus on utilizing unlearning in clinical contexts where data shifts, device deprecation, and policy changes are common. To this end, we propose a bilevel optimization formulation of boundary-based unlearning that can be solved using iterative algorithms. We provide convergence guarantees when first-order algorithms are used to unlearn. Our method introduces tunable loss design for controlling the forgetting-retention tradeoff and supports novel model composition strategies that merge the strengths of distinct unlearning runs. Across benchmark and real-world clinical imaging datasets, our approach outperforms baselines on both forgetting and retention metrics, including scenarios involving imaging devices and anatomical outliers. This work establishes machine unlearning as a modular, practical alternative to retraining for real-world model maintenance in clinical applications.

nan

Article 734

Title@2025-05-28 (3): Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Robot Learning

Title: Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Robot Learning

Coarse-to-fine Q-Network mit Aktionssequenz für dateneffizientes Roboterlernen

Coarse 至 fine Q 网络与数据效率机器人学习行动序列 2411.12155v4

Authors: Younggyo Seo, Pieter Abbeel

Predicting a sequence of actions has been crucial in the success of recent behavior cloning algorithms in robotics. Can similar ideas improve reinforcement learning (RL)? We answer affirmatively by observing that incorporating action sequences when predicting ground-truth return-to-go leads to lower validation loss. Motivated by this, we introduce Coarse-to-fine Q-Network with Action Sequence (CQN-AS), a novel value-based RL algorithm that learns a critic network that outputs Q-values over a sequence of actions, i.e., explicitly training the value function to learn the consequence of executing action sequences. Our experiments show that CQN-AS outperforms several baselines on a variety of sparse-reward humanoid control and tabletop manipulation tasks from BiGym and RLBench.

nan

Article 735

Title@2025-05-28 (3): Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures

Title: Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures

Mini-Batch Coresets für speichereffiziente Sprachmodellschulungen auf Datenmischungen

记忆效率语言数据混合模型培训微型批量核心数据集 2407.19580v4

Authors: Dang Nguyen, Wenhan Yang, Rathul Anand, Yu Yang, Baharan Mirzasoleiman

Training with larger mini-batches improves the convergence rate and can yield superior performance. However, training with large mini-batches becomes prohibitive for Large Language Models (LLMs), due to the large GPU memory requirement. To address this problem, an effective approach is finding small mini-batch coresets that closely match the gradient of larger mini-batches. However, this approach becomes infeasible and ineffective for LLMs, due to the highly imbalanced mixture of sources in language data, use of the Adam optimizer, and the very large gradient dimensionality of LLMs. In this work, we address the above challenges by proposing Coresets for Training LLMs (CoLM). First, we show that mini-batch coresets found by gradient matching do not contain representative examples of the small sources w.h.p., and thus including all examples of the small sources in the mini-batch coresets is crucial for optimal performance. Second, we normalize the gradients by their historical exponential to find mini-batch coresets for training with Adam. Finally, we leverage zeroth-order methods to find smooth gradient of the last V-projection matrix and sparsify it to keep the dimensions with the largest normalized gradient magnitude. We apply CoLM to fine-tuning Phi-2, Phi-3, Zephyr, and Llama-3 models with LoRA on MathInstruct and SuperGLUE benchmark. Remarkably, CoLM reduces the memory requirement of fine-tuning by 2x and even outperforms training with 4x larger mini-batches. Moreover, CoLM seamlessly integrates with existing memory-efficient training methods like LoRA, further reducing the memory requirements of training LLMs. Our code is available at https://github.com/BigML-CS-UCLA/CoLM.

nan

Article 736

Title@2025-05-28 (3): Revisiting Bayesian Model Averaging in the Era of Foundation Models

Title: Revisiting Bayesian Model Averaging in the Era of Foundation Models

Bayesianisches Modell im Zeitalter der Gründungsmodelle neu besuchen

重新审查基金会模式时代的贝耶斯模式 2505.21857v1

Authors: Mijung Park

We revisit the classical, full-fledged Bayesian model averaging (BMA) paradigm to ensemble pre-trained and/or lightly-finetuned foundation models to enhance the classification performance on image and text data. To make BMA tractable under foundation models, we introduce trainable linear classifiers that take frozen features from the pre-trained foundation models as inputs. The model posteriors over the linear classifiers tell us which linear heads and frozen features are better suited for a given dataset, resulting in a principled model ensembling method. Furthermore, we propose a computationally cheaper, optimizable model averaging scheme (OMA). In OMA, we directly optimize the model ensemble weights, just like those weights based on model posterior distributions in BMA, by reducing the amount of surprise (expected entropy of the predictions) we get from predictions of ensembled models. With the rapid development of foundation models, these approaches will enable the incorporation of future, possibly significantly better foundation models to enhance the performance of challenging classification tasks.

nan

Article 737

Title@2025-05-28 (3): Meta Co-Training: Two Views are Better than One

Title: Meta Co-Training: Two Views are Better than One

Meta Co-Training: Zwei Ansichten sind besser als eine

Meta联合培训:两种观点比一种观点更好 2311.18083v5

Authors: Jay C. Rothenberger, Dimitrios I. Diochnos

In many critical computer vision scenarios unlabeled data is plentiful, but labels are scarce and difficult to obtain. As a result, semi-supervised learning which leverages unlabeled data to boost the performance of supervised classifiers have received significant attention in recent literature. One representative class of semi-supervised algorithms are co-training algorithms. Co-training algorithms leverage two different models which have access to different independent and sufficient representations or “views” of the data to jointly make better predictions. Each of these models creates pseudo-labels on unlabeled points which are used to improve the other model. We show that in the common case where independent views are not available, we can construct such views inexpensively using pre-trained models. Co-training on the constructed views yields a performance improvement over any of the individual views we construct and performance comparable with recent approaches in semi-supervised learning. We present Meta Co-Training, a novel semi-supervised learning algorithm, which has two advantages over co-training: (i) learning is more robust when there is large discrepancy between the information content of the different views, and (ii) does not require retraining from scratch on each iteration. Our method achieves new state-of-the-art performance on ImageNet-10% achieving a ~4.7% reduction in error rate over prior work. Our method also outperforms prior semi-supervised work on several other fine-grained image classification datasets.

nan

Article 738

Title@2025-05-28 (3): Investigating the effectiveness of multimodal data in forecasting SARS-COV-2 case surges

Title: Investigating the effectiveness of multimodal data in forecasting SARS-COV-2 case surges

Untersuchung der Wirksamkeit multimodaler Daten bei der Prognose von SARS-COV-2-Fallfluten

调查多式联运数据在预测SARS-COV-2案件激增方面的有效性 2505.22688v1

Authors: Palur Venkata Raghuvamsi, Siyuan Brandon Loh, Prasanta Bhattacharya, Joses Ho, Raphael Lee Tze Chuen, Alvin X. Han, Sebastian Maurer-Stroh

The COVID-19 pandemic response relied heavily on statistical and machine learning models to predict key outcomes such as case prevalence and fatality rates. These predictions were instrumental in enabling timely public health interventions that helped break transmission cycles. While most existing models are grounded in traditional epidemiological data, the potential of alternative datasets, such as those derived from genomic information and human behavior, remains underexplored. In the current study, we investigated the usefulness of diverse modalities of feature sets in predicting case surges. Our results highlight the relative effectiveness of biological (e.g., mutations), public health (e.g., case counts, policy interventions) and human behavioral features (e.g., mobility and social media conversations) in predicting country-level case surges. Importantly, we uncover considerable heterogeneity in predictive performance across countries and feature modalities, suggesting that surge prediction models may need to be tailored to specific national contexts and pandemic phases. Overall, our work highlights the value of integrating alternative data sources into existing disease surveillance frameworks to enhance the prediction of pandemic dynamics.

nan

Article 739

Title@2025-05-28 (3): Multi-Label Bayesian Active Learning with Inter-Label Relationships

Title: Multi-Label Bayesian Active Learning with Inter-Label Relationships

Multi-Label Bayesian Aktives Lernen mit inter-Label Beziehungen

多标签贝耶斯人积极学习与跨标签关系 2411.17941v2

Authors: Yuanyuan Qi, Jueqing Lu, Xiaohao Yang, Joanne Enticott, Lan Du

The primary challenge of multi-label active learning, differing it from multi-class active learning, lies in assessing the informativeness of an indefinite number of labels while also accounting for the inherited label correlation. Existing studies either require substantial computational resources to leverage correlations or fail to fully explore label dependencies. Additionally, real-world scenarios often require addressing intrinsic biases stemming from imbalanced data distributions. In this paper, we propose a new multi-label active learning strategy to address both challenges. Our method incorporates progressively updated positive and negative correlation matrices to capture co-occurrence and disjoint relationships within the label space of annotated samples, enabling a holistic assessment of uncertainty rather than treating labels as isolated elements. Furthermore, alongside diversity, our model employs ensemble pseudo labeling and beta scoring rules to address data imbalances. Extensive experiments on four realistic datasets demonstrate that our strategy consistently achieves more reliable and superior performance, compared to several established methods.

nan

Article 740

Title@2025-05-28 (3): Improving the Variance of Differentially Private Randomized Experiments through Clustering

Title: Improving the Variance of Differentially Private Randomized Experiments through Clustering

Verbesserung der Varianz von differenziert privaten Randomisierten Experimenten durch Clustering

通过集群化改进差异私人随机化实验的差异 2308.00957v3

Authors: Adel Javanmard, Vahab Mirrokni, Jean Pouget-Abadie

Estimating causal effects from randomized experiments is only possible if participants are willing to disclose their potentially sensitive responses. Differential privacy, a widely used framework for ensuring an algorithms privacy guarantees, can encourage participants to share their responses without the risk of de-anonymization. However, many mechanisms achieve differential privacy by adding noise to the original dataset, which reduces the precision of causal effect estimation. This introduces a fundamental trade-off between privacy and variance when performing causal analyses on differentially private data. In this work, we propose a new differentially private mechanism, “Cluster-DP”, which leverages a given cluster structure in the data to improve the privacy-variance trade-off. While our results apply to any clustering, we demonstrate that selecting higher-quality clusters, according to a quality metric we introduce, can decrease the variance penalty without compromising privacy guarantees. Finally, we evaluate the theoretical and empirical performance of our Cluster-DP algorithm on both real and simulated data, comparing it to common baselines, including two special cases of our algorithm: its unclustered version and a uniform-prior version.

nan

Article 741

Title@2025-05-28 (3): ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model

Title: ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model

ItDPDM: Informationstheoretisches Diskretes Poisson-Diffusionsmodell

ITDDDM:信息-理论分辨偏异Poisson传播模型 2505.05082v3

Authors: Sagnik Bhattacharya, Abhiram Gorle, Ahsan Bilal, Connor Ding, Amit Kumar Singh Yadav, Tsachy Weissman

Generative modeling of non-negative, discrete data, such as symbolic music, remains challenging due to two persistent limitations in existing methods. Firstly, many approaches rely on modeling continuous embeddings, which is suboptimal for inherently discrete data distributions. Secondly, most models optimize variational bounds rather than exact data likelihood, resulting in inaccurate likelihood estimates and degraded sampling quality. While recent diffusion-based models have addressed these issues separately, we tackle them jointly. In this work, we introduce the Information-Theoretic Discrete Poisson Diffusion Model (ItDPDM), inspired by photon arrival process, which combines exact likelihood estimation with fully discrete-state modeling. Central to our approach is an information-theoretic Poisson Reconstruction Loss (PRL) that has a provable exact relationship with the true data likelihood. ItDPDM achieves improved likelihood and sampling performance over prior discrete and continuous diffusion models on a variety of synthetic discrete datasets. Furthermore, on real-world datasets such as symbolic music and images, ItDPDM attains superior likelihood estimates and competitive generation quality-demonstrating a proof of concept for distribution-robust discrete generative modeling.

nan

Article 742

Title@2025-05-28 (3): Solving Empirical Bayes via Transformers

Title: Solving Empirical Bayes via Transformers

Lösen von Empirischen Buchten über Transformer

通过变换器解决实证贝贝 2502.09844v2

Authors: Anzo Teh, Mark Jabbour, Yury Polyanskiy

This work applies modern AI tools (transformers) to solving one of the oldest statistical problems: Poisson means under empirical Bayes (Poisson-EB) setting. In Poisson-EB a high-dimensional mean vector $\theta$ (with iid coordinates sampled from an unknown prior $\pi$) is estimated on the basis of $X=\mathrm{Poisson}(\theta)$. A transformer model is pre-trained on a set of synthetically generated pairs $(X,\theta)$ and learns to do in-context learning (ICL) by adapting to unknown $\pi$. Theoretically, we show that a sufficiently wide transformer can achieve vanishing regret with respect to an oracle estimator who knows $\pi$ as dimension grows to infinity. Practically, we discover that already very small models (100k parameters) are able to outperform the best classical algorithm (non-parametric maximum likelihood, or NPMLE) both in runtime and validation loss, which we compute on out-of-distribution synthetic data as well as real-world datasets (NHL hockey, MLB baseball, BookCorpusOpen). Finally, by using linear probes, we confirm that the transformer’s EB estimator appears to internally work differently from either NPMLE or Robbins’ estimators.

nan

Article 743

Title@2025-05-28 (3): Continuous Thought Machines

Title: Continuous Thought Machines

Kontinuierliche Gedankenmaschinen

连续思考机 2505.05522v3

Authors: Luke Darlow, Ciaran Regan, Sebastian Risi, Jeffrey Seely, Llion Jones

Biological brains demonstrate complex neural activity, where the timing and interplay between neurons is critical to how brains process information. Most deep learning architectures simplify neural activity by abstracting away temporal dynamics. In this paper we challenge that paradigm. By incorporating neuron-level processing and synchronization, we can effectively reintroduce neural timing as a foundational element. We present the Continuous Thought Machine (CTM), a model designed to leverage neural dynamics as its core representation. The CTM has two core innovations: (1) neuron-level temporal processing, where each neuron uses unique weight parameters to process a history of incoming signals; and (2) neural synchronization employed as a latent representation. The CTM aims to strike a balance between oversimplified neuron abstractions that improve computational efficiency, and biological realism. It operates at a level of abstraction that effectively captures essential temporal dynamics while remaining computationally tractable for deep learning. We demonstrate the CTM’s strong performance and versatility across a range of challenging tasks, including ImageNet-1K classification, solving 2D mazes, sorting, parity computation, question-answering, and RL tasks. Beyond displaying rich internal representations and offering a natural avenue for interpretation owing to its internal process, the CTM is able to perform tasks that require complex sequential reasoning. The CTM can also leverage adaptive compute, where it can stop earlier for simpler tasks, or keep computing when faced with more challenging instances. The goal of this work is to share the CTM and its associated innovations, rather than pushing for new state-of-the-art results. To that end, we believe the CTM represents a significant step toward developing more biologically plausible and powerful artificial intelligence systems.

nan

Article 744

Title@2025-05-28 (3): Statistical Inference for Temporal Difference Learning with Linear Function Approximation

Title: Statistical Inference for Temporal Difference Learning with Linear Function Approximation

Statistische Schlussfolgerung für zeitliches Differenzlernen mit linearer Funktionsannäherung

与线性函数接近一致的时空差异学习统计推推 2410.16106v3

Authors: Weichen Wu, Gen Li, Yuting Wei, Alessandro Rinaldo

We investigate the statistical properties of Temporal Difference (TD) learning with Polyak-Ruppert averaging, arguably one of the most widely used algorithms in reinforcement learning, for the task of estimating the parameters of the optimal linear approximation to the value function. We make three significant contributions that improve the current state-of-the-art results: (i) we derive sharper high probability convergence guarantee that depend explicitly on the asymptotic variance and hold under weaker conditions than those normally assumed; (ii) we establish refined high-dimensional Berry-Esseen bounds over the class of convex sets, achieving faster rates than those previously established in the literature, and (iii) we propose and analyze a novel, computationally efficient online plug-in estimator of the asymptotic covariance matrix.These results enable the construction of confidence regions and simultaneous confidence intervals for the linear parameters of the value function approximation, with guaranteed finite-sample coverage. We demonstrate the applicability of our theoretical findings through numerical experiments.

nan

Article 745

Title@2025-05-28 (3): A Provable Approach for End-to-End Safe Reinforcement Learning

Title: A Provable Approach for End-to-End Safe Reinforcement Learning

Ein realistischer Ansatz für das Ende-zu-Ende sichere Stärkungslernen

最终至最终安全强化学习的可行办法 2505.21852v1

Authors: Akifumi Wachi, Kohei Miyaguchi, Takumi Tanabe, Rei Sato, Youhei Akimoto

A longstanding goal in safe reinforcement learning (RL) is a method to ensure the safety of a policy throughout the entire process, from learning to operation. However, existing safe RL paradigms inherently struggle to achieve this objective. We propose a method, called Provably Lifetime Safe RL (PLS), that integrates offline safe RL with safe policy deployment to address this challenge. Our proposed method learns a policy offline using return-conditioned supervised learning and then deploys the resulting policy while cautiously optimizing a limited set of parameters, known as target returns, using Gaussian processes (GPs). Theoretically, we justify the use of GPs by analyzing the mathematical relationship between target and actual returns. We then prove that PLS finds near-optimal target returns while guaranteeing safety with high probability. Empirically, we demonstrate that PLS outperforms baselines both in safety and reward performance, thereby achieving the longstanding goal to obtain high rewards while ensuring the safety of a policy throughout the lifetime from learning to operation.

nan

Article 746

Title@2025-05-28 (3): Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories

Title: Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories

Streaming Flow Policy: Vereinfachende Diffusion$/$ Flow-Matching-Richtlinien durch Behandlung von Aktionsbahnen als Flow-Trajektorien

流流流流流流政策:通过将行动轨迹作为流动轨迹处理,简化以美元/美元/美元的流量匹配政策 2505.21851v1

Authors: Sunshine Jiang, Xiaolin Fang, Nicholas Roy, Tomás Lozano-Pérez, Leslie Pack Kaelbling, Siddharth Ancha

Recent advances in diffusion$/$flow-matching policies have enabled imitation learning of complex, multi-modal action trajectories. However, they are computationally expensive because they sample a trajectory of trajectories: a diffusion$/$flow trajectory of action trajectories. They discard intermediate action trajectories, and must wait for the sampling process to complete before any actions can be executed on the robot. We simplify diffusion$/$flow policies by treating action trajectories as flow trajectories. Instead of starting from pure noise, our algorithm samples from a narrow Gaussian around the last action. Then, it incrementally integrates a velocity field learned via flow matching to produce a sequence of actions that constitute a single trajectory. This enables actions to be streamed to the robot on-the-fly during the flow sampling process, and is well-suited for receding horizon policy execution. Despite streaming, our method retains the ability to model multi-modal behavior. We train flows that stabilize around demonstration trajectories to reduce distribution shift and improve imitation learning performance. Streaming flow policy outperforms prior methods while enabling faster policy execution and tighter sensorimotor loops for learning-based robot control. Project website: https://streaming-flow-policy.github.io/

nan

Article 747

Title@2025-05-28 (3): Spectral clustering for dependent community Hawkes process models of temporal networks

Title: Spectral clustering for dependent community Hawkes process models of temporal networks

Spektrales Clustering für abhängige Community Hawkes Prozessmodelle von zeitlichen Netzwerken

依赖依赖性社区霍克斯时间网络过程模型光谱群群群 2505.21845v1

Authors: Lingfei Zhao, Hadeel Soliman, Kevin S. Xu, Subhadeep Paul

Temporal networks observed continuously over time through timestamped relational events data are commonly encountered in application settings including online social media communications, financial transactions, and international relations. Temporal networks often exhibit community structure and strong dependence patterns among node pairs. This dependence can be modeled through mutual excitations, where an interaction event from a sender to a receiver node increases the possibility of future events among other node pairs. We provide statistical results for a class of models that we call dependent community Hawkes (DCH) models, which combine the stochastic block model with mutually exciting Hawkes processes for modeling both community structure and dependence among node pairs, respectively. We derive a non-asymptotic upper bound on the misclustering error of spectral clustering on the event count matrix as a function of the number of nodes and communities, time duration, and the amount of dependence in the model. Our result leverages recent results on bounding an appropriate distance between a multivariate Hawkes process count vector and a Gaussian vector, along with results from random matrix theory. We also propose a DCH model that incorporates only self and reciprocal excitation along with highly scalable parameter estimation using a Generalized Method of Moments (GMM) estimator that we demonstrate to be consistent for growing network size and time duration.

nan

Article 748

Title@2025-05-28 (3): A Physics-Informed Learning Framework to Solve the Infinite-Horizon Optimal Control Problem

Title: A Physics-Informed Learning Framework to Solve the Infinite-Horizon Optimal Control Problem

Ein physikinformiertes Lernrahmenwerk zur Lösung des Unendlichen-Horizon-Optimalen Steuerungsproblems

解决无限 – – 霍里佐最佳控制问题的物理综合学习框架 2505.21842v1

Authors: Filippos Fotiadis, Kyriakos G. Vamvoudakis

We propose a physics-informed neural networks (PINNs) framework to solve the infinite-horizon optimal control problem of nonlinear systems. In particular, since PINNs are generally able to solve a class of partial differential equations (PDEs), they can be employed to learn the value function of the infinite-horizon optimal control problem via solving the associated steady-state Hamilton-Jacobi-Bellman (HJB) equation. However, an issue here is that the steady-state HJB equation generally yields multiple solutions; hence if PINNs are directly employed to it, they may end up approximating a solution that is different from the optimal value function of the problem. We tackle this by instead applying PINNs to a finite-horizon variant of the steady-state HJB that has a unique solution, and which uniformly approximates the optimal value function as the horizon increases. An algorithm to verify if the chosen horizon is large enough is also given, as well as a method to extend it – with reduced computations and robustness to approximation errors – in case it is not. Unlike many existing methods, the proposed technique works well with non-polynomial basis functions, does not require prior knowledge of a stabilizing controller, and does not perform iterative policy evaluations. Simulations are performed, which verify and clarify theoretical findings.

nan

Article 749

Title@2025-05-28 (3): An Optimistic Algorithm for online CMDPS with Anytime Adversarial Constraints

Title: An Optimistic Algorithm for online CMDPS with Anytime Adversarial Constraints

Optimistischer Algorithmus für Online-CMDPS mit jederzeit feindlichen Einschränkungen

带有任何时间的反逆限制的在线 CMDPS 优化算法 2505.21841v1

Authors: Jiahui Zhu, Kihyun Yu, Dabeen Lee, Xin Liu, Honghao Wei

Online safe reinforcement learning (RL) plays a key role in dynamic environments, with applications in autonomous driving, robotics, and cybersecurity. The objective is to learn optimal policies that maximize rewards while satisfying safety constraints modeled by constrained Markov decision processes (CMDPs). Existing methods achieve sublinear regret under stochastic constraints but often fail in adversarial settings, where constraints are unknown, time-varying, and potentially adversarially designed. In this paper, we propose the Optimistic Mirror Descent Primal-Dual (OMDPD) algorithm, the first to address online CMDPs with anytime adversarial constraints. OMDPD achieves optimal regret O(sqrt(K)) and strong constraint violation O(sqrt(K)) without relying on Slater’s condition or the existence of a strictly known safe policy. We further show that access to accurate estimates of rewards and transitions can further improve these bounds. Our results offer practical guarantees for safe decision-making in adversarial environments.

nan

Article 750

Title@2025-05-28 (3): Natural Language Reinforcement Learning

Title: Natural Language Reinforcement Learning

Natürliche Sprache Stärkung Lernen

自然语言强化学习 2411.14251v3

Authors: Xidong Feng, Bo Liu, Yan Song, Haotian Fu, Ziyu Wan, Girish A. Koushik, Zhiyuan Hu, Mengyue Yang, Ying Wen, Jun Wang

Artificial intelligence progresses towards the “Era of Experience,” where agents are expected to learn from continuous, grounded interaction. We argue that traditional Reinforcement Learning (RL), which typically represents value as a scalar, can restrict agent’s deep understanding of environments and hinders the active, deliberative learning crucial for navigating this new paradigm. To address the issue, we introduce Natural Language Reinforcement Learning (NLRL), a framework that extends RL principles into natural language counterparts. Central to NLRL is the Language Value Function (LVF), which redefines value as an interpretable linguistic narrative articulating the rationale behind an evaluation. NLRL further extends this concept to core RL components, including policy, the Bellman equation, and policy iteration. Leveraging recent advancements in Large Language Models (LLMs), NLRL can be practically implemented to achieve RL-like policy and value training through unsupervised environment interactions. Experiments over 4 multi-step agentic tasks demonstrate NLRL’s effectiveness, efficiency, and its potential to foster deeper understanding and more active learning strategies.

nan

Article 751

Title@2025-05-28 (3): UniMoGen: Universal Motion Generation

Title: UniMoGen: Universal Motion Generation

UniMoGen: Universal Motion Generation

UniMoGen: 宇宙运动一代 2505.21837v1

Authors: Aliasghar Khani, Arianna Rampini, Evan Atherton, Bruno Roy

Motion generation is a cornerstone of computer graphics, animation, gaming, and robotics, enabling the creation of realistic and varied character movements. A significant limitation of existing methods is their reliance on specific skeletal structures, which restricts their versatility across different characters. To overcome this, we introduce UniMoGen, a novel UNet-based diffusion model designed for skeleton-agnostic motion generation. UniMoGen can be trained on motion data from diverse characters, such as humans and animals, without the need for a predefined maximum number of joints. By dynamically processing only the necessary joints for each character, our model achieves both skeleton agnosticism and computational efficiency. Key features of UniMoGen include controllability via style and trajectory inputs, and the ability to continue motions from past frames. We demonstrate UniMoGen’s effectiveness on the 100style dataset, where it outperforms state-of-the-art methods in diverse character motion generation. Furthermore, when trained on both the 100style and LAFAN1 datasets, which use different skeletons, UniMoGen achieves high performance and improved efficiency across both skeletons. These results highlight UniMoGen’s potential to advance motion generation by providing a flexible, efficient, and controllable solution for a wide range of character animations.

nan

Article 752

Title@2025-05-27 (2): Inferring Traffic Models in Terminal Airspace from Flight Tracks and Procedures

Title: Inferring Traffic Models in Terminal Airspace from Flight Tracks and Procedures

Ableiten von Verkehrsmodellen im Terminal-Luftraum von Flugspuren und -verfahren

从飞行轨道和程序中推断终端航空空间的交通模式 2303.09981v3

Authors: Soyeon Jung, Amelia Hardy, Mykel J. Kochenderfer

Realistic aircraft trajectory models are useful in the design and validation of air traffic management (ATM) systems. Models of aircraft operated under instrument flight rules (IFR) require capturing the variability inherent in how aircraft follow standard flight procedures. The variability in aircraft behavior differs among flight stages. In this paper, we propose a simple probabilistic model that can learn this variability from procedural data and flight tracks collected from radar surveillance data. For each segment, we use a Gaussian mixture model to learn the deviations of aircraft trajectories from their procedures. Given new procedures, we generate synthetic trajectories by sampling a series of deviations from the Gaussian mixture model and reconstructing the aircraft trajectory using the deviations and the procedures. We extend this method to capture pairwise correlations between aircraft and show how a pairwise model can be used to generate traffic involving an arbitrary number of aircraft. We demonstrate the proposed models on the arrival tracks and procedures of the John F. Kennedy International Airport. Distributional similarity between the original and the synthetic trajectory dataset was evaluated using the Jensen-Shannon divergence between the empirical distributions of different variables and we provide qualitative analyses of the synthetic trajectories generated.

nan

Article 753

Title@2025-05-27 (2): TuneComp: Joint Fine-tuning and Compression for Large Foundation Models

Title: TuneComp: Joint Fine-tuning and Compression for Large Foundation Models

TuneComp: Gemeinsame Feinabstimmung und Kompression für große Fundamentmodelle

TununComp:大型基金会模型的联合微调和压缩 2505.21835v1

Authors: Xiangyu Chen, Jing Liu, Ye Wang, Matthew Brand, Pu, Wang, Toshiaki Koike-Akino

To reduce model size during post-training, compression methods, including knowledge distillation, low-rank approximation, and pruning, are often applied after fine-tuning the model. However, sequential fine-tuning and compression sacrifices performance, while creating a larger than necessary model as an intermediate step. In this work, we aim to reduce this gap, by directly constructing a smaller model while guided by the downstream task. We propose to jointly fine-tune and compress the model by gradually distilling it to a pruned low-rank structure. Experiments demonstrate that joint fine-tuning and compression significantly outperforms other sequential compression methods.

nan

Article 754

Title@2025-05-27 (2): Constrained Discrete Diffusion

Title: Constrained Discrete Diffusion

Beschränkte diskrete Diffusion

限制的分解扩散 2503.09790v2

Authors: Michael Cardei, Jacob K Christopher, Thomas Hartvigsen, Brian R. Bartoldson, Bhavya Kailkhura, Ferdinando Fioretto

Discrete diffusion models are a class of generative models that construct sequences by progressively denoising samples from a categorical noise distribution. Beyond their rapidly growing ability to generate coherent natural language, these models present a new and important opportunity to enforce sequence-level constraints, a capability that current autoregressive models cannot natively provide. This paper capitalizes on this opportunity by introducing Constrained Discrete Diffusion (CDD), a novel integration of differentiable constraint optimization within the diffusion process to ensure adherence to constraints, logic rules, or safety requirements for generated sequences. Unlike conventional text generators that often rely on post-hoc filtering or model retraining for controllable generation, CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach. Experiments in toxicity-controlled text generation, property-constrained molecule design, and instruction-constrained text completion demonstrate that CDD achieves zero constraint violations in a diverse array of tasks while preserving fluency, novelty, and coherence while outperforming autoregressive and existing discrete diffusion approaches.

nan

Article 755

Title@2025-05-27 (2): In Search of Adam’s Secret Sauce

Title: In Search of Adam’s Secret Sauce

Auf der Suche nach Adams geheimer Sauce

寻找亚当的秘密香肠 2505.21829v1

Authors: Antonio Orvieto, Robert Gower

Understanding the remarkable efficacy of Adam when training transformer-based language models has become a central research topic within the optimization community. To gain deeper insights, several simplifications of Adam have been proposed, such as the signed gradient and signed momentum methods. In this work, we conduct an extensive empirical study - training over 1,300 language models across different data configurations and scales - comparing Adam to several known simplified variants. We find that signed momentum methods are faster than SGD, but consistently underperform relative to Adam, even after careful tuning of momentum, clipping setting and learning rates. However, our analysis reveals a compelling option that preserves near-optimal performance while allowing for new insightful reformulations: constraining the Adam momentum parameters to be equal. Beyond robust performance, this choice affords new theoretical insights, highlights the “secret sauce” on top of signed momentum, and grants a precise statistical interpretation: we show that Adam in this setting implements a natural online algorithm for estimating the mean and variance of gradients-one that arises from a mean-field Gaussian variational inference perspective.

nan

Article 756

Title@2025-05-27 (2): Music Source Restoration

Title: Music Source Restoration

Restaurierung der Musikquelle

音乐来源恢复 2505.21827v1

Authors: Yongyi Zang, Zheqi Dai, Mark D. Plumbley, Qiuqiang Kong

We introduce Music Source Restoration (MSR), a novel task addressing the gap between idealized source separation and real-world music production. Current Music Source Separation (MSS) approaches assume mixtures are simple sums of sources, ignoring signal degradations employed during music production like equalization, compression, and reverb. MSR models mixtures as degraded sums of individually degraded sources, with the goal of recovering original, undegraded signals. Due to the lack of data for MSR, we present RawStems, a dataset annotation of 578 songs with unprocessed source signals organized into 8 primary and 17 secondary instrument groups, totaling 354.13 hours. To the best of our knowledge, RawStems is the first dataset that contains unprocessed music stems with hierarchical categories. We consider spectral filtering, dynamic range compression, harmonic distortion, reverb and lossy codec as possible degradations, and establish U-Former as a baseline method, demonstrating the feasibility of MSR on our dataset. We release the RawStems dataset annotations, degradation simulation pipeline, training code and pre-trained models to be publicly available.

nan

Article 757

Title@2025-05-27 (2): From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization

Title: From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization

Von EduVisBench zu EduVisAgent: Ein Benchmark- und Multi-Agent-Framework für eine sinnvolle pädagogische Visualisierung

从Edu Visb bench到Edu Visbench-Edu VisbearAgender:有理性的可视化教育基准和多机构框架 2505.16832v2

Authors: Haonian Ji, Shi Qiu, Siyang Xin, Siwei Han, Zhaorun Chen, Dake Zhang, Hongyi Wang, Huaxiu Yao

While foundation models (FMs), such as diffusion models and large vision-language models (LVLMs), have been widely applied in educational contexts, their ability to generate pedagogically effective visual explanations remains limited. Most existing approaches focus primarily on textual reasoning, overlooking the critical role of structured and interpretable visualizations in supporting conceptual understanding. To better assess the visual reasoning capabilities of FMs in educational settings, we introduce EduVisBench, a multi-domain, multi-level benchmark. EduVisBench features diverse STEM problem sets requiring visually grounded solutions, along with a fine-grained evaluation rubric informed by pedagogical theory. Our empirical analysis reveals that existing models frequently struggle with the inherent challenge of decomposing complex reasoning and translating it into visual representations aligned with human cognitive processes. To address these limitations, we propose EduVisAgent, a multi-agent collaborative framework that coordinates specialized agents for instructional planning, reasoning decomposition, metacognitive prompting, and visualization design. Experimental results show that EduVisAgent substantially outperforms all baselines, achieving a 40.2% improvement and delivering more educationally aligned visualizations. EduVisBench and EduVisAgent are available at https://github.com/aiming-lab/EduVisBench and https://github.com/aiming-lab/EduVisAgent.

nan

Article 758

Title@2025-05-27 (2): Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones

Title: Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones

Lassen Sie mich nachdenken! Eine lange Kette des Denkens kann es wert sein, auf jeden Fall viele kurze Menschen

让我想想吧!一个长期的思考链可能值得一试有很多短一个 2505.21825v1

Authors: Parsa Mirtaheri, Ezra Edelman, Samy Jelassi, Eran Malach, Enric Boix-Adsera

Inference-time computation has emerged as a promising scaling axis for improving large language model reasoning. However, despite yielding impressive performance, the optimal allocation of inference-time computation remains poorly understood. A central question is whether to prioritize sequential scaling (e.g., longer chains of thought) or parallel scaling (e.g., majority voting across multiple short chains of thought). In this work, we seek to illuminate the landscape of test-time scaling by demonstrating the existence of reasoning settings where sequential scaling offers an exponential advantage over parallel scaling. These settings are based on graph connectivity problems in challenging distributions of graphs. We validate our theoretical findings with comprehensive experiments across a range of language models, including models trained from scratch for graph connectivity with different chain of thought strategies as well as large reasoning models.

nan

Article 759

Title@2025-05-27 (2): Unsupervised Latent Pattern Analysis for Estimating Type 2 Diabetes Risk in Undiagnosed Populations

Title: Unsupervised Latent Pattern Analysis for Estimating Type 2 Diabetes Risk in Undiagnosed Populations

Unüberwachte Latent Pattern Analyse zur Schätzung des Typ-2-Diabetes-Risikos in nicht diagnostizierten Populationen

未经监督的对未诊断的人群2型糖尿病风险估算的 2505.21824v1

Authors: Praveen Kumar, Vincent T. Metzger, Scott A. Malec

The global prevalence of diabetes, particularly type 2 diabetes mellitus (T2DM), is rapidly increasing, posing significant health and economic challenges. T2DM not only disrupts blood glucose regulation but also damages vital organs such as the heart, kidneys, eyes, nerves, and blood vessels, leading to substantial morbidity and mortality. In the US alone, the economic burden of diagnosed diabetes exceeded $400 billion in 2022. Early detection of individuals at risk is critical to mitigating these impacts. While machine learning approaches for T2DM prediction are increasingly adopted, many rely on supervised learning, which is often limited by the lack of confirmed negative cases. To address this limitation, we propose a novel unsupervised framework that integrates Non-negative Matrix Factorization (NMF) with statistical techniques to identify individuals at risk of developing T2DM. Our method identifies latent patterns of multimorbidity and polypharmacy among diagnosed T2DM patients and applies these patterns to estimate the T2DM risk in undiagnosed individuals. By leveraging data-driven insights from comorbidity and medication usage, our approach provides an interpretable and scalable solution that can assist healthcare providers in implementing timely interventions, ultimately improving patient outcomes and potentially reducing the future health and economic burden of T2DM.

nan

Article 760

Title@2025-05-27 (2): An Innovative Data-Driven and Adaptive Reinforcement Learning Approach for Context-Aware Prescriptive Process Monitoring

Title: An Innovative Data-Driven and Adaptive Reinforcement Learning Approach for Context-Aware Prescriptive Process Monitoring

Ein innovativer datengetriebener und adaptiver Weiterbildungsansatz für die kontext-aware Prescriptive Prozessüberwachung

采用创新型数据驱动和适应性强化学习方法,用于内容软件指令程序监测 2501.10543v2

Authors: Mostafa Abbasi, Maziyar Khadivi, Maryam Ahang, Patricia Lasserre, Yves Lucet, Homayoun Najjaran

The application of artificial intelligence and machine learning in business process management has advanced significantly, however, the full potential of these technologies remains largely unexplored, primarily due to challenges related to data quality and availability. We present a novel framework called Fine-Tuned Offline Reinforcement Learning Augmented Process Sequence Optimization (FORLAPS), which aims to identify optimal execution paths in business processes by leveraging reinforcement learning enhanced with a state-dependent reward shaping mechanism, thereby enabling context-sensitive prescriptions. Additionally, to compare FORLAPS with the existing models (Permutation Feature Importance and multi-task Long Short Term Memory model), we experimented to evaluate its effectiveness in terms of resource savings and process time reduction. The experimental results on real-life event logs validate that FORLAPS achieves 31% savings in resource time spent and a 23% reduction in process time span. To further enhance learning, we introduce an innovative process-aware data augmentation technique that selectively increases the average estimated Q-values in sampled batches, enabling automatic fine-tuning of the reinforcement learning model. Robustness was assessed through both prefix-level and trace-level evaluations, using the Damerau-Levenshtein distance as the primary metric. Finally, the model’s adaptability across industries was further validated through diverse case studies, including healthcare treatment pathways, financial services workflows, permit applications from regulatory bodies, and operations management. In each domain, the proposed model demonstrated exceptional performance, outperforming existing state-of-the-art approaches in prescriptive decision-making, demonstrating its capability to prescribe optimal next steps and predict the best next activities within a process trace.

nan

Article 761

Title@2025-05-27 (2): DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

Title: DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

DiffMS: Diffusionserzeugung von Molekülen auf Massenspektren

DiffMS: 受质量光谱约束的分子的扩散生成 2502.09571v2

Authors: Montgomery Bohde, Mrunali Manjrekar, Runzhong Wang, Shuiwang Ji, Connor W. Coley

Mass spectrometry plays a fundamental role in elucidating the structures of unknown molecules and subsequent scientific discoveries. One formulation of the structure elucidation task is the conditional de novo generation of molecular structure given a mass spectrum. Toward a more accurate and efficient scientific discovery pipeline for small molecules, we present DiffMS, a formula-restricted encoder-decoder generative network that achieves state-of-the-art performance on this task. The encoder utilizes a transformer architecture and models mass spectra domain knowledge such as peak formulae and neutral losses, and the decoder is a discrete graph diffusion model restricted by the heavy-atom composition of a known chemical formula. To develop a robust decoder that bridges latent embeddings and molecular structures, we pretrain the diffusion decoder with fingerprint-structure pairs, which are available in virtually infinite quantities, compared to structure-spectrum pairs that number in the tens of thousands. Extensive experiments on established benchmarks show that DiffMS outperforms existing models on de novo molecule generation. We provide several ablations to demonstrate the effectiveness of our diffusion and pretraining approaches and show consistent performance scaling with increasing pretraining dataset size. DiffMS code is publicly available at https://github.com/coleygroup/DiffMS.

nan

Article 762

Title@2025-05-27 (2): Representative Language Generation

Title: Representative Language Generation

Repräsentative Sprachgenerierung

代代代语语代语代语代 2505.21819v1

Authors: Charlotte Peale, Vinod Raman, Omer Reingold

We introduce “representative generation,” extending the theoretical framework for generation proposed by Kleinberg et al. (2024) and formalized by Li et al. (2024), to additionally address diversity and bias concerns in generative models. Our notion requires outputs of a generative model to proportionally represent groups of interest from the training data. We characterize representative uniform and non-uniform generation, introducing the “group closure dimension” as a key combinatorial quantity. For representative generation in the limit, we analyze both information-theoretic and computational aspects, demonstrating feasibility for countably infinite hypothesis classes and collections of groups under certain conditions, but proving a negative result for computability using only membership queries. This contrasts with Kleinberg et al.’s (2024) positive results for standard generation in the limit. Our findings provide a rigorous foundation for developing more diverse and representative generative models.

nan

Article 763

Title@2025-05-27 (2): Optimizing Data Augmentation through Bayesian Model Selection

Title: Optimizing Data Augmentation through Bayesian Model Selection

Optimierung der Datenvergrößerung durch Bayesian Model Selection

通过Bayesian模式选择优化数据增加 2505.21813v1

Authors: Madi Matymov, Ba-Hien Tran, Michael Kampffmeyer, Markus Heinonen, Maurizio Filippone

Data Augmentation (DA) has become an essential tool to improve robustness and generalization of modern machine learning. However, when deciding on DA strategies it is critical to choose parameters carefully, and this can be a daunting task which is traditionally left to trial-and-error or expensive optimization based on validation performance. In this paper, we counter these limitations by proposing a novel framework for optimizing DA. In particular, we take a probabilistic view of DA, which leads to the interpretation of augmentation parameters as model (hyper)-parameters, and the optimization of the marginal likelihood with respect to these parameters as a Bayesian model selection problem. Due to its intractability, we derive a tractable Evidence Lower BOund (ELBO), which allows us to optimize augmentation parameters jointly with model parameters. We provide extensive theoretical results on variational approximation quality, generalization guarantees, invariance properties, and connections to empirical Bayes. Through experiments on computer vision tasks, we show that our approach improves calibration and yields robust performance over fixed or no augmentation. Our work provides a rigorous foundation for optimizing DA through Bayesian principles with significant potential for robust machine learning.

nan

Article 764

Title@2025-05-27 (2): Learning Enhanced Ensemble Filters

Title: Learning Enhanced Ensemble Filters

Enhanced Ensemble Filter lernen

学习增强的组合过滤器 2504.17836v2

Authors: Eviatar Bach, Ricardo Baptista, Edoardo Calvello, Bohan Chen, Andrew Stuart

The filtering distribution in hidden Markov models evolves according to the law of a mean-field model in state-observation space. The ensemble Kalman filter (EnKF) approximates this mean-field model with an ensemble of interacting particles, employing a Gaussian ansatz for the joint distribution of the state and observation at each observation time. These methods are robust, but the Gaussian ansatz limits accuracy. This shortcoming is addressed by approximating the mean-field evolution using a novel form of neural operator taking probability distributions as input: a measure neural mapping (MNM). A MNM is used to design a novel approach to filtering, the MNM-enhanced ensemble filter (MNMEF), which is defined in both the mean-field limit and for interacting ensemble particle approximations. The ensemble approach uses empirical measures as input to the MNM and is implemented using the set transformer, which is invariant to ensemble permutation and allows for different ensemble sizes. The derivation of methods from a mean-field formulation allows a single parameterization of the algorithm to be deployed at different ensemble sizes. In practice fine-tuning of a small number of parameters, for specific ensemble sizes, further enhances the accuracy of the scheme. The promise of the approach is demonstrated by its superior root mean-square-error performance relative to leading methods in filtering the Lorenz 96 and Kuramoto-Sivashinsky models.

nan

Article 765

Title@2025-05-27 (2): ThinkGuard: Deliberative Slow Thinking Leads to Cautious Guardrails

Title: ThinkGuard: Deliberative Slow Thinking Leads to Cautious Guardrails

ThinkGuard: Besonnenes langsames Denken führt zu voreiligen Wärtern

思考指南:慎重考虑的慢思考引领谨慎警卫车 2502.13458v2

Authors: Xiaofei Wen, Wenxuan Zhou, Wenjie Jacky Mo, Muhao Chen

Ensuring the safety of large language models (LLMs) is critical as they are deployed in real-world applications. Existing guardrails rely on rule-based filtering or single-pass classification, limiting their ability to handle nuanced safety violations. To address this, we propose ThinkGuard, a critique-augmented guardrail model that distills knowledge from high-capacity LLMs by generating structured critiques alongside safety labels. Fine-tuned on critique-augmented data, the captured deliberative thinking ability drastically enhances the guardrail’s cautiousness and interpretability. Evaluated on multiple safety benchmarks, ThinkGuard achieves the highest average F1 and AUPRC, outperforming all baselines. Compared to LLaMA Guard 3, ThinkGuard improves accuracy by 16.1% and macro F1 by 27.0%. Moreover, it surpasses label-only fine-tuned models, confirming that structured critiques enhance both classification precision and nuanced safety reasoning while maintaining computational efficiency.

nan

Article 766

Title@2025-05-27 (2): Voice Quality Dimensions as Interpretable Primitives for Speaking Style for Atypical Speech and Affect

Title: Voice Quality Dimensions as Interpretable Primitives for Speaking Style for Atypical Speech and Affect

Sprachqualitätsdimensionen als Interpretierbare Primitive für sprechenden Stil für atypische Sprache und Affekt

语音质量方面作为非非典型演讲和影响说话风格的可解释的原始语言 2505.21809v1

Authors: Jaya Narain, Vasudha Kowtha, Colin Lea, Lauren Tooley, Dianna Yee, Vikramjit Mitra, Zifang Huang, Miquel Espi Marques, Jon Huang, Carlos Avendano, Shirley Ren

Perceptual voice quality dimensions describe key characteristics of atypical speech and other speech modulations. Here we develop and evaluate voice quality models for seven voice and speech dimensions (intelligibility, imprecise consonants, harsh voice, naturalness, monoloudness, monopitch, and breathiness). Probes were trained on the public Speech Accessibility (SAP) project dataset with 11,184 samples from 434 speakers, using embeddings from frozen pre-trained models as features. We found that our probes had both strong performance and strong generalization across speech elicitation categories in the SAP dataset. We further validated zero-shot performance on additional datasets, encompassing unseen languages and tasks: Italian atypical speech, English atypical speech, and affective speech. The strong zero-shot performance and the interpretability of results across an array of evaluations suggests the utility of using voice quality dimensions in speaking style-related tasks.

nan

Article 767

Title@2025-05-27 (2): Towards Operational Automated Greenhouse Gas Plume Detection

Title: Towards Operational Automated Greenhouse Gas Plume Detection

Auf dem Weg zu einer operationell automatisierten Treibhausgas-Plume-Erkennung

实现操作性自动温室气体管道探测 2505.21806v1

Authors: Brian D. Bue, Jake H. Lee, Andrew K. Thorpe, Philip G. Brodrick, Daniel Cusworth, Alana Ayasse, Vassiliki Mancoridis, Anagha Satish, Shujun Xiong, Riley Duren

Operational deployment of a fully automated greenhouse gas (GHG) plume detection system remains an elusive goal for imaging spectroscopy missions, despite recent advances in deep learning approaches. With the dramatic increase in data availability, however, automation continues to increase in importance for natural and anthropogenic emissions monitoring. This work reviews and addresses several key obstacles in the field: data and label quality control, prevention of spatiotemporal biases, and correctly aligned modeling objectives. We demonstrate through rigorous experiments using multicampaign data from airborne and spaceborne instruments that convolutional neural networks (CNNs) are able to achieve operational detection performance when these obstacles are alleviated. We demonstrate that a multitask model that learns both instance detection and pixelwise segmentation simultaneously can successfully lead towards an operational pathway. We evaluate the model’s plume detectability across emission source types and regions, identifying thresholds for operational deployment. Finally, we provide analysis-ready data, models, and source code for reproducibility, and work to define a set of best practices and validation standards to facilitate future contributions to the field.

nan

Article 768

Title@2025-05-27 (2): From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs

Title: From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs

Von der Anfahrt zu den Cones: Erforschung multidimensionaler Darstellungen von Propositional Facts in LLMs

” 从方向到锥体:探索液晶中各种潜在事实的多层面代表 “ 2505.21800v1

Authors: Stanley Yu, Vaidehi Bulusu, Oscar Yasunaga, Clayton Lau, Cole Blondin, Sean O’Brien, Kevin Zhu, Vasu Sharma

Large Language Models (LLMs) exhibit strong conversational abilities but often generate falsehoods. Prior work suggests that the truthfulness of simple propositions can be represented as a single linear direction in a model’s internal activations, but this may not fully capture its underlying geometry. In this work, we extend the concept cone framework, recently introduced for modeling refusal, to the domain of truth. We identify multi-dimensional cones that causally mediate truth-related behavior across multiple LLM families. Our results are supported by three lines of evidence: (i) causal interventions reliably flip model responses to factual statements, (ii) learned cones generalize across model architectures, and (iii) cone-based interventions preserve unrelated model behavior. These findings reveal the richer, multidirectional structure governing simple true/false propositions in LLMs and highlight concept cones as a promising tool for probing abstract behaviors.

nan

Article 769

Title@2025-05-27 (2): PolarGrad: A Class of Matrix-Gradient Optimizers from a Unifying Preconditioning Perspective

Title: PolarGrad: A Class of Matrix-Gradient Optimizers from a Unifying Preconditioning Perspective

PolarGrad: Eine Klasse von Matrix-Gradienten-Optimierern aus einer einheitlichen Sicht der Vorkonditionierung

极地格:从统一前置角度出发的矩阵-高压优化器类别 2505.21799v1

Authors: Tim Tsz-Kit Lau, Qi Long, Weijie Su

The ever-growing scale of deep learning models and datasets underscores the critical importance of efficient optimization methods. While preconditioned gradient methods such as Adam and AdamW are the de facto optimizers for training neural networks and large language models, structure-aware preconditioned optimizers like Shampoo and Muon, which utilize the matrix structure of gradients, have demonstrated promising evidence of faster convergence. In this paper, we introduce a unifying framework for analyzing “matrix-aware” preconditioned methods, which not only sheds light on the effectiveness of Muon and related optimizers but also leads to a class of new structure-aware preconditioned methods. A key contribution of this framework is its precise distinction between preconditioning strategies that treat neural network weights as vectors (addressing curvature anisotropy) versus those that consider their matrix structure (addressing gradient anisotropy). This perspective provides new insights into several empirical phenomena in language model pre-training, including Adam’s training instabilities, Muon’s accelerated convergence, and the necessity of learning rate warmup for Adam. Building upon this framework, we introduce PolarGrad, a new class of preconditioned optimization methods based on the polar decomposition of matrix-valued gradients. As a special instance, PolarGrad includes Muon with updates scaled by the nuclear norm of the gradients. We provide numerical implementations of these methods, leveraging efficient numerical polar decomposition algorithms for enhanced convergence. Our extensive evaluations across diverse matrix optimization problems and language model pre-training tasks demonstrate that PolarGrad outperforms both Adam and Muon.

nan

Article 770

Title@2025-05-27 (2): A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Title: A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Ein General-Purpose-Theorem für hochwahrscheinliche Grenzen stochastischer Annäherung mit Polyak Average

具有聚氨基挥动作用的斯托克相吸合高概率波断的普通用途理论 2505.21796v1

Authors: Sajad Khodadadian, Martin Zubeldia

Polyak-Ruppert averaging is a widely used technique to achieve the optimal asymptotic variance of stochastic approximation (SA) algorithms, yet its high-probability performance guarantees remain underexplored in general settings. In this paper, we present a general framework for establishing non-asymptotic concentration bounds for the error of averaged SA iterates. Our approach assumes access to individual concentration bounds for the unaveraged iterates and yields a sharp bound on the averaged iterates. We also construct an example, showing the tightness of our result up to constant multiplicative factors. As direct applications, we derive tight concentration bounds for contractive SA algorithms and for algorithms such as temporal difference learning and Q-learning with averaging, obtaining new bounds in settings where traditional analysis is challenging.

nan

Article 771

Title@2025-05-27 (2): End-to-End Breast Cancer Radiotherapy Planning via LMMs with Consistency Embedding

Title: End-to-End Breast Cancer Radiotherapy Planning via LMMs with Consistency Embedding

End-to-End-Brustkrebs-Radiotherapie Planung über LMMs mit Konsistenz-Embedding

通过具有一致嵌入的LMMs进行端至端乳腺癌放射治疗规划 2311.15876v4

Authors: Kwanyoung Kim, Yujin Oh, Sangjoon Park, Hwa Kyung Byun, Joongyo Lee, Jin Sung Kim, Yong Bae Kim, Jong Chul Ye

Recent advances in AI foundation models have significant potential for lightening the clinical workload by mimicking the comprehensive and multi-faceted approaches used by medical professionals. In the field of radiation oncology, the integration of multiple modalities holds great importance, so the opportunity of foundational model is abundant. Inspired by this, here we present RO-LMM, a multi-purpose, comprehensive large multimodal model (LMM) tailored for the field of radiation oncology. This model effectively manages a series of tasks within the clinical workflow, including clinical context summarization, radiation treatment plan suggestion, and plan-guided target volume segmentation by leveraging the capabilities of LMM. In particular, to perform consecutive clinical tasks without error accumulation, we present a novel Consistency Embedding Fine-Tuning (CEFTune) technique, which boosts LMM’s robustness to noisy inputs while preserving the consistency of handling clean inputs. We further extend this concept to LMM-driven segmentation framework, leading to a novel Consistency Embedding Segmentation (CESEG) techniques. Experimental results including multi-centre validation confirm that our RO-LMM with CEFTune and CESEG results in promising performance for multiple clinical tasks with generalization capabilities.

nan

Article 772

Title@2025-05-27 (2): Multimodal Federated Learning: A Survey through the Lens of Different FL Paradigms

Title: Multimodal Federated Learning: A Survey through the Lens of Different FL Paradigms

Multimodales Federated Learning: Eine Umfrage durch die Linse verschiedener FL-Paradigmen

多模式联邦学习:通过不同FL范式的镜头进行调查 2505.21792v1

Authors: Yuanzhe Peng, Jieming Bian, Lei Wang, Yin Huang, Jie Xu

Multimodal Federated Learning (MFL) lies at the intersection of two pivotal research areas: leveraging complementary information from multiple modalities to improve downstream inference performance and enabling distributed training to enhance efficiency and preserve privacy. Despite the growing interest in MFL, there is currently no comprehensive taxonomy that organizes MFL through the lens of different Federated Learning (FL) paradigms. This perspective is important because multimodal data introduces distinct challenges across various FL settings. These challenges, including modality heterogeneity, privacy heterogeneity, and communication inefficiency, are fundamentally different from those encountered in traditional unimodal or non-FL scenarios. In this paper, we systematically examine MFL within the context of three major FL paradigms: horizontal FL (HFL), vertical FL (VFL), and hybrid FL. For each paradigm, we present the problem formulation, review representative training algorithms, and highlight the most prominent challenge introduced by multimodal data in distributed settings. We also discuss open challenges and provide insights for future research. By establishing this taxonomy, we aim to uncover the novel challenges posed by multimodal data from the perspective of different FL paradigms and to offer a new lens through which to understand and advance the development of MFL.

nan

Article 773

Title@2025-05-27 (2): LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models

Title: LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models

LV-XAttn: Verteilte Cross-Attention für lange visuelle Eingänge in multimodalen großen Sprachmodellen

LV-XAttn:多式大语言模型中长视输入分布式交叉注意 2502.02406v3

Authors: Tzu-Tao Chang, Shivaram Venkataraman

Cross-attention is commonly adopted in multimodal large language models (MLLMs) for integrating visual information into the language backbone. However, in applications with large visual inputs, such as video understanding, processing a large number of visual tokens in cross-attention layers leads to high memory demands and often necessitates distributed computation across multiple GPUs. Existing distributed attention mechanisms face significant communication overheads, making cross-attention layers a critical bottleneck for efficient training and inference of MLLMs. To address this, we propose LV-XAttn, a distributed, exact cross-attention mechanism with minimal communication overhead. We observe that in applications involving large visual inputs, the size of the query block is typically much smaller than that of the key-value blocks. Thus, in LV-XAttn we keep the large key-value blocks locally on each GPU and exchange smaller query blocks across GPUs. We also introduce an efficient activation recomputation technique to support longer visual context. We theoretically analyze the communication benefits of LV-XAttn and show that it can achieve speedups for a wide range of models. Our evaluations with Llama 3-V, mPLUG-Owl3 and OpenFlamingo models find that LV-XAttn achieves up to 10.62$\times$ end-to-end speedup compared to existing approaches.

nan

Article 774

Title@2025-05-27 (2): Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks

Title: Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks

Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks

以美元为单位、以美元为单位、以美元为单位、以美元为单位、以目标为单位的全球最小化器 2505.21791v1

Authors: Julia Nakhleh, Robert D. Nowak

Overparameterized neural networks can interpolate a given dataset in many different ways, prompting the fundamental question: which among these solutions should we prefer, and what explicit regularization strategies will provably yield these solutions? This paper addresses the challenge of finding the sparsest interpolating ReLU network – i.e., the network with the fewest nonzero parameters or neurons – a goal with wide-ranging implications for efficiency, generalization, interpretability, theory, and model compression. Unlike post hoc pruning approaches, we propose a continuous, almost-everywhere differentiable training objective whose global minima are guaranteed to correspond to the sparsest single-hidden-layer ReLU networks that fit the data. This result marks a conceptual advance: it recasts the combinatorial problem of sparse interpolation as a smooth optimization task, potentially enabling the use of gradient-based training methods. Our objective is based on minimizing $\ell^p$ quasinorms of the weights for $0 < p < 1$, a classical sparsity-promoting strategy in finite-dimensional settings. However, applying these ideas to neural networks presents new challenges: the function class is infinite-dimensional, and the weights are learned using a highly nonconvex objective. We prove that, under our formulation, global minimizers correspond exactly to sparsest solutions. Our work lays a foundation for understanding when and how continuous sparsity-inducing objectives can be leveraged to recover sparse networks through training.

nan

Article 775

Title@2025-05-27 (2): Faster Rates for Private Adversarial Bandits

Title: Faster Rates for Private Adversarial Bandits

Schnellere Preise für private Adversarial Bandits

私人反盗贼的速率 2505.21790v1

Authors: Hilal Asi, Vinod Raman, Kunal Talwar

We design new differentially private algorithms for the problems of adversarial bandits and bandits with expert advice. For adversarial bandits, we give a simple and efficient conversion of any non-private bandit algorithm to a private bandit algorithm. Instantiating our conversion with existing non-private bandit algorithms gives a regret upper bound of $O\left(\frac{\sqrt{KT}}{\sqrt{\epsilon}}\right)$, improving upon the existing upper bound $O\left(\frac{\sqrt{KT \log(KT)}}{\epsilon}\right)$ for all $\epsilon \leq 1$. In particular, our algorithms allow for sublinear expected regret even when $\epsilon \leq \frac{1}{\sqrt{T}}$, establishing the first known separation between central and local differential privacy for this problem. For bandits with expert advice, we give the first differentially private algorithms, with expected regret $O\left(\frac{\sqrt{NT}}{\sqrt{\epsilon}}\right), O\left(\frac{\sqrt{KT\log(N)}\log(KT)}{\epsilon}\right)$, and $\tilde{O}\left(\frac{N^{1/6}K^{1/2}T^{2/3}\log(NT)}{\epsilon ^{1/3}} + \frac{N^{1/2}\log(NT)}{\epsilon}\right)$, where $K$ and $N$ are the number of actions and experts respectively. These rates allow us to get sublinear regret for different combinations of small and large $K, N$ and $\epsilon.$

nan

Article 776

Title@2025-05-27 (2): Wanda++: Pruning Large Language Models via Regional Gradients

Title: Wanda++: Pruning Large Language Models via Regional Gradients

Wanda++: Beschneiden großer Sprachmodelle über regionale Gradienten

Wanda+++:通过区域渐变来保护大语言模式 2503.04992v3

Authors: Yifan Yang, Kai Zhen, Bhavana Ganesh, Aram Galstyan, Goeric Huybrechts, Markus Müller, Jonas M. Kübler, Rupak Vignesh Swaminathan, Athanasios Mouchtaris, Sravan Babu Bodapati, Nathan Susanj, Zheng Zhang, Jack FitzGerald, Abhishek Kumar

Large Language Models (LLMs) pruning seeks to remove unimportant weights for inference speedup with minimal accuracy impact. However, existing methods often suffer from accuracy degradation without full-model sparsity-aware fine-tuning. This paper presents Wanda++, a novel pruning framework that outperforms the state-of-the-art methods by utilizing decoder-block-level \textbf{regional} gradients. Specifically, Wanda++ improves the pruning score with regional gradients for the first time and proposes an efficient regional optimization method to minimize pruning-induced output discrepancies between the dense and sparse decoder output. Notably, Wanda++ improves perplexity by up to 32\% over Wanda in the language modeling task and generalizes effectively to downstream tasks. Moreover, despite updating weights with regional optimization, Wanda++ remains orthogonal to sparsity-aware fine-tuning, further reducing perplexity with LoRA in great extend. Our approach is lightweight, pruning a 7B LLaMA model in under 10 minutes on a single H100 GPU.

nan

Article 777

Title@2025-05-27 (2): Born a Transformer – Always a Transformer?

Title: Born a Transformer – Always a Transformer?

Geboren ein Transformer - immer ein Transformer?

天生的变形人 - - 总是变形人? 2505.21785v1

Authors: Yana Veitsman, Mayank Jobanputra, Yash Sarrof, Aleksandra Bakalova, Vera Demberg, Ellie Pavlick, Michael Hahn

Transformers have theoretical limitations in modeling certain sequence-to-sequence tasks, yet it remains largely unclear if these limitations play a role in large-scale pretrained LLMs, or whether LLMs might effectively overcome these constraints in practice due to the scale of both the models themselves and their pretraining data. We explore how these architectural constraints manifest after pretraining, by studying a family of $\textit{retrieval}$ and $\textit{copying}$ tasks inspired by Liu et al. [2024]. We use the recently proposed C-RASP framework for studying length generalization [Huang et al., 2025b] to provide guarantees for each of our settings. Empirically, we observe an $\textit{induction-versus-anti-induction}$ asymmetry, where pretrained models are better at retrieving tokens to the right (induction) rather than the left (anti-induction) of a query token. This asymmetry disappears upon targeted fine-tuning if length-generalization is guaranteed by theory. Mechanistic analysis reveals that this asymmetry is connected to the differences in the strength of induction versus anti-induction circuits within pretrained Transformers. We validate our findings through practical experiments on real-world tasks demonstrating reliability risks. Our results highlight that pretraining selectively enhances certain Transformer capabilities, but does not overcome fundamental length-generalization limits.

nan

Article 778

Title@2025-05-27 (2): Universal Approximation of Mean-Field Models via Transformers

Title: Universal Approximation of Mean-Field Models via Transformers

Universelle Annäherung von Mittelwert-Feld-Modellen über Transformer

通过变压器实现平均实地模型普遍接近 2410.16295v2

Authors: Shiba Biswal, Karthik Elamvazhuthi, Rishi Sonthalia

This paper investigates the use of transformers to approximate the mean-field dynamics of interacting particle systems exhibiting collective behavior. Such systems are fundamental in modeling phenomena across physics, biology, and engineering, including opinion formation, biological networks, and swarm robotics. The key characteristic of these systems is that the particles are indistinguishable, leading to permutation-equivariant dynamics. First, we empirically demonstrate that transformers are well-suited for approximating a variety of mean field models, including the Cucker-Smale model for flocking and milling, and the mean-field system for training two-layer neural networks. We validate our numerical experiments via mathematical theory. Specifically, we prove that if a finite-dimensional transformer effectively approximates the finite-dimensional vector field governing the particle system, then the $L_2$ distance between the \textit{expected transformer} and the infinite-dimensional mean-field vector field can be uniformly bounded by a function of the number of particles observed during training. Leveraging this result, we establish theoretical bounds on the distance between the true mean-field dynamics and those obtained using the transformer.

nan

Article 779

Title@2025-05-27 (2): Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models

Title: Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models

Wasserzeichen im Sand: Unmöglichkeit der starken Wasserzeichen für generative Modelle

沙沙中的水印:在生成模型中使用强水标志的可能性 2311.04378v5

Authors: Hanlin Zhang, Benjamin L. Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, Boaz Barak

Watermarking generative models consists of planting a statistical signal (watermark) in a model’s output so that it can be later verified that the output was generated by the given model. A strong watermarking scheme satisfies the property that a computationally bounded attacker cannot erase the watermark without causing significant quality degradation. In this paper, we study the (im)possibility of strong watermarking schemes. We prove that, under well-specified and natural assumptions, strong watermarking is impossible to achieve. This holds even in the private detection algorithm setting, where the watermark insertion and detection algorithms share a secret key, unknown to the attacker. To prove this result, we introduce a generic efficient watermark attack; the attacker is not required to know the private key of the scheme or even which scheme is used. Our attack is based on two assumptions: (1) The attacker has access to a “quality oracle” that can evaluate whether a candidate output is a high-quality response to a prompt, and (2) The attacker has access to a “perturbation oracle” which can modify an output with a nontrivial probability of maintaining quality, and which induces an efficiently mixing random walk on high-quality outputs. We argue that both assumptions can be satisfied in practice by an attacker with weaker computational capabilities than the watermarked model itself, to which the attacker has only black-box access. Furthermore, our assumptions will likely only be easier to satisfy over time as models grow in capabilities and modalities. We demonstrate the feasibility of our attack by instantiating it to attack three existing watermarking schemes for large language models: Kirchenbauer et al. (2023), Kuditipudi et al. (2023), and Zhao et al. (2023). The same attack successfully removes the watermarks planted by all three schemes, with only minor quality degradation.

nan

Article 780

Title@2025-05-27 (2): P-DROP: Poisson-Based Dropout for Graph Neural Networks

Title: P-DROP: Poisson-Based Dropout for Graph Neural Networks

P-DROP: Poisson-basiertes Dropout für Graphen-Neural-Netzwerke

PDROP: 石形神经网络的 Poisson-Poisson 辍学 2505.21783v1

Authors: Hyunsik Yun

Over-smoothing remains a major challenge in Graph Neural Networks (GNNs), where repeated message passing causes node representations to converge and lose discriminative power. To address this, we propose a novel node selection strategy based on Poisson processes, introducing stochastic but structure-aware updates. Specifically, we equip each node with an independent Poisson clock, enabling asynchronous and localized updates that preserve structural diversity. We explore two applications of this strategy: as a replacement for dropout-based regularization and as a dynamic subgraph training scheme. Experimental results on standard benchmarks (Cora, Citeseer, Pubmed) demonstrate that our Poisson-based method yields competitive or improved accuracy compared to traditional Dropout, DropEdge, and DropNode approaches, particularly in later training stages.

nan

Article 781

Title@2025-05-27 (2): Diffusion Adversarial Post-Training for One-Step Video Generation

Title: Diffusion Adversarial Post-Training for One-Step Video Generation

Diffusions-Adversarial-Post-Training für die One-Step-Videogenerierung

单步制录像制作单步制片后培训 2501.08316v2

Authors: Shanchuan Lin, Xin Xia, Yuxi Ren, Ceyuan Yang, Xuefeng Xiao, Lu Jiang

The diffusion models are widely used for image and video generation, but their iterative generation process is slow and expansive. While existing distillation approaches have demonstrated the potential for one-step generation in the image domain, they still suffer from significant quality degradation. In this work, we propose Adversarial Post-Training (APT) against real data following diffusion pre-training for one-step video generation. To improve the training stability and quality, we introduce several improvements to the model architecture and training procedures, along with an approximated R1 regularization objective. Empirically, our experiments show that our adversarial post-trained model, Seaweed-APT, can generate 2-second, 1280x720, 24fps videos in real time using a single forward evaluation step. Additionally, our model is capable of generating 1024px images in a single step, achieving quality comparable to state-of-the-art methods.

nan

Article 782

Title@2025-05-27 (2): Memorization to Generalization: Emergence of Diffusion Models from Associative Memory

Title: Memorization to Generalization: Emergence of Diffusion Models from Associative Memory

Erinnerung an die Verallgemeinerung: Entstehung von Diffusionsmodellen aus dem assoziativen Gedächtnis

记忆化为普遍化:共同内存传播模型的出现 2505.21777v1

Authors: Bao Pham, Gabriel Raya, Matteo Negri, Mohammed J. Zaki, Luca Ambrogioni, Dmitry Krotov

Hopfield networks are associative memory (AM) systems, designed for storing and retrieving patterns as local minima of an energy landscape. In the classical Hopfield model, an interesting phenomenon occurs when the amount of training data reaches its critical memory load $- spurious\,\,states$, or unintended stable points, emerge at the end of the retrieval dynamics, leading to incorrect recall. In this work, we examine diffusion models, commonly used in generative modeling, from the perspective of AMs. The training phase of diffusion model is conceptualized as memory encoding (training data is stored in the memory). The generation phase is viewed as an attempt of memory retrieval. In the small data regime the diffusion model exhibits a strong memorization phase, where the network creates distinct basins of attraction around each sample in the training set, akin to the Hopfield model below the critical memory load. In the large data regime, a different phase appears where an increase in the size of the training set fosters the creation of new attractor states that correspond to manifolds of the generated samples. Spurious states appear at the boundary of this transition and correspond to emergent attractor states, which are absent in the training set, but, at the same time, have distinct basins of attraction around them. Our findings provide: a novel perspective on the memorization-generalization phenomenon in diffusion models via the lens of AMs, theoretical prediction of existence of spurious states, empirical validation of this prediction in commonly-used diffusion models.

nan

Article 783

Title@2025-05-27 (2): DualSchool: How Reliable are LLMs for Optimization Education?

Title: DualSchool: How Reliable are LLMs for Optimization Education?

DualSchool: Wie zuverlässig sind LLMs für die Optimierungsbildung?

两所学校:优化教育LLMs有多可靠? 2505.21775v1

Authors: Michael Klamkin, Arnaud Deza, Sikai Cheng, Haoruo Zhao, Pascal Van Hentenryck

Consider the following task taught in introductory optimization courses which addresses challenges articulated by the community at the intersection of (generative) AI and OR: generate the dual of a linear program. LLMs, being trained at web-scale, have the conversion process and many instances of Primal to Dual Conversion (P2DC) at their disposal. Students may thus reasonably expect that LLMs would perform well on the P2DC task. To assess this expectation, this paper introduces DualSchool, a comprehensive framework for generating and verifying P2DC instances. The verification procedure of DualSchool uses the Canonical Graph Edit Distance, going well beyond existing evaluation methods for optimization models, which exhibit many false positives and negatives when applied to P2DC. Experiments performed by DualSchool reveal interesting findings. Although LLMs can recite the conversion procedure accurately, state-of-the-art open LLMs fail to consistently produce correct duals. This finding holds even for the smallest two-variable instances and for derivative tasks, such as correctness, verification, and error classification. The paper also discusses the implications for educators, students, and the development of large reasoning systems.

nan

Article 784

Title@2025-05-27 (2): Backdoors in DRL: Four Environments Focusing on In-distribution Triggers

Title: Backdoors in DRL: Four Environments Focusing on In-distribution Triggers

Hintertüren in DRL: Vier Umgebungen mit Fokus auf In-Distribution Trigger

DRL的后门:四个环境,侧重于内部分配触发器 2505.17248v2

Authors: Chace Ashcraft, Ted Staley, Josh Carney, Cameron Hickert, Kiran Karra, Nathan Drenkow

Backdoor attacks, or trojans, pose a security risk by concealing undesirable behavior in deep neural network models. Open-source neural networks are downloaded from the internet daily, possibly containing backdoors, and third-party model developers are common. To advance research on backdoor attack mitigation, we develop several trojans for deep reinforcement learning (DRL) agents. We focus on in-distribution triggers, which occur within the agent’s natural data distribution, since they pose a more significant security threat than out-of-distribution triggers due to their ease of activation by the attacker during model deployment. We implement backdoor attacks in four reinforcement learning (RL) environments: LavaWorld, Randomized LavaWorld, Colorful Memory, and Modified Safety Gymnasium. We train various models, both clean and backdoored, to characterize these attacks. We find that in-distribution triggers can require additional effort to implement and be more challenging for models to learn, but are nevertheless viable threats in DRL even using basic data poisoning attacks.

nan

Article 785

Title@2025-05-27 (2): Beyond 1D: Vision Transformers and Multichannel Signal Images for PPG-to-ECG Reconstruction

Title: Beyond 1D: Vision Transformers and Multichannel Signal Images for PPG-to-ECG Reconstruction

Beyond 1D: Vision Transformers und Multichannel Signal Images für PPG-zu-ECG-Rekonstruktion

1D之后:为重建PPPG至ECG提供愿景变形器和多通道信号图像 2505.21767v1

Authors: Xiaoyan Li, Shixin Xu, Faisal Habib, Arvind Gupta, Huaxiong Huang

Reconstructing ECG from PPG is a promising yet challenging task. While recent advancements in generative models have significantly improved ECG reconstruction, accurately capturing fine-grained waveform features remains a key challenge. To address this, we propose a novel PPG-to-ECG reconstruction method that leverages a Vision Transformer (ViT) as the core network. Unlike conventional approaches that rely on single-channel PPG, our method employs a four-channel signal image representation, incorporating the original PPG, its first-order difference, second-order difference, and area under the curve. This multi-channel design enriches feature extraction by preserving both temporal and physiological variations within the PPG. By leveraging the self-attention mechanism in ViT, our approach effectively captures both inter-beat and intra-beat dependencies, leading to more robust and accurate ECG reconstruction. Experimental results demonstrate that our method consistently outperforms existing 1D convolution-based approaches, achieving up to 29% reduction in PRD and 15% reduction in RMSE. The proposed approach also produces improvements in other evaluation metrics, highlighting its robustness and effectiveness in reconstructing ECG signals. Furthermore, to ensure a clinically relevant evaluation, we introduce new performance metrics, including QRS area error, PR interval error, RT interval error, and RT amplitude difference error. Our findings suggest that integrating a four-channel signal image representation with the self-attention mechanism of ViT enables more effective extraction of informative PPG features and improved modeling of beat-to-beat variations for PPG-to-ECG mapping. Beyond demonstrating the potential of PPG as a viable alternative for heart activity monitoring, our approach opens new avenues for cyclic signal analysis and prediction.

nan

Article 786

Title: Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop

Erklärbare multimodale Zeitreihenvorhersage mit LLM-in-the-Loop

与LLM in-Loop的可解释的多时时间序列预测 2503.01013v2

Authors: Yushan Jiang, Wenchao Yu, Geon Lee, Dongjin Song, Kijung Shin, Wei Cheng, Yanchi Liu, Haifeng Chen

Time series analysis provides essential insights for real-world system dynamics and informs downstream decision-making, yet most existing methods often overlook the rich contextual signals present in auxiliary modalities. To bridge this gap, we introduce TimeXL, a multi-modal prediction framework that integrates a prototype-based time series encoder with three collaborating Large Language Models (LLMs) to deliver more accurate predictions and interpretable explanations. First, a multi-modal prototype-based encoder processes both time series and textual inputs to generate preliminary forecasts alongside case-based rationales. These outputs then feed into a prediction LLM, which refines the forecasts by reasoning over the encoder’s predictions and explanations. Next, a reflection LLM compares the predicted values against the ground truth, identifying textual inconsistencies or noise. Guided by this feedback, a refinement LLM iteratively enhances text quality and triggers encoder retraining. This closed-loop workflow – prediction, critique (reflect), and refinement – continuously boosts the framework’s performance and interpretability. Empirical evaluations on four real-world datasets demonstrate that TimeXL achieves up to 8.9\% improvement in AUC and produces human-centric, multi-modal explanations, highlighting the power of LLM-driven reasoning for time series prediction.

nan

Article 787

Title@2025-05-27 (2): TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster

Title: TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster

TS-RAG: Retrieval-Augmented Generation basierte Time Series Foundation Modelle sind stärker Zero-Shot Forecaster

TS-RAG:基于时间序列的回收-养殖一代基于时间序列的基础模型是更强的零热预测仪 2503.07649v3

Authors: Kanghui Ning, Zijie Pan, Yu Liu, Yushan Jiang, James Y. Zhang, Kashif Rasul, Anderson Schneider, Lintao Ma, Yuriy Nevmyvaka, Dongjin Song

Large Language Models (LLMs) and Foundation Models (FMs) have recently become prevalent for time series forecasting tasks. While fine-tuning LLMs enables domain adaptation, they often struggle to generalize across diverse and unseen datasets. Moreover, existing Time Series Foundation Models (TSFMs) still face challenges in handling non-stationary dynamics and distribution shifts, largely due to the lack of effective mechanisms for adaptation. To this end, we present TS-RAG, a retrieval-augmented generation framework for time series forecasting that enhances the generalization and interpretability of TSFMs. Specifically, TS-RAG leverages pre-trained time series encoders to retrieve semantically relevant segments from a dedicated knowledge base, enriching the contextual representation of the input query. Furthermore, we propose an Adaptive Retrieval Mixer (ARM) module that dynamically fuses the retrieved patterns with the TSFM’s internal representation, improving forecasting accuracy without requiring task-specific fine-tuning. Thorough empirical studies on seven public benchmark datasets demonstrate that TS-RAG achieves state-of-the-art zero-shot forecasting performance, outperforming the existing TSFMs by up to 6.84% across diverse domains while also providing desirable interpretability.

nan

Article 788

Title@2025-05-27 (2): Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization

Title: Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization

Puristische Korrelationen in der hochdimensionalen Regression: Die Rollen der Regularisierung, der Einfachheit Bias und der Überparameterisierung

高度倒退中的纯净误值:常规化、简易生物和过度计量化的作用 2502.01347v2

Authors: Simone Bombari, Marco Mondelli

Learning models have been shown to rely on spurious correlations between non-predictive features and the associated labels in the training data, with negative implications on robustness, bias and fairness. In this work, we provide a statistical characterization of this phenomenon for high-dimensional regression, when the data contains a predictive core feature $x$ and a spurious feature $y$. Specifically, we quantify the amount of spurious correlations $C$ learned via linear regression, in terms of the data covariance and the strength $\lambda$ of the ridge regularization. As a consequence, we first capture the simplicity of $y$ through the spectrum of its covariance, and its correlation with $x$ through the Schur complement of the full data covariance. Next, we prove a trade-off between $C$ and the in-distribution test loss $L$, by showing that the value of $\lambda$ that minimizes $L$ lies in an interval where $C$ is increasing. Finally, we investigate the effects of over-parameterization via the random features model, by showing its equivalence to regularized linear regression. Our theoretical results are supported by numerical experiments on Gaussian, Color-MNIST, and CIFAR-10 datasets.

nan

Article 789

Title: FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

FRAMES-VQA: Benchmarking Fine-Tuning Robustheit über Multi-Modal Shifts in der visuellen Fragestellung

FRAMES-VQA:确定视觉问题解答中多模式变化的精确调整强度基准 2505.21755v1

Authors: Chengyue Huang, Brisa Maneechotesuwan, Shivang Chopra, Zsolt Kira

Visual question answering (VQA) systems face significant challenges when adapting to real-world data shifts, especially in multi-modal contexts. While robust fine-tuning strategies are essential for maintaining performance across in-distribution (ID) and out-of-distribution (OOD) scenarios, current evaluation settings are primarily unimodal or particular to some types of OOD, offering limited insight into the complexities of multi-modal contexts. In this work, we propose a new benchmark FRAMES-VQA (Fine-Tuning Robustness across Multi-Modal Shifts in VQA) for evaluating robust fine-tuning for VQA tasks. We utilize ten existing VQA benchmarks, including VQAv2, IV-VQA, VQA-CP, OK-VQA and others, and categorize them into ID, near and far OOD datasets covering uni-modal, multi-modal and adversarial distribution shifts. We first conduct a comprehensive comparison of existing robust fine-tuning methods. We then quantify the distribution shifts by calculating the Mahalanobis distance using uni-modal and multi-modal embeddings extracted from various models. Further, we perform an extensive analysis to explore the interactions between uni- and multi-modal shifts as well as modality importance for ID and OOD samples. These analyses offer valuable guidance on developing more robust fine-tuning methods to handle multi-modal distribution shifts. The code is available at https://github.com/chengyuehuang511/FRAMES-VQA .

nan

Article 790

Title@2025-05-27 (2): Path Planning for Masked Diffusion Model Sampling

Title: Path Planning for Masked Diffusion Model Sampling

Pfadplanung für maskierte Diffusions-Modell-Probenahme

蒙面扩散模型取样规划路径 2502.03540v4

Authors: Fred Zhangzhi Peng, Zachary Bezemek, Sawan Patel, Jarrid Rector-Brooks, Sherwood Yao, Avishek Joey Bose, Alexander Tong, Pranam Chatterjee

Any order generation of discrete data using masked diffusion models (MDMs) offers a compelling alternative to traditional autoregressive models, especially in domains that lack a natural causal ordering of data. However, current popular MDMs depart from their successful continuous diffusion model counterparts with simplified masked inference wherein unmasked tokens cannot be iteratively refined – even if there is a mistake. In this paper, we extract the full power of MDMs by introducing a novel inference sampling strategy termed Path Planning (P2) that decomposes each generation step into two sub-stages: planning and denoising. Under P2, the planner at every step selects appropriate tokens that are marked to be updated, which can then be sampled using the denoiser. We demonstrate that P2 generalizes all existing sampling strategies for MDMs and critically enhances generative quality through the new capability of refining and updating existing unmasked tokens. We theoretically prove that P2 establishes a (new) expanded evidence lower bound (ELBO) on the log marginal likelihood of data. We instantiate P2 with a family of planners including: 1.) Self-Planning, 2.) BERT-Planning, and 3.) Trained-Planning with a learned planner leading to SOTA generative performance for MDMs on a suite of domains. Specifically, solely using P2 inference, we observe relative improvements of 22% in protein sequence foldability, 8% in RNA sequence pLDDT, 4% in math reasoning, 68% in story generation (ROUGE score), and 33% in code generation for the challenging pass@1 metric.

nan

Article 791

Title@2025-05-27 (2): Hierarchical Reinforcement Learning with Uncertainty-Guided Diffusional Subgoals

Title: Hierarchical Reinforcement Learning with Uncertainty-Guided Diffusional Subgoals

Hierarchisches Stärkungslernen mit unsicheren, diffusionalen Unterzielen

具有不确定性的梯级强化学习,有不确定的辅助分传播目标 2505.21750v1

Authors: Vivienne Huiling Wang, Tinghuai Wang, Joni Pajarinen

Hierarchical reinforcement learning (HRL) learns to make decisions on multiple levels of temporal abstraction. A key challenge in HRL is that the low-level policy changes over time, making it difficult for the high-level policy to generate effective subgoals. To address this issue, the high-level policy must capture a complex subgoal distribution while also accounting for uncertainty in its estimates. We propose an approach that trains a conditional diffusion model regularized by a Gaussian Process (GP) prior to generate a complex variety of subgoals while leveraging principled GP uncertainty quantification. Building on this framework, we develop a strategy that selects subgoals from both the diffusion policy and GP’s predictive mean. Our approach outperforms prior HRL methods in both sample efficiency and performance on challenging continuous control benchmarks.

nan

Article 792

Title@2025-05-27 (2): Revisiting Bi-Linear State Transitions in Recurrent Neural Networks

Title: Revisiting Bi-Linear State Transitions in Recurrent Neural Networks

Bi-Lineare State Transitions in recurrenten neuralen Netzwerken erneut besuchen

在经常性神经网络中重新审查双利那尔州过渡 2505.21749v1

Authors: M. Reza Ebrahimi, Roland Memisevic

The role of hidden units in recurrent neural networks is typically seen as modeling memory, with research focusing on enhancing information retention through gating mechanisms. A less explored perspective views hidden units as active participants in the computation performed by the network, rather than passive memory stores. In this work, we revisit bi-linear operations, which involve multiplicative interactions between hidden units and input embeddings. We demonstrate theoretically and empirically that they constitute a natural inductive bias for representing the evolution of hidden states in state tracking tasks. These are the simplest type of task that require hidden units to actively contribute to the behavior of the network. We also show that bi-linear state updates form a natural hierarchy corresponding to state tracking tasks of increasing complexity, with popular linear recurrent networks such as Mamba residing at the lowest-complexity center of that hierarchy.

nan

Article 793

Title@2025-05-27 (2): Privacy for Free in the Overparameterized Regime

Title: Privacy for Free in the Overparameterized Regime

Privatsphäre kostenlos im überparameterisierten Regime

过度计量制度中的免费隐私 2410.14787v2

Authors: Simone Bombari, Marco Mondelli

Differentially private gradient descent (DP-GD) is a popular algorithm to train deep learning models with provable guarantees on the privacy of the training data. In the last decade, the problem of understanding its performance cost with respect to standard GD has received remarkable attention from the research community, which formally derived upper bounds on the excess population risk $R_{P}$ in different learning settings. However, existing bounds typically degrade with over-parameterization, i.e., as the number of parameters $p$ gets larger than the number of training samples $n$ – a regime which is ubiquitous in current deep-learning practice. As a result, the lack of theoretical insights leaves practitioners without clear guidance, leading some to reduce the effective number of trainable parameters to improve performance, while others use larger models to achieve better results through scale. In this work, we show that in the popular random features model with quadratic loss, for any sufficiently large $p$, privacy can be obtained for free, i.e., $\left

R_{P} \right

= o(1)$, not only when the privacy parameter $\varepsilon$ has constant order, but also in the strongly private setting $\varepsilon = o(1)$. This challenges the common wisdom that over-parameterization inherently hinders performance in private learning.

nan

Article 794

Title@2025-05-27 (2): Learning to See More: UAS-Guided Super-Resolution of Satellite Imagery for Precision Agriculture

Title: Learning to See More: UAS-Guided Super-Resolution of Satellite Imagery for Precision Agriculture

Mehr erfahren: UAS-geführte Super-Resolution von Satellitenbildern für Präzisionslandwirtschaft

学习更多见:UAS-UAS指导的精密农业卫星图像超级分辨率 2505.21746v1

Authors: Arif Masrur, Peder A. Olsen, Paul R. Adler, Carlan Jackson, Matthew W. Myers, Nathan Sedghi, Ray R. Weil

Unmanned Aircraft Systems (UAS) and satellites are key data sources for precision agriculture, yet each presents trade-offs. Satellite data offer broad spatial, temporal, and spectral coverage but lack the resolution needed for many precision farming applications, while UAS provide high spatial detail but are limited by coverage and cost, especially for hyperspectral data. This study presents a novel framework that fuses satellite and UAS imagery using super-resolution methods. By integrating data across spatial, spectral, and temporal domains, we leverage the strengths of both platforms cost-effectively. We use estimation of cover crop biomass and nitrogen (N) as a case study to evaluate our approach. By spectrally extending UAS RGB data to the vegetation red edge and near-infrared regions, we generate high-resolution Sentinel-2 imagery and improve biomass and N estimation accuracy by 18% and 31%, respectively. Our results show that UAS data need only be collected from a subset of fields and time points. Farmers can then 1) enhance the spectral detail of UAS RGB imagery; 2) increase the spatial resolution by using satellite data; and 3) extend these enhancements spatially and across the growing season at the frequency of the satellite flights. Our SRCNN-based spectral extension model shows considerable promise for model transferability over other cropping systems in the Upper and Lower Chesapeake Bay regions. Additionally, it remains effective even when cloud-free satellite data are unavailable, relying solely on the UAS RGB input. The spatial extension model produces better biomass and N predictions than models built on raw UAS RGB images. Once trained with targeted UAS RGB data, the spatial extension model allows farmers to stop repeated UAS flights. While we introduce super-resolution advances, the core contribution is a lightweight and scalable system for affordable on-farm use.

nan

Article 795

Title@2025-05-27 (2): Simulating the Unseen: Crash Prediction Must Learn from What Did Not Happen

Title: Simulating the Unseen: Crash Prediction Must Learn from What Did Not Happen

Das Unsichtbare simulieren: Crash Prediction muss lernen, was nicht passiert ist

模拟看不见:崩溃预测必须从没有发生的事情中吸取教训 2505.21743v1

Authors: Zihao Li, Xinyuan Cao, Xiangbo Gao, Kexin Tian, Keshu Wu, Mohammad Anis, Hao Zhang, Keke Long, Jiwan Jiang, Xiaopeng Li, Yunlong Zhang, Tianbao Yang, Dominique Lord, Zhengzhong Tu, Yang Zhou

Traffic safety science has long been hindered by a fundamental data paradox: the crashes we most wish to prevent are precisely those events we rarely observe. Existing crash-frequency models and surrogate safety metrics rely heavily on sparse, noisy, and under-reported records, while even sophisticated, high-fidelity simulations undersample the long-tailed situations that trigger catastrophic outcomes such as fatalities. We argue that the path to achieving Vision Zero, i.e., the complete elimination of traffic fatalities and severe injuries, requires a paradigm shift from traditional crash-only learning to a new form of counterfactual safety learning: reasoning not only about what happened, but also about the vast set of plausible yet perilous scenarios that could have happened under slightly different circumstances. To operationalize this shift, our proposed agenda bridges macro to micro. Guided by crash-rate priors, generative scene engines, diverse driver models, and causal learning, near-miss events are synthesized and explained. A crash-focused digital twin testbed links micro scenes to macro patterns, while a multi-objective validator ensures that simulations maintain statistical realism. This pipeline transforms sparse crash data into rich signals for crash prediction, enabling the stress-testing of vehicles, roads, and policies before deployment. By learning from crashes that almost happened, we can shift traffic safety from reactive forensics to proactive prevention, advancing Vision Zero.

nan

Article 796

Title@2025-05-27 (2): Outlier-Robust Linear System Identification Under Heavy-tailed Noise

Title: Outlier-Robust Linear System Identification Under Heavy-tailed Noise

Ausreißer-Robust Lineare System-Identifikation unter stark verdichtetem Lärm

在重尾噪音下识别线性系统 2501.00421v2

Authors: Vinay Kanakeri, Aritra Mitra

We consider the problem of estimating the state transition matrix of a linear time-invariant (LTI) system, given access to multiple independent trajectories sampled from the system. Several recent papers have conducted a non-asymptotic analysis of this problem, relying crucially on the assumption that the process noise is either Gaussian or sub-Gaussian, i.e., “light-tailed”. In sharp contrast, we work under a significantly weaker noise model, assuming nothing more than the existence of the fourth moment of the noise distribution. For this setting, we provide the first set of results demonstrating that one can obtain sample-complexity bounds for linear system identification that are nearly of the same order as under sub-Gaussian noise. To achieve such results, we develop a novel robust system identification algorithm that relies on constructing multiple weakly-concentrated estimators, and then boosting their performance using suitable tools from high-dimensional robust statistics. Interestingly, our analysis reveals how the kurtosis of the noise distribution, a measure of heavy-tailedness, affects the number of trajectories needed to achieve desired estimation error bounds. Finally, we show that our algorithm and analysis technique can be easily extended to account for scenarios where an adversary can arbitrarily corrupt a small fraction of the collected trajectory data. Our work takes the first steps towards building a robust statistical learning theory for control under non-ideal assumptions on the data-generating process.

nan

Article 797

Title@2025-05-27 (2): What is Adversarial Training for Diffusion Models?

Title: What is Adversarial Training for Diffusion Models?

Was ist ein Adversarial Training für Diffusionsmodelle?

传播模型的反向培训是什么? 2505.21742v1

Authors: Briglia Maria Rosaria, Mujtaba Hussain Mirza, Giuseppe Lisanti, Iacopo Masi

We answer the question in the title, showing that adversarial training (AT) for diffusion models (DMs) fundamentally differs from classifiers: while AT in classifiers enforces output invariance, AT in DMs requires equivariance to keep the diffusion process aligned with the data distribution. AT is a way to enforce smoothness in the diffusion flow, improving robustness to outliers and corrupted data. Unlike prior art, our method makes no assumptions about the noise model and integrates seamlessly into diffusion training by adding random noise, similar to randomized smoothing, or adversarial noise, akin to AT. This enables intrinsic capabilities such as handling noisy data, dealing with extreme variability such as outliers, preventing memorization, and improving robustness. We rigorously evaluate our approach with proof-of-concept datasets with known distributions in low- and high-dimensional space, thereby taking a perfect measure of errors; we further evaluate on standard benchmarks such as CIFAR-10, CelebA and LSUN Bedroom, showing strong performance under severe noise, data corruption, and iterative adversarial attacks.

nan

Article 798

Title@2025-05-27 (2): Polynomial Chaos Expanded Gaussian Process

Title: Polynomial Chaos Expanded Gaussian Process

Polynomisches Chaos erweiterter Gauß-Prozess

扩大的高斯进程 2405.01052v2

Authors: Dominik Polke, Tim Kösters, Elmar Ahle, Dirk Söffker

In complex and unknown processes, global models are initially generated over the entire experimental space but often fail to provide accurate predictions in local areas. A common approach is to use local models, which requires partitioning the experimental space and training multiple models, adding significant complexity. Recognizing this limitation, this study addresses the need for models that effectively represent both global and local experimental spaces. It introduces a novel machine learning (ML) approach: Polynomial Chaos Expanded Gaussian Process (PCEGP), leveraging polynomial chaos expansion (PCE) to calculate input-dependent hyperparameters of the Gaussian process (GP). This provides a mathematically interpretable approach that incorporates non-stationary covariance functions and heteroscedastic noise estimation to generate locally adapted models. The model performance is compared to different algorithms in benchmark tests for regression tasks. The results demonstrate low prediction errors of the PCEGP, highlighting model performance that is often competitive with or better than previous methods. A key advantage of the presented model is its interpretable hyperparameters along with training and prediction runtimes comparable to those of a GP.

nan

Article 799

Title@2025-05-27 (2): Moment kernels: a simple and scalable approach for equivariance to rotations and reflections in deep convolutional networks

Title: Moment kernels: a simple and scalable approach for equivariance to rotations and reflections in deep convolutional networks

Momentkerne: ein einfacher und skalierbarer Ansatz für Gleichmäßigkeit zu Rotationen und Reflexionen in tiefen konvolutionären Netzwerken

动力核心:一种简单和可伸缩的方法,在深刻的革命网络中,对轮换和反射的等同性采取简单和可伸缩的办法 2505.21736v1

Authors: Zachary Schlamowitz, Andrew Bennecke, Daniel J. Tward

The principle of translation equivariance (if an input image is translated an output image should be translated by the same amount), led to the development of convolutional neural networks that revolutionized machine vision. Other symmetries, like rotations and reflections, play a similarly critical role, especially in biomedical image analysis, but exploiting these symmetries has not seen wide adoption. We hypothesize that this is partially due to the mathematical complexity of methods used to exploit these symmetries, which often rely on representation theory, a bespoke concept in differential geometry and group theory. In this work, we show that the same equivariance can be achieved using a simple form of convolution kernels that we call ``moment kernels,’’ and prove that all equivariant kernels must take this form. These are a set of radially symmetric functions of a spatial position $x$, multiplied by powers of the components of $x$ or the identity matrix. We implement equivariant neural networks using standard convolution modules, and provide architectures to execute several biomedical image analysis tasks that depend on equivariance principles: classification (outputs are invariant under orthogonal transforms), 3D image registration (outputs transform like a vector), and cell segmentation (quadratic forms defining ellipses transform like a matrix).

nan

Article 800

Title@2025-05-27 (2): Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization

Title: Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization

Adressierung von Konzept-Mislabeling in Konzept-Bottleneck-Modellen durch Preference-Optimierung

通过优先优化处理概念瓶颈模式中的概念误贴标签问题 2504.18026v2

Authors: Emiliano Penaloza, Tianyue H. Zhan, Laurent Charlin, Mateo Espinosa Zarlenga

Concept Bottleneck Models (CBMs) propose to enhance the trustworthiness of AI systems by constraining their decisions on a set of human understandable concepts. However, CBMs typically assume that datasets contains accurate concept labels an assumption often violated in practice, which we show can significantly degrade performance (by 25% in some cases). To address this, we introduce the Concept Preference Optimization (CPO) objective, a new loss function based on Direct Preference Optimization, which effectively mitigates the negative impact of concept mislabeling on CBM performance. We provide an analysis on some key properties of the CPO objective showing it directly optimizes for the concept’s posterior distribution, and contrast it against Binary Cross Entropy (BCE) where we show CPO is inherently less sensitive to concept noise. We empirically confirm our analysis finding that CPO consistently outperforms BCE in three real world datasets with and without added label noise.

nan

Article 801

Title@2025-05-27 (2): Non-Markovian Discrete Diffusion with Causal Language Models

Title: Non-Markovian Discrete Diffusion with Causal Language Models

Nicht-Markovianische Diskrepanz mit kausalen Sprachmodellen

非马尔科维语非马尔科维语分辨语言模式的传播 2502.09767v2

Authors: Yangtian Zhang, Sizhuang He, Daniel Levine, Lawrence Zhao, David Zhang, Syed A Rizvi, Emanuele Zappala, Rex Ying, David van Dijk

Discrete diffusion models offer a flexible, controllable approach to structured sequence generation, yet they still lag behind causal language models in expressive power. A key limitation lies in their reliance on the Markovian assumption, which restricts each step to condition only on the current state, leading to potential uncorrectable error accumulation. In this paper, we introduce CaDDi, a discrete diffusion model that conditions on the entire generative trajectory, thereby lifting the Markov constraint and allowing the model to revisit and improve past states. By unifying sequential (causal) and temporal (diffusion) reasoning in a single non-Markovian transformer, CaDDi also treats standard causal language models as a special case and permits the direct reuse of pretrained LLM weights with no architectural changes. Empirically, CaDDi outperforms state-of-the-art discrete diffusion baselines on natural-language benchmarks, substantially narrowing the remaining gap to large autoregressive transformers.

nan

Article 802

Title: MIND-Stack: Modular, Interpretable, End-to-End Differentiability for Autonomous Navigation

MIND-Stack: Modular, interpretierbar, End-to-End-Unterscheidbarkeit für die autonome Navigation

MIND-Stack: 自主航行的模块、可解释、端到端至端差异 2505.21734v1

Authors: Felix Jahncke, Johannes Betz

Developing robust, efficient navigation algorithms is challenging. Rule-based methods offer interpretability and modularity but struggle with learning from large datasets, while end-to-end neural networks excel in learning but lack transparency and modularity. In this paper, we present MIND-Stack, a modular software stack consisting of a localization network and a Stanley Controller with intermediate human interpretable state representations and end-to-end differentiability. Our approach enables the upstream localization module to reduce the downstream control error, extending its role beyond state estimation. Unlike existing research on differentiable algorithms that either lack modules of the autonomous stack to span from sensor input to actuator output or real-world implementation, MIND-Stack offers both capabilities. We conduct experiments that demonstrate the ability of the localization module to reduce the downstream control loss through its end-to-end differentiability while offering better performance than state-of-the-art algorithms. We showcase sim-to-real capabilities by deploying the algorithm on a real-world embedded autonomous platform with limited computation power and demonstrate simultaneous training of both the localization and controller towards one goal. While MIND-Stack shows good results, we discuss the incorporation of additional modules from the autonomous navigation pipeline in the future, promising even greater stability and performance in the next iterations of the framework.

nan

Article 803

Title@2025-05-27 (2): LaX: Boosting Low-Rank Training of Foundation Models via Latent Crossing

Title: LaX: Boosting Low-Rank Training of Foundation Models via Latent Crossing

LaX: Förderung der Low-Rank-Schulung von Stiftungsmodellen durch Latent Crossing

LaX:通过中转交叉促进基金会模型的低射速培训 2505.21732v1

Authors: Ruijie Zhang, Ziyue Liu, Zhengyang Wang, Zheng Zhang

Training foundation models such as ViTs and LLMs requires tremendous computing cost. Low-rank matrix or tensor factorization offers a parameter-efficient alternative, but often downgrades performance due to the restricted parameter space. In this work, we introduce {\textbf{Latent Crossing (LaX)}} – a simple yet effective plug-and-play module that enhances the capacity of low-rank models by enabling information flow across low-rank subspaces. We extensively validate the benefits of LaX on pre-training tasks with ViT-Base/Large and LLaMA-like models ranging from 60M to 1B parameters. LaX boosts low-rank model performance to match or exceed the full-rank baselines while using 2-3(\times) fewer parameters. When equipped with low-rank adapters (i.e., LoRA) for fine-tuning LLaMA-7/13B, LaX consistently improves performance on arithmetic and common sense reasoning tasks with negligible cost.

nan

Article 804

Title@2025-05-27 (2): Deep Reinforcement Learning Agents are not even close to Human Intelligence

Title: Deep Reinforcement Learning Agents are not even close to Human Intelligence

Deep Enforcement Learning Agents sind nicht einmal der menschlichen Intelligenz nahe

深强化学习代理机构甚至离人类情报机构不近 2505.21731v1

Authors: Quentin Delfosse, Jannis Blüml, Fabian Tatai, Théo Vincent, Bjarne Gregori, Elisabeth Dillies, Jan Peters, Constantin Rothkopf, Kristian Kersting

Deep reinforcement learning (RL) agents achieve impressive results in a wide variety of tasks, but they lack zero-shot adaptation capabilities. While most robustness evaluations focus on tasks complexifications, for which human also struggle to maintain performances, no evaluation has been performed on tasks simplifications. To tackle this issue, we introduce HackAtari, a set of task variations of the Arcade Learning Environments. We use it to demonstrate that, contrary to humans, RL agents systematically exhibit huge performance drops on simpler versions of their training tasks, uncovering agents’ consistent reliance on shortcuts. Our analysis across multiple algorithms and architectures highlights the persistent gap between RL agents and human behavioral intelligence, underscoring the need for new benchmarks and methodologies that enforce systematic generalization testing beyond static evaluation protocols. Training and testing in the same environment is not enough to obtain agents equipped with human-like intelligence.

nan

Article 805

Title@2025-05-27 (2): Are Statistical Methods Obsolete in the Era of Deep Learning?

Title: Are Statistical Methods Obsolete in the Era of Deep Learning?

Sind statistische Methoden im Zeitalter des tiefen Lernens überholt?

统计方法是否在深层学习时代过时? 2505.21723v1

Authors: Skyler Wu, Shihao Yang, S. C. Kou

In the era of AI, neural networks have become increasingly popular for modeling, inference, and prediction, largely due to their potential for universal approximation. With the proliferation of such deep learning models, a question arises: are leaner statistical methods still relevant? To shed insight on this question, we employ the mechanistic nonlinear ordinary differential equation (ODE) inverse problem as a testbed, using physics-informed neural network (PINN) as a representative of the deep learning paradigm and manifold-constrained Gaussian process inference (MAGI) as a representative of statistically principled methods. Through case studies involving the SEIR model from epidemiology and the Lorenz model from chaotic dynamics, we demonstrate that statistical methods are far from obsolete, especially when working with sparse and noisy observations. On tasks such as parameter inference and trajectory reconstruction, statistically principled methods consistently achieve lower bias and variance, while using far fewer parameters and requiring less hyperparameter tuning. Statistical methods can also decisively outperform deep learning models on out-of-sample future prediction, where the absence of relevant data often leads overparameterized models astray. Additionally, we find that statistically principled approaches are more robust to accumulation of numerical imprecision and can represent the underlying system more faithful to the true governing ODEs.

nan

Article 806

Title@2025-05-27 (2): Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape

Title: Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape

Sattel-zu-Sattel-Dynamik in Deep ReLU Networks: Low-Rank Bias bei der ersten Sattelflucht

深 ReLU 网络中的套装到套接的动态动态: 第一次套装逃跑中的低兰克比亚 2505.21722v1

Authors: Ioannis Bantzis, James B. Simon, Arthur Jacot

When a deep ReLU network is initialized with small weights, GD is at first dominated by the saddle at the origin in parameter space. We study the so-called escape directions, which play a similar role as the eigenvectors of the Hessian for strict saddles. We show that the optimal escape direction features a low-rank bias in its deeper layers: the first singular value of the $\ell$-th layer weight matrix is at least $\ell^{\frac{1}{4}}$ larger than any other singular value. We also prove a number of related results about these escape directions. We argue that this result is a first step in proving Saddle-to-Saddle dynamics in deep ReLU networks, where GD visits a sequence of saddles with increasing bottleneck rank.

nan

Article 807

Title@2025-05-27 (2): CTBENCH: A Library and Benchmark for Certified Training

Title: CTBENCH: A Library and Benchmark for Certified Training

CTBENCH: Eine Bibliothek und Benchmark für zertifizierte Ausbildung

CTBENCH: 注册培训的图书馆和基准 2406.04848v4

Authors: Yuhao Mao, Stefan Balauca, Martin Vechev

Training certifiably robust neural networks is an important but challenging task. While many algorithms for (deterministic) certified training have been proposed, they are often evaluated on different training schedules, certification methods, and systematically under-tuned hyperparameters, making it difficult to compare their performance. To address this challenge, we introduce CTBench, a unified library and a high-quality benchmark for certified training that evaluates all algorithms under fair settings and systematically tuned hyperparameters. We show that (1) almost all algorithms in CTBench surpass the corresponding reported performance in literature in the magnitude of algorithmic improvements, thus establishing new state-of-the-art, and (2) the claimed advantage of recent algorithms drops significantly when we enhance the outdated baselines with a fair training schedule, a fair certification method and well-tuned hyperparameters. Based on CTBench, we provide new insights into the current state of certified training, including (1) certified models have less fragmented loss surface, (2) certified models share many mistakes, (3) certified models have more sparse activations, (4) reducing regularization cleverly is crucial for certified training especially for large radii and (5) certified training has the potential to improve out-of-distribution generalization. We are confident that CTBench will serve as a benchmark and testbed for future research in certified training.

nan

Article 808

Title@2025-05-27 (2): Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference

Title: Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference

Nahezu dimensionsunabhängige Konvergenz des mittleren Feldes Black-Box Variationale Schlussfolgerung

中 - 现场黑 - 生物- 黑 - 生物- 黑 - 生物- 2505.21721v1

Authors: Kyurae Kim, Yi-An Ma, Trevor Campbell, Jacob R. Gardner

We prove that, given a mean-field location-scale variational family, black-box variational inference (BBVI) with the reparametrization gradient converges at an almost dimension-independent rate. Specifically, for strongly log-concave and log-smooth targets, the number of iterations for BBVI with a sub-Gaussian family to achieve an objective $\epsilon$-close to the global optimum is $\mathrm{O}(\log d)$, which improves over the $\mathrm{O}(d)$ dependence of full-rank location-scale families. For heavy-tailed families, we provide a weaker $\mathrm{O}(d^{2/k})$ dimension dependence, where $k$ is the number of finite moments. Additionally, if the Hessian of the target log-density is constant, the complexity is free of any explicit dimension dependence. We also prove that our bound on the gradient variance, which is key to our result, cannot be improved using only spectral bounds on the Hessian of the target log-density.

nan

Article 809

Title@2025-05-27 (2): Simple Guidance Mechanisms for Discrete Diffusion Models

Title: Simple Guidance Mechanisms for Discrete Diffusion Models

Einfache Leitmechanismen für diskrete Diffusionsmodelle

分辨传播模型的简单指导机制 2412.10193v3

Authors: Yair Schiff, Subham Sekhar Sahoo, Hao Phung, Guanghan Wang, Sam Boshar, Hugo Dalla-torre, Bernardo P. de Almeida, Alexander Rush, Thomas Pierrot, Volodymyr Kuleshov

Diffusion models for continuous data gained widespread adoption owing to their high quality generation and control mechanisms. However, controllable diffusion on discrete data faces challenges given that continuous guidance methods do not directly apply to discrete diffusion. Here, we provide a straightforward derivation of classifier-free and classifier-based guidance for discrete diffusion, as well as a new class of diffusion models that leverage uniform noise and that are more guidable because they can continuously edit their outputs. We improve the quality of these models with a novel continuous-time variational lower bound that yields state-of-the-art performance, especially in settings involving guidance or fast generation. Empirically, we demonstrate that our guidance mechanisms combined with uniform noise diffusion improve controllable generation relative to autoregressive and diffusion baselines on several discrete data domains, including genomic sequences, small molecule design, and discretized image generation.

nan

Article 810

Title@2025-05-27 (2): Training Dynamics of In-Context Learning in Linear Attention

Title: Training Dynamics of In-Context Learning in Linear Attention

Trainingsdynamik des In-Context-Lernens in linearer Aufmerksamkeit

线线性关注的内文学习培训动态 2501.16265v2

Authors: Yedi Zhang, Aaditya K. Singh, Peter E. Latham, Andrew Saxe

While attention-based models have demonstrated the remarkable ability of in-context learning (ICL), the theoretical understanding of how these models acquired this ability through gradient descent training is still preliminary. Towards answering this question, we study the gradient descent dynamics of multi-head linear self-attention trained for in-context linear regression. We examine two parametrizations of linear self-attention: one with the key and query weights merged as a single matrix (common in theoretical studies), and one with separate key and query matrices (closer to practical settings). For the merged parametrization, we show that the training dynamics has two fixed points and the loss trajectory exhibits a single, abrupt drop. We derive an analytical time-course solution for a certain class of datasets and initialization. For the separate parametrization, we show that the training dynamics has exponentially many fixed points and the loss exhibits saddle-to-saddle dynamics, which we reduce to scalar ordinary differential equations. During training, the model implements principal component regression in context with the number of principal components increasing over training time. Overall, we provide a theoretical description of how ICL abilities evolve during gradient descent training of linear attention, revealing abrupt acquisition or progressive improvements depending on how the key and query are parametrized.

nan

Article 811

Title@2025-05-27 (2): Network classification through random walks

Title: Network classification through random walks

Netzwerkklassifizierung durch zufällige Spaziergänge

通过随机行走进行网络分类 2505.21706v1

Authors: Gonzalo Travieso, Joao Merenda, Odemir M. Bruno

Network models have been widely used to study diverse systems and analyze their dynamic behaviors. Given the structural variability of networks, an intriguing question arises: Can we infer the type of system represented by a network based on its structure? This classification problem involves extracting relevant features from the network. Existing literature has proposed various methods that combine structural measurements and dynamical processes for feature extraction. In this study, we introduce a novel approach to characterize networks using statistics from random walks, which can be particularly informative about network properties. We present the employed statistical metrics and compare their performance on multiple datasets with other state-of-the-art feature extraction methods. Our results demonstrate that the proposed method is effective in many cases, often outperforming existing approaches, although some limitations are observed across certain datasets.

nan

Article 812

Title@2025-05-27 (2): AMSFL: Adaptive Multi-Step Federated Learning via Gradient Difference-Based Error Modeling

Title: AMSFL: Adaptive Multi-Step Federated Learning via Gradient Difference-Based Error Modeling

AMSFL: Adaptives Multi-Step-Federated Learning über gradient Difference-based Error Modeling

ASFL:通过基于差异的渐进错误建模进行适应性多阶段联邦学习 2505.21695v1

Authors: Ganglou Xu

Federated learning faces critical challenges in balancing communication efficiency and model accuracy. One key issue lies in the approximation of update errors without incurring high computational costs. In this paper, we propose a lightweight yet effective method called Gradient Difference Approximation (GDA), which leverages first-order information to estimate local error trends without computing the full Hessian matrix. The proposed method forms a key component of the Adaptive Multi-Step Federated Learning (AMSFL) framework and provides a unified error modeling strategy for large-scale multi-step adaptive training environments.

nan

Article 813

Title@2025-05-27 (2): What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization

Title: What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization

Welche Daten ermöglichen optimale Entscheidungen? Eine genaue Charakterisierung für lineare Optimierung

什么数据能使最佳决定实现最佳决定? 线性优化的精确属性 2505.21692v1

Authors: Omar Bennouna, Amine Bennouna, Saurabh Amin, Asuman Ozdaglar

We study the fundamental question of how informative a dataset is for solving a given decision-making task. In our setting, the dataset provides partial information about unknown parameters that influence task outcomes. Focusing on linear programs, we characterize when a dataset is sufficient to recover an optimal decision, given an uncertainty set on the cost vector. Our main contribution is a sharp geometric characterization that identifies the directions of the cost vector that matter for optimality, relative to the task constraints and uncertainty set. We further develop a practical algorithm that, for a given task, constructs a minimal or least-costly sufficient dataset. Our results reveal that small, well-chosen datasets can often fully determine optimal decisions – offering a principled foundation for task-aware data selection.

nan

Article 814

Title@2025-05-27 (2): LLMPR: A Novel LLM-Driven Transfer Learning based Petition Ranking Model

Title: LLMPR: A Novel LLM-Driven Transfer Learning based Petition Ranking Model

LLMPR: Ein neuartiges LLM-getriebenes Transfer-Learning-basiertes Petitions-Ranking-Modell

LLMPR:基于请愿排级的新式LLM-驱动转移学习模式 2505.21689v1

Authors: Avijit Gayen, Somyajit Chakraborty, Mainak Sen, Soham Paul, Angshuman Jana

The persistent accumulation of unresolved legal cases, especially within the Indian judiciary, significantly hampers the timely delivery of justice. Manual methods of prioritizing petitions are often prone to inefficiencies and subjective biases further exacerbating delays. To address this issue, we propose LLMPR (Large Language Model-based Petition Ranking), an automated framework that utilizes transfer learning and machine learning to assign priority rankings to legal petitions based on their contextual urgency. Leveraging the ILDC dataset comprising 7,593 annotated petitions, we process unstructured legal text and extract features through various embedding techniques, including DistilBERT, LegalBERT, and MiniLM. These textual embeddings are combined with quantitative indicators such as gap days, rank scores, and word counts to train multiple machine learning models, including Random Forest, Decision Tree, XGBoost, LightGBM, and CatBoost. Our experiments demonstrate that Random Forest and Decision Tree models yield superior performance, with accuracy exceeding 99% and a Spearman rank correlation of 0.99. Notably, models using only numerical features achieve nearly optimal ranking results (R2 = 0.988, \r{ho} = 0.998), while LLM-based embeddings offer only marginal gains. These findings suggest that automated petition ranking can effectively streamline judicial workflows, reduce case backlog, and improve fairness in legal prioritization.

nan

Article 815

Title@2025-05-27 (2): Empirical analysis of binding precedent efficiency in Brazilian Supreme Court via case classification

Title: Empirical analysis of binding precedent efficiency in Brazilian Supreme Court via case classification

Empirische Analyse der verbindlichen Präzedenzeffizienz im brasilianischen Obersten Gerichtshof über die Fallklassifizierung

通过案件分类对巴西最高法院具有约束力的先例效率进行经验分析 2407.07004v3

Authors: Raphaël Tinarrage, Henrique Ennes, Lucas Resck, Lucas T. Gomes, Jean R. Ponciano, Jorge Poco

Binding precedents (s'umulas vinculantes) constitute a juridical instrument unique to the Brazilian legal system and whose objectives include the protection of the Federal Supreme Court against repetitive demands. Studies of the effectiveness of these instruments in decreasing the Court’s exposure to similar cases, however, indicate that they tend to fail in such a direction, with some of the binding precedents seemingly creating new demands. We empirically assess the legal impact of five binding precedents, 11, 14, 17, 26, and 37, at the highest Court level through their effects on the legal subjects they address. This analysis is only possible through the comparison of the Court’s ruling about the precedents’ themes before they are created, which means that these decisions should be detected through techniques of Similar Case Retrieval, which we tackle from the angle of Case Classification. The contributions of this article are therefore twofold: on the mathematical side, we compare the use of different methods of Natural Language Processing – TF-IDF, LSTM, Longformer, and regex – for Case Classification, whereas on the legal side, we contrast the inefficiency of these binding precedents with a set of hypotheses that may justify their repeated usage. We observe that the TF-IDF models performed slightly better than LSTM and Longformer when compared through common metrics; however, the deep learning models were able to detect certain important legal events that TF-IDF missed. On the legal side, we argue that the reasons for binding precedents to fail in responding to repetitive demand are heterogeneous and case-dependent, making it impossible to single out a specific cause. We identify five main hypotheses, which are found in different combinations in each of the precedents studied.

nan

Article 816

Title@2025-05-27 (2): Probabilistic Reasoning with LLMs for k-anonymity Estimation

Title: Probabilistic Reasoning with LLMs for k-anonymity Estimation

Probabilistische Begründung mit LLMs für k-Anonymitätsschätzung

K-匿名性估计法LLMs的概率推理 2503.09674v3

Authors: Jonathan Zheng, Sauvik Das, Alan Ritter, Wei Xu

Probabilistic reasoning is a key aspect of both human and artificial intelligence that allows for handling uncertainty and ambiguity in decision-making. In this paper, we introduce a new numerical reasoning task under uncertainty for large language models, focusing on estimating the privacy risk of user-generated documents containing privacy-sensitive information. We propose BRANCH, a new LLM methodology that estimates the k-privacy value of a text-the size of the population matching the given information. BRANCH factorizes a joint probability distribution of personal information as random variables. The probability of each factor in a population is estimated separately using a Bayesian network and combined to compute the final k-value. Our experiments show that this method successfully estimates the k-value 73% of the time, a 13% increase compared to o3-mini with chain-of-thought reasoning. We also find that LLM uncertainty is a good indicator for accuracy, as high-variance predictions are 37.47% less accurate on average.

nan

Article 817

Title@2025-05-27 (2): Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models

Title: Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models

Verbesserung der Benutzerverhaltensvorhersage: Annotator-Metadaten in überwachten Machine Learning-Modellen nutzen

改进用户行为预测:在受监督的机器学习模型中利用标记元数据 2503.21000v2

Authors: Lynnette Hui Xian Ng, Kokil Jaidka, Kaiyuan Tay, Hansin Ahuja, Niyati Chhaya

Supervised machine-learning models often underperform in predicting user behaviors from conversational text, hindered by poor crowdsourced label quality and low NLP task accuracy. We introduce the Metadata-Sensitive Weighted-Encoding Ensemble Model (MSWEEM), which integrates annotator meta-features like fatigue and speeding. First, our results show MSWEEM outperforms standard ensembles by 14% on held-out data and 12% on an alternative dataset. Second, we find that incorporating signals of annotator behavior, such as speed and fatigue, significantly boosts model performance. Third, we find that annotators with higher qualifications, such as Master’s, deliver more consistent and faster annotations. Given the increasing uncertainty over annotation quality, our experiments show that understanding annotator patterns is crucial for enhancing model accuracy in user behavior prediction.

nan

Article 818

Title@2025-05-27 (2): tenSVD algorithm for compression

Title: tenSVD algorithm for compression

tenSVD-Algorithmus zur Kompression

用于压缩的 10SVD 算法 2505.21686v1

Authors: Michele Gallo

Tensors provide a robust framework for managing high-dimensional data. Consequently, tensor analysis has emerged as an active research area in various domains, including machine learning, signal processing, computer vision, graph analysis, and data mining. This study introduces an efficient image storage approach utilizing tensors, aiming to minimize memory to store, bandwidth to transmit and energy to processing. The proposed method organizes original data into a higher-order tensor and applies the Tucker model for compression. Implemented in R, this method is compared to a baseline algorithm. The evaluation focuses on efficient of algorithm measured in term of computational time and the quality of information preserved, using both simulated and real datasets. A detailed analysis of the results is conducted, employing established quantitative metrics, with significant attention paid to sustainability in terms of energy consumption across algorithms.

nan

Article 819

Title@2025-05-27 (2): Edit Distance Robust Watermarks via Indexing Pseudorandom Codes

Title: Edit Distance Robust Watermarks via Indexing Pseudorandom Codes

Entfernung bearbeiten Robuste Wasserzeichen über Indexierung Pseudorandom Codes

通过索引化 Peredorandom 代码编辑远程硬体水印 2406.02633v2

Authors: Noah Golowich, Ankur Moitra

Motivated by the problem of detecting AI-generated text, we consider the problem of watermarking the output of language models with provable guarantees. We aim for watermarks which satisfy: (a) undetectability, a cryptographic notion introduced by Christ, Gunn & Zamir (2024) which stipulates that it is computationally hard to distinguish watermarked language model outputs from the model’s actual output distribution; and (b) robustness to channels which introduce a constant fraction of adversarial insertions, substitutions, and deletions to the watermarked text. Earlier schemes could only handle stochastic substitutions and deletions, and thus we are aiming for a more natural and appealing robustness guarantee that holds with respect to edit distance. Our main result is a watermarking scheme which achieves both undetectability and robustness to edits when the alphabet size for the language model is allowed to grow as a polynomial in the security parameter. To derive such a scheme, we follow an approach introduced by Christ & Gunn (2024), which proceeds via first constructing pseudorandom codes satisfying undetectability and robustness properties analogous to those above; our key idea is to handle adversarial insertions and deletions by interpreting the symbols as indices into the codeword, which we call indexing pseudorandom codes. Additionally, our codes rely on weaker computational assumptions than used in previous work. Then we show that there is a generic transformation from such codes over large alphabets to watermarking schemes for arbitrary language models.

nan

Article 820

Title@2025-05-27 (2): Incentivizing Permissionless Distributed Learning of LLMs

Title: Incentivizing Permissionless Distributed Learning of LLMs

Anreize für das unbefugte Lernen von LLMs

激励对LLMM的无自由分配的学习 2505.21684v1

Authors: Joel Lidin, Amir Sarfi, Evangelos Pappas, Samuel Dare, Eugene Belilovsky, Jacob Steeves

We describe an incentive system for distributed deep learning of foundational models where peers are rewarded for contributions. The incentive system, \textit{Gauntlet}, has been deployed on the bittensor blockchain and used to train a 1.2B LLM with completely permissionless contributions of pseudo-gradients: no control over the users that can register or their hardware. \textit{Gauntlet} can be applied to any synchronous distributed training scheme that relies on aggregating updates or pseudo-gradients. We rely on a two-stage mechanism for fast filtering of peer uptime, reliability, and synchronization, combined with the core component that estimates the loss before and after individual pseudo-gradient contributions. We utilized an OpenSkill rating system to track competitiveness of pseudo-gradient scores across time. Finally, we introduce a novel mechanism to ensure peers on the network perform unique computations. Our live 1.2B run, which has paid out real-valued tokens to participants based on the value of their contributions, yielded a competitive (on a per-iteration basis) 1.2B model that demonstrates the utility of our incentive system.

nan

Article 821

Title@2025-05-27 (2): multivariateGPT: a decoder-only transformer for multivariate categorical and numeric data

Title: multivariateGPT: a decoder-only transformer for multivariate categorical and numeric data

multivariateGPT: ein nur Decoder-Transformator für multivariate kategoriale und numerische Daten

多个变量GPT: 用于多变量绝对数据和数字数据的解码器专用变压器 2505.21680v1

Authors: Andrew J. Loza, Jun Yup Kim, Shangzheng Song, Yihang Liu, Joseph J. Y. Sung, R Andrew Taylor, Dennis L. Shung

Real-world processes often generate data that are a mix of categorical and numeric values that are recorded at irregular and informative intervals. Discrete token-based approaches are limited in numeric representation capacity while methods like neural ordinary differential equations are not well suited for categorical data or informative sampling and require augmentation to handle certain classes of trajectories. Here, we present multivariateGPT, a single architecture for modeling sequences of mixed categorical (including tokenized text) and numeric data. This is accomplished with an autoregressive sequence decomposition, embedding scheme, and loss function that extend the next token prediction task to likelihood estimation of the joint distribution of next token class and value. We demonstrate how this approach can efficiently learn to generalize patterns in simple physical systems and model complex time series including electrocardiograms and multivariate electronic health record data. This work extends the utility of transformer based models to additional classes of data.

nan

Article 822

Title@2025-05-27 (2): Fast meta-solvers for 3D complex-shape scatterers using neural operators trained on a non-scattering problem

Title: Fast meta-solvers for 3D complex-shape scatterers using neural operators trained on a non-scattering problem

Schnelle Meta-Lösung für 3D-Komplex-Spritzer mit neuronalen Operatoren, die auf einem nicht-streuenden Problem geschult sind

使用神经操作员就非碎裂问题接受培训的3D复合碎片散散射器快速元解析器 2405.12380v2

Authors: Youngkyu Lee, Shanqing Liu, Zongren Zou, Adar Kahana, Eli Turkel, Rishikesh Ranade, Jay Pathak, George Em Karniadakis

Three-dimensional target identification using scattering techniques requires high accuracy solutions and very fast computations for real-time predictions in some critical applications. We first train a deep neural operator~(DeepONet) to solve wave propagation problems described by the Helmholtz equation in a domain \textit{without scatterers} but at different wavenumbers and with a complex absorbing boundary condition. We then design two classes of fast meta-solvers by combining DeepONet with either relaxation methods, such as Jacobi and Gauss-Seidel, or with Krylov methods, such as GMRES and BiCGStab, using the trunk basis of DeepONet as a coarse-scale preconditioner. We leverage the spectral bias of neural networks to account for the lower part of the spectrum in the error distribution while the upper part is handled inexpensively using relaxation methods or fine-scale preconditioners. The meta-solvers are then applied to solve scattering problems with different shape of scatterers, at no extra training cost. We first demonstrate that the resulting meta-solvers are shape-agnostic, fast, and robust, whereas the standard standalone solvers may even fail to converge without the DeepONet. We then apply both classes of meta-solvers to scattering from a submarine, a complex three-dimensional problem. We achieve very fast solutions, especially with the DeepONet-Krylov methods, which require orders of magnitude fewer iterations than any of the standalone solvers.

nan

Article 823

Title@2025-05-27 (2): Robust LLM Alignment via Distributionally Robust Direct Preference Optimization

Title: Robust LLM Alignment via Distributionally Robust Direct Preference Optimization

Robuste LLM-Ausrichtung über distributiv robuste Direktpräferenzoptimierung

通过分布式强力直接首选项优化对齐 2502.01930v2

Authors: Zaiyan Xu, Sushil Vemuri, Kishan Panaganti, Dileep Kalathil, Rahul Jain, Deepak Ramachandran

A major challenge in aligning large language models (LLMs) with human preferences is the issue of distribution shift. LLM alignment algorithms rely on static preference datasets, assuming that they accurately represent real-world user preferences. However, user preferences vary significantly across geographical regions, demographics, linguistic patterns, and evolving cultural trends. This preference distribution shift leads to catastrophic alignment failures in many real-world applications. We address this problem using the principled framework of distributionally robust optimization, and develop two novel distributionally robust direct preference optimization (DPO) algorithms, namely, Wasserstein DPO (WDPO) and Kullback-Leibler DPO (KLDPO). We characterize the sample complexity of learning the optimal policy parameters for WDPO and KLDPO. Moreover, we propose scalable gradient descent-style learning algorithms by developing suitable approximations for the challenging minimax loss functions of WDPO and KLDPO. Our empirical experiments using benchmark data sets and LLMs demonstrate the superior performance of WDPO and KLDPO in substantially improving the alignment when there is a preference distribution shift.

nan

Article 824

Title@2025-05-27 (2): What happens when generative AI models train recursively on each others’ generated outputs?

Title: What happens when generative AI models train recursively on each others’ generated outputs?

Was passiert, wenn generative KI-Modelle rekursiv auf den jeweils anderen generierten Ausgängen trainieren?

当基因化的AI模型对彼此产生的产出进行回溯性培训时会怎样呢? 2505.21677v1

Authors: Hung Ahn Vu, Galen Reeves, Emily Wenger

The internet is full of AI-generated content while also serving as a common source of training data for generative AI (genAI) models. This duality raises the possibility that future genAI models may be trained on other models’ generated outputs. Prior work has studied consequences of models training on their own generated outputs, but limited work has considered what happens if models ingest content produced by other models. Given society’s increasing dependence on genAI tools, understanding downstream effects of such data-mediated model interactions is critical. To this end, we provide empirical evidence for how data-mediated interactions might unfold in practice, develop a theoretical model for this interactive training process, and show experimentally possible long-term results of such interactions. We find that data-mediated interactions can benefit models by exposing them to novel concepts perhaps missed in original training data, but also can homogenize their performance on shared tasks.

nan

Article 825

Title@2025-05-27 (2): In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention

Title: In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention

In-Context Lineare Regression Demystified: Trainingsdynamik und mechanistische Interpretierbarkeit von Multi-Head Softmax Achtung

内负线倒退:对多头软体注意力进行动态和机械解释的培训 2503.12734v2

Authors: Jianliang He, Xintian Pan, Siyu Chen, Zhuoran Yang

We study how multi-head softmax attention models are trained to perform in-context learning on linear data. Through extensive empirical experiments and rigorous theoretical analysis, we demystify the emergence of elegant attention patterns: a diagonal and homogeneous pattern in the key-query (KQ) weights, and a last-entry-only and zero-sum pattern in the output-value (OV) weights. Remarkably, these patterns consistently appear from gradient-based training starting from random initialization. Our analysis reveals that such emergent structures enable multi-head attention to approximately implement a debiased gradient descent predictor – one that outperforms single-head attention and nearly achieves Bayesian optimality up to proportional factor. Furthermore, compared to linear transformers, the softmax attention readily generalizes to sequences longer than those seen during training. We also extend our study to scenarios with anisotropic covariates and multi-task linear regression. In the former, multi-head attention learns to implement a form of pre-conditioned gradient descent. In the latter, we uncover an intriguing regime where the interplay between head number and task number triggers a superposition phenomenon that efficiently resolves multi-task in-context learning. Our results reveal that in-context learning ability emerges from the trained transformer as an aggregated effect of its architecture and the underlying data distribution, paving the way for deeper understanding and broader applications of in-context learning.

nan

Article 826

Title@2025-05-27 (2): Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

Title: Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

Schnelles lebenslanges Adaptives Inverses Verstärktes Lernen aus Demonstrationen

从示范活动中学习 2209.11908v8

Authors: Letian Chen, Sravan Jayanthi, Rohan Paleja, Daniel Martin, Viacheslav Zakharov, Matthew Gombolay

Learning from Demonstration (LfD) approaches empower end-users to teach robots novel tasks via demonstrations of the desired behaviors, democratizing access to robotics. However, current LfD frameworks are not capable of fast adaptation to heterogeneous human demonstrations nor the large-scale deployment in ubiquitous robotics applications. In this paper, we propose a novel LfD framework, Fast Lifelong Adaptive Inverse Reinforcement learning (FLAIR). Our approach (1) leverages learned strategies to construct policy mixtures for fast adaptation to new demonstrations, allowing for quick end-user personalization, (2) distills common knowledge across demonstrations, achieving accurate task inference; and (3) expands its model only when needed in lifelong deployments, maintaining a concise set of prototypical strategies that can approximate all behaviors via policy mixtures. We empirically validate that FLAIR achieves adaptability (i.e., the robot adapts to heterogeneous, user-specific task preferences), efficiency (i.e., the robot achieves sample-efficient adaptation), and scalability (i.e., the model grows sublinearly with the number of demonstrations while maintaining high performance). FLAIR surpasses benchmarks across three control tasks with an average 57% improvement in policy returns and an average 78% fewer episodes required for demonstration modeling using policy mixtures. Finally, we demonstrate the success of FLAIR in a table tennis task and find users rate FLAIR as having higher task (p<.05) and personalization (p<.05) performance.

nan

Article 827

Title@2025-05-27 (2): Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing

Title: Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing

Adaptive Frontier Exploration von Graphen mit Anwendungen für netzwerkbasierte Krankheitstests

适应性边界探索应用网络基疾病测试图图的适应性边界探索 2505.21671v1

Authors: Davin Choo, Yuqi Pan, Tonghan Wang, Milind Tambe, Alastair van Heerden, Cheryl Johnson

We study a sequential decision-making problem on a $n$-node graph $G$ where each node has an unknown label from a finite set $\mathbf{\Sigma}$, drawn from a joint distribution $P$ that is Markov with respect to $G$. At each step, selecting a node reveals its label and yields a label-dependent reward. The goal is to adaptively choose nodes to maximize expected accumulated discounted rewards. We impose a frontier exploration constraint, where actions are limited to neighbors of previously selected nodes, reflecting practical constraints in settings such as contact tracing and robotic exploration. We design a Gittins index-based policy that applies to general graphs and is provably optimal when $G$ is a forest. Our implementation runs in $O(n^2 \cdot

\mathbf{\Sigma}

^2)$ time while using $O(n \cdot

\mathbf{\Sigma}

^2)$ oracle calls to $P$ and $O(n^2 \cdot

\mathbf{\Sigma}

)$ space. Experiments on synthetic and real-world graphs show that our method consistently outperforms natural baselines, including in non-tree, budget-limited, and undiscounted settings. For example, in HIV testing simulations on real-world sexual interaction networks, our policy detects nearly all positive cases with only half the population tested, substantially outperforming other baselines.

nan

Article 828

Title@2025-05-27 (2): Efficient Controllable Diffusion via Optimal Classifier Guidance

Title: Efficient Controllable Diffusion via Optimal Classifier Guidance

Effiziente steuerbare Diffusion über Optimal Classifier Guidance

通过最佳分类指南有效控制可控扩散 2505.21666v1

Authors: Owen Oertell, Shikun Sun, Yiding Chen, Jin Peng Zhou, Zhiyong Wang, Wen Sun

The controllable generation of diffusion models aims to steer the model to generate samples that optimize some given objective functions. It is desirable for a variety of applications including image generation, molecule generation, and DNA/sequence generation. Reinforcement Learning (RL) based fine-tuning of the base model is a popular approach but it can overfit the reward function while requiring significant resources. We frame controllable generation as a problem of finding a distribution that optimizes a KL-regularized objective function. We present SLCD – Supervised Learning based Controllable Diffusion, which iteratively generates online data and trains a small classifier to guide the generation of the diffusion model. Similar to the standard classifier-guided diffusion, SLCD’s key computation primitive is classification and does not involve any complex concepts from RL or control. Via a reduction to no-regret online learning analysis, we show that under KL divergence, the output from SLCD provably converges to the optimal solution of the KL-regularized objective. Further, we empirically demonstrate that SLCD can generate high quality samples with nearly the same inference time as the base model in both image generation with continuous diffusion and biological sequence generation with discrete diffusion. Our code is available at https://github.com/Owen-Oertell/slcd

nan

Article 829

Title@2025-05-27 (2): Constraint-Adaptive Policy Switching for Offline Safe Reinforcement Learning

Title: Constraint-Adaptive Policy Switching for Offline Safe Reinforcement Learning

Constraint-Adaptive Policy Switching für Offline-sicheres Ausbau-Lernen

离线安全强化学习约束性强化政策转换 2412.18946v2

Authors: Yassine Chemingui, Aryan Deshwal, Honghao Wei, Alan Fern, Janardhan Rao Doppa

Offline safe reinforcement learning (OSRL) involves learning a decision-making policy to maximize rewards from a fixed batch of training data to satisfy pre-defined safety constraints. However, adapting to varying safety constraints during deployment without retraining remains an under-explored challenge. To address this challenge, we introduce constraint-adaptive policy switching (CAPS), a wrapper framework around existing offline RL algorithms. During training, CAPS uses offline data to learn multiple policies with a shared representation that optimize different reward and cost trade-offs. During testing, CAPS switches between those policies by selecting at each state the policy that maximizes future rewards among those that satisfy the current cost constraint. Our experiments on 38 tasks from the DSRL benchmark demonstrate that CAPS consistently outperforms existing methods, establishing a strong wrapper-based baseline for OSRL. The code is publicly available at https://github.com/yassineCh/CAPS.

nan

Article 830

Title@2025-05-27 (2): PreGenie: An Agentic Framework for High-quality Visual Presentation Generation

Title: PreGenie: An Agentic Framework for High-quality Visual Presentation Generation

PreGenie: Agentisches Framework für hochwertige visuelle Präsentationsgeneration

PreGenie:高质量视觉演示制作的代理框架 2505.21660v1

Authors: Xiaojie Xu, Xinli Xu, Sirui Chen, Haoyu Chen, Fan Zhang, Ying-Cong Chen

Visual presentations are vital for effective communication. Early attempts to automate their creation using deep learning often faced issues such as poorly organized layouts, inaccurate text summarization, and a lack of image understanding, leading to mismatched visuals and text. These limitations restrict their application in formal contexts like business and scientific research. To address these challenges, we propose PreGenie, an agentic and modular framework powered by multimodal large language models (MLLMs) for generating high-quality visual presentations. PreGenie is built on the Slidev presentation framework, where slides are rendered from Markdown code. It operates in two stages: (1) Analysis and Initial Generation, which summarizes multimodal input and generates initial code, and (2) Review and Re-generation, which iteratively reviews intermediate code and rendered slides to produce final, high-quality presentations. Each stage leverages multiple MLLMs that collaborate and share information. Comprehensive experiments demonstrate that PreGenie excels in multimodal understanding, outperforming existing models in both aesthetics and content consistency, while aligning more closely with human design preferences.

nan

Article 831

Title@2025-05-27 (2): STACI: Spatio-Temporal Aleatoric Conformal Inference

Title: STACI: Spatio-Temporal Aleatoric Conformal Inference

STACI: Spatio-Temporale aleatorische Konforme Schlussfolgerung

STACI: 斯帕迪奥-时空空气迁移 2505.21658v1

Authors: Brandon R. Feng, David Keetae Park, Xihaier Luo, Arantxa Urdangarin, Shinjae Yoo, Brian J. Reich

Fitting Gaussian Processes (GPs) provides interpretable aleatoric uncertainty quantification for estimation of spatio-temporal fields. Spatio-temporal deep learning models, while scalable, typically assume a simplistic independent covariance matrix for the response, failing to capture the underlying correlation structure. However, spatio-temporal GPs suffer from issues of scalability and various forms of approximation bias resulting from restrictive assumptions of the covariance kernel function. We propose STACI, a novel framework consisting of a variational Bayesian neural network approximation of non-stationary spatio-temporal GP along with a novel spatio-temporal conformal inference algorithm. STACI is highly scalable, taking advantage of GPU training capabilities for neural network models, and provides statistically valid prediction intervals for uncertainty quantification. STACI outperforms competing GPs and deep methods in accurately approximating spatio-temporal processes and we show it easily scales to datasets with millions of observations.

nan

Article 832

Title@2025-05-27 (2): Explainability of Large Language Models using SMILE: Statistical Model-agnostic Interpretability with Local Explanations

Title: Explainability of Large Language Models using SMILE: Statistical Model-agnostic Interpretability with Local Explanations

Erklärbarkeit großer Sprachmodelle mit SMILE: Statistische Modell-agnostische Interpretierbarkeit mit lokalen Erklärungen

使用SMILE解释大语言模型的可解释性:统计模型 – – 与当地解释的可解释性 2505.21657v1

Authors: Zeinab Dehghani, Koorosh Aslansefat, Adil Khan, Mohammed Naveed Akram

Large language models like GPT, LLAMA, and Claude have become incredibly powerful at generating text, but they are still black boxes, so it is hard to understand how they decide what to say. That lack of transparency can be problematic, especially in fields where trust and accountability matter. To help with this, we introduce SMILE, a new method that explains how these models respond to different parts of a prompt. SMILE is model-agnostic and works by slightly changing the input, measuring how the output changes, and then highlighting which words had the most impact. Create simple visual heat maps showing which parts of a prompt matter the most. We tested SMILE on several leading LLMs and used metrics such as accuracy, consistency, stability, and fidelity to show that it gives clear and reliable explanations. By making these models easier to understand, SMILE brings us one step closer to making AI more transparent and trustworthy.

nan

Article 833

Title@2025-05-27 (2): BACON: A fully explainable AI model with graded logic for decision making problems

Title: BACON: A fully explainable AI model with graded logic for decision making problems

BACON: Ein voll erklärbares KI-Modell mit abgestufter Logik für Entscheidungsprobleme

具有决策问题分级逻辑的完全可解释的AI模型 2505.14510v3

Authors: Haishi Bai, Jozo Dujmovic, Jianwu Wang

As machine learning models and autonomous agents are increasingly deployed in high-stakes, real-world domains such as healthcare, security, finance, and robotics, the need for transparent and trustworthy explanations has become critical. To ensure end-to-end transparency of AI decisions, we need models that are not only accurate but also fully explainable and human-tunable. We introduce BACON, a novel framework for automatically training explainable AI models for decision making problems using graded logic. BACON achieves high predictive accuracy while offering full structural transparency and precise, logic-based symbolic explanations, enabling effective human-AI collaboration and expert-guided refinement. We evaluate BACON with a diverse set of scenarios: classic Boolean approximation, Iris flower classification, house purchasing decisions and breast cancer diagnosis. In each case, BACON provides high-performance models while producing compact, human-verifiable decision logic. These results demonstrate BACON’s potential as a practical and principled approach for delivering crisp, trustworthy explainable AI.

nan

Article 834

Title@2025-05-27 (2): AutoSGD: Automatic Learning Rate Selection for Stochastic Gradient Descent

Title: AutoSGD: Automatic Learning Rate Selection for Stochastic Gradient Descent

AutoSGD: Automatische Lernrate-Auswahl für stochastische Gradient Descent

AutoSGD: 存储渐变后代自动学习率选择 2505.21651v1

Authors: Nikola Surjanovic, Alexandre Bouchard-Côté, Trevor Campbell

The learning rate is an important tuning parameter for stochastic gradient descent (SGD) and can greatly influence its performance. However, appropriate selection of a learning rate schedule across all iterations typically requires a non-trivial amount of user tuning effort. To address this, we introduce AutoSGD: an SGD method that automatically determines whether to increase or decrease the learning rate at a given iteration and then takes appropriate action. We introduce theory supporting the convergence of AutoSGD, along with its deterministic counterpart for standard gradient descent. Empirical results suggest strong performance of the method on a variety of traditional optimization problems and machine learning tasks.

nan

Article 835

Title@2025-05-27 (2): QuARI: Query Adaptive Retrieval Improvement

Title: QuARI: Query Adaptive Retrieval Improvement

QUARI: Abfrage Adaptive Verbesserung des Retrievals

QuARI: 查询适应性检索改进 2505.21647v1

Authors: Eric Xing, Abby Stylianou, Robert Pless, Nathan Jacobs

Massive-scale pretraining has made vision-language models increasingly popular for image-to-image and text-to-image retrieval across a broad collection of domains. However, these models do not perform well when used for challenging retrieval tasks, such as instance retrieval in very large-scale image collections. Recent work has shown that linear transformations of VLM features trained for instance retrieval can improve performance by emphasizing subspaces that relate to the domain of interest. In this paper, we explore a more extreme version of this specialization by learning to map a given query to a query-specific feature space transformation. Because this transformation is linear, it can be applied with minimal computational cost to millions of image embeddings, making it effective for large-scale retrieval or re-ranking. Results show that this method consistently outperforms state-of-the-art alternatives, including those that require many orders of magnitude more computation at query time.

nan

Article 836

Title@2025-05-27 (2): PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects

Title: PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects

Private: Differenzielle private Vertrauensintervalle für durchschnittliche Behandlungseffekte

普里瓦特:对平均待遇影响有区别的私人信任互换 2505.21641v1

Authors: Maresa Schröder, Justin Hartenstein, Stefan Feuerriegel

The average treatment effect (ATE) is widely used to evaluate the effectiveness of drugs and other medical interventions. In safety-critical applications like medicine, reliable inferences about the ATE typically require valid uncertainty quantification, such as through confidence intervals (CIs). However, estimating treatment effects in these settings often involves sensitive data that must be kept private. In this work, we present PrivATE, a novel machine learning framework for computing CIs for the ATE under differential privacy. Specifically, we focus on deriving valid privacy-preserving CIs for the ATE from observational data. Our PrivATE framework consists of three steps: (i) estimating a differentially private ATE through output perturbation; (ii) estimating the differentially private variance through a truncated output perturbation mechanism; and (iii) constructing the CIs while accounting for the uncertainty from both the estimation and privatization steps. Our PrivATE framework is model agnostic, doubly robust, and ensures valid CIs. We demonstrate the effectiveness of our framework using synthetic and real-world medical datasets. To the best of our knowledge, we are the first to derive a general, doubly robust framework for valid CIs of the ATE under ($\varepsilon$, $\delta$)-differential privacy.

nan

Article 837

Title@2025-05-27 (2): Efficient Diffusion Models for Symmetric Manifolds

Title: Efficient Diffusion Models for Symmetric Manifolds

Effiziente Diffusionsmodelle für symmetrische Manifolds

高效扩散对称操纵模型 2505.21640v1

Authors: Oren Mangoubi, Neil He, Nisheeth K. Vishnoi

We introduce a framework for designing efficient diffusion models for $d$-dimensional symmetric-space Riemannian manifolds, including the torus, sphere, special orthogonal group and unitary group. Existing manifold diffusion models often depend on heat kernels, which lack closed-form expressions and require either $d$ gradient evaluations or exponential-in-$d$ arithmetic operations per training step. We introduce a new diffusion model for symmetric manifolds with a spatially-varying covariance, allowing us to leverage a projection of Euclidean Brownian motion to bypass heat kernel computations. Our training algorithm minimizes a novel efficient objective derived via Ito’s Lemma, allowing each step to run in $O(1)$ gradient evaluations and nearly-linear-in-$d$ ($O(d^{1.19})$) arithmetic operations, reducing the gap between diffusions on symmetric manifolds and Euclidean space. Manifold symmetries ensure the diffusion satisfies an “average-case” Lipschitz condition, enabling accurate and efficient sample generation. Empirically, our model outperforms prior methods in training speed and improves sample quality on synthetic datasets on the torus, special orthogonal group, and unitary group.

nan

Article 838

Title@2025-05-27 (2): Apprenticeship learning with prior beliefs using inverse optimization

Title: Apprenticeship learning with prior beliefs using inverse optimization

Lehrlingsstudium mit früheren Überzeugungen mit inverser Optimierung

利用反向优化进行具有先入先信的学徒学习 2505.21639v1

Authors: Mauricio Junca, Esteban Leiva

The relationship between inverse reinforcement learning (IRL) and inverse optimization (IO) for Markov decision processes (MDPs) has been relatively underexplored in the literature, despite addressing the same problem. In this work, we revisit the relationship between the IO framework for MDPs, IRL, and apprenticeship learning (AL). We incorporate prior beliefs on the structure of the cost function into the IRL and AL problems, and demonstrate that the convex-analytic view of the AL formalism (Kamoutsi et al., 2021) emerges as a relaxation of our framework. Notably, the AL formalism is a special case in our framework when the regularization term is absent. Focusing on the suboptimal expert setting, we formulate the AL problem as a regularized min-max problem. The regularizer plays a key role in addressing the ill-posedness of IRL by guiding the search for plausible cost functions. To solve the resulting regularized-convex-concave-min-max problem, we use stochastic mirror descent (SMD) and establish convergence bounds for the proposed method. Numerical experiments highlight the critical role of regularization in learning cost vectors and apprentice policies.

nan

Article 839

Title@2025-05-27 (2): Is Your LLM Overcharging You? Tokenization, Transparency, and Incentives

Title: Is Your LLM Overcharging You? Tokenization, Transparency, and Incentives

Ist Ihr LLM überladen Sie? Tokenization, Transparenz, und Incentives

您的法学硕士是否对你太过苛刻? 2505.21627v1

Authors: Ander Artola Velasco, Stratis Tsirtsis, Nastaran Okati, Manuel Gomez-Rodriguez

State-of-the-art large language models require specialized hardware and substantial energy to operate. As a consequence, cloud-based services that provide access to large language models have become very popular. In these services, the price users pay for an output provided by a model depends on the number of tokens the model uses to generate it – they pay a fixed price per token. In this work, we show that this pricing mechanism creates a financial incentive for providers to strategize and misreport the (number of) tokens a model used to generate an output, and users cannot prove, or even know, whether a provider is overcharging them. However, we also show that, if an unfaithful provider is obliged to be transparent about the generative process used by the model, misreporting optimally without raising suspicion is hard. Nevertheless, as a proof-of-concept, we introduce an efficient heuristic algorithm that allows providers to significantly overcharge users without raising suspicion, highlighting the vulnerability of users under the current pay-per-token pricing mechanism. Further, to completely eliminate the financial incentive to strategize, we introduce a simple incentive-compatible token pricing mechanism. Under this mechanism, the price users pay for an output provided by a model depends on the number of characters of the output – they pay a fixed price per character. Along the way, to illustrate and complement our theoretical results, we conduct experiments with several large language models from the $\texttt{Llama}$, $\texttt{Gemma}$ and $\texttt{Ministral}$ families, and input prompts from the LMSYS Chatbot Arena platform.

nan

Article 840

Title@2025-05-27 (2): Localized Weather Prediction Using Kolmogorov-Arnold Network-Based Models and Deep RNNs

Title: Localized Weather Prediction Using Kolmogorov-Arnold Network-Based Models and Deep RNNs

Lokalisierte Wettervorhersage mit Kolmogorov-Arnold-Netzwerk-basierten Modellen und tiefen RNNs

利用Kolmogorov-Arnold网络模型和深区域网网 2505.22686v1

Authors: Ange-Clement Akazan, Verlon Roel Mbingui, Gnankan Landry Regis N’guessan, Issa Karambal

Weather forecasting is crucial for managing risks and economic planning, particularly in tropical Africa, where extreme events severely impact livelihoods. Yet, existing forecasting methods often struggle with the region’s complex, non-linear weather patterns. This study benchmarks deep recurrent neural networks such as $\texttt{LSTM, GRU, BiLSTM, BiGRU}$, and Kolmogorov-Arnold-based models $(\texttt{KAN} and \texttt{TKAN})$ for daily forecasting of temperature, precipitation, and pressure in two tropical cities: Abidjan, Cote d’Ivoire (Ivory Coast) and Kigali (Rwanda). We further introduce two customized variants of $ \texttt{TKAN}$ that replace its original $\texttt{SiLU}$ activation function with $ \texttt{GeLU}$ and \texttt{MiSH}, respectively. Using station-level meteorological data spanning from 2010 to 2024, we evaluate all the models on standard regression metrics. $\texttt{KAN}$ achieves temperature prediction ($R^2=0.9986$ in Abidjan, $0.9998$ in Kigali, $\texttt{MSE} < 0.0014~^\circ C ^2$), while $\texttt{TKAN}$ variants minimize absolute errors for precipitation forecasting in low-rainfall regimes. The customized $\texttt{TKAN}$ models demonstrate improvements over the standard $\texttt{TKAN}$ across both datasets. Classical \texttt{RNNs} remain highly competitive for atmospheric pressure ($R^2 \approx 0.83{-}0.86$), outperforming $\texttt{KAN}$-based models in this task. These results highlight the potential of spline-based neural architectures for efficient and data-efficient forecasting.

nan

Article 841

Title@2025-05-27 (2): Learning Where to Learn: Training Distribution Selection for Provable OOD Performance

Title: Learning Where to Learn: Training Distribution Selection for Provable OOD Performance

Lernen, wo man lernen kann: Training Distribution Selection for Provable OOD Performance

学习从何学习:选择培训分布,以选择可实现的OOD业绩 2505.21626v1

Authors: Nicolas Guerra, Nicholas H. Nelsen, Yunan Yang

Out-of-distribution (OOD) generalization remains a fundamental challenge in machine learning. Models trained on one data distribution often experience substantial performance degradation when evaluated on shifted or unseen domains. To address this challenge, the present paper studies the design of training data distributions that maximize average-case OOD performance. First, a theoretical analysis establishes a family of generalization bounds that quantify how the choice of training distribution influences OOD error across a predefined family of target distributions. These insights motivate the introduction of two complementary algorithmic strategies: (i) directly formulating OOD risk minimization as a bilevel optimization problem over the space of probability measures and (ii) minimizing a theoretical upper bound on OOD error. Last, the paper evaluates the two approaches across a range of function approximation and operator learning examples. The proposed methods significantly improve OOD accuracy over standard empirical risk minimization with a fixed distribution. These results highlight the potential of distribution-aware training as a principled and practical framework for robust OOD generalization.

nan

Article 842

Title@2025-05-27 (2): VideoMarkBench: Benchmarking Robustness of Video Watermarking

Title: VideoMarkBench: Benchmarking Robustness of Video Watermarking

VideoMarkBench: Benchmarking Robustheit von Video Watermarking

视频MarkBench:视频水标记基准的坚实性 2505.21620v1

Authors: Zhengyuan Jiang, Moyang Guo, Kecen Li, Yuepeng Hu, Yupu Wang, Zhicong Huang, Cheng Hong, Neil Zhenqiang Gong

The rapid development of video generative models has led to a surge in highly realistic synthetic videos, raising ethical concerns related to disinformation and copyright infringement. Recently, video watermarking has been proposed as a mitigation strategy by embedding invisible marks into AI-generated videos to enable subsequent detection. However, the robustness of existing video watermarking methods against both common and adversarial perturbations remains underexplored. In this work, we introduce VideoMarkBench, the first systematic benchmark designed to evaluate the robustness of video watermarks under watermark removal and watermark forgery attacks. Our study encompasses a unified dataset generated by three state-of-the-art video generative models, across three video styles, incorporating four watermarking methods and seven aggregation strategies used during detection. We comprehensively evaluate 12 types of perturbations under white-box, black-box, and no-box threat models. Our findings reveal significant vulnerabilities in current watermarking approaches and highlight the urgent need for more robust solutions. Our code is available at https://github.com/zhengyuan-jiang/VideoMarkBench.

nan

Article 843

Title@2025-05-27 (2): Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making

Title: Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making

Schweigen ist kein Konsens: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making

沉默不是共识:通过用于临床决策的Catfish代理商在多方代理LLMs中破坏协议的偏见 2505.21503v1

Authors: Yihan Wang, Qiao Yan, Zhenghao Xing, Lihao Liu, Junjun He, Chi-Wing Fu, Xiaowei Hu, Pheng-Ann Heng

Large language models (LLMs) have demonstrated strong potential in clinical question answering, with recent multi-agent frameworks further improving diagnostic accuracy via collaborative reasoning. However, we identify a recurring issue of Silent Agreement, where agents prematurely converge on diagnoses without sufficient critical analysis, particularly in complex or ambiguous cases. We present a new concept called Catfish Agent, a role-specialized LLM designed to inject structured dissent and counter silent agreement. Inspired by the ``catfish effect’’ in organizational psychology, the Catfish Agent is designed to challenge emerging consensus to stimulate deeper reasoning. We formulate two mechanisms to encourage effective and context-aware interventions: (i) a complexity-aware intervention that modulates agent engagement based on case difficulty, and (ii) a tone-calibrated intervention articulated to balance critique and collaboration. Evaluations on nine medical Q&A and three medical VQA benchmarks show that our approach consistently outperforms both single- and multi-agent LLMs frameworks, including leading commercial models such as GPT-4o and DeepSeek-R1.

nan

Article 844

Title@2025-05-27 (2): UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents

Title: UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents

UI-Genie: Ein selbstverbesserender Ansatz zur iterativen Steigerung von MLLM-basierten mobilen GUI-Agenten

UI-Genie: 一种自我改进的方法,用于在刺激下促进基于MLLLM的移动图形界面工具 2505.21496v1

Authors: Han Xiao, Guozhi Wang, Yuxiang Chai, Zimu Lu, Weifeng Lin, Hao He, Lue Fan, Liuyang Bian, Rui Hu, Liang Liu, Shuai Ren, Yafei Wen, Xiaoxin Chen, Aojun Zhou, Hongsheng Li

In this paper, we introduce UI-Genie, a self-improving framework addressing two key challenges in GUI agents: verification of trajectory outcome is challenging and high-quality training data are not scalable. These challenges are addressed by a reward model and a self-improving pipeline, respectively. The reward model, UI-Genie-RM, features an image-text interleaved architecture that efficiently pro- cesses historical context and unifies action-level and task-level rewards. To sup- port the training of UI-Genie-RM, we develop deliberately-designed data genera- tion strategies including rule-based verification, controlled trajectory corruption, and hard negative mining. To address the second challenge, a self-improvement pipeline progressively expands solvable complex GUI tasks by enhancing both the agent and reward models through reward-guided exploration and outcome verification in dynamic environments. For training the model, we generate UI- Genie-RM-517k and UI-Genie-Agent-16k, establishing the first reward-specific dataset for GUI agents while demonstrating high-quality synthetic trajectory gen- eration without manual annotation. Experimental results show that UI-Genie achieves state-of-the-art performance across multiple GUI agent benchmarks with three generations of data-model self-improvement. We open-source our complete framework implementation and generated datasets to facilitate further research in https://github.com/Euphoria16/UI-Genie.

nan

Article 845

Title@2025-05-27 (2): Reinforcing General Reasoning without Verifiers

Title: Reinforcing General Reasoning without Verifiers

Verstärkung der allgemeinen Vernunft ohne Prüfer

加强一般理由说明,无验证人 2505.21493v1

Authors: Xiangxin Zhou, Zichen Liu, Anya Sims, Haonan Wang, Tianyu Pang, Chongxuan Li, Liang Wang, Min Lin, Chao Du

The recent paradigm shift towards training large language models (LLMs) using DeepSeek-R1-Zero-style reinforcement learning (RL) on verifiable rewards has led to impressive advancements in code and mathematical reasoning. However, this methodology is limited to tasks where rule-based answer verification is possible and does not naturally extend to real-world domains such as chemistry, healthcare, engineering, law, biology, business, and economics. Current practical workarounds use an additional LLM as a model-based verifier; however, this introduces issues such as reliance on a strong verifier LLM, susceptibility to reward hacking, and the practical burden of maintaining the verifier model in memory during training. To address this and extend DeepSeek-R1-Zero-style training to general reasoning domains, we propose a verifier-free method (VeriFree) that bypasses answer verification and instead uses RL to directly maximize the probability of generating the reference answer. We compare VeriFree with verifier-based methods and demonstrate that, in addition to its significant practical benefits and reduced compute requirements, VeriFree matches and even surpasses verifier-based methods on extensive evaluations across MMLU-Pro, GPQA, SuperGPQA, and math-related benchmarks. Moreover, we provide insights into this method from multiple perspectives: as an elegant integration of training both the policy and implicit verifier in a unified model, and as a variational optimization approach. Code is available at https://github.com/sail-sg/VeriFree.

nan

Article 846

Title@2025-05-27 (2): Be Decisive: Noise-Induced Layouts for Multi-Subject Generation

Title: Be Decisive: Noise-Induced Layouts for Multi-Subject Generation

Entscheidend sein: Lärminduzierte Layouts für die mehrteilige Generierung

Be Decisive: 多主题生成的噪音生成布局 2505.21488v1

Authors: Omer Dahary, Yehonathan Cohen, Or Patashnik, Kfir Aberman, Daniel Cohen-Or

Generating multiple distinct subjects remains a challenge for existing text-to-image diffusion models. Complex prompts often lead to subject leakage, causing inaccuracies in quantities, attributes, and visual features. Preventing leakage among subjects necessitates knowledge of each subject’s spatial location. Recent methods provide these spatial locations via an external layout control. However, enforcing such a prescribed layout often conflicts with the innate layout dictated by the sampled initial noise, leading to misalignment with the model’s prior. In this work, we introduce a new approach that predicts a spatial layout aligned with the prompt, derived from the initial noise, and refines it throughout the denoising process. By relying on this noise-induced layout, we avoid conflicts with externally imposed layouts and better preserve the model’s prior. Our method employs a small neural network to predict and refine the evolving noise-induced layout at each denoising step, ensuring clear boundaries between subjects while maintaining consistency. Experimental results show that this noise-aligned strategy achieves improved text-image alignment and more stable multi-subject generation compared to existing layout-guided techniques, while preserving the rich diversity of the model’s original distribution.

nan

Article 847

Title@2025-05-27 (2): Hardware-Efficient Attention for Fast Decoding

Title: Hardware-Efficient Attention for Fast Decoding

Hardware-Effiziente Aufmerksamkeit für schnelle Dekodierung

快速下标记的硬件高效关注 2505.21487v1

Authors: Ted Zadouri, Hubert Strauss, Tri Dao

LLM decoding is bottlenecked for large batches and long contexts by loading the key-value (KV) cache from high-bandwidth memory, which inflates per-token latency, while the sequential nature of decoding limits parallelism. We analyze the interplay among arithmetic intensity, parallelization, and model quality and question whether current architectures fully exploit modern hardware. This work redesigns attention to perform more computation per byte loaded from memory to maximize hardware efficiency without trading off parallel scalability. We first propose Grouped-Tied Attention (GTA), a simple variant that combines and reuses key and value states, reducing memory transfers without compromising model quality. We then introduce Grouped Latent Attention (GLA), a parallel-friendly latent attention paired with low-level optimizations for fast decoding while maintaining high model quality. Experiments show that GTA matches Grouped-Query Attention (GQA) quality while using roughly half the KV cache and that GLA matches Multi-head Latent Attention (MLA) and is easier to shard. Our optimized GLA kernel is up to 2$\times$ faster than FlashMLA, for example, in a speculative decoding setting when the query length exceeds one. Furthermore, by fetching a smaller KV cache per device, GLA reduces end-to-end latency and increases throughput in online serving benchmarks by up to 2$\times$.

nan

Article 848

Title@2025-05-27 (2): Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-index Models

Title: Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-index Models

Algorithmen und SQ Lower Bounds für robustes Lernen Real-valuierte Multi-Index-Modelle

强力学习实时估价多指数模型的等级和 SQ 下角宽度 2505.21475v1

Authors: Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, Lisheng Ren

We study the complexity of learning real-valued Multi-Index Models (MIMs) under the Gaussian distribution. A $K$-MIM is a function $f:\mathbb{R}^d\to \mathbb{R}$ that depends only on the projection of its input onto a $K$-dimensional subspace. We give a general algorithm for PAC learning a broad class of MIMs with respect to the square loss, even in the presence of adversarial label noise. Moreover, we establish a nearly matching Statistical Query (SQ) lower bound, providing evidence that the complexity of our algorithm is qualitatively optimal as a function of the dimension. Specifically, we consider the class of bounded variation MIMs with the property that degree at most $m$ distinguishing moments exist with respect to projections onto any subspace. In the presence of adversarial label noise, the complexity of our learning algorithm is $d^{O(m)}2^{\mathrm{poly}(K/\epsilon)}$. For the realizable and independent noise settings, our algorithm incurs complexity $d^{O(m)}2^{\mathrm{poly}(K)}(1/\epsilon)^{O(K)}$. To complement our upper bound, we show that if for some subspace degree-$m$ distinguishing moments do not exist, then any SQ learner for the corresponding class of MIMs requires complexity $d^{\Omega(m)}$. As an application, we give the first efficient learner for the class of positive-homogeneous $L$-Lipschitz $K$-MIMs. The resulting algorithm has complexity $\mathrm{poly}(d) 2^{\mathrm{poly}(KL/\epsilon)}$. This gives a new PAC learning algorithm for Lipschitz homogeneous ReLU networks with complexity independent of the network size, removing the exponential dependence incurred in prior work.

nan

Article 849

Title: Annealing Flow Generative Models Towards Sampling High-Dimensional and Multi-Modal Distributions

Annealing Flow Generative Modelle zur Probenahme hochdimensionaler und multi-Modalen Verteilungen

用于取样的高多样性和多模式分布和多模式分布的Ananining流程生成模型 2409.20547v4

Authors: Dongze Wu, Yao Xie

Sampling from high-dimensional, multi-modal distributions remains a fundamental challenge across domains such as statistical Bayesian inference and physics-based machine learning. In this paper, we propose Annealing Flow (AF), a method built on Continuous Normalizing Flow (CNF) for sampling from high-dimensional and multi-modal distributions. AF is trained with a dynamic Optimal Transport (OT) objective incorporating Wasserstein regularization, and guided by annealing procedures, facilitating effective exploration of modes in high-dimensional spaces. Compared to recent NF methods, AF greatly improves training efficiency and stability, with minimal reliance on MC assistance. We demonstrate the superior performance of AF compared to state-of-the-art methods through experiments on various challenging distributions and real-world datasets, particularly in high-dimensional and multi-modal settings. We also highlight AF potential for sampling the least favorable distributions.

nan

Article 850

Title@2025-05-27 (2): SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge

Title: SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge

SOSBENCH: Benchmarking der Sicherheitsausrichtung auf wissenschaftliche Erkenntnisse

SOSBENCH:以科学知识为安全协调基准 2505.21605v1

Authors: Fengqing Jiang, Fengbo Ma, Zhangchen Xu, Yuetai Li, Bhaskar Ramasubramanian, Luyao Niu, Bo Li, Xianyan Chen, Zhen Xiang, Radha Poovendran

Large language models (LLMs) exhibit advancing capabilities in complex tasks, such as reasoning and graduate-level question answering, yet their resilience against misuse, particularly involving scientifically sophisticated risks, remains underexplored. Existing safety benchmarks typically focus either on instructions requiring minimal knowledge comprehension (e.g., ``tell me how to build a bomb”) or utilize prompts that are relatively low-risk (e.g., multiple-choice or classification tasks about hazardous content). Consequently, they fail to adequately assess model safety when handling knowledge-intensive, hazardous scenarios. To address this critical gap, we introduce SOSBench, a regulation-grounded, hazard-focused benchmark encompassing six high-risk scientific domains: chemistry, biology, medicine, pharmacology, physics, and psychology. The benchmark comprises 3,000 prompts derived from real-world regulations and laws, systematically expanded via an LLM-assisted evolutionary pipeline that introduces diverse, realistic misuse scenarios (e.g., detailed explosive synthesis instructions involving advanced chemical formulas). We evaluate frontier models within a unified evaluation framework using our SOSBench. Despite their alignment claims, advanced models consistently disclose policy-violating content across all domains, demonstrating alarmingly high rates of harmful responses (e.g., 79.1% for Deepseek-R1 and 47.3% for GPT-4.1). These results highlight significant safety alignment deficiencies and underscore urgent concerns regarding the responsible deployment of powerful LLMs.

nan

Article 851

Title@2025-05-27 (2): Guide your favorite protein sequence generative model

Title: Guide your favorite protein sequence generative model

Führen Sie Ihre Lieblings-Protein-Sequenz generative Modell

指导您最喜爱的蛋白质序列基因模型 2505.04823v2

Authors: Junhao Xiong, Hunter Nisonoff, Maria Lukarska, Ishan Gaur, Luke M. Oltrogge, David F. Savage, Jennifer Listgarten

Generative machine learning models on sequences are transforming protein engineering. However, no principled framework exists for conditioning these models on auxiliary information, such as experimental data, in a plug-and-play manner. Herein, we present ProteinGuide – a principled and general method for conditioning – by unifying a broad class of protein generative models under a single framework. We demonstrate the applicability of ProteinGuide by guiding two protein generative models, ProteinMPNN and ESM3, to generate amino acid and structure token sequences, conditioned on several user-specified properties such as enhanced stability, enzyme classes, and CATH-labeled folds. We also used ProteinGuide with inverse folding models and our own experimental assay to design adenine base editor sequences for high activity.

nan

Article 852

Title@2025-05-27 (2): When Are Concepts Erased From Diffusion Models?

Title: When Are Concepts Erased From Diffusion Models?

Wann werden Konzepte von Diffusionsmodellen ausgelöscht?

概念何时从传播模型中消失? 2505.17013v3

Authors: Kevin Lu, Nicky Kriplani, Rohit Gandikota, Minh Pham, David Bau, Chinmay Hegde, Niv Cohen

Concept erasure, the ability to selectively prevent a model from generating specific concepts, has attracted growing interest, with various approaches emerging to address the challenge. However, it remains unclear how thoroughly these methods erase the target concept. We begin by proposing two conceptual models for the erasure mechanism in diffusion models: (i) reducing the likelihood of generating the target concept, and (ii) interfering with the model’s internal guidance mechanisms. To thoroughly assess whether a concept has been truly erased from the model, we introduce a suite of independent evaluations. Our evaluation framework includes adversarial attacks, novel probing techniques, and analysis of the model’s alternative generations in place of the erased concept. Our results shed light on the tension between minimizing side effects and maintaining robustness to adversarial prompts. Broadly, our work underlines the importance of comprehensive evaluation for erasure in diffusion models.

nan

Article 853

Title@2025-05-27 (2): On the Robustness of Adversarial Training Against Uncertainty Attacks

Title: On the Robustness of Adversarial Training Against Uncertainty Attacks

Über die Robustheit des zweifelhaften Trainings gegen Ungewissheitsangriffe

关于防止不确定袭击的反逆训练的有力性 2410.21952v2

Authors: Emanuele Ledda, Giovanni Scodeller, Daniele Angioni, Giorgio Piras, Antonio Emanuele Cinà, Giorgio Fumera, Battista Biggio, Fabio Roli

In learning problems, the noise inherent to the task at hand hinders the possibility to infer without a certain degree of uncertainty. Quantifying this uncertainty, regardless of its wide use, assumes high relevance for security-sensitive applications. Within these scenarios, it becomes fundamental to guarantee good (i.e., trustworthy) uncertainty measures, which downstream modules can securely employ to drive the final decision-making process. However, an attacker may be interested in forcing the system to produce either (i) highly uncertain outputs jeopardizing the system’s availability or (ii) low uncertainty estimates, making the system accept uncertain samples that would instead require a careful inspection (e.g., human intervention). Therefore, it becomes fundamental to understand how to obtain robust uncertainty estimates against these kinds of attacks. In this work, we reveal both empirically and theoretically that defending against adversarial examples, i.e., carefully perturbed samples that cause misclassification, additionally guarantees a more secure, trustworthy uncertainty estimate under common attack scenarios without the need for an ad-hoc defense strategy. To support our claims, we evaluate multiple adversarial-robust models from the publicly available benchmark RobustBench on the CIFAR-10 and ImageNet datasets.

nan

Article 854

Title@2025-05-27 (2): Causal Posterior Estimation

Title: Causal Posterior Estimation

Kausale hintere Schätzung

Causal Posides 估计值 2505.21468v1

Authors: Simon Dirmeier, Antonietta Mira

We present Causal Posterior Estimation (CPE), a novel method for Bayesian inference in simulator models, i.e., models where the evaluation of the likelihood function is intractable or too computationally expensive, but where one can simulate model outputs given parameter values. CPE utilizes a normalizing flow-based (NF) approximation to the posterior distribution which carefully incorporates the conditional dependence structure induced by the graphical representation of the model into the neural network. Thereby it is possible to improve the accuracy of the approximation. We introduce both discrete and continuous NF architectures for CPE and propose a constant-time sampling procedure for the continuous case which reduces the computational complexity of drawing samples to O(1) as for discrete NFs. We show, through an extensive experimental evaluation, that by incorporating the conditional dependencies induced by the graphical model directly into the neural network, rather than learning them from data, CPE is able to conduct highly accurate posterior inference either outperforming or matching the state of the art in the field.

nan

Article 855

Title@2025-05-27 (2): GeLLMO: Generalizing Large Language Models for Multi-property Molecule Optimization

Title: GeLLMO: Generalizing Large Language Models for Multi-property Molecule Optimization

GeLLMO: Verallgemeinern von großen Sprachmodellen für Multi-Property-Molekül-Optimierung

GELLMO:通用多财产分子优化大语言模型 2502.13398v2

Authors: Vishal Dey, Xiao Hu, Xia Ning

Despite recent advancements, most computational methods for molecule optimization are constrained to single- or double-property optimization tasks and suffer from poor scalability and generalizability to novel optimization tasks. Meanwhile, Large Language Models (LLMs) demonstrate remarkable out-of-domain generalizability to novel tasks. To demonstrate LLMs’ potential for molecule optimization, we introduce MuMOInstruct, the first high-quality instruction-tuning dataset specifically focused on complex multi-property molecule optimization tasks. Leveraging MuMOInstruct, we develop GeLLMOs, a series of instruction-tuned LLMs for molecule optimization. Extensive evaluations across 5 in-domain and 5 out-of-domain tasks demonstrate that GeLLMOs consistently outperform state-of-the-art baselines. GeLLMOs also exhibit outstanding zero-shot generalization to unseen tasks, significantly outperforming powerful closed-source LLMs. Such strong generalizability demonstrates the tremendous potential of GeLLMOs as foundational models for molecule optimization, thereby tackling novel optimization tasks without resource-intensive retraining. MuMOInstruct, models, and code are accessible through https://github.com/ninglab/GeLLMO.

nan

Article 856

Title@2025-05-27 (2): High-Dimensional Calibration from Swap Regret

Title: High-Dimensional Calibration from Swap Regret

Hochdimensionale Kalibrierung aus Swap-Regret

从 Swap Regret 进行高维校准 2505.21460v1

Authors: Maxwell Fishelson, Noah Golowich, Mehryar Mohri, Jon Schneider

We study the online calibration of multi-dimensional forecasts over an arbitrary convex set $\mathcal{P} \subset \mathbb{R}^d$ relative to an arbitrary norm $\Vert\cdot\Vert$. We connect this with the problem of external regret minimization for online linear optimization, showing that if it is possible to guarantee $O(\sqrt{\rho T})$ worst-case regret after $T$ rounds when actions are drawn from $\mathcal{P}$ and losses are drawn from the dual $\Vert \cdot \Vert_*$ unit norm ball, then it is also possible to obtain $\epsilon$-calibrated forecasts after $T = \exp(O(\rho /\epsilon^2))$ rounds. When $\mathcal{P}$ is the $d$-dimensional simplex and $\Vert \cdot \Vert$ is the $\ell_1$-norm, the existence of $O(\sqrt{T\log d})$-regret algorithms for learning with experts implies that it is possible to obtain $\epsilon$-calibrated forecasts after $T = \exp(O(\log{d}/\epsilon^2)) = d^{O(1/\epsilon^2)}$ rounds, recovering a recent result of Peng (2025). Interestingly, our algorithm obtains this guarantee without requiring access to any online linear optimization subroutine or knowledge of the optimal rate $\rho$ – in fact, our algorithm is identical for every setting of $\mathcal{P}$ and $\Vert \cdot \Vert$. Instead, we show that the optimal regularizer for the above OLO problem can be used to upper bound the above calibration error by a swap regret, which we then minimize by running the recent TreeSwap algorithm with Follow-The-Leader as a subroutine. Finally, we prove that any online calibration algorithm that guarantees $\epsilon T$ $\ell_1$-calibration error over the $d$-dimensional simplex requires $T \geq \exp(\mathrm{poly}(1/\epsilon))$ (assuming $d \geq \mathrm{poly}(1/\epsilon)$). This strengthens the corresponding $d^{\Omega(\log{1/\epsilon})}$ lower bound of Peng, and shows that an exponential dependence on $1/\epsilon$ is necessary.

nan

Article 857

Title@2025-05-27 (2): Designing Cyclic Peptides via Harmonic SDE with Atom-Bond Modeling

Title: Designing Cyclic Peptides via Harmonic SDE with Atom-Bond Modeling

Konzipieren von Cyclic Peptides über Harmonische SDE mit Atom-Bond-Modellierung

通过使用原子-体型建模的波力SDE, 设计圆性五氯苯并配有原子-体型建模 2505.21452v1

Authors: Xiangxin Zhou, Mingyu Li, Yi Xiao, Jiahan Li, Dongyu Xue, Zaixiang Zheng, Jianzhu Ma, Quanquan Gu

Cyclic peptides offer inherent advantages in pharmaceuticals. For example, cyclic peptides are more resistant to enzymatic hydrolysis compared to linear peptides and usually exhibit excellent stability and affinity. Although deep generative models have achieved great success in linear peptide design, several challenges prevent the development of computational methods for designing diverse types of cyclic peptides. These challenges include the scarcity of 3D structural data on target proteins and associated cyclic peptide ligands, the geometric constraints that cyclization imposes, and the involvement of non-canonical amino acids in cyclization. To address the above challenges, we introduce CpSDE, which consists of two key components: AtomSDE, a generative structure prediction model based on harmonic SDE, and ResRouter, a residue type predictor. Utilizing a routed sampling algorithm that alternates between these two models to iteratively update sequences and structures, CpSDE facilitates the generation of cyclic peptides. By employing explicit all-atom and bond modeling, CpSDE overcomes existing data limitations and is proficient in designing a wide variety of cyclic peptides. Our experimental results demonstrate that the cyclic peptides designed by our method exhibit reliable stability and affinity.

nan

Article 858

Title@2025-05-27 (2): Training neural control variates using correlated configurations

Title: Training neural control variates using correlated configurations

Ausbildung von Neuralsteuerungsvariaten mit korrelierten Konfigurationen

使用相关配置的培训神经控制变异 2505.07719v2

Authors: Hyunwoo Oh

Neural control variates (NCVs) have emerged as a powerful tool for variance reduction in Monte Carlo (MC) simulations, particularly in high-dimensional problems where traditional control variates are difficult to construct analytically. By training neural networks to learn auxiliary functions correlated with the target observable, NCVs can significantly reduce estimator variance while preserving unbiasedness. However, a critical but often overlooked aspect of NCV training is the role of autocorrelated samples generated by Markov Chain Monte Carlo (MCMC). While such samples are typically discarded for error estimation due to their statistical redundancy, they may contain useful information about the structure of the underlying probability distribution that can benefit the training process. In this work, we systematically examine the effect of using correlated configurations in training neural control variates. We demonstrate, both conceptually and numerically, that training on correlated data can improve control variate performance, especially in settings with limited computational resources. Our analysis includes empirical results from $U(1)$ gauge theory and scalar field theory, illustrating when and how autocorrelated samples enhance NCV construction. These findings provide practical guidance for the efficient use of MCMC data in training neural networks.

nan

Article 859

Title@2025-05-27 (2): When Two LLMs Debate, Both Think They’ll Win

Title: When Two LLMs Debate, Both Think They’ll Win

Wenn zwei LLMs diskutieren, denken beide, dass sie gewinnen werden

当两个LLM 辩论, 双方都认为他们会赢 2505.19184v2

Authors: Pradyumna Shyama Prasad, Minh Nhat Nguyen

Can LLMs accurately adjust their confidence when facing opposition? Building on previous studies measuring calibration on static fact-based question-answering tasks, we evaluate Large Language Models (LLMs) in a dynamic, adversarial debate setting, uniquely combining two realistic factors: (a) a multi-turn format requiring models to update beliefs as new information emerges, and (b) a zero-sum structure to control for task-related uncertainty, since mutual high-confidence claims imply systematic overconfidence. We organized 60 three-round policy debates among ten state-of-the-art LLMs, with models privately rating their confidence (0-100) in winning after each round. We observed five concerning patterns: (1) Systematic overconfidence: models began debates with average initial confidence of 72.9% vs. a rational 50% baseline. (2) Confidence escalation: rather than reducing confidence as debates progressed, debaters increased their win probabilities, averaging 83% by the final round. (3) Mutual overestimation: in 61.7% of debates, both sides simultaneously claimed >=75% probability of victory, a logical impossibility. (4) Persistent self-debate bias: models debating identical copies increased confidence from 64.1% to 75.2%; even when explicitly informed their chance of winning was exactly 50%, confidence still rose (from 50.0% to 57.1%). (5) Misaligned private reasoning: models’ private scratchpad thoughts sometimes differed from their public confidence ratings, raising concerns about faithfulness of chain-of-thought reasoning. These results suggest LLMs lack the ability to accurately self-assess or update their beliefs in dynamic, multi-turn tasks; a major concern as LLM outputs are deployed without careful review in assistant roles or agentic settings.

nan

Article 860

Title@2025-05-27 (2): Leveraging XP and CRISP-DM for Agile Data Science Projects

Title: Leveraging XP and CRISP-DM for Agile Data Science Projects

Nutzung von XP und CRISP-DM für agile Data Science Projekte

利用XP和CRISP-DM为敏感数据科学项目发挥杠杆作用 2505.21603v1

Authors: Andre Massahiro Shimaoka, Renato Cordeiro Ferreira, Alfredo Goldman

This study explores the integration of eXtreme Programming (XP) and the Cross-Industry Standard Process for Data Mining (CRISP-DM) in agile Data Science projects. We conducted a case study at the e-commerce company Elo7 to answer the research question: How can the agility of the XP method be integrated with CRISP-DM in Data Science projects? Data was collected through interviews and questionnaires with a Data Science team consisting of data scientists, ML engineers, and data product managers. The results show that 86% of the team frequently or always applies CRISP-DM, while 71% adopt XP practices in their projects. Furthermore, the study demonstrates that it is possible to combine CRISP-DM with XP in Data Science projects, providing a structured and collaborative approach. Finally, the study generated improvement recommendations for the company.

nan

Article 861

Title@2025-05-27 (2): Can Large Reasoning Models Self-Train?

Title: Can Large Reasoning Models Self-Train?

Können sich große vernünftigen Modelle selbst entwickeln?

大理由模型能够自我培训吗? 2505.21444v1

Authors: Sheikh Shafayat, Fahim Tajwar, Ruslan Salakhutdinov, Jeff Schneider, Andrea Zanette

Scaling the performance of large language models (LLMs) increasingly depends on methods that reduce reliance on human supervision. Reinforcement learning from automated verification offers an alternative, but it incurs scalability limitations due to dependency upon human-designed verifiers. Self-training, where the model’s own judgment provides the supervisory signal, presents a compelling direction. We propose an online self-training reinforcement learning algorithm that leverages the model’s self-consistency to infer correctness signals and train without any ground-truth supervision. We apply the algorithm to challenging mathematical reasoning tasks and show that it quickly reaches performance levels rivaling reinforcement-learning methods trained explicitly on gold-standard answers. Additionally, we analyze inherent limitations of the algorithm, highlighting how the self-generated proxy reward initially correlated with correctness can incentivize reward hacking, where confidently incorrect outputs are favored. Our results illustrate how self-supervised improvement can achieve significant performance gains without external labels, while also revealing its fundamental challenges.

nan

Article 862

Title@2025-05-27 (2): Autoencoding Random Forests

Title: Autoencoding Random Forests

Zufällige Wälder automatisch kodieren

自动编码随机森林 2505.21441v1

Authors: Binh Duc Vu, Jan Kapar, Marvin Wright, David S. Watson

We propose a principled method for autoencoding with random forests. Our strategy builds on foundational results from nonparametric statistics and spectral graph theory to learn a low-dimensional embedding of the model that optimally represents relationships in the data. We provide exact and approximate solutions to the decoding problem via constrained optimization, split relabeling, and nearest neighbors regression. These methods effectively invert the compression pipeline, establishing a map from the embedding space back to the input space using splits learned by the ensemble’s constituent trees. The resulting decoders are universally consistent under common regularity assumptions. The procedure works with supervised or unsupervised models, providing a window into conditional or joint distributions. We demonstrate various applications of this autoencoder, including powerful new tools for visualization, compression, clustering, and denoising. Experiments illustrate the ease and utility of our method in a wide range of settings, including tabular, image, and genomic data.

nan

Article 863

Title@2025-05-27 (2): ANCHOLIK-NER: A Benchmark Dataset for Bangla Regional Named Entity Recognition

Title: ANCHOLIK-NER: A Benchmark Dataset for Bangla Regional Named Entity Recognition

ANCHOLIK-NER: Ein Benchmark-Datensatz für Bangla Regional Named Entity Recognition

ANCHOLIK-NER:孟加拉地区命名实体识别基准数据集 2502.11198v3

Authors: Bidyarthi Paul, Faika Fairuj Preotee, Shuvashis Sarker, Shamim Rahim Refat, Shifat Islam, Tashreef Muhammad, Mohammad Ashraful Hoque, Shahriar Manzoor

Named Entity Recognition (NER) in regional dialects is a critical yet underexplored area in Natural Language Processing (NLP), especially for low-resource languages like Bangla. While NER systems for Standard Bangla have made progress, no existing resources or models specifically address the challenge of regional dialects such as Barishal, Chittagong, Mymensingh, Noakhali, and Sylhet, which exhibit unique linguistic features that existing models fail to handle effectively. To fill this gap, we introduce ANCHOLIK-NER, the first benchmark dataset for NER in Bangla regional dialects, comprising 17,405 sentences distributed across five regions. The dataset was sourced from publicly available resources and supplemented with manual translations, ensuring alignment of named entities across dialects. We evaluate three transformer-based models - Bangla BERT, Bangla BERT Base, and BERT Base Multilingual Cased - on this dataset. Our findings demonstrate that BERT Base Multilingual Cased performs best in recognizing named entities across regions, with significant performance observed in Mymensingh with an F1-score of 82.611%. Despite strong overall performance, challenges remain in region like Chittagong, where the models show lower precision and recall. Since no previous NER systems for Bangla regional dialects exist, our work represents a foundational step in addressing this gap. Future work will focus on improving model performance in underperforming regions and expanding the dataset to include more dialects, enhancing the development of dialect-aware NER systems.

nan

Article 864

Title@2025-05-27 (2): Measuring Fine-Grained Relatedness in Multitask Learning via Data Attribution

Title: Measuring Fine-Grained Relatedness in Multitask Learning via Data Attribution

Messung der feinkörnigen Verbundenheit im Multitasking-Lernen über Datenzuweisung

通过数据归责衡量多任务学习中的细微关联 2505.21438v1

Authors: Yiwen Tu, Ziqi Liu, Jiaqi W. Ma, Weijing Tang

Measuring task relatedness and mitigating negative transfer remain a critical open challenge in Multitask Learning (MTL). This work extends data attribution – which quantifies the influence of individual training data points on model predictions – to MTL setting for measuring task relatedness. We propose the MultiTask Influence Function (MTIF), a method that adapts influence functions to MTL models with hard or soft parameter sharing. Compared to conventional task relatedness measurements, MTIF provides a fine-grained, instance-level relatedness measure beyond the entire-task level. This fine-grained relatedness measure enables a data selection strategy to effectively mitigate negative transfer in MTL. Through extensive experiments, we demonstrate that the proposed MTIF efficiently and accurately approximates the performance of models trained on data subsets. Moreover, the data selection strategy enabled by MTIF consistently improves model performance in MTL. Our work establishes a novel connection between data attribution and MTL, offering an efficient and fine-grained solution for measuring task relatedness and enhancing MTL models.

nan

Article 865

Title@2025-05-27 (2): Distributional Scaling for Emergent Capabilities

Title: Distributional Scaling for Emergent Capabilities

Verteilungsskalierung für Emergent Capabilities

新兴市场能力分配比例 2502.17356v3

Authors: Rosie Zhao, Tian Qin, David Alvarez-Melis, Sham Kakade, Naomi Saphra

This paper explores the nature of sudden breakthroughs in language model performance at scale, which stand in contrast to smooth improvements governed by scaling laws. While advocates of “emergence” view breakthroughs as unlocked capabilities, others attribute them to thresholding effects on noncontinuous metrics. We propose that breakthroughs are instead driven by continuous changes in the probability distribution of training outcomes when performance is bimodally distributed across random seeds. In synthetic length generalization tasks, we show that different random seeds can produce either highly linear or emergent scaling trends. We reveal that sharp breakthroughs in metrics are produced by underlying continuous changes in their distribution across seeds. Furthermore, we provide a case study of inverse scaling. We validate our distributional scaling framework on realistic settings by measuring MMLU performance in LM populations. These insights emphasize the role of random variation in the effect of scale on LM capabilities.

nan

Article 866

Title@2025-05-27 (2): Attribute-Efficient PAC Learning of Sparse Halfspaces with Constant Malicious Noise Rate

Title: Attribute-Efficient PAC Learning of Sparse Halfspaces with Constant Malicious Noise Rate

Effizientes PAC-Lernen von Sparse-Halbräumen mit konstanter bösartiger Lärmrate

以常态恶意噪音率学习粗微半空空间的属性- 有效 PAC 学习 2505.21430v1

Authors: Shiwei Zeng, Jie Shen

Attribute-efficient learning of sparse halfspaces has been a fundamental problem in machine learning theory. In recent years, machine learning algorithms are faced with prevalent data corruptions or even adversarial attacks. It is of central interest to design efficient algorithms that are robust to noise corruptions. In this paper, we consider that there exists a constant amount of malicious noise in the data and the goal is to learn an underlying $s$-sparse halfspace $w^* \in \mathbb{R}^d$ with $\text{poly}(s,\log d)$ samples. Specifically, we follow a recent line of works and assume that the underlying distribution satisfies a certain concentration condition and a margin condition at the same time. Under such conditions, we show that attribute-efficiency can be achieved by simple variants to existing hinge loss minimization programs. Our key contribution includes: 1) an attribute-efficient PAC learning algorithm that works under constant malicious noise rate; 2) a new gradient analysis that carefully handles the sparsity constraint in hinge loss minimization.

nan

Article 867

Title@2025-05-27 (2): QuForge: A Library for Qudits Simulation

Title: QuForge: A Library for Qudits Simulation

QuForge: Eine Bibliothek für Qudits Simulation

Quforge: Quits 模拟图书馆 2409.17716v2

Authors: Tiago de Souza Farias, Lucas Friedrich, Jonas Maziero

Quantum computing with qudits, an extension of qubits to multiple levels, is a research field less mature than qubit-based quantum computing. However, qudits can offer some advantages over qubits, by representing information with fewer separated components. In this article, we present QuForge, a Python-based library designed to simulate quantum circuits with qudits. This library provides the necessary quantum gates for implementing quantum algorithms, tailored to any chosen qudit dimension. Built on top of differentiable frameworks, QuForge supports execution on accelerating devices such as GPUs and TPUs, significantly speeding up simulations. It also supports sparse operations, leading to a reduction in memory consumption compared to other libraries. Additionally, by constructing quantum circuits as differentiable graphs, QuForge facilitates the implementation of quantum machine learning algorithms, enhancing the capabilities and flexibility of quantum computing research.

nan

Article 868

Title@2025-05-27 (2): Stochastic Online Conformal Prediction with Semi-Bandit Feedback

Title: Stochastic Online Conformal Prediction with Semi-Bandit Feedback

Stochastische Online-Konforme Vorhersage mit Halbbandit Feedback

具有半银行反馈的在线非正式预测 2405.13268v3

Authors: Haosen Ge, Hamsa Bastani, Osbert Bastani

Conformal prediction has emerged as an effective strategy for uncertainty quantification by modifying a model to output sets of labels instead of a single label. These prediction sets come with the guarantee that they contain the true label with high probability. However, conformal prediction typically requires a large calibration dataset of i.i.d. examples. We consider the online learning setting, where examples arrive over time, and the goal is to construct prediction sets dynamically. Departing from existing work, we assume semi-bandit feedback, where we only observe the true label if it is contained in the prediction set. For instance, consider calibrating a document retrieval model to a new domain; in this setting, a user would only be able to provide the true label if the target document is in the prediction set of retrieved documents. We propose a novel conformal prediction algorithm targeted at this setting, and prove that it obtains sublinear regret compared to the optimal conformal predictor. We evaluate our algorithm on a retrieval task, an image classification task, and an auction price-setting task, and demonstrate that it empirically achieves good performance compared to several baselines.

nan

Article 869

Title@2025-05-27 (2): R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Title: R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

R2R: Effizientes Navigieren unterschiedlicher Vernunftpfade mit klein-großen Model Token Routing

R2R: 以小型模型调速器有效导航差异性理性路径 2505.21600v1

Authors: Tianyu Fu, Yi Ge, Yichen You, Enshu Liu, Zhihang Yuan, Guohao Dai, Shengen Yan, Huazhong Yang, Yu Wang

Large Language Models (LLMs) achieve impressive reasoning capabilities at the cost of substantial inference overhead, posing substantial deployment challenges. Although distilled Small Language Models (SLMs) significantly enhance efficiency, their performance suffers as they fail to follow LLMs’ reasoning paths. Luckily, we reveal that only a small fraction of tokens genuinely diverge reasoning paths between LLMs and SLMs. Most generated tokens are either identical or exhibit neutral differences, such as minor variations in abbreviations or expressions. Leveraging this insight, we introduce Roads to Rome (R2R), a neural token routing method that selectively utilizes LLMs only for these critical, path-divergent tokens, while leaving the majority of token generation to the SLM. We also develop an automatic data generation pipeline that identifies divergent tokens and generates token-level routing labels to train the lightweight router. We apply R2R to combine R1-1.5B and R1-32B models from the DeepSeek family, and evaluate on challenging math, coding, and QA benchmarks. With an average activated parameter size of 5.6B, R2R surpasses the average accuracy of R1-7B by 1.6x, outperforming even the R1-14B model. Compared to R1-32B, it delivers a 2.8x wall-clock speedup with comparable performance, advancing the Pareto frontier of test-time scaling efficiency. Our code is available at https://github.com/thu-nics/R2R.

nan

Article 870

Title@2025-05-27 (2): Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning

Title: Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning

Politische Induktion: Vorhersage des Startup-Erfolgs durch erklärbares Memory-Augmented In-Context Learning

政策介绍:通过可解释的记忆增强的内文学习预测启动成功 2505.21427v1

Authors: Xianling Mu, Joseph Ternasky, Fuat Alican, Yigit Ihlamur

Early-stage startup investment is a high-risk endeavor characterized by scarce data and uncertain outcomes. Traditional machine learning approaches often require large, labeled datasets and extensive fine-tuning, yet remain opaque and difficult for domain experts to interpret or improve. In this paper, we propose a transparent and data-efficient investment decision framework powered by memory-augmented large language models (LLMs) using in-context learning (ICL). Central to our method is a natural language policy embedded directly into the LLM prompt, enabling the model to apply explicit reasoning patterns and allowing human experts to easily interpret, audit, and iteratively refine the logic. We introduce a lightweight training process that combines few-shot learning with an in-context learning loop, enabling the LLM to update its decision policy iteratively based on structured feedback. With only minimal supervision and no gradient-based optimization, our system predicts startup success far more accurately than existing benchmarks. It is over 20x more precise than random chance, which succeeds 1.9% of the time. It is also 7.1x more precise than the typical 5.6% success rate of top-tier venture capital (VC) firms.

nan

Article 871

Title@2025-05-27 (2): Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks

Title: Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks

Individuelles Verhalten in agentenbasierten Modellen mit Graph Diffusionsnetzwerken lernen

具有图表传播网络的基于代理模型的学习个人行为 2505.21426v1

Authors: Francesco Cozzi, Marco Pangallo, Alan Perotti, André Panisson, Corrado Monti

Agent-Based Models (ABMs) are powerful tools for studying emergent properties in complex systems. In ABMs, agent behaviors are governed by local interactions and stochastic rules. However, these rules are, in general, non-differentiable, limiting the use of gradient-based methods for optimization, and thus integration with real-world data. We propose a novel framework to learn a differentiable surrogate of any ABM by observing its generated data. Our method combines diffusion models to capture behavioral stochasticity and graph neural networks to model agent interactions. Distinct from prior surrogate approaches, our method introduces a fundamental shift: rather than approximating system-level outputs, it models individual agent behavior directly, preserving the decentralized, bottom-up dynamics that define ABMs. We validate our approach on two ABMs (Schelling’s segregation model and a Predator-Prey ecosystem) showing that it replicates individual-level patterns and accurately forecasts emergent dynamics beyond training. Our results demonstrate the potential of combining diffusion models and graph learning for data-driven ABM simulation.

nan

Article 872

Title@2025-05-27 (2): GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

Title: GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

GenPO: Generative Diffusionsmodelle treffen auf On-Policy-Verstärkungs-Lernen

GENPO: 符合政策强化学习的生成传播模式 2505.18763v2

Authors: Shutong Ding, Ke Hu, Shan Zhong, Haoyang Luo, Weinan Zhang, Jingya Wang, Jun Wang, Ye Shi

Recent advances in reinforcement learning (RL) have demonstrated the powerful exploration capabilities and multimodality of generative diffusion-based policies. While substantial progress has been made in offline RL and off-policy RL settings, integrating diffusion policies into on-policy frameworks like PPO remains underexplored. This gap is particularly significant given the widespread use of large-scale parallel GPU-accelerated simulators, such as IsaacLab, which are optimized for on-policy RL algorithms and enable rapid training of complex robotic tasks. A key challenge lies in computing state-action log-likelihoods under diffusion policies, which is straightforward for Gaussian policies but intractable for flow-based models due to irreversible forward-reverse processes and discretization errors (e.g., Euler-Maruyama approximations). To bridge this gap, we propose GenPO, a generative policy optimization framework that leverages exact diffusion inversion to construct invertible action mappings. GenPO introduces a novel doubled dummy action mechanism that enables invertibility via alternating updates, resolving log-likelihood computation barriers. Furthermore, we also use the action log-likelihood for unbiased entropy and KL divergence estimation, enabling KL-adaptive learning rates and entropy regularization in on-policy updates. Extensive experiments on eight IsaacLab benchmarks, including legged locomotion (Ant, Humanoid, Anymal-D, Unitree H1, Go2), dexterous manipulation (Shadow Hand), aerial control (Quadcopter), and robotic arm tasks (Franka), demonstrate GenPO’s superiority over existing RL baselines. Notably, GenPO is the first method to successfully integrate diffusion policies into on-policy RL, unlocking their potential for large-scale parallelized training and real-world robotic deployment.

nan

Article 873

Title@2025-05-27 (2): A Lightweight Method to Disrupt Memorized Sequences in LLM

Title: A Lightweight Method to Disrupt Memorized Sequences in LLM

Eine leichte Methode zum Disruptieren von gemerkten Sequenzen in LLM

LLM 中破坏记忆序列的轻量方法 2502.05159v2

Authors: Parjanya Prajakta Prashant, Kaustubh Ponkshe, Babak Salimi

As language models scale, their performance improves dramatically across a wide range of tasks, but so does their tendency to memorize and regurgitate parts of their training data verbatim. This tradeoff poses serious legal, ethical, and safety concerns, especially in real-world deployments. Existing mitigation techniques, such as differential privacy or model unlearning, often require retraining or access to internal weights making them impractical for most users. In this work, we introduce TokenSwap, a lightweight, post-hoc defense designed for realistic settings where the user can only access token-level outputs. Our key insight is that while large models are necessary for high task performance, small models (e.g., DistilGPT-2) are often sufficient to assign fluent, grammatically plausible probabilities to common function words - and crucially, they memorize far less. By selectively swapping token probabilities between models, TokenSwap preserves the capabilities of large models while reducing their propensity for verbatim reproduction. Evaluations on Pythia-6.9B and Llama-3-8B show up to a 10$\times$ drop in exact memorization with negligible task degradation. Our method offers a practical, accessible solution for mitigating memorized generation in deployed LLMs.

nan

Article 874

Title@2025-05-27 (2): Can Large Language Models Understand Symbolic Graphics Programs?

Title: Can Large Language Models Understand Symbolic Graphics Programs?

Können große Sprachmodelle symbolische Grafikprogramme verstehen?

大语言模型能理解符号图形程序吗? 2408.08313v4

Authors: Zeju Qiu, Weiyang Liu, Haiwen Feng, Zhen Liu, Tim Z. Xiao, Katherine M. Collins, Joshua B. Tenenbaum, Adrian Weller, Michael J. Black, Bernhard Schölkopf

Against the backdrop of enthusiasm for large language models (LLMs), there is a growing need to scientifically assess their capabilities and shortcomings. This is nontrivial in part because it is difficult to find tasks which the models have not encountered during training. Utilizing symbolic graphics programs, we propose a domain well-suited to test multiple spatial-semantic reasoning skills of LLMs. Popular in computer graphics, these programs procedurally generate visual data. While LLMs exhibit impressive skills in general program synthesis and analysis, symbolic graphics programs offer a new layer of evaluation: they allow us to test an LLM’s ability to answer semantic questions about the images or 3D geometries without a vision encoder. To semantically understand the symbolic programs, LLMs would need to possess the ability to “imagine” and reason how the corresponding graphics content would look with only the symbolic description of the local curvatures and strokes. We use this task to evaluate LLMs by creating a large benchmark for the semantic visual understanding of symbolic graphics programs, built procedurally with minimal human effort. Particular emphasis is placed on transformations of images that leave the image level semantics invariant while introducing significant changes to the underlying program. We evaluate commercial and open-source LLMs on our benchmark to assess their ability to reason about visual output of programs, finding that LLMs considered stronger at reasoning generally perform better. Lastly, we introduce a novel method to improve this ability – Symbolic Instruction Tuning (SIT), in which the LLM is finetuned with pre-collected instruction data on symbolic graphics programs. Interestingly, we find that SIT not only improves LLM’s understanding on symbolic programs, but it also improves general reasoning ability on various other benchmarks.

nan

Article 875

Title@2025-05-27 (2): Optimizing Deep Learning for Skin Cancer Classification: A Computationally Efficient CNN with Minimal Accuracy Trade-Off

Title: Optimizing Deep Learning for Skin Cancer Classification: A Computationally Efficient CNN with Minimal Accuracy Trade-Off

Deep Learning für Hautkrebs-Klassifikation optimieren: Ein Computational Efficient CNN mit minimaler Genauigkeit Trade-Off

最优化皮肤癌症分类深层学习:计算效率高的有线电视新闻网与最低准确性交易 2505.21597v1

Authors: Abdullah Al Mamun, Pollob Chandra Ray, Md Rahat Ul Nasib, Akash Das, Jia Uddin, Md Nurul Absur

The rapid advancement of deep learning in medical image analysis has greatly enhanced the accuracy of skin cancer classification. However, current state-of-the-art models, especially those based on transfer learning like ResNet50, come with significant computational overhead, rendering them impractical for deployment in resource-constrained environments. This study proposes a custom CNN model that achieves a 96.7\% reduction in parameters (from 23.9 million in ResNet50 to 692,000) while maintaining a classification accuracy deviation of less than 0.022\%. Our empirical analysis of the HAM10000 dataset reveals that although transfer learning models provide a marginal accuracy improvement of approximately 0.022\%, they result in a staggering 13,216.76\% increase in FLOPs, considerably raising computational costs and inference latency. In contrast, our lightweight CNN architecture, which encompasses only 30.04 million FLOPs compared to ResNet50’s 4.00 billion, significantly reduces energy consumption, memory footprint, and inference time. These findings underscore the trade-off between the complexity of deep models and their real-world feasibility, positioning our optimized CNN as a practical solution for mobile and edge-based skin cancer diagnostics.

nan

Article 876

Title@2025-05-27 (2): Learning optimal treatment strategies for intraoperative hypotension using deep reinforcement learning

Title: Learning optimal treatment strategies for intraoperative hypotension using deep reinforcement learning

Optimale Therapiestrategien für intraoperative Hypotonie mit Deep-Enforcement-Lernen

利用深强化学习学习,学习采用最佳治疗战略,以弥补职业内衰退 2505.21596v1

Authors: Esra Adiyeke, Tianqi Liu, Venkata Sai Dheeraj Naganaboina, Han Li, Tyler J. Loftus, Yuanfang Ren, Benjamin Shickel, Matthew M. Ruppert, Karandeep Singh, Ruogu Fang, Parisa Rashidi, Azra Bihorac, Tezcan Ozrazgat-Baslanti

Traditional methods of surgical decision making heavily rely on human experience and prompt actions, which are variable. A data-driven system generating treatment recommendations based on patient states can be a substantial asset in perioperative decision-making, as in cases of intraoperative hypotension, for which suboptimal management is associated with acute kidney injury (AKI), a common and morbid postoperative complication. We developed a Reinforcement Learning (RL) model to recommend optimum dose of intravenous (IV) fluid and vasopressors during surgery to avoid intraoperative hypotension and postoperative AKI. We retrospectively analyzed 50,021 surgeries from 42,547 adult patients who underwent major surgery at a quaternary care hospital between June 2014 and September 2020. Of these, 34,186 surgeries were used for model training and 15,835 surgeries were reserved for testing. We developed a Deep Q-Networks based RL model using 16 variables including intraoperative physiologic time series, total dose of IV fluid and vasopressors extracted for every 15-minute epoch. The model replicated 69% of physician’s decisions for the dosage of vasopressors and proposed higher or lower dosage of vasopressors than received in 10% and 21% of the treatments, respectively. In terms of IV fluids, the model’s recommendations were within 0.05 ml/kg/15 min of the actual dose in 41% of the cases, with higher or lower doses recommended for 27% and 32% of the treatments, respectively. The model resulted in a higher estimated policy value compared to the physicians’ actual treatments, as well as random and zero-drug policies. AKI prevalence was the lowest in patients receiving medication dosages that aligned with model’s decisions. Our findings suggest that implementation of the model’s policy has the potential to reduce postoperative AKI and improve other outcomes driven by intraoperative hypotension.

nan

Article 877

Title@2025-05-27 (2): Relevance-driven Input Dropout: an Explanation-guided Regularization Technique

Title: Relevance-driven Input Dropout: an Explanation-guided Regularization Technique

Relevanz-gesteuerter Input Dropout: eine Erklärungs-geführte Regularisierungstechnik

由相关性驱动的 “ 投入辍学:解释指导规范化技术 “ 2505.21595v1

Authors: Shreyas Gururaj, Lars Grüne, Wojciech Samek, Sebastian Lapuschkin, Leander Weber

Overfitting is a well-known issue extending even to state-of-the-art (SOTA) Machine Learning (ML) models, resulting in reduced generalization, and a significant train-test performance gap. Mitigation measures include a combination of dropout, data augmentation, weight decay, and other regularization techniques. Among the various data augmentation strategies, occlusion is a prominent technique that typically focuses on randomly masking regions of the input during training. Most of the existing literature emphasizes randomness in selecting and modifying the input features instead of regions that strongly influence model decisions. We propose Relevance-driven Input Dropout (RelDrop), a novel data augmentation method which selectively occludes the most relevant regions of the input, nudging the model to use other important features in the prediction process, thus improving model generalization through informed regularization. We further conduct qualitative and quantitative analyses to study how Relevance-driven Input Dropout (RelDrop) affects model decision-making. Through a series of experiments on benchmark datasets, we demonstrate that our approach improves robustness towards occlusion, results in models utilizing more features within the region of interest, and boosts inference time generalization performance. Our code is available at https://github.com/Shreyas-Gururaj/LRP_Relevance_Dropout.

nan

Article 878

Title@2025-05-27 (2): Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges

Title: Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges

Benchmarking Spatiotemporal Reasoning in LLMs und Reasoning Models: Fähigkeiten und Herausforderungen

确定LLM和理由模型的偏差理由基准:能力和挑战 2505.11618v2

Authors: Pengrui Quan, Brian Wang, Kang Yang, Liying Han, Mani Srivastava

Spatiotemporal reasoning plays a key role in Cyber-Physical Systems (CPS). Despite advances in Large Language Models (LLMs) and Large Reasoning Models (LRMs), their capacity to reason about complex spatiotemporal signals remains underexplored. This paper proposes a hierarchical SpatioTemporal reAsoning benchmaRK, STARK, to systematically evaluate LLMs across three levels of reasoning complexity: state estimation (e.g., predicting field variables, localizing and tracking events in space and time), spatiotemporal reasoning over states (e.g., inferring spatial-temporal relationships), and world-knowledge-aware reasoning that integrates contextual and domain knowledge (e.g., intent prediction, landmark-aware navigation). We curate 26 distinct spatiotemporal tasks with diverse sensor modalities, comprising 14,552 challenges where models answer directly or by Python Code Interpreter. Evaluating 3 LRMs and 8 LLMs, we find LLMs achieve limited success in tasks requiring geometric reasoning (e.g., multilateration or triangulation), particularly as complexity increases. Surprisingly, LRMs show robust performance across tasks with various levels of difficulty, often competing or surpassing traditional first-principle-based methods. Our results show that in reasoning tasks requiring world knowledge, the performance gap between LLMs and LRMs narrows, with some LLMs even surpassing LRMs. However, the LRM o3 model continues to achieve leading performance across all evaluated tasks, a result attributed primarily to the larger size of the reasoning models. STARK motivates future innovations in model architectures and reasoning paradigms for intelligent CPS by providing a structured framework to identify limitations in the spatiotemporal reasoning of LLMs and LRMs.

nan

Article 879

Title@2025-05-27 (2): Conflicting Biases at the Edge of Stability: Norm versus Sharpness Regularization

Title: Conflicting Biases at the Edge of Stability: Norm versus Sharpness Regularization

Widersprüchliche Biasen am Rande der Stabilität: Norm versus Schärfe Regularisierung

稳定边缘的冲突两重冲突:规范与尖锐的规范化 2505.21423v1

Authors: Vit Fojtik, Maria Matveev, Hung-Hsu Chou, Gitta Kutyniok, Johannes Maly

A widely believed explanation for the remarkable generalization capacities of overparameterized neural networks is that the optimization algorithms used for training induce an implicit bias towards benign solutions. To grasp this theoretically, recent works examine gradient descent and its variants in simplified training settings, often assuming vanishing learning rates. These studies reveal various forms of implicit regularization, such as $\ell_1$-norm minimizing parameters in regression and max-margin solutions in classification. Concurrently, empirical findings show that moderate to large learning rates exceeding standard stability thresholds lead to faster, albeit oscillatory, convergence in the so-called Edge-of-Stability regime, and induce an implicit bias towards minima of low sharpness (norm of training loss Hessian). In this work, we argue that a comprehensive understanding of the generalization performance of gradient descent requires analyzing the interaction between these various forms of implicit regularization. We empirically demonstrate that the learning rate balances between low parameter norm and low sharpness of the trained model. We furthermore prove for diagonal linear networks trained on a simple regression task that neither implicit bias alone minimizes the generalization error. These findings demonstrate that focusing on a single implicit bias is insufficient to explain good generalization, and they motivate a broader view of implicit regularization that captures the dynamic trade-off between norm and sharpness induced by non-negligible learning rates.

nan

Article 880

Title@2025-05-27 (2): When Shift Happens - Confounding Is to Blame

Title: When Shift Happens - Confounding Is to Blame

Wenn es zu einer Verschiebung kommt - Verwirren ist die Schuld

发生变迁时 - 令人不安的是责怪 2505.21422v1

Authors: Abbavaram Gowtham Reddy, Celia Rubio-Madrigal, Rebekka Burkholz, Krikamol Muandet

Distribution shifts introduce uncertainty that undermines the robustness and generalization capabilities of machine learning models. While conventional wisdom suggests that learning causal-invariant representations enhances robustness to such shifts, recent empirical studies present a counterintuitive finding: (i) empirical risk minimization (ERM) can rival or even outperform state-of-the-art out-of-distribution (OOD) generalization methods, and (ii) its OOD generalization performance improves when all available covariates, not just causal ones, are utilized. Drawing on both empirical and theoretical evidence, we attribute this phenomenon to hidden confounding. Shifts in hidden confounding induce changes in data distributions that violate assumptions commonly made by existing OOD generalization approaches. Under such conditions, we prove that effective generalization requires learning environment-specific relationships, rather than relying solely on invariant ones. Furthermore, we show that models augmented with proxies for hidden confounders can mitigate the challenges posed by hidden confounding shifts. These findings offer new theoretical insights and practical guidance for designing robust OOD generalization algorithms and principled covariate selection strategies.

nan

Article 881

Title@2025-05-27 (2): A Physics-Augmented GraphGPS Framework for the Reconstruction of 3D Riemann Problems from Sparse Data

Title: A Physics-Augmented GraphGPS Framework for the Reconstruction of 3D Riemann Problems from Sparse Data

Ein physikgestütztes GraphGPS-Framework für den Wiederaufbau von 3D Riemann-Problemen aus Sparse-Daten

物理辅助图形GPS框架,用于从简简数据中重建3D里伊曼问题 2505.21421v1

Authors: Rami Cassia, Rich Kerswell

In compressible fluid flow, reconstructing shocks, discontinuities, rarefactions, and their interactions from sparse measurements is an important inverse problem with practical applications. Moreover, physics-informed machine learning has recently become an increasingly popular approach for performing reconstructions tasks. In this work we explore a machine learning recipe, known as GraphGPS, for reconstructing canonical compressible flows known as 3D Riemann problems from sparse observations, in a physics-informed manner. The GraphGPS framework combines the benefits of positional encodings, local message-passing of graphs, and global contextual awareness, and we explore the latter two components through an ablation study. Furthermore, we modify the aggregation step of message-passing such that it is aware of shocks and discontinuities, resulting in sharper reconstructions of these features. Additionally, we modify message-passing such that information flows strictly from known nodes only, which results in computational savings, better training convergence, and no degradation of reconstruction accuracy. We also show that the GraphGPS framework outperforms numerous machine learning benchmarks.

nan

Article 882

Title@2025-05-27 (2): From Continual Learning to SGD and Back: Better Rates for Continual Linear Models

Title: From Continual Learning to SGD and Back: Better Rates for Continual Linear Models

Vom kontinuierlichen Lernen bis hin zu SGD und Back: Bessere Preise für kontinuierliche lineare Modelle

从持续学习到SGD和后退:持续线性模型的更好比率 2504.04579v2

Authors: Itay Evron, Ran Levinstein, Matan Schliserman, Uri Sherman, Tomer Koren, Daniel Soudry, Nathan Srebro

We theoretically study the common continual learning setup where an overparameterized model is sequentially fitted to a set of jointly realizable tasks. We analyze the forgetting, i.e., loss on previously seen tasks, after $k$ iterations. For continual linear models, we prove that fitting a task is equivalent to a single stochastic gradient descent (SGD) step on a modified objective. We develop novel last-iterate SGD upper bounds in the realizable least squares setup, which we then leverage to derive new results for continual learning. Focusing on random orderings over $T$ tasks, we establish universal forgetting rates, whereas existing rates depend on the problem dimensionality or complexity. Specifically, in continual regression with replacement, we improve the best existing rate from $O((d-r)/k)$ to $O(\min(k^{-1/4}, \sqrt{d-r}/k, \sqrt{Tr}/k))$, where $d$ is the dimensionality and $r$ the average task rank. Furthermore, we establish the first rate for random task orderings without replacement. The obtained rate of $O(\min(T^{-1/4}, (d-r)/T))$ proves for the first time that randomization alone, with no task repetition, can prevent catastrophic forgetting in sufficiently long task sequences. Finally, we prove a matching $O(k^{-1/4})$ forgetting rate for continual linear classification on separable data. Our universal rates apply for broader projection methods, such as block Kaczmarz and POCS, illuminating their loss convergence under i.i.d. and one-pass orderings.

nan

Article 883

Title@2025-05-27 (2): Efficiently Scaling LLM Reasoning with Certaindex

Title: Efficiently Scaling LLM Reasoning with Certaindex

Effiziente Skalierung der LLM-Vernunft mit bestimmtem Dex

高效扩增 LLM 使用 emitedex 说明 2412.20993v2

Authors: Yichao Fu, Junda Chen, Siqi Zhu, Zheyu Fu, Zhongdongming Dai, Yonghao Zhuang, Yian Ma, Aurick Qiao, Tajana Rosing, Ion Stoica, Hao Zhang

Test-time reasoning algorithms such as chain-of-thought, self-consistency, and MCTS enhance LLM problem-solving but can wastefully generate many tokens without improving accuracy. At the same time, we observe that these algorithms exhibit answer stabilization: their intermediate solutions often cease to change after a certain point, and further investment of compute does not change their final answer. To quantify this phenomenon, we introduce Certaindex, an algorithm-agnostic metric measuring this evolving stability, signaling when further computation is unlikely to alter the final result. Certaindex is lightweight, can accelerate reasoning program inference via early exit, and further enables dynamic token allocation, gang scheduling, and many opportunities when integrated with real-world LLM serving systems. To quantify real-world benefits, we built Certaindex as a scheduler into Dynasor, our reasoning-aware LLM serving system, and demonstrate up to 50% compute savings and 3.3x higher throughput in real workloads with no accuracy drop. Our code is available at https://github.com/hao-ai-lab/Dynasor.git

nan

Article 884

Title@2025-05-27 (2): A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment

Title: A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment

Ein Rahmen für die strittige Analyse von Entscheidungsunterstützungssystemen vor der Einführung

在部署之前对决定支助系统进行反对分析的框架 2505.21414v1

Authors: Brett Bissey, Kyle Gatesman, Walker Dimon, Mohammad Alam, Luis Robaina, Joseph Weissman

This paper introduces a comprehensive framework designed to analyze and secure decision-support systems trained with Deep Reinforcement Learning (DRL), prior to deployment, by providing insights into learned behavior patterns and vulnerabilities discovered through simulation. The introduced framework aids in the development of precisely timed and targeted observation perturbations, enabling researchers to assess adversarial attack outcomes within a strategic decision-making context. We validate our framework, visualize agent behavior, and evaluate adversarial outcomes within the context of a custom-built strategic game, CyberStrike. Utilizing the proposed framework, we introduce a method for systematically discovering and ranking the impact of attacks on various observation indices and time-steps, and we conduct experiments to evaluate the transferability of adversarial attacks across agent architectures and DRL training algorithms. The findings underscore the critical need for robust adversarial defense mechanisms to protect decision-making policies in high-stakes environments.

nan

Article 885

Title@2025-05-27 (2): Comparison of the Cox proportional hazards model and Random Survival Forest algorithm for predicting patient-specific survival probabilities in clinical trial data

Title: Comparison of the Cox proportional hazards model and Random Survival Forest algorithm for predicting patient-specific survival probabilities in clinical trial data

Vergleich des Cox-Proportional-Hazards-Modells und des Random Survival Forest-Algorithmus zur Vorhersage patientenspezifischer Überlebenswahrscheinlichkeiten in klinischen Studiendaten

比较Cox按比例比例危害模型和随机生存森林算法,以预测临床试验数据中特定患者生存概率 2502.03119v2

Authors: Ricarda Graf, Susan Todd, M. Fazil Baksh

The Cox proportional hazards model is often used to analyze data from Randomized Controlled Trials (RCT) with time-to-event outcomes. Random survival forest (RSF) is a machine-learning algorithm known for its high predictive performance. We conduct a comprehensive neutral comparison study to compare the performance of Cox regression and RSF in various simulation scenarios based on two reference datasets from RCTs. The motivation is to identify settings in which one method is preferable over the other when comparing different aspects of performance using measures according to the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) recommendations. Our results show that conclusions solely based on the C index, a performance measure that has been predominantly used in previous studies comparing predictive accuracy of the Cox-PH and RSF model based on real-world observational time-to-event data and that has been criticized by methodologists, may not be generalizable to other aspects of predictive performance. We found that measures of overall performance may generally give more reasonable results, and that the standard log-rank splitting rule used for the RSF may be outperformed by alternative splitting rules, in particular in nonproportional hazards settings. In our simulations, performance of the RSF suffers less in data with treatment-covariate interactions compared to data where these are absent. Performance of the Cox-PH model is affected by the violation of the proportional hazards assumption.

nan

Article 886

Title@2025-05-27 (2): MRSD: Multi-Resolution Skill Discovery for HRL Agents

Title: MRSD: Multi-Resolution Skill Discovery for HRL Agents

MRSD: Multi-Resolution Skill Discovery für HRL-Agenten

MRSD: HRL代理机构多分辨率技能发现 2505.21410v1

Authors: Shashank Sharma, Janina Hoffmann, Vinay Namboodiri

Hierarchical reinforcement learning (HRL) relies on abstract skills to solve long-horizon tasks efficiently. While existing skill discovery methods learns these skills automatically, they are limited to a single skill per task. In contrast, humans learn and use both fine-grained and coarse motor skills simultaneously. Inspired by human motor control, we propose Multi-Resolution Skill Discovery (MRSD), an HRL framework that learns multiple skill encoders at different temporal resolutions in parallel. A high-level manager dynamically selects among these skills, enabling adaptive control strategies over time. We evaluate MRSD on tasks from the DeepMind Control Suite and show that it outperforms prior state-of-the-art skill discovery and HRL methods, achieving faster convergence and higher final performance. Our findings highlight the benefits of integrating multi-resolution skills in HRL, paving the way for more versatile and efficient agents.

nan

Article 887

Title@2025-05-27 (2): Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks

Title: Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks

Dual Natural Gradient Descent für skalierbare Ausbildung von physikinformierten Neuronalen Netzwerken

物理内成形神经网络可缩放培训 2505.21404v1

Authors: Anas Jnini, Flavio Vella

Natural-gradient methods markedly accelerate the training of Physics-Informed Neural Networks (PINNs), yet their Gauss–Newton update must be solved in the parameter space, incurring a prohibitive $O(n^3)$ time complexity, where $n$ is the number of network trainable weights. We show that exactly the same step can instead be formulated in a generally smaller residual space of size $m = \sum_{\gamma} N_{\gamma} d_{\gamma}$, where each residual class $\gamma$ (e.g. PDE interior, boundary, initial data) contributes $N_{\gamma}$ collocation points of output dimension $d_{\gamma}$. Building on this insight, we introduce \textit{Dual Natural Gradient Descent} (D-NGD). D-NGD computes the Gauss–Newton step in residual space, augments it with a geodesic-acceleration correction at negligible extra cost, and provides both a dense direct solver for modest $m$ and a Nystrom-preconditioned conjugate-gradient solver for larger $m$. Experimentally, D-NGD scales second-order PINN optimization to networks with up to 12.8 million parameters, delivers one- to three-order-of-magnitude lower final error $L^2$ than first-order methods (Adam, SGD) and quasi-Newton methods, and – crucially – enables natural-gradient training of PINNs at this scale on a single GPU.

nan

Article 888

Title@2025-05-27 (2): A Convergence Theory for Diffusion Language Models: An Information-Theoretic Perspective

Title: A Convergence Theory for Diffusion Language Models: An Information-Theoretic Perspective

Eine Konvergenztheorie für Diffusions-Sprachmodelle: Eine informationstheoretische Perspektive

传播语言模型集成理论:信息理论视角 2505.21400v1

Authors: Gen Li, Changxiao Cai

Diffusion models have emerged as a powerful paradigm for modern generative modeling, demonstrating strong potential for large language models (LLMs). Unlike conventional autoregressive (AR) models that generate tokens sequentially, diffusion models enable parallel token sampling, leading to faster generation and eliminating left-to-right generation constraints. Despite their empirical success, the theoretical understanding of diffusion model approaches remains underdeveloped. In this work, we develop convergence guarantees for diffusion language models from an information-theoretic perspective. Our analysis demonstrates that the sampling error, measured by the Kullback-Leibler (KL) divergence, decays inversely with the number of iterations $T$ and scales linearly with the mutual information between tokens in the target text sequence. In particular, we establish matching upper and lower bounds, up to some constant factor, to demonstrate the tightness of our convergence analysis. These results offer novel theoretical insights into the practical effectiveness of diffusion language models.

nan

Article 889

Title@2025-05-27 (2): Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling

Title: Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling

Factual Self-Awareness in Sprachmodellen: Repräsentation, Robustheit und Skalierung

语言模式中的事实自觉意识:代表性、强力和比例 2505.21399v1

Authors: Hovhannes Tamoyan, Subhabrata Dutta, Iryna Gurevych

Factual incorrectness in generated content is one of the primary concerns in ubiquitous deployment of large language models (LLMs). Prior findings suggest LLMs can (sometimes) detect factual incorrectness in their generated content (i.e., fact-checking post-generation). In this work, we provide evidence supporting the presence of LLMs’ internal compass that dictate the correctness of factual recall at the time of generation. We demonstrate that for a given subject entity and a relation, LLMs internally encode linear features in the Transformer’s residual stream that dictate whether it will be able to recall the correct attribute (that forms a valid entity-relation-attribute triplet). This self-awareness signal is robust to minor formatting variations. We investigate the effects of context perturbation via different example selection strategies. Scaling experiments across model sizes and training dynamics highlight that self-awareness emerges rapidly during training and peaks in intermediate layers. These findings uncover intrinsic self-monitoring capabilities within LLMs, contributing to their interpretability and reliability.

nan

Article 890

Title@2025-05-27 (2): Square$χ$PO: Differentially Private and Robust $χ^2$-Preference Optimization in Offline Direct Alignment

Title: Square$χ$PO: Differentially Private and Robust $χ^2$-Preference Optimization in Offline Direct Alignment

Square$x$PO: Differential privat und robust $x^2$-Preference Optimierung in Offline Direct Alignment

平方美元=美元PO$:在离线直接调整中区别对待的私人和强势的美元=2美元-优惠优化 2505.21395v1

Authors: Xingyu Zhou, Yulian Wu, Wenqian Weng, Francesco Orabona

In this paper, we theoretically study the offline alignment of language models with human preference feedback, under both preference label corruption and privacy protections. To this end, we propose Square$\chi$PO, a simple one-line change to $\chi$PO where the standard log-loss is replaced by a new square loss over probability. Thanks to the inherent properties of this new loss, we have advanced the state-of-the-art of differentially private and robust offline direct alignment. Specifically, for the local model of label privacy, Square$\chi$PO is the first algorithm that attains an optimal rate based on single-policy concentrability even with general function approximations. It also gives the first result under the central model of privacy protection over both prompts (responses) and labels. On the robustness side against Huber label corruption, Square$\chi$PO is the first alignment method that has a meaningful theoretical guarantee under general function approximations. More importantly, Square$\chi$PO can address privacy protection and corruption simultaneously, where an interesting separation is observed, implying that the order of privacy and corruption matters. Furthermore, we show that Square$\chi$PO can also be easily extended to handle the scenario of the general preference model with state-of-the-art guarantees under corruption and privacy. Last but not least, all of our theoretical guarantees enjoy a unified analysis, building upon a new result on the generalization error bounds of least-square regression under corruption and privacy constraints, which we believe is of independent interest to the community.

nan

Article 891

Title@2025-05-27 (2): Foundation Models on a Budget: Approximating Blocks in Large Vision Models

Title: Foundation Models on a Budget: Approximating Blocks in Large Vision Models

Basismodelle auf einem Budget: Annähernde Blöcke in großen Visionsmodellen

预算模式基础模式:大愿景模式中类似障碍 2410.04941v5

Authors: Irene Cannistraci, Simone Antonelli, Emanuele Palumbo, Thomas M. Sutter, Emanuele Rodolà, Bastian Rieck, Julia E. Vogt

Foundation Models have shown impressive performance in various tasks and domains, yet they require massive computational resources, raising concerns about accessibility and sustainability. Previous attempts to reduce foundation model size fall short of fully addressing the problem, as they end up increasing computational load through additional training steps. Recent works reveal that deep neural networks exhibit internal representation similarities. While inter-network similarities have enabled techniques such as model stitching and merging, intra-network similarities remain underexplored for improving efficiency. In this paper, we propose Transformer Blocks Approximation (TBA), a novel method that leverages intra-network similarities to identify and approximate transformer blocks in large vision models. TBA replaces these blocks using lightweight, closed-form transformations, without retraining or fine-tuning the rest of the model. The proposed method reduces the number of parameters while having minimal impact on the downstream task. We validate the effectiveness and generalizability of TBA through extensive experiments across multiple datasets (e.g., Imagenet-1k and CIFAR100) and state-of-the-art pretrained vision models (e.g, ViT, DiNO-v2, and DEiT).

nan

Article 892

Title@2025-05-27 (2): Leveraging the Power of Conversations: Optimal Key Term Selection in Conversational Contextual Bandits

Title: Leveraging the Power of Conversations: Optimal Key Term Selection in Conversational Contextual Bandits

Die Macht der Gespräche nutzen: Optimale Auswahl der Schlüsselbegriffe in konversatorischen Kontextbanditen

利用对话的力量:在对话背景强盗中最佳关键条件选择 2505.21393v1

Authors: Maoli Liu, Zhuohua Li, Xiangxiang Dai, John C. S. Lui

Conversational recommender systems proactively query users with relevant “key terms” and leverage the feedback to elicit users’ preferences for personalized recommendations. Conversational contextual bandits, a prevalent approach in this domain, aim to optimize preference learning by balancing exploitation and exploration. However, several limitations hinder their effectiveness in real-world scenarios. First, existing algorithms employ key term selection strategies with insufficient exploration, often failing to thoroughly probe users’ preferences and resulting in suboptimal preference estimation. Second, current algorithms typically rely on deterministic rules to initiate conversations, causing unnecessary interactions when preferences are well-understood and missed opportunities when preferences are uncertain. To address these limitations, we propose three novel algorithms: CLiSK, CLiME, and CLiSK-ME. CLiSK introduces smoothed key term contexts to enhance exploration in preference learning, CLiME adaptively initiates conversations based on preference uncertainty, and CLiSK-ME integrates both techniques. We theoretically prove that all three algorithms achieve a tighter regret upper bound of $O(\sqrt{dT\log{T}})$ with respect to the time horizon $T$, improving upon existing methods. Additionally, we provide a matching lower bound $\Omega(\sqrt{dT})$ for conversational bandits, demonstrating that our algorithms are nearly minimax optimal. Extensive evaluations on both synthetic and real-world datasets show that our approaches achieve at least a 14.6% improvement in cumulative regret.

nan

Article 893

Title@2025-05-27 (2): Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features

Title: Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features

Finite-Probenanalyse von linearen zeitlichen Unterschieden Lernen mit willkürlichen Funktionen

具有任意地貌特征的线性时间上差异学习的简单抽样分析 2505.21391v1

Authors: Zixuan Xie, Xinyu Liu, Rohan Chandra, Shangtong Zhang

Linear TD($\lambda$) is one of the most fundamental reinforcement learning algorithms for policy evaluation. Previously, convergence rates are typically established under the assumption of linearly independent features, which does not hold in many practical scenarios. This paper instead establishes the first $L^2$ convergence rates for linear TD($\lambda$) operating under arbitrary features, without making any algorithmic modification or additional assumptions. Our results apply to both the discounted and average-reward settings. To address the potential non-uniqueness of solutions resulting from arbitrary features, we develop a novel stochastic approximation result featuring convergence rates to the solution set instead of a single point.

nan

Article 894

Title@2025-05-27 (2): DeCAF: Decentralized Consensus-And-Factorization for Low-Rank Adaptation of Foundation Models

Title: DeCAF: Decentralized Consensus-And-Factorization for Low-Rank Adaptation of Foundation Models

DeCAF: Dezentrale Konsens-und-Factorisierung für Low-Rank-Anpassung von Stiftungsmodellen

DeCAF: 基金会模式的低成本改造的分散化共识和因素 2505.21382v1

Authors: Nastaran Saadati, Zhanhong Jiang, Joshua R. Waite, Shreyan Ganguly, Aditya Balu, Chinmay Hegde, Soumik Sarkar

Low-Rank Adaptation (LoRA) has emerged as one of the most effective, computationally tractable fine-tuning approaches for training Vision-Language Models (VLMs) and Large Language Models (LLMs). LoRA accomplishes this by freezing the pre-trained model weights and injecting trainable low-rank matrices, allowing for efficient learning of these foundation models even on edge devices. However, LoRA in decentralized settings still remains under explored, particularly for the theoretical underpinnings due to the lack of smoothness guarantee and model consensus interference (defined formally below). This work improves the convergence rate of decentralized LoRA (DLoRA) to match the rate of decentralized SGD by ensuring gradient smoothness. We also introduce DeCAF, a novel algorithm integrating DLoRA with truncated singular value decomposition (TSVD)-based matrix factorization to resolve consensus interference. Theoretical analysis shows TSVD’s approximation error is bounded and consensus differences between DLoRA and DeCAF vanish as rank increases, yielding DeCAF’s matching convergence rate. Extensive experiments across vision/language tasks demonstrate our algorithms outperform local training and rivals federated learning under both IID and non-IID data distributions.

nan

Article 895

Title@2025-05-27 (2): Securing Federated Learning against Backdoor Threats with Foundation Model Integration

Title: Securing Federated Learning against Backdoor Threats with Foundation Model Integration

Sichern von Federated Learning gegen Hintertürbedrohungen durch die Integration von Foundation-Modellen

安全联邦学习应对后门威胁,采用基金会模式一体化模式 2410.17573v3

Authors: Xiaohuan Bi, Xi Li

Federated Learning (FL) enables decentralized model training while preserving privacy. Recently, the integration of Foundation Models (FMs) into FL has enhanced performance but introduced a novel backdoor attack mechanism. Attackers can exploit FM vulnerabilities to embed backdoors into synthetic data generated by FMs. During global model fusion, these backdoors are transferred to the global model through compromised synthetic data, subsequently infecting all client models. Existing FL backdoor defenses are ineffective against this novel attack due to its fundamentally different mechanism compared to classic ones. In this work, we propose a novel data-free defense strategy that addresses both classic and novel backdoor attacks in FL. The shared attack pattern lies in the abnormal activations within the hidden feature space during model aggregation. Hence, we propose to constrain internal activations to remain within reasonable ranges, effectively mitigating attacks while preserving model functionality. The activation constraints are optimized using synthetic data alongside FL training. Extensive experiments demonstrate its effectiveness against both novel and classic backdoor attacks, outperforming existing defenses.

nan

Article 896

Title@2025-05-27 (2): Linear $Q$-Learning Does Not Diverge in $L^2$: Convergence Rates to a Bounded Set

Title: Linear $Q$-Learning Does Not Diverge in $L^2$: Convergence Rates to a Bounded Set

Lineares $Q$-Lernen unterscheidet sich nicht in $L^2$: Konvergenzraten zu einem begrenzten Satz

线性 $Q $ 美元学习的学习不以 $L $2 美元进行 : 汇合率与环形集的汇合率 2501.19254v4

Authors: Xinyu Liu, Zixuan Xie, Shangtong Zhang

$Q$-learning is one of the most fundamental reinforcement learning algorithms. It is widely believed that $Q$-learning with linear function approximation (i.e., linear $Q$-learning) suffers from possible divergence until the recent work Meyn (2024) which establishes the ultimate almost sure boundedness of the iterates of linear $Q$-learning. Building on this success, this paper further establishes the first $L^2$ convergence rate of linear $Q$-learning iterates (to a bounded set). Similar to Meyn (2024), we do not make any modification to the original linear $Q$-learning algorithm, do not make any Bellman completeness assumption, and do not make any near-optimality assumption on the behavior policy. All we need is an $\epsilon$-softmax behavior policy with an adaptive temperature. The key to our analysis is the general result of stochastic approximations under Markovian noise with fast-changing transition functions. As a side product, we also use this general result to establish the $L^2$ convergence rate of tabular $Q$-learning with an $\epsilon$-softmax behavior policy, for which we rely on a novel pseudo-contraction property of the weighted Bellman optimality operator.

nan

Article 897

Title@2025-05-27 (2): Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Title: Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Chain-of-Zoom: Extreme Super-Resolution über Scale Autoregression und Preference Alignment

缩放链缩放链 : 通过缩放自动递减和偏好对齐, 极超分辨率 2505.18600v2

Authors: Bryan Sangwoo Kim, Jeongsol Kim, Jong Chul Ye

Modern single-image super-resolution (SISR) models deliver photo-realistic results at the scale factors on which they are trained, but collapse when asked to magnify far beyond that regime. We address this scalability bottleneck with Chain-of-Zoom (CoZ), a model-agnostic framework that factorizes SISR into an autoregressive chain of intermediate scale-states with multi-scale-aware prompts. CoZ repeatedly re-uses a backbone SR model, decomposing the conditional probability into tractable sub-problems to achieve extreme resolutions without additional training. Because visual cues diminish at high magnifications, we augment each zoom step with multi-scale-aware text prompts generated by a vision-language model (VLM). The prompt extractor itself is fine-tuned using Generalized Reward Policy Optimization (GRPO) with a critic VLM, aligning text guidance towards human preference. Experiments show that a standard 4x diffusion SR model wrapped in CoZ attains beyond 256x enlargement with high perceptual quality and fidelity. Project Page: https://bryanswkim.github.io/chain-of-zoom/ .

nan

Article 898

Title@2025-05-27 (2): Improving LLM-based Global Optimization with Search Space Partitioning

Title: Improving LLM-based Global Optimization with Search Space Partitioning

Verbesserung der globalen Optimierung auf LLM-Basis mit Search Space Partitioning

改进以LLM为基础的全球最佳利用搜索空间分割法 2505.21372v1

Authors: Andrej Schwanke, Lyubomir Ivanov, David Salinas, Fabio Ferreira, Aaron Klein, Frank Hutter, Arber Zela

Large Language Models (LLMs) have recently emerged as effective surrogate models and candidate generators within global optimization frameworks for expensive blackbox functions. Despite promising results, LLM-based methods often struggle in high-dimensional search spaces or when lacking domain-specific priors, leading to sparse or uninformative suggestions. To overcome these limitations, we propose HOLLM, a novel global optimization algorithm that enhances LLM-driven sampling by partitioning the search space into promising subregions. Each subregion acts as a ``meta-arm’’ selected via a bandit-inspired scoring mechanism that effectively balances exploration and exploitation. Within each selected subregion, an LLM then proposes high-quality candidate points, without any explicit domain knowledge. Empirical evaluation on standard optimization benchmarks shows that HOLLM consistently matches or surpasses leading Bayesian optimization and trust-region methods, while substantially outperforming global LLM-based sampling strategies.

nan

Article 899

Title@2025-05-27 (2): PLANETALIGN: A Comprehensive Python Library for Benchmarking Network Alignment

Title: PLANETALIGN: A Comprehensive Python Library for Benchmarking Network Alignment

PLANETALIGN: Eine umfassende Python-Bibliothek für die Ausrichtung von Benchmarking-Netzwerken

PlanETALIGN: 用于基准确定网络协调的综合性俾顿图书馆 2505.21366v1

Authors: Qi Yu, Zhichen Zeng, Yuchen Yan, Zhining Liu, Baoyu Jing, Ruizhong Qiu, Ariful Azad, Hanghang Tong

Network alignment (NA) aims to identify node correspondence across different networks and serves as a critical cornerstone behind various downstream multi-network learning tasks. Despite growing research in NA, there lacks a comprehensive library that facilitates the systematic development and benchmarking of NA methods. In this work, we introduce PLANETALIGN, a comprehensive Python library for network alignment that features a rich collection of built-in datasets, methods, and evaluation pipelines with easy-to-use APIs. Specifically, PLANETALIGN integrates 18 datasets and 14 NA methods with extensible APIs for easy use and development of NA methods. Our standardized evaluation pipeline encompasses a wide range of metrics, enabling a systematic assessment of the effectiveness, scalability, and robustness of NA methods. Through extensive comparative studies, we reveal practical insights into the strengths and limitations of existing NA methods. We hope that PLANETALIGN can foster a deeper understanding of the NA problem and facilitate the development and benchmarking of more effective, scalable, and robust methods in the future. The source code of PLANETALIGN is available at https://github.com/yq-leo/PlanetAlign.

nan

Article 900

Title@2025-05-27 (2): Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders

Title: Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders

Auf dem Weg zur Verdolmetschbarkeit ohne Opfer: treue Dense-Layer-Zersetzung mit Mischung aus Decodern

实现无牺牲的解释性:忠实的高密度层分解与代谢物混合 2505.21364v1

Authors: James Oldfield, Shawn Im, Yixuan Li, Mihalis A. Nicolaou, Ioannis Patras, Grigorios G Chrysos

Multilayer perceptrons (MLPs) are an integral part of large language models, yet their dense representations render them difficult to understand, edit, and steer. Recent methods learn interpretable approximations via neuron-level sparsity, yet fail to faithfully reconstruct the original mapping–significantly increasing model’s next-token cross-entropy loss. In this paper, we advocate for moving to layer-level sparsity to overcome the accuracy trade-off in sparse layer approximation. Under this paradigm, we introduce Mixture of Decoders (MxDs). MxDs generalize MLPs and Gated Linear Units, expanding pre-trained dense layers into tens of thousands of specialized sublayers. Through a flexible form of tensor factorization, each sparsely activating MxD sublayer implements a linear transformation with full-rank weights–preserving the original decoders’ expressive capacity even under heavy sparsity. Experimentally, we show that MxDs significantly outperform state-of-the-art methods (e.g., Transcoders) on the sparsity-accuracy frontier in language models with up to 3B parameters. Further evaluations on sparse probing and feature steering demonstrate that MxDs learn similarly specialized features of natural language–opening up a promising new avenue for designing interpretable yet faithful decompositions. Our code is included at: https://github.com/james-oldfield/MxD/.

nan

Article 901

Title@2025-05-27 (2): CRISP-NAM: Competing Risks Interpretable Survival Prediction with Neural Additive Models

Title: CRISP-NAM: Competing Risks Interpretable Survival Prediction with Neural Additive Models

CRISP-NAM: Konkurrenzfähige Risiken interpretierbare Überlebensvorhersage mit neuralen Additivenmodellen

CRIISP-NAM: 与神经添加模型相竞争的风险解释性生存预测 2505.21360v1

Authors: Dhanesh Ramachandram

Competing risks are crucial considerations in survival modelling, particularly in healthcare domains where patients may experience multiple distinct event types. We propose CRISP-NAM (Competing Risks Interpretable Survival Prediction with Neural Additive Models), an interpretable neural additive model for competing risks survival analysis which extends the neural additive architecture to model cause-specific hazards while preserving feature-level interpretability. Each feature contributes independently to risk estimation through dedicated neural networks, allowing for visualization of complex non-linear relationships between covariates and each competing risk. We demonstrate competitive performance on multiple datasets compared to existing approaches.

nan

Article 902

Title@2025-05-27 (2): Learning with Selectively Labeled Data from Multiple Decision-makers

Title: Learning with Selectively Labeled Data from Multiple Decision-makers

Lernen mit selektiv beschrifteten Daten von mehreren Entscheidungsträgern

学习来自多个决策者的选择性标签数据 2306.07566v4

Authors: Jian Chen, Zhehao Li, Xiaojie Mao

We study the problem of classification with selectively labeled data, whose distribution may differ from the full population due to historical decision-making. We exploit the fact that in many applications historical decisions were made by multiple decision-makers, each with different decision rules. We analyze this setup under a principled instrumental variable (IV) framework and rigorously study the identification of classification risk. We establish conditions for the exact identification of classification risk and derive tight partial identification bounds when exact identification fails. We further propose a unified cost-sensitive learning (UCL) approach to learn classifiers robust to selection bias in both identification settings. Finally, we theoretically and numerically validate the efficacy of our proposed method.

nan

Article 903

Title@2025-05-27 (2): Leveraging Large Language Models for Bengali Math Word Problem Solving with Chain of Thought Reasoning

Title: Leveraging Large Language Models for Bengali Math Word Problem Solving with Chain of Thought Reasoning

Nutzung von großen Sprachmodellen für Bengalische Mathematik-Wort-Probleme bei der Lösung der Kette der Gedankenveranlagung

利用大语言模型解决孟加拉语数学字词与思维链理性的解决问题 2505.21354v1

Authors: Bidyarthi Paul, Jalisha Jashim Era, Mirazur Rahman Zim, Tahmid Sattar Aothoi, Faisal Muhammad Shah

Solving Bengali Math Word Problems (MWPs) remains a major challenge in natural language processing (NLP) due to the language’s low-resource status and the multi-step reasoning required. Existing models struggle with complex Bengali MWPs, largely because no human-annotated Bengali dataset has previously addressed this task. This gap has limited progress in Bengali mathematical reasoning. To address this, we created SOMADHAN, a dataset of 8792 complex Bengali MWPs with manually written, step-by-step solutions. We designed this dataset to support reasoning-focused evaluation and model development in a linguistically underrepresented context. Using SOMADHAN, we evaluated a range of large language models (LLMs) - including GPT-4o, GPT-3.5 Turbo, LLaMA series models, Deepseek, and Qwen - through both zero-shot and few-shot prompting with and without Chain of Thought (CoT) reasoning. CoT prompting consistently improved performance over standard prompting, especially in tasks requiring multi-step logic. LLaMA-3.3 70B achieved the highest accuracy of 88% with few-shot CoT prompting. We also applied Low-Rank Adaptation (LoRA) to fine-tune models efficiently, enabling them to adapt to Bengali MWPs with minimal computational cost. Our work fills a critical gap in Bengali NLP by providing a high-quality reasoning dataset and a scalable framework for solving complex MWPs. We aim to advance equitable research in low-resource languages and enhance reasoning capabilities in educational and language technologies.

nan

Article 904

Title@2025-05-27 (2): Diffusion Predictive Control with Constraints

Title: Diffusion Predictive Control with Constraints

Diffusion Predictive Control mit Einschränkungen

受限制的预测控制 2412.09342v2

Authors: Ralf Römer, Alexander von Rohr, Angela P. Schoellig

Diffusion models have become popular for policy learning in robotics due to their ability to capture high-dimensional and multimodal distributions. However, diffusion policies are stochastic and typically trained offline, limiting their ability to handle unseen and dynamic conditions where novel constraints not represented in the training data must be satisfied. To overcome this limitation, we propose diffusion predictive control with constraints (DPCC), an algorithm for diffusion-based control with explicit state and action constraints that can deviate from those in the training data. DPCC incorporates model-based projections into the denoising process of a trained trajectory diffusion model and uses constraint tightening to account for model mismatch. This allows us to generate constraint-satisfying, dynamically feasible, and goal-reaching trajectories for predictive control. We show through simulations of a robot manipulator that DPCC outperforms existing methods in satisfying novel test-time constraints while maintaining performance on the learned control task.

nan

Article 905

Title@2025-05-27 (2): An Uncertainty-Aware ED-LSTM for Probabilistic Suffix Prediction

Title: An Uncertainty-Aware ED-LSTM for Probabilistic Suffix Prediction

Eine unsichere ED-LSTM für probabilistische Suffix-Vorhersage

用于概率后置物后置物预测的不确定性( ED-LSTM) 的不确定性警告 ED-LSTM 2505.21339v1

Authors: Henryk Mustroph, Michel Kunkler, Stefanie Rinderle-Ma

Suffix prediction of business processes forecasts the remaining sequence of events until process completion. Current approaches focus on predicting a single, most likely suffix. However, if the future course of a process is exposed to uncertainty or has high variability, the expressiveness of a single suffix prediction can be limited. To address this limitation, we propose probabilistic suffix prediction, a novel approach that approximates a probability distribution of suffixes. The proposed approach is based on an Uncertainty-Aware Encoder-Decoder LSTM (U-ED-LSTM) and a Monte Carlo (MC) suffix sampling algorithm. We capture epistemic uncertainties via MC dropout and aleatoric uncertainties as learned loss attenuation. This technical report provides a detailed evaluation of the U-ED-LSTM’s predictive performance and assesses its calibration on four real-life event logs with three different hyperparameter settings. The results show that i) the U-ED-LSTM has reasonable predictive performance across various datasets, ii) aggregating probabilistic suffix predictions into mean values can outperform most likely predictions, particularly for rare prefixes or longer suffixes, and iii) the approach effectively captures uncertainties present in event logs.

nan

Article 906

Title@2025-05-27 (2): Controlling Participation in Federated Learning with Feedback

Title: Controlling Participation in Federated Learning with Feedback

Mit Feedback die Teilnahme am Föderierten Lernen kontrollieren

控制参加有反馈的联邦学习 2411.19242v2

Authors: Michael Cummins, Guner Dilsad Er, Michael Muehlebach

We address the problem of client participation in federated learning, where traditional methods typically rely on a random selection of a small subset of clients for each training round. In contrast, we propose FedBack, a deterministic approach that leverages control-theoretic principles to manage client participation in ADMM-based federated learning. FedBack models client participation as a discrete-time dynamical system and employs an integral feedback controller to adjust each client’s participation rate individually, based on the client’s optimization dynamics. We provide global convergence guarantees for our approach by building on the recent federated learning research. Numerical experiments on federated image classification demonstrate that FedBack achieves up to 50\% improvement in communication and computational efficiency over algorithms that rely on a random selection of clients.

nan

Article 907

Title@2025-05-27 (2): PeerGuard: Defending Multi-Agent Systems Against Backdoor Attacks Through Mutual Reasoning

Title: PeerGuard: Defending Multi-Agent Systems Against Backdoor Attacks Through Mutual Reasoning

PeerGuard: Verteidigen von Multi-Agenten-Systemen gegen Hintertürangriffe durch gegenseitige Vernunft

同伴保护:捍卫多机构系统,防止通过相互理由进行后门攻击 2505.11642v2

Authors: Falong Fan, Xi Li

Multi-agent systems leverage advanced AI models as autonomous agents that interact, cooperate, or compete to complete complex tasks across applications such as robotics and traffic management. Despite their growing importance, safety in multi-agent systems remains largely underexplored, with most research focusing on single AI models rather than interacting agents. This work investigates backdoor vulnerabilities in multi-agent systems and proposes a defense mechanism based on agent interactions. By leveraging reasoning abilities, each agent evaluates responses from others to detect illogical reasoning processes, which indicate poisoned agents. Experiments on LLM-based multi-agent systems, including ChatGPT series and Llama 3, demonstrate the effectiveness of the proposed method, achieving high accuracy in identifying poisoned agents while minimizing false positives on clean agents. We believe this work provides insights into multi-agent system safety and contributes to the development of robust, trustworthy AI interactions.

nan

Article 908

Title: Adaptive Sample Sharing for Multi Agent Linear Bandits

Adaptive Probenfreigabe für Multi Agent Linear Bandits

多剂线性强盗的适应性样本共享 2309.08710v3

Authors: Hamza Cherkaoui, Merwan Barlier, Igor Colin

The multi-agent linear bandit setting is a well-known setting for which designing efficient collaboration between agents remains challenging. This paper studies the impact of data sharing among agents on regret minimization. Unlike most existing approaches, our contribution does not rely on any assumptions on the bandit parameters structure. Our main result formalizes the trade-off between the bias and uncertainty of the bandit parameter estimation for efficient collaboration. This result is the cornerstone of the Bandit Adaptive Sample Sharing (BASS) algorithm, whose efficiency over the current state-of-the-art is validated through both theoretical analysis and empirical evaluations on both synthetic and real-world datasets. Furthermore, we demonstrate that, when agents’ parameters display a cluster structure, our algorithm accurately recovers them.

nan

Article 909

Title@2025-05-27 (2): Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness

Title: Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness

Sign-Operator für den Umgang mit schwerfälligen Geräuschen in Nicht-Konvex-Optimierung: Hohe Wahrscheinlichkeitsgrenzen unter $(L_0, L_1)$-Smoothness

在非Convex优化情况下处理重故障噪音的签名操作员: 高概率弹道低于$(L_0, L_1), 低于$(L_1) 2502.07923v2

Authors: Nikita Kornilov, Philip Zmushko, Andrei Semenov, Mark Ikonnikov, Alexander Gasnikov, Alexander Beznosikov

In recent years, non-convex optimization problems are more often described by generalized $(L_0, L_1)$-smoothness assumption rather than standard one. Meanwhile, severely corrupted data used in these problems has increased the demand for methods capable of handling heavy-tailed noises, i.e., noises with bounded $\kappa$-th moment. Motivated by these real-world trends and challenges, we explore sign-based methods in this setup and demonstrate their effectiveness in comparison with other popular solutions like clipping or normalization. In theory, we prove the first-known high probability convergence bounds under $(L_0, L_1)$-smoothness and heavy-tailed noises with mild parameter dependencies. In the case of standard smoothness, these bounds are novel for sign-based methods as well. In particular, SignSGD with batching achieves sample complexity $\tilde{O}\left(\left(\frac{\Delta L_0d}{\varepsilon^2} + \frac{\Delta L_1d^\frac{3}{2}}{\varepsilon}\right)\left[1 + \left(\frac{\sigma}{\varepsilon}\right)^\frac{\kappa}{\kappa-1}\right]\right), \kappa \in (1,2]$. Under the assumption of symmetric noises, SignSGD with Majority Voting can robustly work on the whole range of $\kappa \in (0,2]$ with complexity $\tilde{O}\left(\left(\frac{\Delta L_0d}{\varepsilon^2} + \frac{\Delta L_1d^\frac{3}{2}}{\varepsilon}\right)\left[\frac{1}{\kappa^2} + \frac{\sigma^2}{\varepsilon^2}\right]\right)$. We also obtain results for parameter-agnostic setups, Polyak-Lojasiewicz functions and momentum-based methods (in expectation). Our theoretical findings are supported by the superior performance of sign-based methods in training Large Language Models compared to clipping and normalization.

nan

Article 910

Title@2025-05-27 (2): Joint Learning in the Gaussian Single Index Model

Title: Joint Learning in the Gaussian Single Index Model

Gemeinsames Lernen im Gaussischen Einzelindexmodell

Gaussian单一指数模式联合学习 2505.21336v1

Authors: Loucas Pillaud-Vivien, Adrien Schertzer

We consider the problem of jointly learning a one-dimensional projection and a univariate function in high-dimensional Gaussian models. Specifically, we study predictors of the form $f(x)=\varphi^\star(\langle w^\star, x \rangle)$, where both the direction $w^\star \in \mathcal{S}_{d-1}$, the sphere of $\mathbb{R}^d$, and the function $\varphi^\star: \mathbb{R} \to \mathbb{R}$ are learned from Gaussian data. This setting captures a fundamental non-convex problem at the intersection of representation learning and nonlinear regression. We analyze the gradient flow dynamics of a natural alternating scheme and prove convergence, with a rate controlled by the information exponent reflecting the \textit{Gaussian regularity} of the function $\varphi^\star$. Strikingly, our analysis shows that convergence still occurs even when the initial direction is negatively correlated with the target. On the practical side, we demonstrate that such joint learning can be effectively implemented using a Reproducing Kernel Hilbert Space (RKHS) adapted to the structure of the problem, enabling efficient and flexible estimation of the univariate function. Our results offer both theoretical insight and practical methodology for learning low-dimensional structure in high-dimensional settings.

nan

Article 911

Title@2025-05-27 (2): DHP: Discrete Hierarchical Planning for Hierarchical Reinforcement Learning Agents

Title: DHP: Discrete Hierarchical Planning for Hierarchical Reinforcement Learning Agents

DHP: Diskrete Hierarchische Planung für Hierarchische Verstärkungs-Learning Agents

DHP: 等级加强学习代理的分级分级规划 2502.01956v2

Authors: Shashank Sharma, Janina Hoffmann, Vinay Namboodiri

Hierarchical Reinforcement Learning (HRL) agents often struggle with long-horizon visual planning due to their reliance on error-prone distance metrics. We propose Discrete Hierarchical Planning (DHP), a method that replaces continuous distance estimates with discrete reachability checks to evaluate subgoal feasibility. DHP recursively constructs tree-structured plans by decomposing long-term goals into sequences of simpler subtasks, using a novel advantage estimation strategy that inherently rewards shorter plans and generalizes beyond training depths. In addition, to address the data efficiency challenge, we introduce an exploration strategy that generates targeted training examples for the planning modules without needing expert data. Experiments in 25-room navigation environments demonstrate $100\%$ success rate (vs $82\%$ baseline) and $73$-step average episode length (vs $158$-step baseline). The method also generalizes to momentum-based control tasks and requires only $\log N$ steps for replanning. Theoretical analysis and ablations validate our design choices.

nan

Article 912

Title@2025-05-27 (2): Structure from Collision

Title: Structure from Collision

Struktur aus Kollision

来自碰撞的结构 2505.21335v1

Authors: Takuhiro Kaneko

Recent advancements in neural 3D representations, such as neural radiance fields (NeRF) and 3D Gaussian splatting (3DGS), have enabled the accurate estimation of 3D structures from multiview images. However, this capability is limited to estimating the visible external structure, and identifying the invisible internal structure hidden behind the surface is difficult. To overcome this limitation, we address a new task called Structure from Collision (SfC), which aims to estimate the structure (including the invisible internal structure) of an object from appearance changes during collision. To solve this problem, we propose a novel model called SfC-NeRF that optimizes the invisible internal structure of an object through a video sequence under physical, appearance (i.e., visible external structure)-preserving, and keyframe constraints. In particular, to avoid falling into undesirable local optima owing to its ill-posed nature, we propose volume annealing; that is, searching for global optima by repeatedly reducing and expanding the volume. Extensive experiments on 115 objects involving diverse structures (i.e., various cavity shapes, locations, and sizes) and material properties revealed the properties of SfC and demonstrated the effectiveness of the proposed SfC-NeRF.

nan

Article 913

Title@2025-05-27 (2): Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

Title: Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

Robustheit und Genauigkeit in der Mischung von Experten optimieren: Ein Dual-Model-Ansatz

优化专家混合中的力量和准确性:双模式办法 2502.06832v3

Authors: Xu Zhang, Kaidi Xu, Ziqing Hu, Ren Wang

Mixture of Experts (MoE) have shown remarkable success in leveraging specialized expert networks for complex machine learning tasks. However, their susceptibility to adversarial attacks presents a critical challenge for deployment in robust applications. This paper addresses the critical question of how to incorporate robustness into MoEs while maintaining high natural accuracy. We begin by analyzing the vulnerability of MoE components, finding that expert networks are notably more susceptible to adversarial attacks than the router. Based on this insight, we propose a targeted robust training technique that integrates a novel loss function to enhance the adversarial robustness of MoE, requiring only the robustification of one additional expert without compromising training or inference efficiency. Building on this, we introduce a dual-model strategy that linearly combines a standard MoE model with our robustified MoE model using a smoothing parameter. This approach allows for flexible control over the robustness-accuracy trade-off. We further provide theoretical foundations by deriving certified robustness bounds for both the single MoE and the dual-model. To push the boundaries of robustness and accuracy, we propose a novel joint training strategy JTDMoE for the dual-model. This joint training enhances both robustness and accuracy beyond what is achievable with separate models. Experimental results on CIFAR-10 and TinyImageNet datasets using ResNet18 and Vision Transformer (ViT) architectures demonstrate the effectiveness of our proposed methods. The code is publicly available at https://github.com/TIML-Group/Robust-MoE-Dual-Model.

nan

Article 914

Title@2025-05-27 (2): Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices

Title: Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices

Eingewickelt Gaussian auf der Mannigfaltigkeit der Symmetrischen Positiven Definiten Matrizen

以正负负负负下方矩阵的方块包装高森 2502.01512v3

Authors: Thibault de Surrel, Fabien Lotte, Sylvain Chevallier, Florian Yger

Circular and non-flat data distributions are prevalent across diverse domains of data science, yet their specific geometric structures often remain underutilized in machine learning frameworks. A principled approach to accounting for the underlying geometry of such data is pivotal, particularly when extending statistical models, like the pervasive Gaussian distribution. In this work, we tackle those issue by focusing on the manifold of symmetric positive definite (SPD) matrices, a key focus in information geometry. We introduce a non-isotropic wrapped Gaussian by leveraging the exponential map, we derive theoretical properties of this distribution and propose a maximum likelihood framework for parameter estimation. Furthermore, we reinterpret established classifiers on SPD through a probabilistic lens and introduce new classifiers based on the wrapped Gaussian model. Experiments on synthetic and real-world datasets demonstrate the robustness and flexibility of this geometry-aware distribution, underscoring its potential to advance manifold-based data analysis. This work lays the groundwork for extending classical machine learning and statistical methods to more complex and structured data.

nan

Article 915

Title@2025-05-27 (2): Scheduling with Uncertain Holding Costs and its Application to Content Moderation

Title: Scheduling with Uncertain Holding Costs and its Application to Content Moderation

Planung mit unsicheren Holdingkosten und deren Anwendung auf Content Moderation

与不确定的控股成本及其对内容调节应用的时间安排 2505.21331v1

Authors: Caner Gocmen, Thodoris Lykouris, Deeksha Sinha, Wentao Weng

In content moderation for social media platforms, the cost of delaying the review of a content is proportional to its view trajectory, which fluctuates and is apriori unknown. Motivated by such uncertain holding costs, we consider a queueing model where job states evolve based on a Markov chain with state-dependent instantaneous holding costs. We demonstrate that in the presence of such uncertain holding costs, the two canonical algorithmic principles, instantaneous-cost ($c\mu$-rule) and expected-remaining-cost ($c\mu/\theta$-rule), are suboptimal. By viewing each job as a Markovian ski-rental problem, we develop a new index-based algorithm, Opportunity-adjusted Remaining Cost (OaRC), that adjusts to the opportunity of serving jobs in the future when uncertainty partly resolves. We show that the regret of OaRC scales as $\tilde{O}(L^{1.5}\sqrt{N})$, where $L$ is the maximum length of a job’s holding cost trajectory and $N$ is the system size. This regret bound shows that OaRC achieves asymptotic optimality when the system size $N$ scales to infinity. Moreover, its regret is independent of the state-space size, which is a desirable property when job states contain contextual information. We corroborate our results with an extensive simulation study based on two holding cost patterns (online ads and user-generated content) that arise in content moderation for social media platforms. Our simulations based on synthetic and real datasets demonstrate that OaRC consistently outperforms existing practice, which is based on the two canonical algorithmic principles.

nan

Article 916

Title@2025-05-27 (2): UGCE: User-Guided Incremental Counterfactual Exploration

Title: UGCE: User-Guided Incremental Counterfactual Exploration

UGCE: User-Guided Incremental Counterfactual Exploration

UGCE: 用户指导的递增反事实探索 2505.21330v1

Authors: Christos Fragkathoulas, Evaggelia Pitoura

Counterfactual explanations (CFEs) are a popular approach for interpreting machine learning predictions by identifying minimal feature changes that alter model outputs. However, in real-world settings, users often refine feasibility constraints over time, requiring counterfactual generation to adapt dynamically. Existing methods fail to support such iterative updates, instead recomputing explanations from scratch with each change, an inefficient and rigid approach. We propose User-Guided Incremental Counterfactual Exploration (UGCE), a genetic algorithm-based framework that incrementally updates counterfactuals in response to evolving user constraints. Experimental results across five benchmark datasets demonstrate that UGCE significantly improves computational efficiency while maintaining high-quality solutions compared to a static, non-incremental approach. Our evaluation further shows that UGCE supports stable performance under varying constraint sequences, benefits from an efficient warm-start strategy, and reveals how different constraint types may affect search behavior.

nan

Article 917

Title@2025-05-27 (2): Bencher: Simple and Reproducible Benchmarking for Black-Box Optimization

Title: Bencher: Simple and Reproducible Benchmarking for Black-Box Optimization

Bencher: Einfaches und reproduzierbares Benchmarking für Black-Box-Optimierung

座谈人: 简化和可复制的黑箱优化基准 2505.21321v1

Authors: Leonard Papenmeier, Luigi Nardi

We present Bencher, a modular benchmarking framework for black-box optimization that fundamentally decouples benchmark execution from optimization logic. Unlike prior suites that focus on combining many benchmarks in a single project, Bencher introduces a clean abstraction boundary: each benchmark is isolated in its own virtual Python environment and accessed via a unified, version-agnostic remote procedure call (RPC) interface. This design eliminates dependency conflicts and simplifies the integration of diverse, real-world benchmarks, which often have complex and conflicting software requirements. Bencher can be deployed locally or remotely via Docker or on high-performance computing (HPC) clusters via Singularity, providing a containerized, reproducible runtime for any benchmark. Its lightweight client requires minimal setup and supports drop-in evaluation of 80 benchmarks across continuous, categorical, and binary domains.

nan

Article 918

Title: A Cross Modal Knowledge Distillation & Data Augmentation Recipe for Improving Transcriptomics Representations through Morphological Features

Ein Cross Modal Knowledge Destillation & Data Augmentation Rezept zur Verbesserung von Transkriptionsdarstellungen durch morphologische Merkmale

一种交叉模式知识蒸馏和数据增强休息室,以通过生理特征改进转基因医学的表现形式 2505.21317v1

Authors: Ihab Bendidi, Yassir El Mesbahi, Alisandra K. Denton, Karush Suri, Kian Kenyon-Dean, Auguste Genovesio, Emmanuel Noutahi

Understanding cellular responses to stimuli is crucial for biological discovery and drug development. Transcriptomics provides interpretable, gene-level insights, while microscopy imaging offers rich predictive features but is harder to interpret. Weakly paired datasets, where samples share biological states, enable multimodal learning but are scarce, limiting their utility for training and multimodal inference. We propose a framework to enhance transcriptomics by distilling knowledge from microscopy images. Using weakly paired data, our method aligns and binds modalities, enriching gene expression representations with morphological information. To address data scarcity, we introduce (1) Semi-Clipped, an adaptation of CLIP for cross-modal distillation using pretrained foundation models, achieving state-of-the-art results, and (2) PEA (Perturbation Embedding Augmentation), a novel augmentation technique that enhances transcriptomics data while preserving inherent biological information. These strategies improve the predictive power and retain the interpretability of transcriptomics, enabling rich unimodal representations for complex biological tasks.

nan

Article 919

Title@2025-05-27 (2): It’s complicated. The relationship of algorithmic fairness and non-discrimination regulations for high-risk systems in the EU AI Act

Title: It’s complicated. The relationship of algorithmic fairness and non-discrimination regulations for high-risk systems in the EU AI Act

Es ist kompliziert. Das Verhältnis algorithmischer Fairness- und Nichtdiskriminierungsvorschriften für Hochrisikosysteme im EU-AI-Gesetz

这很复杂,在欧盟的AI法案中, 高风险系统的算法公正和不歧视规定之间的关系。 2501.12962v3

Authors: Kristof Meding

What constitutes a fair decision? This question is not only difficult for humans but becomes more challenging when Artificial Intelligence (AI) models are used. In light of discriminatory algorithmic behaviors, the EU has recently passed the AI Act, which mandates specific rules for high-risk systems, incorporating both traditional legal non-discrimination regulations and machine learning based algorithmic fairness concepts. This paper aims to bridge these two different concepts in the AI Act through: First, a necessary high-level introduction of both concepts targeting legal and computer science-oriented scholars, and second, an in-depth analysis of the AI Act’s relationship between legal non-discrimination regulations and algorithmic fairness. Our analysis reveals three key findings: (1.) Most non-discrimination regulations target only high-risk AI systems. (2.) The regulation of high-risk systems encompasses both data input requirements and output monitoring, though these regulations are partly inconsistent and raise questions of computational feasibility. (3.) Finally, we consider the possible (future) interaction of classical EU non-discrimination law and the AI Act regulations. We recommend developing more specific auditing and testing methodologies for AI systems. This paper aims to serve as a foundation for future interdisciplinary collaboration between legal scholars and computer science-oriented machine learning researchers studying discrimination in AI systems.

nan

Article 920

Title@2025-05-27 (2): Item Cluster-aware Prompt Learning for Session-based Recommendation

Title: Item Cluster-aware Prompt Learning for Session-based Recommendation

Artikel Cluster-aware Prompt Learning für sitzungsbasierte Empfehlung

项目集群意识快速学习促进基于会议的建议 2410.04756v2

Authors: Wooseong Yang, Chen Wang, Zihe Song, Weizhi Zhang, Philip S. Yu

Session-based recommendation (SBR) aims to capture dynamic user preferences by analyzing item sequences within individual sessions. However, most existing approaches focus mainly on intra-session item relationships, neglecting the connections between items across different sessions (inter-session relationships), which limits their ability to fully capture complex item interactions. While some methods incorporate inter-session information, they often suffer from high computational costs, leading to longer training times and reduced efficiency. To address these challenges, we propose the CLIP-SBR (Cluster-aware Item Prompt learning for Session-Based Recommendation) framework. CLIP-SBR is composed of two modules: 1) an item relationship mining module that builds a global graph to effectively model both intra- and inter-session relationships, and 2) an item cluster-aware prompt learning module that uses soft prompts to integrate these relationships into SBR models efficiently. We evaluate CLIP-SBR across eight SBR models and three benchmark datasets, consistently demonstrating improved recommendation performance and establishing CLIP-SBR as a robust solution for session-based recommendation tasks.

nan

Article 921

Title@2025-05-27 (2): Overcoming Spurious Solutions in Semi-Dual Neural Optimal Transport: A Smoothing Approach for Learning the Optimal Transport Plan

Title: Overcoming Spurious Solutions in Semi-Dual Neural Optimal Transport: A Smoothing Approach for Learning the Optimal Transport Plan

Überwinden von sauberen Lösungen im halbdualen Neural Optimalen Verkehr: Ein glättender Ansatz für das Lernen des optimalen Verkehrsplans

克服半双轨神经优化运输中的纯净解决方案:学习最佳运输计划的平滑方法 2502.04583v2

Authors: Jaemoo Choi, Jaewoong Choi, Dohyun Kwon

We address the convergence problem in learning the Optimal Transport (OT) map, where the OT Map refers to a map from one distribution to another while minimizing the transport cost. Semi-dual Neural OT, a widely used approach for learning OT Maps with neural networks, often generates spurious solutions that fail to transfer one distribution to another accurately. We identify a sufficient condition under which the max-min solution of Semi-dual Neural OT recovers the true OT Map. Moreover, to address cases when this sufficient condition is not satisfied, we propose a novel method, OTP, which learns both the OT Map and the Optimal Transport Plan, representing the optimal coupling between two distributions. Under sharp assumptions on the distributions, we prove that our model eliminates the spurious solution issue and correctly solves the OT problem. Our experiments show that the OTP model recovers the optimal transport map where existing methods fail and outperforms current OT-based models in image-to-image translation tasks. Notably, the OTP model can learn stochastic transport maps when deterministic OT Maps do not exist, such as one-to-many tasks like colorization.

nan

Article 922

Title@2025-05-27 (2): Interlocking-free Selective Rationalization Through Genetic-based Learning

Title: Interlocking-free Selective Rationalization Through Genetic-based Learning

Interlocking-free Selektive Rationalisierung durch gentechnisch-basiertes Lernen

通过基于遗传的学习实现互连、无互闭和无互换的选择性合理化 2412.10312v2

Authors: Federico Ruggeri, Gaetano Signorelli

A popular end-to-end architecture for selective rationalization is the select-then-predict pipeline, comprising a generator to extract highlights fed to a predictor. Such a cooperative system suffers from suboptimal equilibrium minima due to the dominance of one of the two modules, a phenomenon known as interlocking. While several contributions aimed at addressing interlocking, they only mitigate its effect, often by introducing feature-based heuristics, sampling, and ad-hoc regularizations. We present GenSPP, the first interlocking-free architecture for selective rationalization that does not require any learning overhead, as the above-mentioned. GenSPP avoids interlocking by performing disjoint training of the generator and predictor via genetic global search. Experiments on a synthetic and a real-world benchmark show that our model outperforms several state-of-the-art competitors.

nan

Article 923

Title@2025-05-27 (2): Optimizing fMRI Data Acquisition for Decoding Natural Speech with Limited Participants

Title: Optimizing fMRI Data Acquisition for Decoding Natural Speech with Limited Participants

Optimierung der fMRI-Datenerfassung für die Dekodierung von Natural Speech mit begrenzten Teilnehmern

优化FMRI数据获取,以便与有限参加者进行自然演讲 2505.21304v1

Authors: Louis Jalouzot, Alexis Thual, Yair Lakretz, Christophe Pallier, Bertrand Thirion

We investigate optimal strategies for decoding perceived natural speech from fMRI data acquired from a limited number of participants. Leveraging Lebel et al. (2023)’s dataset of 8 participants, we first demonstrate the effectiveness of training deep neural networks to predict LLM-derived text representations from fMRI activity. Then, in this data regime, we observe that multi-subject training does not improve decoding accuracy compared to single-subject approach. Furthermore, training on similar or different stimuli across subjects has a negligible effect on decoding accuracy. Finally, we find that our decoders better model syntactic than semantic features, and that stories containing sentences with complex syntax or rich semantic content are more challenging to decode. While our results demonstrate the benefits of having extensive data per participant (deep phenotyping), they suggest that leveraging multi-subject for natural speech decoding likely requires deeper phenotyping or a substantially larger cohort.

nan

Article 924

Title@2025-05-27 (2): Large Language Models Miss the Multi-Agent Mark

Title: Large Language Models Miss the Multi-Agent Mark

Große Sprachmodelle vermissen das Multi-Agent Mark

大语言模型 2505.21298v1

Authors: Emanuele La Malfa, Gabriele La Malfa, Samuele Marro, Jie M. Zhang, Elizabeth Black, Micheal Luck, Philip Torr, Michael Wooldridge

Recent interest in Multi-Agent Systems of Large Language Models (MAS LLMs) has led to an increase in frameworks leveraging multiple LLMs to tackle complex tasks. However, much of this literature appropriates the terminology of MAS without engaging with its foundational principles. In this position paper, we highlight critical discrepancies between MAS theory and current MAS LLMs implementations, focusing on four key areas: the social aspect of agency, environment design, coordination and communication protocols, and measuring emergent behaviours. Our position is that many MAS LLMs lack multi-agent characteristics such as autonomy, social interaction, and structured environments, and often rely on oversimplified, LLM-centric architectures. The field may slow down and lose traction by revisiting problems the MAS literature has already addressed. Therefore, we systematically analyse this issue and outline associated research opportunities; we advocate for better integrating established MAS concepts and more precise terminology to avoid mischaracterisation and missed opportunities.

nan

Article 925

Title@2025-05-27 (2): Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation

Title: Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation

Auf dem Weg zur Anpassung von Open Source großen Sprachmodellen für die Erstellung klinischer Notizen auf Expertenebene

努力调整用于专家级临床笔记制作的开放源大语言模型 2405.00715v6

Authors: Hanyin Wang, Chufan Gao, Bolun Liu, Qiping Xu, Guleid Hussein, Mohamad El Labban, Kingsley Iheasirim, Hariprasad Korsapati, Chuck Outcalt, Jimeng Sun

Proprietary Large Language Models (LLMs) such as GPT-4 and Gemini have demonstrated promising capabilities in clinical text summarization tasks. However, due to patient data privacy concerns and computational costs, many healthcare providers prefer using small, locally-hosted models over external generic LLMs. This study presents a comprehensive domain- and task-specific adaptation process for the open-source LLaMA-2 13 billion parameter model, enabling it to generate high-quality clinical notes from outpatient patient-doctor dialogues. Our process incorporates continued pretraining, supervised fine-tuning, and reinforcement learning from both AI and human feedback. We introduced a new approach, DistillDirect, for performing on-policy reinforcement learning with Gemini 1.0 Pro as the teacher model. Our resulting model, LLaMA-Clinic, can generate clinical notes comparable in quality to those authored by physicians. In a blinded physician reader study, the majority (92.8%) of individual evaluations rated the notes generated by LLaMA-Clinic as “acceptable” or higher across three criteria: real-world readiness, completeness, and accuracy. In the more challenging “Assessment and Plan” section, LLaMA-Clinic matched physician-authored notes in real-world readiness score. We highlight key considerations for future clinical note-generation tasks, emphasizing the importance of pre-defining a “best practice” note format, rather than relying on LLMs to determine this for clinical practice.

nan

Article 926

Title@2025-05-27 (2): LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning

Title: LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning

LoFT: Low-Rank-Anpassung, die sich wie Full-Fine-Tuning verhält

LOFT: 行为如完全精美调整的低朗适应 2505.21289v1

Authors: Nurbek Tastan, Stefanos Laskaridis, Martin Takac, Karthik Nandakumar, Samuel Horvath

Large pre-trained models are commonly adapted to downstream tasks using parameter-efficient fine-tuning methods such as Low-Rank Adaptation (LoRA), which injects small trainable low-rank matrices instead of updating all weights. While LoRA dramatically reduces trainable parameters with little overhead, it can still underperform full fine-tuning in accuracy and often converges more slowly. We introduce LoFT, a novel low-rank adaptation method that behaves like full fine-tuning by aligning the optimizer’s internal dynamics with those of updating all model weights. LoFT not only learns weight updates in a low-rank subspace (like LoRA) but also properly projects the optimizer’s first and second moments (Adam’s momentum and variance) into the same subspace, mirroring full-model updates. By aligning the low-rank update itself with the full update, LoFT eliminates the need for tuning extra hyperparameters, e.g., LoRA scaling factor $\alpha$. Empirically, this approach substantially narrows the performance gap between adapter-based tuning and full fine-tuning and consistently outperforms standard LoRA-style methods, all without increasing inference cost.

nan

Article 927

Title@2025-05-27 (2): GSAT: Graph Structure Attention Networks

Title: GSAT: Graph Structure Attention Networks

GSAT: Grafische Struktur

GSAT: 图表结构关注网络 2505.21288v1

Authors: Farshad Noravesh, Reza Haffari, Layki Soon, Arghya Pal

Graph Neural Networks (GNNs) have emerged as a powerful tool for processing data represented in graph structures, achieving remarkable success across a wide range of applications. However, to further improve the performance on graph classification benchmarks, structural representation of each node that encodes rich local topological information in the neighbourhood of nodes is an important type of feature that is often overlooked in the modeling. The consequence of neglecting the structural information has resulted high number of layers to connect messages from distant nodes which by itself produces other problems such as oversmoothing. In the present paper, we leverage these structural information that are modeled by anonymous random walks (ARWs) and introduce graph structure attention network (GSAT) which is a generalization of graph attention network(GAT) to integrate the original attribute and the structural representation to enforce the model to automatically find patterns for attending to different edges in the node neighbourhood to enrich graph representation. Our experiments show GSAT slightly improves SOTA on some graph classification benchmarks.

nan

Article 928

Title@2025-05-27 (2): Learnable Kernel Density Estimation for Graphs

Title: Learnable Kernel Density Estimation for Graphs

Erlernbare Kerneldichteschätzung für Graphen

可学习的内核密度 2505.21285v1

Authors: Xudong Wang, Ziheng Sun, Chris Ding, Jicong Fan

This work proposes a framework LGKDE that learns kernel density estimation for graphs. The key challenge in graph density estimation lies in effectively capturing both structural patterns and semantic variations while maintaining theoretical guarantees. Combining graph kernels and kernel density estimation (KDE) is a standard approach to graph density estimation, but has unsatisfactory performance due to the handcrafted and fixed features of kernels. Our method LGKDE leverages graph neural networks to represent each graph as a discrete distribution and utilizes maximum mean discrepancy to learn the graph metric for multi-scale KDE, where all parameters are learned by maximizing the density of graphs relative to the density of their well-designed perturbed counterparts. The perturbations are conducted on both node features and graph spectra, which helps better characterize the boundary of normal density regions. Theoretically, we establish consistency and convergence guarantees for LGKDE, including bounds on the mean integrated squared error, robustness, and complexity. We validate LGKDE by demonstrating its effectiveness in recovering the underlying density of synthetic graph distributions and applying it to graph anomaly detection across diverse benchmark datasets. Extensive empirical evaluation shows that LGKDE demonstrates superior performance compared to state-of-the-art baselines on most benchmark datasets.

nan

Article 929

Title@2025-05-27 (2): Optimal Pricing for Data-Augmented AutoML Marketplaces

Title: Optimal Pricing for Data-Augmented AutoML Marketplaces

Optimale Preise für datengesteigerte AutoML-Märkte

数据增强自动自动ML 市场最佳定价 2310.17843v2

Authors: Minbiao Han, Jonathan Light, Steven Xia, Sainyam Galhotra, Raul Castro Fernandez, Haifeng Xu

Organizations often lack sufficient data to effectively train machine learning (ML) models, while others possess valuable data that remains underutilized. Data markets promise to unlock substantial value by matching data suppliers with demand from ML consumers. However, market design involves addressing intricate challenges, including data pricing, fairness, robustness, and strategic behavior. In this paper, we propose a pragmatic data-augmented AutoML market that seamlessly integrates with existing cloud-based AutoML platforms such as Google’s Vertex AI and Amazon’s SageMaker. Unlike standard AutoML solutions, our design automatically augments buyer-submitted training data with valuable external datasets, pricing the resulting models based on their measurable performance improvements rather than computational costs as the status quo. Our key innovation is a pricing mechanism grounded in the instrumental value - the marginal model quality improvement - of externally sourced data. This approach bypasses direct dataset pricing complexities, mitigates strategic buyer behavior, and accommodates diverse buyer valuations through menu-based options. By integrating automated data and model discovery, our solution not only enhances ML outcomes but also establishes an economically sustainable framework for monetizing external data.

nan

Article 930

Title@2025-05-27 (2): Accelerated Parallel Tempering via Neural Transports

Title: Accelerated Parallel Tempering via Neural Transports

Beschleunigung des parallelen Temperierens über neurale Transporte

通过神经运输加速平行探险 2502.10328v2

Authors: Leo Zhang, Peter Potaptchik, Jiajun He, Yuanqi Du, Arnaud Doucet, Francisco Vargas, Hai-Dang Dau, Saifuddin Syed

Markov Chain Monte Carlo (MCMC) algorithms are essential tools in computational statistics for sampling from unnormalised probability distributions, but can be fragile when targeting high-dimensional, multimodal, or complex target distributions. Parallel Tempering (PT) enhances MCMC’s sample efficiency through annealing and parallel computation, propagating samples from tractable reference distributions to intractable targets via state swapping across interpolating distributions. The effectiveness of PT is limited by the often minimal overlap between adjacent distributions in challenging problems, which requires increasing the computational resources to compensate. We introduce a framework that accelerates PT by leveraging neural samplers-including normalising flows, diffusion models, and controlled diffusions-to reduce the required overlap. Our approach utilises neural samplers in parallel, circumventing the computational burden of neural samplers while preserving the asymptotic consistency of classical PT. We demonstrate theoretically and empirically on a variety of multimodal sampling problems that our method improves sample quality, reduces the computational cost compared to classical PT, and enables efficient free energies/normalising constants estimation.

nan

Article 931

Title@2025-05-27 (2): Dual-Directed Algorithm Design for Efficient Pure Exploration

Title: Dual-Directed Algorithm Design for Efficient Pure Exploration

Dual-Directed-Algorithm-Design für effizientes Pure-Exploring

高效纯勘探的双重稀释算法设计 2310.19319v3

Authors: Chao Qin, Wei You

While experimental design often focuses on selecting the single best alternative from a finite set (e.g., in ranking and selection or best-arm identification), many pure-exploration problems pursue richer goals. Given a specific goal, adaptive experimentation aims to achieve it by strategically allocating sampling effort, with the underlying sample complexity characterized by a maximin optimization problem. By introducing dual variables, we derive necessary and sufficient conditions for an optimal allocation, yielding a unified algorithm design principle that extends the top-two approach beyond best-arm identification. This principle gives rise to Information-Directed Selection, a hyperparameter-free rule that dynamically evaluates and chooses among candidates based on their current informational value. We prove that, when combined with Information-Directed Selection, top-two Thompson sampling attains asymptotic optimality for Gaussian best-arm identification, resolving a notable open question in the pure-exploration literature. Furthermore, our framework produces asymptotically optimal algorithms for pure-exploration thresholding bandits and $\varepsilon$-best-arm identification (i.e., ranking and selection with probability-of-good-selection guarantees), and more generally establishes a recipe for adapting Thompson sampling across a broad class of pure-exploration problems. Extensive numerical experiments highlight the efficiency of our proposed algorithms compared to existing methods.

nan

Article 932

Title: Taylor expansion-based Kolmogorov-Arnold network for blind image quality assessment

Taylor-expansionsbasiertes Kolmogorov-Arnold-Netzwerk für blinde Bildqualitätsbewertung

以泰勒为扩展基地的Kolmogorov-Arnold盲人图像质量评估网络 2505.21592v1

Authors: Ze Chen, Shaode Yu

Kolmogorov-Arnold Network (KAN) has attracted growing interest for its strong function approximation capability. In our previous work, KAN and its variants were explored in score regression for blind image quality assessment (BIQA). However, these models encounter challenges when processing high-dimensional features, leading to limited performance gains and increased computational cost. To address these issues, we propose TaylorKAN that leverages the Taylor expansions as learnable activation functions to enhance local approximation capability. To improve the computational efficiency, network depth reduction and feature dimensionality compression are integrated into the TaylorKAN-based score regression pipeline. On five databases (BID, CLIVE, KonIQ, SPAQ, and FLIVE) with authentic distortions, extensive experiments demonstrate that TaylorKAN consistently outperforms the other KAN-related models, indicating that the local approximation via Taylor expansions is more effective than global approximation using orthogonal functions. Its generalization capacity is validated through inter-database experiments. The findings highlight the potential of TaylorKAN as an efficient and robust model for high-dimensional score regression.

nan

Article 933

Title@2025-05-27 (2): Minimizing False-Positive Attributions in Explanations of Non-Linear Models

Title: Minimizing False-Positive Attributions in Explanations of Non-Linear Models

Minimierung falsch-positiver Attribute in Erklärungen nicht-linearer Modelle

尽量减少解释非碱模型中的虚假动机归属 2505.11210v2

Authors: Anders Gjølbye, Stefan Haufe, Lars Kai Hansen

Suppressor variables can influence model predictions without being dependent on the target outcome and they pose a significant challenge for Explainable AI (XAI) methods. These variables may cause false-positive feature attributions, undermining the utility of explanations. Although effective remedies exist for linear models, their extension to non-linear models and to instance-based explanations has remained limited. We introduce PatternLocal, a novel XAI technique that addresses this gap. PatternLocal begins with a locally linear surrogate, e.g. LIME, KernelSHAP, or gradient-based methods, and transforms the resulting discriminative model weights into a generative representation, thereby suppressing the influence of suppressor variables while preserving local fidelity. In extensive hyperparameter optimization on the XAI-TRIS benchmark, PatternLocal consistently outperformed other XAI methods and reduced false-positive attributions when explaining non-linear tasks, thereby enabling more reliable and actionable insights.

nan

Article 934

Title@2025-05-27 (2): ResKoopNet: Learning Koopman Representations for Complex Dynamics with Spectral Residuals

Title: ResKoopNet: Learning Koopman Representations for Complex Dynamics with Spectral Residuals

ResKoopNet: Koopman-Repräsentanzen für komplexe Dynamiken mit Spektralresidualen lernen

ResKoopNet:学习 Koopman 代表器, 用于使用光谱残余物的复杂动态 2501.00701v4

Authors: Yuanchao Xu, Kaidi Shao, Nikos Logothetis, Zhongwei Shen

Analyzing the long-term behavior of high-dimensional nonlinear dynamical systems remains a significant challenge. While the Koopman operator framework provides a powerful global linearization tool, current methods for approximating its spectral components often face theoretical limitations and depend on predefined dictionaries. Residual Dynamic Mode Decomposition (ResDMD) advanced the field by introducing the \emph{spectral residual} to assess Koopman operator approximation accuracy; however, its approach of only filtering precomputed spectra prevents the discovery of the operator’s complete spectral information, a limitation known as the `spectral inclusion’ problem. We introduce ResKoopNet (Residual-based Koopman-learning Network), a novel method that directly addresses this by explicitly minimizing the \emph{spectral residual} to compute Koopman eigenpairs. This enables the identification of a more precise and complete Koopman operator spectrum. Using neural networks, our approach provides theoretical guarantees while maintaining computational adaptability. Experiments on a variety of physical and biological systems show that ResKoopNet achieves more accurate spectral approximations than existing methods, particularly for high-dimensional systems and those with continuous spectra, which demonstrates its effectiveness as a tool for analyzing complex dynamical systems.

nan

Article 935

Title@2025-05-27 (2): Mitigating Molecular Aggregation in Drug Discovery with Predictive Insights from Explainable AI

Title: Mitigating Molecular Aggregation in Drug Discovery with Predictive Insights from Explainable AI

Mildernde molekulare Aggregation in der Drogenentdeckung mit vorausschauenden Erkenntnissen von erklärbarer KI

利用可解释的人工智能的预测洞察力减轻药物发现中的分子聚合 2306.02206v2

Authors: Hunter Sturm, Jonas Teufel, Kaitlin A. Isfeld, Pascal Friederich, Rebecca L. Davis

Herein, we present the application of MEGAN, our explainable AI (xAI) model, for the identification of small colloidally aggregating molecules (SCAMs). This work offers solutions to the long-standing problem of false positives caused by SCAMs in high throughput screening for drug discovery and demonstrates the power of xAI in the classification of molecular properties that are not chemically intuitive based on our current understanding. We leverage xAI insights and molecular counterfactuals to design alternatives to problematic compounds in drug screening libraries. Additionally, we experimentally validate the MEGAN prediction classification for one of the counterfactuals and demonstrate the utility of counterfactuals for altering the aggregation properties of a compound through minor structural modifications. The integration of this method in high-throughput screening approaches will help combat and circumvent false positives, providing better lead molecules more rapidly and thus accelerating drug discovery cycles.

nan

Article 936

Title@2025-05-27 (2): BindEnergyCraft: Casting Protein Structure Predictors as Energy-Based Models for Binder Design

Title: BindEnergyCraft: Casting Protein Structure Predictors as Energy-Based Models for Binder Design

BindEnergyCraft: Proteinstrukturvorhersagen als energiebasierte Modelle für Binder-Design

Bind EnergyCraft: 将蛋白结构预测器作为Binder设计以能源为基础的模型 2505.21241v1

Authors: Divya Nori, Anisha Parsan, Caroline Uhler, Wengong Jin

Protein binder design has been transformed by hallucination-based methods that optimize structure prediction confidence metrics, such as the interface predicted TM-score (ipTM), via backpropagation. However, these metrics do not reflect the statistical likelihood of a binder-target complex under the learned distribution and yield sparse gradients for optimization. In this work, we propose a method to extract such likelihoods from structure predictors by reinterpreting their confidence outputs as an energy-based model (EBM). By leveraging the Joint Energy-based Modeling (JEM) framework, we introduce pTMEnergy, a statistical energy function derived from predicted inter-residue error distributions. We incorporate pTMEnergy into BindEnergyCraft (BECraft), a design pipeline that maintains the same optimization framework as BindCraft but replaces ipTM with our energy-based objective. BECraft outperforms BindCraft, RFDiffusion, and ESM3 across multiple challenging targets, achieving higher in silico binder success rates while reducing structural clashes. Furthermore, pTMEnergy establishes a new state-of-the-art in structure-based virtual screening tasks for miniprotein and RNA aptamer binders.

nan

Article 937

Title@2025-05-27 (2): Breaking the Performance Ceiling in Complex Reinforcement Learning requires Inference Strategies

Title: Breaking the Performance Ceiling in Complex Reinforcement Learning requires Inference Strategies

Breaking the Performance Ceiling in komplexen Verstärkungs-Lernen erfordert Inferenz-Strategien

综合加强学习中业绩上限的打破需要推断战略 2505.21236v1

Authors: Felix Chalumeau, Daniel Rajaonarivonivelomanantsoa, Ruan de Kock, Claude Formanek, Sasha Abramowitz, Oumayma Mahjoub, Wiem Khlifi, Simon Du Toit, Louay Ben Nessir, Refiloe Shabe, Arnol Fokam, Siddarth Singh, Ulrich Mbou Sob, Arnu Pretorius

Reinforcement learning (RL) systems have countless applications, from energy-grid management to protein design. However, such real-world scenarios are often extremely difficult, combinatorial in nature, and require complex coordination between multiple agents. This level of complexity can cause even state-of-the-art RL systems, trained until convergence, to hit a performance ceiling which they are unable to break out of with zero-shot inference. Meanwhile, many digital or simulation-based applications allow for an inference phase that utilises a specific time and compute budget to explore multiple attempts before outputting a final solution. In this work, we show that such an inference phase employed at execution time, and the choice of a corresponding inference strategy, are key to breaking the performance ceiling observed in complex multi-agent RL problems. Our main result is striking: we can obtain up to a 126% and, on average, a 45% improvement over the previous state-of-the-art across 17 tasks, using only a couple seconds of extra wall-clock time during execution. We also demonstrate promising compute scaling properties, supported by over 60k experiments, making it the largest study on inference strategies for complex RL to date. Our experimental data and code are available at https://sites.google.com/view/inf-marl.

nan

Article 938

Title@2025-05-27 (2): STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization

Title: STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization

STRAP: Spatio-Temporal Pattern Retrieval für Out-of-Distribution-Verallgemeinerung

STRAP: 普遍分发的Spadio-Temporal 样板回收 2505.19547v2

Authors: Haoyu Zhang, Wentao Zhang, Hao Miao, Xinke Jiang, Yuchen Fang, Yifan Zhang

Spatio-Temporal Graph Neural Networks (STGNNs) have emerged as a powerful tool for modeling dynamic graph-structured data across diverse domains. However, they often fail to generalize in Spatio-Temporal Out-of-Distribution (STOOD) scenarios, where both temporal dynamics and spatial structures evolve beyond the training distribution. To address this problem, we propose an innovative Spatio-Temporal Retrieval-Augmented Pattern Learning framework,STRAP, which enhances model generalization by integrating retrieval-augmented learning into the STGNN continue learning pipeline. The core of STRAP is a compact and expressive pattern library that stores representative spatio-temporal patterns enriched with historical, structural, and semantic information, which is obtained and optimized during the training phase. During inference, STRAP retrieves relevant patterns from this library based on similarity to the current input and injects them into the model via a plug-and-play prompting mechanism. This not only strengthens spatio-temporal representations but also mitigates catastrophic forgetting. Moreover, STRAP introduces a knowledge-balancing objective to harmonize new information with retrieved knowledge. Extensive experiments across multiple real-world streaming graph datasets show that STRAP consistently outperforms state-of-the-art STGNN baselines on STOOD tasks, demonstrating its robustness, adaptability, and strong generalization capability without task-specific fine-tuning.

nan

Article 939

Title@2025-05-27 (2): FRIREN: Beyond Trajectories – A Spectral Lens on Time

Title: FRIREN: Beyond Trajectories – A Spectral Lens on Time

FRIREN: Jenseits von Trajektorien – Eine Spektrallinse auf Zeit

在轨迹之外 – – 时光透镜 2505.17370v2

Authors: Qilin Wang

Long-term time-series forecasting (LTSF) models are often presented as general-purpose solutions that can be applied across domains, implicitly assuming that all data is pointwise predictable. Using chaotic systems such as Lorenz-63 as a case study, we argue that geometric structure - not pointwise prediction - is the right abstraction for a dynamic-agnostic foundational model. Minimizing the Wasserstein-2 distance (W2), which captures geometric changes, and providing a spectral view of dynamics are essential for long-horizon forecasting. Our model, FRIREN (Flow-inspired Representations via Interpretable Eigen-networks), implements an augmented normalizing-flow block that embeds data into a normally distributed latent representation. It then generates a W2-efficient optimal path that can be decomposed into rotation, scaling, inverse rotation, and translation. This architecture yields locally generated, geometry-preserving predictions that are independent of the underlying dynamics, and a global spectral representation that functions as a finite Koopman operator with a small modification. This enables practitioners to identify which modes grow, decay, or oscillate, both locally and system-wide. FRIREN achieves an MSE of 11.4, MAE of 1.6, and SWD of 0.96 on Lorenz-63 in a 336-in, 336-out, dt=0.01 setting, surpassing TimeMixer (MSE 27.3, MAE 2.8, SWD 2.1). The model maintains effective prediction for 274 out of 336 steps, approximately 2.5 Lyapunov times. On Rossler (96-in, 336-out), FRIREN achieves an MSE of 0.0349, MAE of 0.0953, and SWD of 0.0170, outperforming TimeMixer’s MSE of 4.3988, MAE of 0.886, and SWD of 3.2065. FRIREN is also competitive on standard LTSF datasets such as ETT and Weather. By connecting modern generative flows with classical spectral analysis, FRIREN makes long-term forecasting both accurate and interpretable, setting a new benchmark for LTSF model design.

nan

Article 940

Title@2025-05-27 (2): Is Hyperbolic Space All You Need for Medical Anomaly Detection?

Title: Is Hyperbolic Space All You Need for Medical Anomaly Detection?

Ist hyperbolischer Raum alles, was Sie für medizinische Anomalie-Erkennung benötigen?

超双曲空间是否所有你需要的医疗异常检测? 2505.21228v1

Authors: Alvaro Gonzalez-Jimenez, Simone Lionetti, Ludovic Amruthalingam, Philippe Gottfrois, Fabian Gröger, Marc Pouly, Alexander A. Navarini

Medical anomaly detection has emerged as a promising solution to challenges in data availability and labeling constraints. Traditional methods extract features from different layers of pre-trained networks in Euclidean space; however, Euclidean representations fail to effectively capture the hierarchical relationships within these features, leading to suboptimal anomaly detection performance. We propose a novel yet simple approach that projects feature representations into hyperbolic space, aggregates them based on confidence levels, and classifies samples as healthy or anomalous. Our experiments demonstrate that hyperbolic space consistently outperforms Euclidean-based frameworks, achieving higher AUROC scores at both image and pixel levels across multiple medical benchmark datasets. Additionally, we show that hyperbolic space exhibits resilience to parameter variations and excels in few-shot scenarios, where healthy images are scarce. These findings underscore the potential of hyperbolic space as a powerful alternative for medical anomaly detection. The project website can be found at https://hyperbolic-anomalies.github.io

nan

Article 941

Title@2025-05-27 (2): Why Do More Experts Fail? A Theoretical Analysis of Model Merging

Title: Why Do More Experts Fail? A Theoretical Analysis of Model Merging

Warum scheitern weitere Experten? Eine theoretische Analyse der Modellzusammenführung

为何有更多的专家失败?对模式合并的理论分析 2505.21226v1

Authors: Zijing Wang, Xingle Xu, Yongkang Liu, Yiqun Zhang, Peiqin Lin, Shi Feng, Xiaocui Yang, Daling Wang, Hinrich Schütze

Model merging dramatically reduces storage and computational resources by combining multiple expert models into a single multi-task model. Although recent model merging methods have shown promising results, they struggle to maintain performance gains as the number of merged models increases. In this paper, we investigate the key obstacles that limit the scalability of model merging when integrating a large number of expert models. First, we prove that there is an upper bound on model merging. Further theoretical analysis reveals that the limited effective parameter space imposes a strict constraint on the number of models that can be successfully merged. Gaussian Width shows that the marginal benefit of merging additional models diminishes according to a strictly concave function. This implies that the effective parameter space becomes rapidly saturated as the number of merged models increases. Furthermore, using Approximate Kinematics Theory, we prove the existence of a unique optimal threshold beyond which adding more models does not yield significant performance improvements. At the same time, we introduce a straightforward Reparameterized Heavy-Tailed method (RHT) to extend the coverage of the merged model, thereby enhancing its performance. Empirical results on 12 benchmarks, including both knowledge-intensive and general-purpose tasks, validate our theoretical analysis. We believe that these results spark further research beyond the current scope of model merging. The source code is in the anonymous Github repository https://github.com/wzj1718/ModelMergingAnalysis.

nan

Article 942

Title@2025-05-27 (2): The dark side of the forces: assessing non-conservative force models for atomistic machine learning

Title: The dark side of the forces: assessing non-conservative force models for atomistic machine learning

Die dunkle Seite der Kräfte: Bewertung nicht konservativer Kraftmodelle für atomistisches maschinelles Lernen

部队的黑暗面:评估非保守力量模型,以进行原子学机器学习 2412.11569v3

Authors: Filippo Bigi, Marcel Langer, Michele Ceriotti

The use of machine learning to estimate the energy of a group of atoms, and the forces that drive them to more stable configurations, has revolutionized the fields of computational chemistry and materials discovery. In this domain, rigorous enforcement of symmetry and conservation laws has traditionally been considered essential. For this reason, interatomic forces are usually computed as the derivatives of the potential energy, ensuring energy conservation. Several recent works have questioned this physically constrained approach, suggesting that directly predicting the forces yields a better trade-off between accuracy and computational efficiency – and that energy conservation can be learned during training. This work investigates the applicability of such non-conservative models in microscopic simulations. We identify and demonstrate several fundamental issues, from ill-defined convergence of geometry optimization to instability in various types of molecular dynamics. Contrary to the case of rotational symmetry, energy conservation is hard to learn, monitor, and correct for. The best approach to exploit the acceleration afforded by direct force prediction might be to use it in tandem with a conservative model, reducing – rather than eliminating – the additional cost of backpropagation, but avoiding the pathological behavior associated with non-conservative forces.

nan

Article 943

Title@2025-05-27 (2): Wavelet Flow For Extragalactic Foreground Simulations

Title: Wavelet Flow For Extragalactic Foreground Simulations

Wavelet Flow für extragalaktische Foreground Simulationen

用于外星际前景模拟的波浪流 2505.21220v1

Authors: M. Mebratu, W. L. K. Wu

Extragalactic foregrounds in cosmic microwave background (CMB) observations are both a source of cosmological and astrophysical information and a nuisance to the CMB. Effective field-level modeling that captures their non-Gaussian statistical distributions is increasingly important for optimal information extraction, particularly given the precise and low-noise observations from current and upcoming experiments. We explore the use of Wavelet Flow (WF) models to tackle the novel task of modeling the field-level probability distributions of multi-component CMB secondaries. Specifically, we jointly train correlated CMB lensing convergence ($\kappa$) and cosmic infrared background (CIB) maps with a WF model and obtain a network that statistically recovers the input to high accuracy – the trained network generates samples of $\kappa$ and CIB fields whose average power spectra are within a few percent of the inputs across all scales, and whose Minkowski functionals are similarly accurate compared to the inputs. Leveraging the multiscale architecture of these models, we fine-tune both the model parameters and the priors at each scale independently, optimizing performance across different resolutions. These results demonstrate that WF models can accurately simulate correlated components of CMB secondaries, supporting improved analysis of cosmological data. Our code and trained models can be found here (https://github.com/matiwosm/HybridPriorWavletFlow.git).

nan

Article 944

Title@2025-05-27 (2): Addressing Data Quality Decompensation in Federated Learning via Dynamic Client Selection

Title: Addressing Data Quality Decompensation in Federated Learning via Dynamic Client Selection

Adressierung von Datenqualitätsentkompensation im Federated Learning über Dynamic Client Selection

通过动态客户选择解决联邦学习中的数据质量补偿问题 2505.21219v1

Authors: Qinjun Fei, Nuria Rodríguez-Barroso, María Victoria Luzón, Zhongliang Zhang, Francisco Herrera

In cross-silo Federated Learning (FL), client selection is critical to ensure high model performance, yet it remains challenging due to data quality decompensation, budget constraints, and incentive compatibility. As training progresses, these factors exacerbate client heterogeneity and degrade global performance. Most existing approaches treat these challenges in isolation, making jointly optimizing multiple factors difficult. To address this, we propose Shapley-Bid Reputation Optimized Federated Learning (SBRO-FL), a unified framework integrating dynamic bidding, reputation modeling, and cost-aware selection. Clients submit bids based on their perceived data quality, and their contributions are evaluated using Shapley values to quantify their marginal impact on the global model. A reputation system, inspired by prospect theory, captures historical performance while penalizing inconsistency. The client selection problem is formulated as a 0-1 integer program that maximizes reputation-weighted utility under budget constraints. Experiments on FashionMNIST, EMNIST, CIFAR-10, and SVHN datasets show that SBRO-FL improves accuracy, convergence speed, and robustness, even in adversarial and low-bid interference scenarios. Our results highlight the importance of balancing data reliability, incentive compatibility, and cost efficiency to enable scalable and trustworthy FL deployments.

nan

Article 945

Title@2025-05-27 (2): Transfer learning for multifidelity simulation-based inference in cosmology

Title: Transfer learning for multifidelity simulation-based inference in cosmology

Transfer-Lernen für Multifidelity-Simulationsbasierte Schlussfolgerungen in der Kosmologie

在宇宙学中进行多种不贞行为模拟推论的转让性学习 2505.21215v1

Authors: Alex A. Saoulis, Davide Piras, Niall Jeffrey, Alessio Spurio Mancini, Ana M. G. Ferreira, Benjamin Joachimi

Simulation-based inference (SBI) enables cosmological parameter estimation when closed-form likelihoods or models are unavailable. However, SBI relies on machine learning for neural compression and density estimation. This requires large training datasets which are prohibitively expensive for high-quality simulations. We overcome this limitation with multifidelity transfer learning, combining less expensive, lower-fidelity simulations with a limited number of high-fidelity simulations. We demonstrate our methodology on dark matter density maps from two separate simulation suites in the hydrodynamical CAMELS Multifield Dataset. Pre-training on dark-matter-only $N$-body simulations reduces the required number of high-fidelity hydrodynamical simulations by a factor between $8$ and $15$, depending on the model complexity, posterior dimensionality, and performance metrics used. By leveraging cheaper simulations, our approach enables performant and accurate inference on high-fidelity models while substantially reducing computational costs.

nan

Article 946

Title@2025-05-27 (2): Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning

Title: Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning

Auf dem Weg zur Enthüllung der Wirksamkeit von Klein-Scale-Fine-Tuning im R1-Stil Verstärktes Lernen

提高R1型强化学习中小规模微调的效力 2505.17988v2

Authors: Yutong Chen, Jiandong Gao, Ji Wu

R1-style Reinforcement Learning (RL) significantly enhances Large Language Models’ reasoning capabilities, yet the mechanism behind rule-based RL remains unclear. We found that small-scale SFT has significant influence on RL but shows poor efficiency. To explain our observations, we propose an analytical framework and compare the efficiency of SFT and RL by measuring sample effect. Hypothetical analysis show that SFT efficiency is limited by training data. Guided by our analysis, we propose Re-distillation, a technique that fine-tunes pretrain model through small-scale distillation from the RL-trained policy. Experiments on Knight & Knave and MATH datasets demonstrate re-distillation’s surprising efficiency: re-distilled models match RL performance with far fewer samples and less computation. Empirical verification shows that sample effect is a good indicator of performance improvements. As a result, on K&K dataset, our re-distilled Qwen2.5-1.5B model surpasses DeepSeek-V3-0324 with only 1K SFT samples. On MATH, Qwen2.5-1.5B fine-tuned with re-distilled 500 samples matches its instruct-tuned variant without RL. Our work explains several interesting phenomena in R1-style RL, shedding light on the mechanisms behind its empirical success. Code is available at: https://github.com/on1262/deep-reasoning

nan

Article 947

Title@2025-05-27 (2): Input Convex Kolmogorov Arnold Networks

Title: Input Convex Kolmogorov Arnold Networks

Input Convex Kolmogorov Arnold Networks

投入 Convex Kolmogorov Arnold 网络 2505.21208v1

Authors: Thomas Deschatre, Xavier Warin

This article presents an input convex neural network architecture using Kolmogorov-Arnold networks (ICKAN). Two specific networks are presented: the first is based on a low-order, linear-by-part, representation of functions, and a universal approximation theorem is provided. The second is based on cubic splines, for which only numerical results support convergence. We demonstrate on simple tests that these networks perform competitively with classical input convex neural networks (ICNNs). In a second part, we use the networks to solve some optimal transport problems needing a convex approximation of functions and demonstrate their effectiveness. Comparisons with ICNNs show that cubic ICKANs produce results similar to those of classical ICNNs.

nan

Article 948

Title@2025-05-27 (2): Towards Identifiability of Interventional Stochastic Differential Equations

Title: Towards Identifiability of Interventional Stochastic Differential Equations

Zur Identifizierbarkeit interventioneller stochastischer Differentialgleichungen

实现干预性斯托卡差异等同的可识别性 2505.15987v2

Authors: Aaron Zweig, Zaikang Lin, Elham Azizi, David Knowles

We study identifiability of stochastic differential equation (SDE) models under multiple interventions. Our results give the first provable bounds for unique recovery of SDE parameters given samples from their stationary distributions. We give tight bounds on the number of necessary interventions for linear SDEs, and upper bounds for nonlinear SDEs in the small noise regime. We experimentally validate the recovery of true parameters in synthetic data, and motivated by our theoretical results, demonstrate the advantage of parameterizations with learnable activation functions.

nan

Article 949

Title@2025-05-27 (2): Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs

Title: Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs

Universal Reasoner: Ein einfacher, komponierbarer Plug-and-Play-Reasoner für gefrorene LLMs

通用理由:冻结长效LMs的单一、可合成插管和布局理由 2505.19075v2

Authors: Jaemin Kim, Hangeol Chang, Hyunmin Hwang, Choonghan Kim, Jong Chul Ye

Large Language Models (LLMs) have demonstrated remarkable general capabilities, but enhancing skills such as reasoning often demands substantial computational resources and may compromise their generalization. While Parameter-Efficient Fine-Tuning (PEFT) methods offer a more resource-conscious alternative, they typically requires retraining for each LLM backbone due to architectural dependencies. To address these challenges, here we propose Universal Reasoner (UniR) - a single, lightweight, composable, and plug-and-play reasoning module that can be used with any frozen LLM to endow it with specialized reasoning capabilities. Specifically, UniR decomposes the reward into a standalone reasoning module that is trained independently using predefined rewards, effectively translating trajectory-level signals into token-level guidance. Once trained, UniR can be combined with any frozen LLM at inference time by simply adding its output logits to those of the LLM backbone. This additive structure naturally enables modular composition: multiple UniR modules trained for different tasks can be jointly applied by summing their logits, enabling complex reasoning via composition. Experimental results on mathematical reasoning and machine translation tasks show that UniR significantly outperforms existing baseline fine-tuning methods using the Llama3.2 model. Furthermore, UniR demonstrates strong weak-to-strong generalization: reasoning modules trained on smaller models effectively guide much larger LLMs. This makes UniR a cost-efficient, adaptable, and robust solution for enhancing reasoning in LLMs without compromising their core capabilities. Code is open-sourced at https://github.com/hangeol/UniR

nan

Article 950

Title@2025-05-27 (2): Developing hybrid mechanistic and data-driven personalized prediction models for platelet dynamics

Title: Developing hybrid mechanistic and data-driven personalized prediction models for platelet dynamics

Entwicklung hybrider mechanistischer und datengesteuerter personalisierter Vorhersagemodelle für Thrombozytendynamik

开发混合机械和数据驱动的小板板动力学混合机械和个人化预测模型 2505.21204v1

Authors: Marie Steinacker, Yuri Kheifetz, Markus Scholz

Hematotoxicity, drug-induced damage to the blood-forming system, is a frequent side effect of cytotoxic chemotherapy and poses a significant challenge in clinical practice due to its high inter-patient variability and limited predictability. Current mechanistic models often struggle to accurately forecast outcomes for patients with irregular or atypical trajectories. In this study, we develop and compare hybrid mechanistic and data-driven approaches for individualized time series modeling of platelet counts during chemotherapy. We consider hybrid models that combine mechanistic models with neural networks, known as universal differential equations. As a purely data-driven alternative, we utilize a nonlinear autoregressive exogenous model using gated recurrent units as the underlying architecture. These models are evaluated across a range of real patient scenarios, varying in data availability and sparsity, to assess predictive performance. Our findings demonstrate that data-driven methods, when provided with sufficient data, significantly improve prediction accuracy, particularly for high-risk patients with irregular platelet dynamics. This highlights the potential of data-driven approaches in enhancing clinical decision-making. In contrast, hybrid and mechanistic models are superior in scenarios with limited or sparse data. The proposed modeling and comparison framework is generalizable and could be extended to predict other treatment-related toxicities, offering broad applicability in personalized medicine.

nan

Article 951

Title@2025-05-27 (2): Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Title: Implicit Dynamical Flow Fusion (IDFF) for Generative Modeling

Implizite Dynamische Flussfusion (IDFF) für generative Modellierung

用于产生建模的隐含动态流动融合(IDFF) 2409.14599v4

Authors: Mohammad R. Rezaei, Milos R. Popovic, Milad Lankarany, Rahul G. Krishnan

Conditional Flow Matching (CFM) models can generate high-quality samples from a non-informative prior, but they can be slow, often needing hundreds of network evaluations (NFE). To address this, we propose Implicit Dynamical Flow Fusion (IDFF); IDFF learns a new vector field with an additional momentum term that enables taking longer steps during sample generation while maintaining the fidelity of the generated distribution. Consequently, IDFFs reduce the NFEs by a factor of ten (relative to CFMs) without sacrificing sample quality, enabling rapid sampling and efficient handling of image and time-series data generation tasks. We evaluate IDFF on standard benchmarks such as CIFAR-10 and CelebA for image generation, where we achieve likelihood and quality performance comparable to CFMs and diffusion-based models with fewer NFEs. IDFF also shows superior performance on time-series datasets modeling, including molecular simulation and sea surface temperature (SST) datasets, highlighting its versatility and effectiveness across different domains.\href{https://github.com/MrRezaeiUofT/IDFF}{Github Repository}

nan

Article 952

Title@2025-05-27 (2): Crop recommendation with machine learning: leveraging environmental and economic factors for optimal crop selection

Title: Crop recommendation with machine learning: leveraging environmental and economic factors for optimal crop selection

Kulturempfehlung mit maschinellem Lernen: Nutzung ökologischer und wirtschaftlicher Faktoren für eine optimale Ernteauswahl

采用机械学习的作物建议:利用环境和经济因素优化作物选择 2505.21201v1

Authors: Steven Sam, Silima Marshal DAbreo

Agriculture constitutes a primary source of food production, economic growth and employment in India, but the sector is confronted with low farm productivity and yields aggravated by increased pressure on natural resources and adverse climate change variability. Efforts involving green revolution, land irrigations, improved seeds and organic farming have yielded suboptimal outcomes. The adoption of computational tools like crop recommendation systems offers a new way to provide insights and help farmers tackle low productivity. However, most agricultural recommendation systems in India focus narrowly on environmental factors and regions, limiting accurate predictions of high-yield, profitable crops. This study uses environmental and economic factors with 19 crops across 15 states to develop and evaluate Random Forest and SVM models using 10-fold Cross Validation, Time-series Split, and Lag Variables. The 10-fold cross validation showed high accuracy (RF: 99.96%, SVM: 94.71%) but raised overfitting concerns. Introducing temporal order, better reflecting real-world conditions, reduced performance (RF: 78.55%, SVM: 71.18%) in the Time-series Split.To further increase the model accuracy while maintaining the temporal order, the Lag Variables approach was employed, which resulted in improved performance (RF: 83.62%, SVM: 74.38%) compared to the 10-fold cross validation approach. Overall, the models in the Time-series Split and Lag Variable Approaches offer practical insights by handling temporal dependencies and enhancing its adaptability to changing agricultural conditions over time. Consequently, the study shows the Random Forest model developed based on the Lag Variables as the most preferred algorithm for optimal crop recommendation in the Indian context.

nan

Article 953

Title@2025-05-27 (2): Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning

Title: Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning

Pioniere 4-Bit FP-Quantisierung für Diffusionsmodelle: Mixup-Sign-Quantisierung und Timestep-Aware Feintuning

推出4-Bit FP 扩散模型量化:混合- Sign 量度和时间步骤- 软件精美调试 2505.21591v1

Authors: Maosen Zhao, Pengtao Chen, Chong Yu, Yan Wen, Xudong Tan, Tao Chen

Model quantization reduces the bit-width of weights and activations, improving memory efficiency and inference speed in diffusion models. However, achieving 4-bit quantization remains challenging. Existing methods, primarily based on integer quantization and post-training quantization fine-tuning, struggle with inconsistent performance. Inspired by the success of floating-point (FP) quantization in large language models, we explore low-bit FP quantization for diffusion models and identify key challenges: the failure of signed FP quantization to handle asymmetric activation distributions, the insufficient consideration of temporal complexity in the denoising process during fine-tuning, and the misalignment between fine-tuning loss and quantization error. To address these challenges, we propose the mixup-sign floating-point quantization (MSFP) framework, first introducing unsigned FP quantization in model quantization, along with timestep-aware LoRA (TALoRA) and denoising-factor loss alignment (DFA), which ensure precise and stable fine-tuning. Extensive experiments show that we are the first to achieve superior performance in 4-bit FP quantization for diffusion models, outperforming existing PTQ fine-tuning methods in 4-bit INT quantization.

nan

Article 954

Title@2025-05-27 (2): Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM’s Instruction-Following Capabilities

Title: Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM’s Instruction-Following Capabilities

Enthüllen von instruction-spezifischen Neuronen & Experten: Ein analytischer Rahmen für die instruction-following Fähigkeiten von LLM

具体未完成的指示性具体神经和专家:LLM教学-执行能力分析框架 2505.21191v1

Authors: Junyan Zhang, Yubo Gao, Yibo Yan, Jungang Li, Zhaorui Hou, Sicheng Tao, Shuliang Liu, Song Dai, Yonghua Hei, Junzhuo Li, Xuming Hu

The finetuning of Large Language Models (LLMs) has significantly advanced their instruction-following capabilities, yet the underlying computational mechanisms driving these improvements remain poorly understood. This study systematically examines how fine-tuning reconfigures LLM computations by isolating and analyzing instruction-specific sparse components, i.e., neurons in dense models and both neurons and experts in Mixture-of-Experts (MoE) architectures. In particular, we introduce HexaInst, a carefully curated and balanced instructional dataset spanning six distinct categories, and propose SPARCOM, a novel analytical framework comprising three key contributions: (1) a method for identifying these sparse components, (2) an evaluation of their functional generality and uniqueness, and (3) a systematic comparison of their alterations. Through experiments, we demonstrate functional generality, uniqueness, and the critical role of these components in instruction execution. By elucidating the relationship between fine-tuning-induced adaptations and sparse computational substrates, this work provides deeper insights into how LLMs internalize instruction-following behavior for the trustworthy LLM community.

nan

Article 955

Title@2025-05-27 (2): Exploring the Latent Capacity of LLMs for One-Step Text Generation

Title: Exploring the Latent Capacity of LLMs for One-Step Text Generation

Erforschung der Latent-Kapazität von LLMs für die einstufige Textgenerierung

探索单步制文本生成LLMs的原始能力 2505.21189v1

Authors: Gleb Mezentsev, Ivan Oseledets

A recent study showed that large language models (LLMs) can reconstruct surprisingly long texts - up to thousands of tokens - via autoregressive generation from just one specially trained input embedding. In this work, we explore whether such reconstruction is possible without autoregression. We show that frozen LLMs can generate hundreds of accurate tokens in just one forward pass, when provided with only two learned embeddings. This reveals a surprising and underexplored capability of LLMs - multi-token generation without iterative decoding. We investigate the behaviour of these embeddings and provide insight into the type of information they encode. We also empirically show that although these representations are not unique for a given text, they form connected and local regions in embedding space - a property that suggests the potential of learning a dedicated encoder into that space.

nan

Article 956

Title@2025-05-27 (2): Equivariant Representation Learning for Symmetry-Aware Inference with Guarantees

Title: Equivariant Representation Learning for Symmetry-Aware Inference with Guarantees

Gleichwertiges Repräsentationslernen für Symmetrie-Bewusstschluss mit Garantien

关于有担保的对称-软件推断的等同代表制学习 2505.19809v2

Authors: Daniel Ordoñez-Apraez, Vladimir Kostić, Alek Fröhlich, Vivien Brandt, Karim Lounici, Massimiliano Pontil

In many real-world applications of regression, conditional probability estimation, and uncertainty quantification, exploiting symmetries rooted in physics or geometry can dramatically improve generalization and sample efficiency. While geometric deep learning has made significant empirical advances by incorporating group-theoretic structure, less attention has been given to statistical learning guarantees. In this paper, we introduce an equivariant representation learning framework that simultaneously addresses regression, conditional probability estimation, and uncertainty quantification while providing first-of-its-kind non-asymptotic statistical learning guarantees. Grounded in operator and group representation theory, our framework approximates the spectral decomposition of the conditional expectation operator, building representations that are both equivariant and disentangled along independent symmetry subgroups. Empirical evaluations on synthetic datasets and real-world robotics applications confirm the potential of our approach, matching or outperforming existing equivariant baselines in regression while additionally providing well-calibrated parametric uncertainty estimates.

nan

Article 957

Title@2025-05-27 (2): PoisonSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing

Title: PoisonSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing

GiftSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing

毒物群:通过示范众包普及有害信息合成 2505.21184v1

Authors: Yu Yan, Sheng Sun, Zhifei Zheng, Ziji Hao, Teli Liu, Min Liu

To construct responsible and secure AI applications, harmful information data is widely utilized for adversarial testing and the development of safeguards. Existing studies mainly leverage Large Language Models (LLMs) to synthesize data to obtain high-quality task datasets at scale, thereby avoiding costly human annotation. However, limited by the safety alignment mechanisms of LLMs, the synthesis of harmful data still faces challenges in generation reliability and content diversity. In this study, we propose a novel harmful information synthesis framework, PoisonSwarm, which applies the model crowdsourcing strategy to generate diverse harmful data while maintaining a high success rate. Specifically, we generate abundant benign data as the based templates in a counterfactual manner. Subsequently, we decompose each based template into multiple semantic units and perform unit-by-unit toxification and final refinement through dynamic model switching, thus ensuring the success of synthesis. Experimental results demonstrate that PoisonSwarm achieves state-of-the-art performance in synthesizing different categories of harmful data with high scalability and diversity.

nan

Article 958

Title@2025-05-27 (2): Learning What to Do and What Not To Do: Offline Imitation from Expert and Undesirable Demonstrations

Title: Learning What to Do and What Not To Do: Offline Imitation from Expert and Undesirable Demonstrations

Lernen, was zu tun ist und was nicht: Offline-Imitation von Experten und unerwünschten Demonstrationen

学会做什么做什么和不做什么:专家的脱线模仿和不受欢迎的示威 2505.21182v1

Authors: Huy Hoang, Tien Mai, Pradeep Varakantham, Tanvi Verma

Offline imitation learning typically learns from expert and unlabeled demonstrations, yet often overlooks the valuable signal in explicitly undesirable behaviors. In this work, we study offline imitation learning from contrasting behaviors, where the dataset contains both expert and undesirable demonstrations. We propose a novel formulation that optimizes a difference of KL divergences over the state-action visitation distributions of expert and undesirable (or bad) data. Although the resulting objective is a DC (Difference-of-Convex) program, we prove that it becomes convex when expert demonstrations outweigh undesirable demonstrations, enabling a practical and stable non-adversarial training objective. Our method avoids adversarial training and handles both positive and negative demonstrations in a unified framework. Extensive experiments on standard offline imitation learning benchmarks demonstrate that our approach consistently outperforms state-of-the-art baselines.

nan

Article 959

Title@2025-05-27 (2): Latent label distribution grid representation for modeling uncertainty

Title: Latent label distribution grid representation for modeling uncertainty

Latent Label Distribution Grid Darstellung für Modellierung Unsicherheit

用于模拟不确定性模型的延迟标签分配网格代表 2505.21180v1

Authors: ShuNing Sun, YinSong Xiong, Yu Zhang, Zhuoran Zheng

Although \textbf{L}abel \textbf{D}istribution \textbf{L}earning (LDL) has promising representation capabilities for characterizing the polysemy of an instance, the complexity and high cost of the label distribution annotation lead to inexact in the construction of the label space. The existence of a large number of inexact labels generates a label space with uncertainty, which misleads the LDL algorithm to yield incorrect decisions. To alleviate this problem, we model the uncertainty of label distributions by constructing a \textbf{L}atent \textbf{L}abel \textbf{D}istribution \textbf{G}rid (LLDG) to form a low-noise representation space. Specifically, we first construct a label correlation matrix based on the differences between labels, and then expand each value of the matrix into a vector that obeys a Gaussian distribution, thus building a LLDG to model the uncertainty of the label space. Finally, the LLDG is reconstructed by the LLDG-Mixer to generate an accurate label distribution. Note that we enforce a customized low-rank scheme on this grid, which assumes that the label relations may be noisy and it needs to perform noise-reduction with the help of a Tucker reconstruction technique. Furthermore, we attempt to evaluate the effectiveness of the LLDG by considering its generation as an upstream task to achieve the classification of the objects. Extensive experimental results show that our approach performs competitively on several benchmarks.

nan

Article 960

Title@2025-05-27 (2): Improved Online Confidence Bounds for Multinomial Logistic Bandits

Title: Improved Online Confidence Bounds for Multinomial Logistic Bandits

Verbesserte Online-Konfidenzgrenzen für multinomiale Logistische Banditen

提高多军后勤大盗的在线信任度 2502.10020v4

Authors: Joongkyu Lee, Min-hwan Oh

In this paper, we propose an improved online confidence bound for multinomial logistic (MNL) models and apply this result to MNL bandits, achieving variance-dependent optimal regret. Recently, Lee & Oh (2024) established an online confidence bound for MNL models and achieved nearly minimax-optimal regret in MNL bandits. However, their results still depend on the norm-boundedness of the unknown parameter $B$ and the maximum size of possible outcomes $K$. To address this, we first derive an online confidence bound of $O\left(\sqrt{d \log t} + B \right)$, which is a significant improvement over the previous bound of $O (B \sqrt{d} \log t \log K )$ (Lee & Oh, 2024). This is mainly achieved by establishing tighter self-concordant properties of the MNL loss and introducing a novel intermediary term to bound the estimation error. Using this new online confidence bound, we propose a constant-time algorithm, OFU-MNL++, which achieves a variance-dependent regret bound of $O \Big( d \log T \sqrt{ \sum_{t=1}^T \sigma_t^2 } \Big) $ for sufficiently large $T$, where $\sigma_t^2$ denotes the variance of the rewards at round $t$, $d$ is the dimension of the contexts, and $T$ is the total number of rounds. Furthermore, we introduce a Maximum Likelihood Estimation (MLE)-based algorithm, OFU-MN$^2$L, which achieves an anytime poly(B)-free regret of $O \Big( d \log (BT) \sqrt{ \sum_{t=1}^T \sigma_t^2 } \Big) $.

nan

Article 961

Title@2025-05-27 (2): Topological Deep Learning for Speech Data

Title: Topological Deep Learning for Speech Data

Topologisches Deep Learning für Sprachdaten

为语音数据进行地形深层学习 2505.21173v1

Authors: Zhiwang Yu

Topological data analysis (TDA) offers novel mathematical tools for deep learning. Inspired by Carlsson et al., this study designs topology-aware convolutional kernels that significantly improve speech recognition networks. Theoretically, by investigating orthogonal group actions on kernels, we establish a fiber-bundle decomposition of matrix spaces, enabling new filter generation methods. Practically, our proposed Orthogonal Feature (OF) layer achieves superior performance in phoneme recognition, particularly in low-noise scenarios, while demonstrating cross-domain adaptability. This work reveals TDA’s potential in neural network optimization, opening new avenues for mathematics-deep learning interdisciplinary studies.

nan

Article 962

Title@2025-05-27 (2): Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation

Title: Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation

Parameter Effizientes kontinuierliches Lernen mit dynamischer Low-Rank-Anpassung

具有动态低Rank适应性的持续学习 2505.11998v2

Authors: Prashant Shivaram Bhat, Shakib Yazdani, Elahe Arani, Bahram Zonooz

Catastrophic forgetting has remained a critical challenge for deep neural networks in Continual Learning (CL) as it undermines consolidated knowledge when learning new tasks. Parameter efficient fine tuning CL techniques are gaining traction for their effectiveness in addressing catastrophic forgetting with a lightweight training schedule while avoiding degradation of consolidated knowledge in pre-trained models. However, low rank adapters (LoRA) in these approaches are highly sensitive to rank selection which can lead to sub-optimal resource allocation and performance. To this end, we introduce PEARL, a rehearsal-free CL framework that entails dynamic rank allocation for LoRA components during CL training. Specifically, PEARL leverages reference task weights and adaptively determines the rank of task-specific LoRA components based on the current tasks’ proximity to reference task weights in parameter space. To demonstrate the versatility of PEARL, we evaluate it across three vision architectures (ResNet, Separable Convolutional Network and Vision Transformer) and a multitude of CL scenarios, and show that PEARL outperforms all considered baselines by a large margin.

nan

Article 963

Title@2025-05-27 (2): STEB: In Search of the Best Evaluation Approach for Synthetic Time Series

Title: STEB: In Search of the Best Evaluation Approach for Synthetic Time Series

STEB: Auf der Suche nach dem besten Bewertungsansatz für die Synthetische Zeitreihe

STEB:寻求合成时间系列的最佳评价方法 2505.21160v1

Authors: Michael Stenger, Robert Leppich, André Bauer, Samuel Kounev

The growing need for synthetic time series, due to data augmentation or privacy regulations, has led to numerous generative models, frameworks, and evaluation measures alike. Objectively comparing these measures on a large scale remains an open challenge. We propose the Synthetic Time series Evaluation Benchmark (STEB) – the first benchmark framework that enables comprehensive and interpretable automated comparisons of synthetic time series evaluation measures. Using 10 diverse datasets, randomness injection, and 13 configurable data transformations, STEB computes indicators for measure reliability and score consistency. It tracks running time, test errors, and features sequential and parallel modes of operation. In our experiments, we determine a ranking of 41 measures from literature and confirm that the choice of upstream time series embedding heavily impacts the final score.

nan

Article 964

Title@2025-05-27 (2): Model as Loss: A Self-Consistent Training Paradigm

Title: Model as Loss: A Self-Consistent Training Paradigm

Modell als Verlust: Ein selbstkonsistentes Trainingsparadigma

损失模型:自我协调培训模型 2505.21156v1

Authors: Saisamarth Rajesh Phaye, Milos Cernak, Andrew Harper

Conventional methods for speech enhancement rely on handcrafted loss functions (e.g., time or frequency domain losses) or deep feature losses (e.g., using WavLM or wav2vec), which often fail to capture subtle signal properties essential for optimal performance. To address this, we propose Model as Loss, a novel training paradigm that utilizes the encoder from the same model as a loss function to guide the training. The Model as Loss paradigm leverages the encoder’s task-specific feature space, optimizing the decoder to produce output consistent with perceptual and task-relevant characteristics of the clean signal. By using the encoder’s learned features as a loss function, this framework enforces self-consistency between the clean reference speech and the enhanced model output. Our approach outperforms pre-trained deep feature losses on standard speech enhancement benchmarks, offering better perceptual quality and robust generalization to both in-domain and out-of-domain datasets.

nan

Article 965

Title@2025-05-27 (2): FlexiReg: Flexible Urban Region Representation Learning

Title: FlexiReg: Flexible Urban Region Representation Learning

FlexiReg: Flexibles Stadtraum-Repräsentanz-Lernen

灵活的城市地区代表性学习:灵活的城市地区代表性学习 2503.09128v2

Authors: Fengze Sun, Yanchuan Chang, Egemen Tanin, Shanika Karunasekera, Jianzhong Qi

The increasing availability of urban data offers new opportunities for learning region representations, which can be used as input to machine learning models for downstream tasks such as check-in or crime prediction. While existing solutions have produced promising results, an issue is their fixed formation of regions and fixed input region features, which may not suit the needs of different downstream tasks. To address this limitation, we propose a model named FlexiReg for urban region representation learning that is flexible with both the formation of urban regions and the input region features. FlexiReg is based on a spatial grid partitioning over the spatial area of interest. It learns representations for the grid cells, leveraging publicly accessible data, including POI, land use, satellite imagery, and street view imagery. We propose adaptive aggregation to fuse the cell representations and prompt learning techniques to tailor the representations towards different tasks, addressing the needs of varying formations of urban regions and downstream tasks. Extensive experiments on five real-world datasets demonstrate that FlexiReg outperforms state-of-the-art models by up to 202% in term of the accuracy of four diverse downstream tasks using the produced urban region representations.

nan

Article 966

Title@2025-05-27 (2): Predicate Invention for Bilevel Planning

Title: Predicate Invention for Bilevel Planning

Prädikat Erfindung für Bilevel-Planung

双级规划预发明 2203.09634v3

Authors: Tom Silver, Rohan Chitnis, Nishanth Kumar, Willie McClinton, Tomas Lozano-Perez, Leslie Pack Kaelbling, Joshua Tenenbaum

Efficient planning in continuous state and action spaces is fundamentally hard, even when the transition model is deterministic and known. One way to alleviate this challenge is to perform bilevel planning with abstractions, where a high-level search for abstract plans is used to guide planning in the original transition space. Previous work has shown that when state abstractions in the form of symbolic predicates are hand-designed, operators and samplers for bilevel planning can be learned from demonstrations. In this work, we propose an algorithm for learning predicates from demonstrations, eliminating the need for manually specified state abstractions. Our key idea is to learn predicates by optimizing a surrogate objective that is tractable but faithful to our real efficient-planning objective. We use this surrogate objective in a hill-climbing search over predicate sets drawn from a grammar. Experimentally, we show across four robotic planning environments that our learned abstractions are able to quickly solve held-out tasks, outperforming six baselines. Code: https://tinyurl.com/predicators-release

nan

Article 967

Title@2025-05-27 (2): Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score

Title: Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score

Halbüberwachte konforme Vorhersage mit nicht markiertem Nonkonformity Score

带有未贴标签的不合规分数的半超半常规预测 2505.21147v1

Authors: Xuanning Zhou, Hao Zeng, Xiaobo Xia, Bingyi Jing, Hongxin Wei

Conformal prediction (CP) is a powerful framework for uncertainty quantification, providing prediction sets with coverage guarantees when calibrated on sufficient labeled data. However, in real-world applications where labeled data is often limited, standard CP can lead to coverage deviation and output overly large prediction sets. In this paper, we extend CP to the semi-supervised setting and propose SemiCP, leveraging both labeled data and unlabeled data for calibration. Specifically, we introduce a novel nonconformity score function, NNM, designed for unlabeled data. This function selects labeled data with similar pseudo-label scores to estimate nonconformity scores, integrating them into the calibration process to overcome sample size limitations. We theoretically demonstrate that, under mild assumptions, SemiCP provide asymptotically coverage guarantee for prediction sets. Extensive experiments further validate that our approach effectively reduces instability and inefficiency under limited calibration data, can be adapted to conditional coverage settings, and integrates seamlessly with existing CP methods.

nan

Article 968

Title@2025-05-27 (2): A Distributional Treatment of Real2Sim2Real for Object-Centric Agent Adaptation in Vision-Driven Deformable Linear Object Manipulation

Title: A Distributional Treatment of Real2Sim2Real for Object-Centric Agent Adaptation in Vision-Driven Deformable Linear Object Manipulation

Eine distributive Behandlung von Real2Sim2Real für die Anpassung an Objekt-Zentrische Agenten in visionsgetriebener, deformierbarer linearer Objektmanipulation

在视觉-驱动式可变线性物体操纵中用于物体中心剂适应的Real2Sim2Real的分布式处理法 2502.18615v2

Authors: Georgios Kamaras, Subramanian Ramamoorthy

We present an integrated (or end-to-end) framework for the Real2Sim2Real problem of manipulating deformable linear objects (DLOs) based on visual perception. Working with a parameterised set of DLOs, we use likelihood-free inference (LFI) to compute the posterior distributions for the physical parameters using which we can approximately simulate the behaviour of each specific DLO. We use these posteriors for domain randomisation while training, in simulation, object-specific visuomotor policies (i.e. assuming only visual and proprioceptive sensory) for a DLO reaching task, using model-free reinforcement learning. We demonstrate the utility of this approach by deploying sim-trained DLO manipulation policies in the real world in a zero-shot manner, i.e. without any further fine-tuning. In this context, we evaluate the capacity of a prominent LFI method to perform fine classification over the parametric set of DLOs, using only visual and proprioceptive data obtained in a dynamic manipulation trajectory. We then study the implications of the resulting domain distributions in sim-based policy learning and real-world performance.

nan

Article 969

Title@2025-05-27 (2): Hallucinations are inevitable but can be made statistically negligible. The “innate” inevitability of hallucinations cannot explain practical LLM issues

Title: Hallucinations are inevitable but can be made statistically negligible. The “innate” inevitability of hallucinations cannot explain practical LLM issues

Halluzinationen sind unvermeidlich, können aber statistisch vernachlässigbar gemacht werden. Die “angeborene” Unvermeidbarkeit von Halluzinationen kann praktische LLM-Probleme nicht erklären

幻觉的“内在”不可避免性无法解释实际的LLM问题。 2502.12187v2

Authors: Atsushi Suzuki, Yulan He, Feng Tian, Zhongyuan Wang

Hallucinations, a phenomenon where a language model (LM) generates nonfactual content, pose a significant challenge to the practical deployment of LMs. While many empirical methods have been proposed to mitigate hallucinations, recent studies established a computability-theoretic result showing that any LM will inevitably generate hallucinations on an infinite set of inputs, regardless of the quality and quantity of training datasets and the choice of the language model architecture and training and inference algorithms. Although the computability-theoretic result may seem pessimistic, its significance in practical viewpoints has remained unclear. This paper claims that those “innate” inevitability results from computability theory and diagonal argument, in principle, cannot explain practical issues of LLMs. We demonstrate this claim by presenting a positive theoretical result from a probabilistic perspective. Specifically, we prove that hallucinations can be made statistically negligible, provided that the quality and quantity of the training data are sufficient. Interestingly, our positive result coexists with the computability-theoretic result, implying that while hallucinations on an infinite set of inputs cannot be entirely eliminated, their probability can always be reduced by improving algorithms and training data. By evaluating the two seemingly contradictory results through the lens of information theory, we argue that our probability-theoretic positive result better reflects practical considerations than the computability-theoretic negative result.

nan

Article 970

Title@2025-05-27 (2): A Predicting Phishing Websites Using Support Vector Machine and MultiClass Classification Based on Association Rule Techniques

Title: A Predicting Phishing Websites Using Support Vector Machine and MultiClass Classification Based on Association Rule Techniques

Eine Vorhersage Phishing-Websites mit Unterstützung Vektor-Maschine und Multi-Klasse Klassifizierung basierend auf Assoziation Regel Techniken

基于协会规则技术的利用辅助病媒机和多类分类的预测钓鱼网站 2505.21141v1

Authors: Nancy C. Woods, Virtue Ene Agada, Adebola K. Ojo

Phishing is a semantic attack which targets the user rather than the computer. It is a new Internet crime in comparison with other forms such as virus and hacking. Considering the damage phishing websites has caused to various economies by collapsing organizations, stealing information and financial diversion, various researchers have embarked on different ways of detecting phishing websites but there has been no agreement about the best algorithm to be used for prediction. This study is interested in integrating the strengths of two algorithms, Support Vector Machines (SVM) and Multi-Class Classification Rules based on Association Rules (MCAR) to establish a strong and better means of predicting phishing websites. A total of 11,056 websites were used from both PhishTank and yahoo directory to verify the effectiveness of this approach. Feature extraction and rules generation were done by the MCAR technique; classification and prediction were done by SVM technique. The result showed that the technique achieved 98.30% classification accuracy with a computation time of 2205.33s with minimum error rate. It showed a total of 98% Area under the Curve (AUC) which showed the proportion of accuracy in classifying phishing websites. The model showed 82.84% variance in the prediction of phishing websites based on the coefficient of determination. The use of two techniques together in detecting phishing websites produced a more accurate result as it combined the strength of both techniques respectively. This research work centralized on this advantage by building a hybrid of two techniques to help produce a more accurate result.

nan

Article 971

Title@2025-05-27 (2): HeteroBA: A Structure-Manipulating Backdoor Attack on Heterogeneous Graphs

Title: HeteroBA: A Structure-Manipulating Backdoor Attack on Heterogeneous Graphs

HeteroBA: Ein strukturmanipulierender Backdoor-Angriff auf Heterogene Graphen

异型BA:结构调节式后门对异种图的后门攻击 2505.21140v1

Authors: Honglin Gao, Xiang Li, Lan Zhao, Gaoxi Xiao

Heterogeneous graph neural networks (HGNNs) have recently drawn increasing attention for modeling complex multi-relational data in domains such as recommendation, finance, and social networks. While existing research has been largely focused on enhancing HGNNs’ predictive performance, their robustness and security, especially under backdoor attacks, remain underexplored. In this paper, we propose a novel Heterogeneous Backdoor Attack (HeteroBA) framework for node classification tasks on heterogeneous graphs. HeteroBA inserts carefully crafted trigger nodes with realistic features and targeted structural connections, leveraging attention-based and clustering-based strategies to select influential auxiliary nodes for effective trigger propagation, thereby causing the model to misclassify specific nodes into a target label while maintaining accuracy on clean data. Experimental results on three datasets and various HGNN architectures demonstrate that HeteroBA achieves high attack success rates with minimal impact on the clean accuracy. Our method sheds light on potential vulnerabilities in HGNNs and calls for more robust defenses against backdoor threats in multi-relational graph scenarios.

nan

Article 972

Title@2025-05-27 (2): Identifying Heart Attack Risk in Vulnerable Population: A Machine Learning Approach

Title: Identifying Heart Attack Risk in Vulnerable Population: A Machine Learning Approach

Identifikation von Herzinfarktrisiko in gefährdeter Bevölkerung: Ein Ansatz zum maschinellen Lernen

查明弱势人口中的心脏攻击风险:机械学习方法 2505.21139v1

Authors: Subhagata Chattopadhyay, Amit K Chattopadhyay

The COVID-19 pandemic has significantly increased the incidence of post-infection cardiovascular events, particularly myocardial infarction, in individuals over 40. While the underlying mechanisms remain elusive, this study employs a hybrid machine learning approach to analyze epidemiological data in assessing 13 key heart attack risk factors and their susceptibility. Based on a unique dataset that combines demographic, biochemical, ECG, and thallium stress-tests, this study categorizes distinct subpopulations against varying risk profiles and then divides the population into ‘at-risk’ (AR) and ‘not-at-risk’ (NAR) groups using clustering algorithms. The study reveals strong association between the likelihood of experiencing a heart attack on the 13 risk factors studied. The aggravated risk for postmenopausal patients indicates compromised individual risk factors due to estrogen depletion that may be, further compromised by extraneous stress impacts, like anxiety and fear, aspects that have traditionally eluded data modeling predictions.

nan

Article 973

Title@2025-05-27 (2): Learning Single Index Models with Diffusion Priors

Title: Learning Single Index Models with Diffusion Priors

Einzelindexmodelle mit Diffusion Priors lernen

具有传播前版本的学习单一指数模式 2505.21135v1

Authors: Anqi Tang, Youming Chen, Shuchen Xue, Zhaoqiang Liu

Diffusion models (DMs) have demonstrated remarkable ability to generate diverse and high-quality images by efficiently modeling complex data distributions. They have also been explored as powerful generative priors for signal recovery, resulting in a substantial improvement in the quality of reconstructed signals. However, existing research on signal recovery with diffusion models either focuses on specific reconstruction problems or is unable to handle nonlinear measurement models with discontinuous or unknown link functions. In this work, we focus on using DMs to achieve accurate recovery from semi-parametric single index models, which encompass a variety of popular nonlinear models that may have {\em discontinuous} and {\em unknown} link functions. We propose an efficient reconstruction method that only requires one round of unconditional sampling and (partial) inversion of DMs. Theoretical analysis on the effectiveness of the proposed methods has been established under appropriate conditions. We perform numerical experiments on image datasets for different nonlinear measurement models. We observe that compared to competing methods, our approach can yield more accurate reconstructions while utilizing significantly fewer neural function evaluations.

nan

Article 974

Title@2025-05-27 (2): Robust and Computation-Aware Gaussian Processes

Title: Robust and Computation-Aware Gaussian Processes

Robuste und rechnergestützte Gaußsche Prozesse

强力和计算- 软件软件高斯进程 2505.21133v1

Authors: Marshal Arijona Sinaga, Julien Martinelli, Samuel Kaski

Gaussian processes (GPs) are widely used for regression and optimization tasks such as Bayesian optimization (BO) due to their expressiveness and principled uncertainty estimates. However, in settings with large datasets corrupted by outliers, standard GPs and their sparse approximations struggle with computational tractability and robustness. We introduce Robust Computation-aware Gaussian Process (RCaGP), a novel GP model that jointly addresses these challenges by combining a principled treatment of approximation-induced uncertainty with robust generalized Bayesian updating. The key insight is that robustness and approximation-awareness are not orthogonal but intertwined: approximations can exacerbate the impact of outliers, and mitigating one without the other is insufficient. Unlike previous work that focuses narrowly on either robustness or approximation quality, RCaGP combines both in a principled and scalable framework, thus effectively managing both outliers and computational uncertainties introduced by approximations such as low-rank matrix multiplications. Our model ensures more conservative and reliable uncertainty estimates, a property we rigorously demonstrate. Additionally, we establish a robustness property and show that the mean function is key to preserving it, motivating a tailored model selection scheme for robust mean functions. Empirical results confirm that solving these challenges jointly leads to superior performance across both clean and outlier-contaminated settings, both on regression and high-throughput Bayesian optimization benchmarks.

nan

Article 975

Title@2025-05-27 (2): Backpropagation-free Spiking Neural Networks with the Forward-Forward Algorithm

Title: Backpropagation-free Spiking Neural Networks with the Forward-Forward Algorithm

Rückpropagierungsfreie Spiking-Neural-Netzwerke mit dem vorwärts-vorwärts-Algorithmus

带有前向前向演算法的无后向反向反向光谱反向神经网络 2502.20411v2

Authors: Mohammadnavid Ghader, Saeed Reza Kheradpisheh, Bahar Farahani, Mahmood Fazlali

Spiking Neural Networks (SNNs) offer a biologically inspired computational paradigm that emulates neuronal activity through discrete spike-based processing. Despite their advantages, training SNNs with traditional backpropagation (BP) remains challenging due to computational inefficiencies and a lack of biological plausibility. This study explores the Forward-Forward (FF) algorithm as an alternative learning framework for SNNs. Unlike backpropagation, which relies on forward and backward passes, the FF algorithm employs two forward passes, enabling layer-wise localized learning, enhanced computational efficiency, and improved compatibility with neuromorphic hardware. We introduce an FF-based SNN training framework and evaluate its performance across both non-spiking (MNIST, Fashion-MNIST, Kuzushiji-MNIST) and spiking (Neuro-MNIST, SHD) datasets. Experimental results demonstrate that our model surpasses existing FF-based SNNs on evaluated static datasets with a much lighter architecture while achieving accuracy comparable to state-of-the-art backpropagation-trained SNNs. On more complex spiking tasks such as SHD, our approach outperforms other SNN models and remains competitive with leading backpropagation-trained SNNs. These findings highlight the FF algorithm’s potential to advance SNN training methodologies by addressing some key limitations of backpropagation.

nan

Article 976

Title@2025-05-27 (2): MetaGS: A Meta-Learned Gaussian-Phong Model for Out-of-Distribution 3D Scene Relighting

Title: MetaGS: A Meta-Learned Gaussian-Phong Model for Out-of-Distribution 3D Scene Relighting

MetaGS: Ein meta-erlerntes Gaussian-Phong-Modell für 3D-Szenen-Erhellung im Out-of-Distribution-Bereich

MetaGS: 3D号场景光化模型 2405.20791v2

Authors: Yumeng He, Yunbo Wang, Xiaokang Yang

Out-of-distribution (OOD) 3D relighting requires novel view synthesis under unseen lighting conditions that differ significantly from the observed images. Existing relighting methods, which assume consistent light source distributions between training and testing, often degrade in OOD scenarios. We introduce MetaGS to tackle this challenge from two perspectives. First, we propose a meta-learning approach to train 3D Gaussian splatting, which explicitly promotes learning generalizable Gaussian geometries and appearance attributes across diverse lighting conditions, even with biased training data. Second, we embed fundamental physical priors from the Blinn-Phong reflection model into Gaussian splatting, which enhances the decoupling of shading components and leads to more accurate 3D scene reconstruction. Results on both synthetic and real-world datasets demonstrate the effectiveness of MetaGS in challenging OOD relighting tasks, supporting efficient point-light relighting and generalizing well to unseen environment lighting maps.

nan

Article 977

Title@2025-05-27 (2): Universal Value-Function Uncertainties

Title: Universal Value-Function Uncertainties

Universelle Wert-Funktions-Unsicherheiten

通用价值-功能不确定性 2505.21119v1

Authors: Moritz A. Zanger, Max Weltevrede, Yaniv Oren, Pascal R. Van der Vaart, Caroline Horsch, Wendelin Böhmer, Matthijs T. J. Spaan

Estimating epistemic uncertainty in value functions is a crucial challenge for many aspects of reinforcement learning (RL), including efficient exploration, safe decision-making, and offline RL. While deep ensembles provide a robust method for quantifying value uncertainty, they come with significant computational overhead. Single-model methods, while computationally favorable, often rely on heuristics and typically require additional propagation mechanisms for myopic uncertainty estimates. In this work we introduce universal value-function uncertainties (UVU), which, similar in spirit to random network distillation (RND), quantify uncertainty as squared prediction errors between an online learner and a fixed, randomly initialized target network. Unlike RND, UVU errors reflect policy-conditional value uncertainty, incorporating the future uncertainties any given policy may encounter. This is due to the training procedure employed in UVU: the online network is trained using temporal difference learning with a synthetic reward derived from the fixed, randomly initialized target network. We provide an extensive theoretical analysis of our approach using neural tangent kernel (NTK) theory and show that in the limit of infinite network width, UVU errors are exactly equivalent to the variance of an ensemble of independent universal value functions. Empirically, we show that UVU achieves equal performance to large ensembles on challenging multi-task offline RL settings, while offering simplicity and substantial computational savings.

nan

Article 978

Title@2025-05-27 (2): A Lightweight Multi-Expert Generative Language Model System for Engineering Information and Knowledge Extraction

Title: A Lightweight Multi-Expert Generative Language Model System for Engineering Information and Knowledge Extraction

Ein leichtes Multi-Expert Generatives Sprachmodellsystem für Engineering Information and Knowledge Extraction

工程信息和知识采掘轻量多专家生成语言示范系统 2505.21109v1

Authors: Bogdan Bogachov, Yaoyao Fiona Zhao

Despite recent advancements in domain adaptation techniques for large language models, these methods remain computationally intensive, and the resulting models can still exhibit hallucination issues. Most existing adaptation methods do not prioritize reducing the computational resources required for fine-tuning and inference of language models. Hallucination issues have gradually decreased with each new model release. However, they remain prevalent in engineering contexts, where generating well-structured text with minimal errors and inconsistencies is critical. This work introduces a novel approach called the Small Language Graph (SLG), which is a lightweight adaptation solution designed to address the two key challenges outlined above. The system is structured in the form of a graph, where each node represents a lightweight expert - a small language model fine-tuned on specific and concise texts. The results of this study have shown that SLG was able to surpass conventional fine-tuning methods on the Exact Match metric by 3 times. Additionally, the fine-tuning process was 1.7 times faster compared to that of a larger stand-alone language model. These findings introduce a potential for small to medium-sized engineering companies to confidently use generative AI technologies, such as LLMs, without the necessity to invest in expensive computational resources. Also, the graph architecture and the small size of expert nodes offer a possible opportunity for distributed AI systems, thus potentially diverting the global need for expensive centralized compute clusters.

nan

Article 979

Title@2025-05-27 (2): Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance

Title: Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance

Bedingte Diffusionsmodelle mit klassifikatorfreier Gibbs-ähnlicher Anleitung

有条件传播模式,附有无分类者免费吉布布斯类指南 2505.21101v1

Authors: Badr Moufad, Yazid Janati, Alain Durmus, Ahmed Ghorbel, Eric Moulines, Jimmy Olsson

Classifier-Free Guidance (CFG) is a widely used technique for improving conditional diffusion models by linearly combining the outputs of conditional and unconditional denoisers. While CFG enhances visual quality and improves alignment with prompts, it often reduces sample diversity, leading to a challenging trade-off between quality and diversity. To address this issue, we make two key contributions. First, CFG generally does not correspond to a well-defined denoising diffusion model (DDM). In particular, contrary to common intuition, CFG does not yield samples from the target distribution associated with the limiting CFG score as the noise level approaches zero – where the data distribution is tilted by a power $w \gt 1$ of the conditional distribution. We identify the missing component: a R'enyi divergence term that acts as a repulsive force and is required to correct CFG and render it consistent with a proper DDM. Our analysis shows that this correction term vanishes in the low-noise limit. Second, motivated by this insight, we propose a Gibbs-like sampling procedure to draw samples from the desired tilted distribution. This method starts with an initial sample from the conditional diffusion model without CFG and iteratively refines it, preserving diversity while progressively enhancing sample quality. We evaluate our approach on both image and text-to-audio generation tasks, demonstrating substantial improvements over CFG across all considered metrics. The code is available at https://github.com/yazidjanati/cfgig

nan

Article 980

Title@2025-05-27 (2): Random Walk Diffusion for Efficient Large-Scale Graph Generation

Title: Random Walk Diffusion for Efficient Large-Scale Graph Generation

Random Walk Diffusion für effiziente großformatige Graphengeneration

高效大型图表生成的随机漫步扩散 2408.04461v2

Authors: Tobias Bernecker, Ghalia Rehawi, Francesco Paolo Casale, Janine Knauer-Arloth, Annalisa Marsico

Graph generation addresses the problem of generating new graphs that have a data distribution similar to real-world graphs. While previous diffusion-based graph generation methods have shown promising results, they often struggle to scale to large graphs. In this work, we propose ARROW-Diff (AutoRegressive RandOm Walk Diffusion), a novel random walk-based diffusion approach for efficient large-scale graph generation. Our method encompasses two components in an iterative process of random walk sampling and graph pruning. We demonstrate that ARROW-Diff can scale to large graphs efficiently, surpassing other baseline methods in terms of both generation time and multiple graph statistics, reflecting the high quality of the generated graphs.

nan

Article 981

Title@2025-05-27 (2): Do you see what I see? An Ambiguous Optical Illusion Dataset exposing limitations of Explainable AI

Title: Do you see what I see? An Ambiguous Optical Illusion Dataset exposing limitations of Explainable AI

Sehen Sie, was ich sehe? Ein Ambiguous Optical Illusion Dataset, das Beschränkungen der erklärbaren KI aufdeckt

你看到我所看到的吗?一个模糊的光学幻影数据集暴露了可解释的人工智能的局限性。 2505.21589v1

Authors: Carina Newen, Luca Hinkamp, Maria Ntonti, Emmanuel Müller

From uncertainty quantification to real-world object detection, we recognize the importance of machine learning algorithms, particularly in safety-critical domains such as autonomous driving or medical diagnostics. In machine learning, ambiguous data plays an important role in various machine learning domains. Optical illusions present a compelling area of study in this context, as they offer insight into the limitations of both human and machine perception. Despite this relevance, optical illusion datasets remain scarce. In this work, we introduce a novel dataset of optical illusions featuring intermingled animal pairs designed to evoke perceptual ambiguity. We identify generalizable visual concepts, particularly gaze direction and eye cues, as subtle yet impactful features that significantly influence model accuracy. By confronting models with perceptual ambiguity, our findings underscore the importance of concepts in visual learning and provide a foundation for studying bias and alignment between human and machine vision. To make this dataset useful for general purposes, we generate optical illusions systematically with different concepts discussed in our bias mitigation section. The dataset is accessible in Kaggle via https://kaggle.com/datasets/693bf7c6dd2cb45c8a863f9177350c8f9849a9508e9d50526e2ffcc5559a8333. Our source code can be found at https://github.com/KDD-OpenSource/Ambivision.git.

nan

Article 982

Title@2025-05-27 (2): Sequential Function-Space Variational Inference via Gaussian Mixture Approximation

Title: Sequential Function-Space Variational Inference via Gaussian Mixture Approximation

Sequentielle Funktions-Raum Variationelle Schlussfolgerung über Gaußsche Mischungsannäherung

通过高森混ixture近似加速发生序列函数-空间空间变动推断 2503.07114v2

Authors: Menghao Waiyan William Zhu, Pengcheng Hao, Ercan Engin Kuruoğlu

Continual learning in neural networks aims to learn new tasks without forgetting old tasks. Sequential function-space variational inference (SFSVI) uses a Gaussian variational distribution to approximate the distribution of the outputs of the neural network corresponding to a finite number of selected inducing points. Since the posterior distribution of a neural network is multi-modal, a Gaussian distribution could only match one mode of the posterior distribution, and a Gaussian mixture distribution could be used to better approximate the posterior distribution. We propose an SFSVI method based on a Gaussian mixture variational distribution. We also compare different types of variational inference methods with a fixed pre-trained feature extractor (where continual learning is performed on the final layer) and without a fixed pre-trained feature extractor (where continual learning is performed on all layers). We find that in terms of final average accuracy, likelihood-focused Gaussian mixture SFSVI outperforms other sequential variational inference methods, especially in the latter case.

nan

Article 983

Title@2025-05-27 (2): Thinker: Learning to Think Fast and Slow

Title: Thinker: Learning to Think Fast and Slow

Denker: Schnell und langsam denken lernen

思考者:学会快速和缓慢思考 2505.21097v1

Authors: Stephen Chung, Wenyu Du, Jie Fu

Recent studies show that the reasoning capabilities of Large Language Models (LLMs) can be improved by applying Reinforcement Learning (RL) to question-answering (QA) tasks in areas such as math and coding. With a long context length, LLMs may learn to perform search, as indicated by the self-correction behavior observed in DeepSeek R1. However, this search behavior is often imprecise and lacks confidence, resulting in long, redundant responses and highlighting deficiencies in intuition and verification. Inspired by the Dual Process Theory in psychology, we introduce a simple modification to the QA task that includes four stages: Fast Thinking, where the LLM must answer within a strict token budget; Verification, where the model evaluates its initial response; Slow Thinking, where it refines the initial response with more deliberation; and Summarization, where it distills the refinement from the previous stage into precise steps. Our proposed task improves average accuracy from 24.9% to 27.9% for Qwen2.5-1.5B, and from 45.9% to 49.8% for DeepSeek-R1-Qwen-1.5B. Notably, for Qwen2.5-1.5B, the Fast Thinking mode alone achieves 26.8% accuracy using fewer than 1000 tokens, demonstrating substantial inference efficiency gains. These findings suggest that intuition and deliberative reasoning are distinct, complementary systems benefiting from targeted training.

nan

Article 984

Title@2025-05-27 (2): Improved Impossible Tuning and Lipschitz-Adaptive Universal Online Learning with Gradient Variations

Title: Improved Impossible Tuning and Lipschitz-Adaptive Universal Online Learning with Gradient Variations

Verbessertes Unmögliches Tuning und Lipschitz-Adaptives Universal Online-Lernen mit gradienten Variationen

改进不可能的图金和利普施维茨-适应性通用在线学习,有渐进变异 2505.21095v1

Authors: Kei Takemura, Ryuta Matsuno, Keita Sakuma

A central goal in online learning is to achieve adaptivity to unknown problem characteristics, such as environmental changes captured by gradient variation (GV), function curvature (universal online learning, UOL), and gradient scales (Lipschitz adaptivity, LA). Simultaneously achieving these with optimal performance is a major challenge, partly due to limitations in algorithms for prediction with expert advice. These algorithms often serve as meta-algorithms in online ensemble frameworks, and their sub-optimality hinders overall UOL performance. Specifically, existing algorithms addressing the ``impossible tuning’’ issue incur an excess $\sqrt{\log T}$ factor in their regret bound compared to the lower bound. To solve this problem, we propose a novel optimistic online mirror descent algorithm with an auxiliary initial round using large learning rates. This design enables a refined analysis where a generated negative term cancels the gap-related factor, resolving the impossible tuning issue up to $\log\log T$ factors. Leveraging our improved algorithm as a meta-algorithm, we develop the first UOL algorithm that simultaneously achieves state-of-the-art GV bounds and LA under standard assumptions. Our UOL result overcomes key limitations of prior works, notably resolving the conflict between LA mechanisms and regret analysis for GV bounds – an open problem highlighted by Xie et al.

nan

Article 985

Title@2025-05-27 (2): Recurrent Memory for Online Interdomain Gaussian Processes

Title: Recurrent Memory for Online Interdomain Gaussian Processes

Recurrent Speicher für Online-Interdomain Gaussian Prozesse

Gaussian 在线内部进程经常性内存 2502.08736v3

Authors: Wenlong Chen, Naoki Kiyohara, Harrison Bo Hua Zhu, Jacob Curran-Sebastian, Samir Bhatt, Yingzhen Li

We propose a novel online Gaussian process (GP) model that is capable of capturing long-term memory in sequential data in an online learning setting. Our model, Online HiPPO Sparse Variational Gaussian Process (OHSVGP), leverages the HiPPO (High-order Polynomial Projection Operators) framework, which is popularized in the RNN domain due to its long-range memory modeling capabilities. We interpret the HiPPO time-varying orthogonal projections as inducing variables with time-dependent orthogonal polynomial basis functions, which allows the SVGP inducing variables to memorize the process history. We show that the HiPPO framework fits naturally into the interdomain GP framework and demonstrate that the kernel matrices can also be updated online in a recurrence form based on the ODE evolution of HiPPO. We evaluate OHSVGP with online prediction for 1D time series, continual learning in discriminative GP model for data with multidimensional inputs, and deep generative modeling with sparse Gaussian process variational autoencoder, showing that it outperforms existing online GP methods in terms of predictive performance, long-term memory preservation, and computational efficiency.

nan

Article 986

Title@2025-05-27 (2): Out of the Shadows: Exploring a Latent Space for Neural Network Verification

Title: Out of the Shadows: Exploring a Latent Space for Neural Network Verification

Out of the Shadows: Erforschen eines latenten Raumes für neurale Netzwerkverifizierung

暗影外:探索神经网络的原始空间核查 2505.17854v2

Authors: Lukas Koller, Tobias Ladner, Matthias Althoff

Neural networks are ubiquitous. However, they are often sensitive to small input changes. Hence, to prevent unexpected behavior in safety-critical applications, their formal verification – a notoriously hard problem – is necessary. Many state-of-the-art verification algorithms use reachability analysis or abstract interpretation to enclose the set of possible outputs of a neural network. Often, the verification is inconclusive due to the conservatism of the enclosure. To address this problem, we design a novel latent space for formal verification that enables the transfer of output specifications to the input space for an iterative specification-driven input refinement, i.e., we iteratively reduce the set of possible inputs to only enclose the unsafe ones. The latent space is constructed from a novel view of projection-based set representations, e.g., zonotopes, which are commonly used in reachability analysis of neural networks. A projection-based set representation is a “shadow” of a higher-dimensional set – a latent space – that does not change during a set propagation through a neural network. Hence, the input set and the output enclosure are “shadows” of the same latent space that we can use to transfer constraints. We present an efficient verification tool for neural networks that uses our iterative refinement to significantly reduce the number of subproblems in a branch-and-bound procedure. Using zonotopes as a set representation, unlike many other state-of-the-art approaches, our approach can be realized by only using matrix operations, which enables a significant speed-up through efficient GPU acceleration. We demonstrate that our tool achieves competitive performance, which would place it among the top-ranking tools of the last neural network verification competition (VNN-COMP’24).

nan

Article 987

Title@2025-05-27 (2): Efficient Large Language Model Inference with Neural Block Linearization

Title: Efficient Large Language Model Inference with Neural Block Linearization

Effiziente großsprachige Modellinferenz mit neuraler Blocklinearisierung

高效大语言模型与神经区块线性线性结合的推断 2505.21077v1

Authors: Mete Erdogan, Francesco Tonin, Volkan Cevher

The high inference demands of transformer-based Large Language Models (LLMs) pose substantial challenges in their deployment. To this end, we introduce Neural Block Linearization (NBL), a novel framework for accelerating transformer model inference by replacing self-attention layers with linear approximations derived from Linear Minimum Mean Squared Error estimators. NBL leverages Canonical Correlation Analysis to compute a theoretical upper bound on the approximation error. Then, we use this bound as a criterion for substitution, selecting the LLM layers with the lowest linearization error. NBL can be efficiently applied to pre-trained LLMs without the need for fine-tuning. In experiments, NBL achieves notable computational speed-ups while preserving competitive accuracy on multiple reasoning benchmarks. For instance, applying NBL to 12 self-attention layers in DeepSeek-R1-Distill-Llama-8B increases the inference speed by 32% with less than 1% accuracy trade-off, making it a flexible and promising solution to improve the inference efficiency of LLMs.

nan

Article 988

Title@2025-05-27 (2): Red-Teaming Text-to-Image Systems by Rule-based Preference Modeling

Title: Red-Teaming Text-to-Image Systems by Rule-based Preference Modeling

Red-Teaming Text-to-Image-Systeme durch regelbasiertes Preference-Modelling

通过基于规则的首选模式建立红色团队式文本到图像系统 2505.21074v1

Authors: Yichuan Cao, Yibo Miao, Xiao-Shan Gao, Yinpeng Dong

Text-to-image (T2I) models raise ethical and safety concerns due to their potential to generate inappropriate or harmful images. Evaluating these models’ security through red-teaming is vital, yet white-box approaches are limited by their need for internal access, complicating their use with closed-source models. Moreover, existing black-box methods often assume knowledge about the model’s specific defense mechanisms, limiting their utility in real-world commercial API scenarios. A significant challenge is how to evade unknown and diverse defense mechanisms. To overcome this difficulty, we propose a novel Rule-based Preference modeling Guided Red-Teaming (RPG-RT), which iteratively employs LLM to modify prompts to query and leverages feedback from T2I systems for fine-tuning the LLM. RPG-RT treats the feedback from each iteration as a prior, enabling the LLM to dynamically adapt to unknown defense mechanisms. Given that the feedback is often labeled and coarse-grained, making it difficult to utilize directly, we further propose rule-based preference modeling, which employs a set of rules to evaluate desired or undesired feedback, facilitating finer-grained control over the LLM’s dynamic adaptation process. Extensive experiments on nineteen T2I systems with varied safety mechanisms, three online commercial API services, and T2V models verify the superiority and practicality of our approach.

nan

Article 989

Title@2025-05-27 (2): A domain adaptation neural network for digital twin-supported fault diagnosis

Title: A domain adaptation neural network for digital twin-supported fault diagnosis

Ein neuronales Netzwerk für die Domänenanpassung für die digitale Doppel-unterstützte Fehlerdiagnose

数字双支持缺陷诊断领域适应性神经神经网络 2505.21046v1

Authors: Zhenling Chen, Haiwei Fu, Zhiguo Zeng

Digital twins offer a promising solution to the lack of sufficient labeled data in deep learning-based fault diagnosis by generating simulated data for model training. However, discrepancies between simulation and real-world systems can lead to a significant drop in performance when models are applied in real scenarios. To address this issue, we propose a fault diagnosis framework based on Domain-Adversarial Neural Networks (DANN), which enables knowledge transfer from simulated (source domain) to real-world (target domain) data. We evaluate the proposed framework using a publicly available robotics fault diagnosis dataset, which includes 3,600 sequences generated by a digital twin model and 90 real sequences collected from physical systems. The DANN method is compared with commonly used lightweight deep learning models such as CNN, TCN, Transformer, and LSTM. Experimental results show that incorporating domain adaptation significantly improves the diagnostic performance. For example, applying DANN to a baseline CNN model improves its accuracy from 70.00% to 80.22% on real-world test data, demonstrating the effectiveness of domain adaptation in bridging the sim-to-real gap.

nan

Article 990

Title@2025-05-27 (2): Scalable and adaptive prediction bands with kernel sum-of-squares

Title: Scalable and adaptive prediction bands with kernel sum-of-squares

Skalierbare und adaptive Vorhersagebänder mit Kernel-Summe von Quadraten

可缩放和适应性预测带带内核和平方总和的可缩放和适应性预测波段 2505.21039v1

Authors: Louis Allain, Sébastien da Veiga, Brian Staber

Conformal Prediction (CP) is a popular framework for constructing prediction bands with valid coverage in finite samples, while being free of any distributional assumption. A well-known limitation of conformal prediction is the lack of adaptivity, although several works introduced practically efficient alternate procedures. In this work, we build upon recent ideas that rely on recasting the CP problem as a statistical learning problem, directly targeting coverage and adaptivity. This statistical learning problem is based on reproducible kernel Hilbert spaces (RKHS) and kernel sum-of-squares (SoS) methods. First, we extend previous results with a general representer theorem and exhibit the dual formulation of the learning problem. Crucially, such dual formulation can be solved efficiently by accelerated gradient methods with several hundreds or thousands of samples, unlike previous strategies based on off-the-shelf semidefinite programming algorithms. Second, we introduce a new hyperparameter tuning strategy tailored specifically to target adaptivity through bounds on test-conditional coverage. This strategy, based on the Hilbert-Schmidt Independence Criterion (HSIC), is introduced here to tune kernel lengthscales in our framework, but has broader applicability since it could be used in any CP algorithm where the score function is learned. Finally, extensive experiments are conducted to show how our method compares to related work. All figures can be reproduced with the accompanying code.

nan

Article 991

Title@2025-05-27 (2): Unraveling Indirect In-Context Learning Using Influence Functions

Title: Unraveling Indirect In-Context Learning Using Influence Functions

Indirektes In-Context-Lernen mit Einflussfunktionen entschlüsseln

利用影响功能进行分散的间接间接内文学习 2501.01473v2

Authors: Hadi Askari, Shivanshu Gupta, Terry Tong, Fei Wang, Anshuman Chhabra, Muhao Chen

In this work, we introduce a novel paradigm for generalized In-Context Learning (ICL), termed Indirect In-Context Learning. In Indirect ICL, we explore demonstration selection strategies tailored for two distinct real-world scenarios: Mixture of Tasks and Noisy ICL. We systematically evaluate the effectiveness of Influence Functions (IFs) as a selection tool for these settings, highlighting the potential of IFs to better capture the informativeness of examples within the demonstration pool. For the Mixture of Tasks setting, demonstrations are drawn from 28 diverse tasks, including MMLU, BigBench, StrategyQA, and CommonsenseQA. We demonstrate that combining BertScore-Recall (BSR) with an IF surrogate model can further improve performance, leading to average absolute accuracy gains of 0.37\% and 1.45\% for 3-shot and 5-shot setups when compared to traditional ICL metrics. In the Noisy ICL setting, we examine scenarios where demonstrations might be mislabeled or have adversarial noise. Our experiments show that reweighting traditional ICL selectors (BSR and Cosine Similarity) with IF-based selectors boosts accuracy by an average of 2.90\% for Cosine Similarity and 2.94\% for BSR on noisy GLUE benchmarks. For the adversarial sub-setting, we show the utility of using IFs for task-agnostic demonstration selection for backdoor attack mitigation. Showing a 32.89\% reduction in Attack Success Rate compared to task-aware methods. In sum, we propose a robust framework for demonstration selection that generalizes beyond traditional ICL, offering valuable insights into the role of IFs for Indirect ICL.

nan

Article 992

Title@2025-05-27 (2): CellCLAT: Preserving Topology and Trimming Redundancy in Self-Supervised Cellular Contrastive Learning

Title: CellCLAT: Preserving Topology and Trimming Redundancy in Self-Supervised Cellular Contrastive Learning

CellCLAT: Topologie und Trimming Redundanz im selbstüberwachten zellulären Kontrastiven Lernen erhalten

CellCLAT: 在自我维持的细胞抵触学习中保留地形学和三角再利用 2505.21587v1

Authors: Bin Qin, Qirui Ji, Jiangmeng Li, Yupeng Wang, Xuesong Wu, Jianwen Cao, Fanjiang Xu

Self-supervised topological deep learning (TDL) represents a nascent but underexplored area with significant potential for modeling higher-order interactions in simplicial complexes and cellular complexes to derive representations of unlabeled graphs. Compared to simplicial complexes, cellular complexes exhibit greater expressive power. However, the advancement in self-supervised learning for cellular TDL is largely hindered by two core challenges: \textit{extrinsic structural constraints} inherent to cellular complexes, and intrinsic semantic redundancy in cellular representations. The first challenge highlights that traditional graph augmentation techniques may compromise the integrity of higher-order cellular interactions, while the second underscores that topological redundancy in cellular complexes potentially diminish task-relevant information. To address these issues, we introduce Cellular Complex Contrastive Learning with Adaptive Trimming (CellCLAT), a twofold framework designed to adhere to the combinatorial constraints of cellular complexes while mitigating informational redundancy. Specifically, we propose a parameter perturbation-based augmentation method that injects controlled noise into cellular interactions without altering the underlying cellular structures, thereby preserving cellular topology during contrastive learning. Additionally, a cellular trimming scheduler is employed to mask gradient contributions from task-irrelevant cells through a bi-level meta-learning approach, effectively removing redundant topological elements while maintaining critical higher-order semantics. We provide theoretical justification and empirical validation to demonstrate that CellCLAT achieves substantial improvements over existing self-supervised graph learning methods, marking a significant attempt in this domain.

nan

Article 993

Title@2025-05-27 (2): Directed Semi-Simplicial Learning with Applications to Brain Activity Decoding

Title: Directed Semi-Simplicial Learning with Applications to Brain Activity Decoding

Direktes Semi-Simplizielles Lernen mit Anwendungen zur Entschlüsselung der Gehirnaktivität

定向半简化学习,应用脑活动解码 2505.17939v2

Authors: Manuel Lecha, Andrea Cavallo, Francesca Dominici, Ran Levi, Alessio Del Bue, Elvin Isufi, Pietro Morerio, Claudio Battiloro

Graph Neural Networks (GNNs) excel at learning from pairwise interactions but often overlook multi-way and hierarchical relationships. Topological Deep Learning (TDL) addresses this limitation by leveraging combinatorial topological spaces. However, existing TDL models are restricted to undirected settings and fail to capture the higher-order directed patterns prevalent in many complex systems, e.g., brain networks, where such interactions are both abundant and functionally significant. To fill this gap, we introduce Semi-Simplicial Neural Networks (SSNs), a principled class of TDL models that operate on semi-simplicial sets – combinatorial structures that encode directed higher-order motifs and their directional relationships. To enhance scalability, we propose Routing-SSNs, which dynamically select the most informative relations in a learnable manner. We prove that SSNs are strictly more expressive than standard graph and TDL models. We then introduce a new principled framework for brain dynamics representation learning, grounded in the ability of SSNs to provably recover topological descriptors shown to successfully characterize brain activity. Empirically, SSNs achieve state-of-the-art performance on brain dynamics classification tasks, outperforming the second-best model by up to 27%, and message passing GNNs by up to 50% in accuracy. Our results highlight the potential of principled topological models for learning from structured brain data, establishing a unique real-world case study for TDL. We also test SSNs on standard node classification and edge regression tasks, showing competitive performance. We will make the code and data publicly available.

nan

Article 994

Title@2025-05-27 (2): LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms

Title: LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms

LLaMEA-BO: Ein evolutionärer Algorithmus für die automatische Generierung Bayesischer Optimierungsalgorithmen

LLAMEA-BO:用于自动生成贝耶斯优化优化生成的大型语言模型进化演化算法 2505.21034v1

Authors: Wenhu Li, Niki van Stein, Thomas Bäck, Elena Raponi

Bayesian optimization (BO) is a powerful class of algorithms for optimizing expensive black-box functions, but designing effective BO algorithms remains a manual, expertise-driven task. Recent advancements in Large Language Models (LLMs) have opened new avenues for automating scientific discovery, including the automatic design of optimization algorithms. While prior work has used LLMs within optimization loops or to generate non-BO algorithms, we tackle a new challenge: Using LLMs to automatically generate full BO algorithm code. Our framework uses an evolution strategy to guide an LLM in generating Python code that preserves the key components of BO algorithms: An initial design, a surrogate model, and an acquisition function. The LLM is prompted to produce multiple candidate algorithms, which are evaluated on the established Black-Box Optimization Benchmarking (BBOB) test suite from the COmparing Continuous Optimizers (COCO) platform. Based on their performance, top candidates are selected, combined, and mutated via controlled prompt variations, enabling iterative refinement. Despite no additional fine-tuning, the LLM-generated algorithms outperform state-of-the-art BO baselines in 19 (out of 24) BBOB functions in dimension 5 and generalize well to higher dimensions, and different tasks (from the Bayesmark framework). This work demonstrates that LLMs can serve as algorithmic co-designers, offering a new paradigm for automating BO development and accelerating the discovery of novel algorithmic combinations. The source code is provided at https://github.com/Ewendawi/LLaMEA-BO.

nan

Article 995

Title@2025-05-27 (2): Optimizing Case-Based Reasoning System for Functional Test Script Generation with Large Language Models

Title: Optimizing Case-Based Reasoning System for Functional Test Script Generation with Large Language Models

Optimierung des Case-Based-Reasoning-Systems für die Generierung funktionaler Testskripte mit großen Sprachmodellen

为具有大语言模型的功能测试脚本生成优化基于个案的理由说明系统 2503.20576v3

Authors: Siyuan Guo, Huiwu Liu, Xiaolong Chen, Yuming Xie, Liang Zhang, Tao Han, Hechang Chen, Yi Chang, Jun Wang

In this work, we explore the potential of large language models (LLMs) for generating functional test scripts, which necessitates understanding the dynamically evolving code structure of the target software. To achieve this, we propose a case-based reasoning (CBR) system utilizing a 4R cycle (i.e., retrieve, reuse, revise, and retain), which maintains and leverages a case bank of test intent descriptions and corresponding test scripts to facilitate LLMs for test script generation. To improve user experience further, we introduce Re4, an optimization method for the CBR system, comprising reranking-based retrieval finetuning and reinforced reuse finetuning. Specifically, we first identify positive examples with high semantic and script similarity, providing reliable pseudo-labels for finetuning the retriever model without costly labeling. Then, we apply supervised finetuning, followed by a reinforcement learning finetuning stage, to align LLMs with our production scenarios, ensuring the faithful reuse of retrieved cases. Extensive experimental results on two product development units from Huawei Datacom demonstrate the superiority of the proposed CBR+Re4. Notably, we also show that the proposed Re4 method can help alleviate the repetitive generation issues with LLMs.

nan

Article 996

Title@2025-05-27 (2): Generalizable and Robust Spectral Method for Multi-view Representation Learning

Title: Generalizable and Robust Spectral Method for Multi-view Representation Learning

Verallgemeinerbare und robuste Spektralmethode für Multi-View Representative Learning

多视角代表制学习通用和强力光谱方法 2411.02138v3

Authors: Amitai Yacobi, Ofir Lindenbaum, Uri Shaham

Multi-view representation learning (MvRL) has garnered substantial attention in recent years, driven by the increasing demand for applications that can effectively process and analyze data from multiple sources. In this context, graph Laplacian-based MvRL methods have demonstrated remarkable success in representing multi-view data. However, these methods often struggle with generalization to new data and face challenges with scalability. Moreover, in many practical scenarios, multi-view data is contaminated by noise or outliers. In such cases, modern deep-learning-based MvRL approaches that rely on alignment or contrastive objectives present degraded performance in downstream tasks, as they may impose incorrect consistency between clear and corrupted data sources. We introduce $\textit{SpecRaGE}$, a novel fusion-based framework that integrates the strengths of graph Laplacian methods with the power of deep learning to overcome these challenges. SpecRage uses neural networks to learn parametric mapping that approximates a joint diagonalization of graph Laplacians. This solution bypasses the need for alignment while enabling generalizable and scalable learning of informative and meaningful representations. Moreover, it incorporates a meta-learning fusion module that dynamically adapts to data quality, ensuring robustness against outliers and noisy views. Our extensive experiments demonstrate that SpecRaGE outperforms state-of-the-art methods, particularly in scenarios with data contamination, paving the way for more reliable and efficient multi-view learning.

nan

Article 997

Title@2025-05-27 (2): FeatInv: Spatially resolved mapping from feature space to input space using conditional diffusion models

Title: FeatInv: Spatially resolved mapping from feature space to input space using conditional diffusion models

FeatInv: Räumlich aufgelöstes Mapping vom Feature Space zum Input Space mit bedingten Diffusionsmodellen

FeatInv:使用有条件扩散模型从地物空间到输入空间的空间空间的空间分辨率绘图 2505.21032v1

Authors: Nils Neukirch, Johanna Vielhaben, Nils Strodthoff

Internal representations are crucial for understanding deep neural networks, such as their properties and reasoning patterns, but remain difficult to interpret. While mapping from feature space to input space aids in interpreting the former, existing approaches often rely on crude approximations. We propose using a conditional diffusion model - a pretrained high-fidelity diffusion model conditioned on spatially resolved feature maps - to learn such a mapping in a probabilistic manner. We demonstrate the feasibility of this approach across various pretrained image classifiers from CNNs to ViTs, showing excellent reconstruction capabilities. Through qualitative comparisons and robustness analysis, we validate our method and showcase possible applications, such as the visualization of concept steering in input space or investigations of the composite nature of the feature space. This approach has broad potential for improving feature space understanding in computer vision models.

nan

Article 998

Title@2025-05-27 (2): TabAttackBench: A Benchmark for Adversarial Attacks on Tabular Data

Title: TabAttackBench: A Benchmark for Adversarial Attacks on Tabular Data

TabAttackBench: Ein Benchmark für feindliche Angriffe auf Tabellendaten

TabAttack Bench: 表格数据对抗性攻击基准 2505.21027v1

Authors: Zhipeng He, Chun Ouyang, Lijie Wen, Cong Liu, Catarina Moreira

Adversarial attacks pose a significant threat to machine learning models by inducing incorrect predictions through imperceptible perturbations to input data. While these attacks have been extensively studied in unstructured data like images, their application to tabular data presents new challenges. These challenges arise from the inherent heterogeneity and complex feature interdependencies in tabular data, which differ significantly from those in image data. To address these differences, it is crucial to consider imperceptibility as a key criterion specific to tabular data. Most current research focuses primarily on achieving effective adversarial attacks, often overlooking the importance of maintaining imperceptibility. To address this gap, we propose a new benchmark for adversarial attacks on tabular data that evaluates both effectiveness and imperceptibility. In this study, we assess the effectiveness and imperceptibility of five adversarial attacks across four models using eleven tabular datasets, including both mixed and numerical-only datasets. Our analysis explores how these factors interact and influence the overall performance of the attacks. We also compare the results across different dataset types to understand the broader implications of these findings. The findings from this benchmark provide valuable insights for improving the design of adversarial attack algorithms, thereby advancing the field of adversarial machine learning on tabular data.

nan

Article 999

Title@2025-05-27 (2): PaSa: An LLM Agent for Comprehensive Academic Paper Search

Title: PaSa: An LLM Agent for Comprehensive Academic Paper Search

PaSa: Ein LLM-Agent für umfassende wissenschaftliche Papiersuche

Pasa: 法学硕士全面学术论文搜索代理 2501.10120v2

Authors: Yichen He, Guanhua Huang, Peiyuan Feng, Yuan Lin, Yuchen Zhang, Hang Li, Weinan E

We introduce PaSa, an advanced Paper Search agent powered by large language models. PaSa can autonomously make a series of decisions, including invoking search tools, reading papers, and selecting relevant references, to ultimately obtain comprehensive and accurate results for complex scholar queries. We optimize PaSa using reinforcement learning with a synthetic dataset, AutoScholarQuery, which includes 35k fine-grained academic queries and corresponding papers sourced from top-tier AI conference publications. Additionally, we develop RealScholarQuery, a benchmark collecting real-world academic queries to assess PaSa performance in more realistic scenarios. Despite being trained on synthetic data, PaSa significantly outperforms existing baselines on RealScholarQuery, including Google, Google Scholar, Google with GPT-4o for paraphrased queries, ChatGPT (search-enabled GPT-4o), GPT-o1, and PaSa-GPT-4o (PaSa implemented by prompting GPT-4o). Notably, PaSa-7B surpasses the best Google-based baseline, Google with GPT-4o, by 37.78% in recall@20 and 39.90% in recall@50, and exceeds PaSa-GPT-4o by 30.36% in recall and 4.25% in precision. Model, datasets, and code are available at https://github.com/bytedance/pasa.

nan

Article 1000

Title@2025-05-27 (2): Multi-Mode Process Control Using Multi-Task Inverse Reinforcement Learning

Title: Multi-Mode Process Control Using Multi-Task Inverse Reinforcement Learning

Multi-Mode-Prozesssteuerung mit Multi-Task Inverse Verstärkungslernen

利用多任务反向强化学习进行多模式程序控制 2505.21026v1

Authors: Runze Lin, Junghui Chen, Biao Huang, Lei Xie, Hongye Su

In the era of Industry 4.0 and smart manufacturing, process systems engineering must adapt to digital transformation. While reinforcement learning offers a model-free approach to process control, its applications are limited by the dependence on accurate digital twins and well-designed reward functions. To address these limitations, this paper introduces a novel framework that integrates inverse reinforcement learning (IRL) with multi-task learning for data-driven, multi-mode control design. Using historical closed-loop data as expert demonstrations, IRL extracts optimal reward functions and control policies. A latent-context variable is incorporated to distinguish modes, enabling the training of mode-specific controllers. Case studies on a continuous stirred tank reactor and a fed-batch bioreactor validate the effectiveness of this framework in handling multi-mode data and training adaptable controllers.

nan

Article 1001

Title@2025-05-27 (2): Text-Queried Audio Source Separation via Hierarchical Modeling

Title: Text-Queried Audio Source Separation via Hierarchical Modeling

Textbefragte Audioquelle Trennung über Hierarchische Modellierung

通过等级制建模模式对文本查询的音频源分离 2505.21025v1

Authors: Xinlei Yin, Xiulian Peng, Xue Jiang, Zhiwei Xiong, Yan Lu

Target audio source separation with natural language queries presents a promising paradigm for extracting arbitrary audio events through arbitrary text descriptions. Existing methods mainly face two challenges, the difficulty in jointly modeling acoustic-textual alignment and semantic-aware separation within a blindly-learned single-stage architecture, and the reliance on large-scale accurately-labeled training data to compensate for inefficient cross-modal learning and separation. To address these challenges, we propose a hierarchical decomposition framework, HSM-TSS, that decouples the task into global-local semantic-guided feature separation and structure-preserving acoustic reconstruction. Our approach introduces a dual-stage mechanism for semantic separation, operating on distinct global and local semantic feature spaces. We first perform global-semantic separation through a global semantic feature space aligned with text queries. A Q-Audio architecture is employed to align audio and text modalities, serving as pretrained global-semantic encoders. Conditioned on the predicted global feature, we then perform the second-stage local-semantic separation on AudioMAE features that preserve time-frequency structures, followed by acoustic reconstruction. We also propose an instruction processing pipeline to parse arbitrary text queries into structured operations, extraction or removal, coupled with audio descriptions, enabling flexible sound manipulation. Our method achieves state-of-the-art separation performance with data-efficient training while maintaining superior semantic consistency with queries in complex auditory scenes.

nan

Article 1002

Title@2025-05-27 (2): Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers

Title: Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers

Pause Tokens erhöhen streng die Expressivität der konstant-tiefen Transformer

严格提高常数面变换器的表达性 2505.21024v1

Authors: Charles London, Varun Kanade

Pause tokens, simple filler symbols such as “…”, consistently improve Transformer performance on both language and mathematical tasks, yet their theoretical effect remains unexplained. We provide the first formal separation result, proving that adding pause tokens to constant-depth, logarithmic-width Transformers strictly increases their computational expressivity. With bounded-precision activations, Transformers without pause tokens compute only a strict subset of $\mathsf{AC}^0$ functions, while adding a polynomial number of pause tokens allows them to express the entire class. For logarithmic-precision Transformers, we show that adding pause tokens achieves expressivity equivalent to $\mathsf{TC}^0$, matching known upper bounds. Empirically, we demonstrate that two-layer causally masked Transformers can learn parity when supplied with pause tokens, a function that they appear unable to learn without them. Our results provide a rigorous theoretical explanation for prior empirical findings, clarify how pause tokens interact with width, depth, and numeric precision, and position them as a distinct mechanism, complementary to chain-of-thought prompting, for enhancing Transformer reasoning.

nan

Article 1003

Title@2025-05-27 (2): NeuralOM: Neural Ocean Model for Subseasonal-to-Seasonal Simulation

Title: NeuralOM: Neural Ocean Model for Subseasonal-to-Seasonal Simulation

NeuralOM: Neurales Ozeanmodell für die Simulation von Subsaisonal-zu-Seasonal

神经力OM:次季节到季节模拟神经海洋模型 2505.21020v1

Authors: Yuan Gao, Ruiqi Shu, Hao Wu, Fan Xu, Yanfei Xiang, Ruijian Gou, Qingsong Wen, Xian Wu, Xiaomeng Huang

Accurate Subseasonal-to-Seasonal (S2S) ocean simulation is critically important for marine research, yet remains challenging due to its substantial thermal inertia and extended time delay. Machine learning (ML)-based models have demonstrated significant advancements in simulation accuracy and computational efficiency compared to traditional numerical methods. Nevertheless, a significant limitation of current ML models for S2S ocean simulation is their inadequate incorporation of physical consistency and the slow-changing properties of the ocean system. In this work, we propose a neural ocean model (NeuralOM) for S2S ocean simulation with a multi-scale interactive graph neural network to emulate diverse physical phenomena associated with ocean systems effectively. Specifically, we propose a multi-stage framework tailored to model the ocean’s slowly changing nature. Additionally, we introduce a multi-scale interactive messaging module to capture complex dynamical behaviors, such as gradient changes and multiplicative coupling relationships inherent in ocean dynamics. Extensive experimental evaluations confirm that our proposed NeuralOM outperforms state-of-the-art models in S2S and extreme event simulation. The codes are available at https://github.com/YuanGao-YG/NeuralOM.

nan

Article 1004

Title@2025-05-27 (2): Cardiac Digital Twins at Scale from MRI: Open Tools and Representative Models from ~55000 UK Biobank Participants

Title: Cardiac Digital Twins at Scale from MRI: Open Tools and Representative Models from ~55000 UK Biobank Participants

Cardiac Digital Twins auf Scale von MRI: Offene Werkzeuge und repräsentative Modelle von ~55000 britischen Biobank-Teilnehmern

来自MRI的大规模心脏病数字双对:来自~55000英国生物库参与者的开放工具和代表模型 2505.21019v1

Authors: Devran Ugurlu, Shuang Qian, Elliot Fairweather, Charlene Mauger, Bram Ruijsink, Laura Dal Toso, Yu Deng, Marina Strocchi, Reza Razavi, Alistair Young, Pablo Lamata, Steven Niederer, Martin Bishop

A cardiac digital twin is a virtual replica of a patient’s heart for screening, diagnosis, prognosis, risk assessment, and treatment planning of cardiovascular diseases. This requires an anatomically accurate patient-specific 3D structural representation of the heart, suitable for electro-mechanical simulations or study of disease mechanisms. However, generation of cardiac digital twins at scale is demanding and there are no public repositories of models across demographic groups. We describe an automatic open-source pipeline for creating patient-specific left and right ventricular meshes from cardiovascular magnetic resonance images, its application to a large cohort of ~55000 participants from UK Biobank, and the construction of the most comprehensive cohort of adult heart models to date, comprising 1423 representative meshes across sex (male, female), body mass index (range: 16 - 42 kg/m$^2$) and age (range: 49 - 80 years). Our code is available at https://github.com/cdttk/biv-volumetric-meshing/tree/plos2025 , and pre-trained networks, representative volumetric meshes with fibers and UVCs will be made available soon.

nan

Article 1005

Title@2025-05-27 (2): Federated Instrumental Variable Analysis via Federated Generalized Method of Moments

Title: Federated Instrumental Variable Analysis via Federated Generalized Method of Moments

Federated Instrumental Variable Analysis via Federated Generalized Method of Moments

通过联邦通用时数方法进行的联邦仪器变量分析 2505.21012v1

Authors: Geetika, Somya Tyagi, Bapi Chatterjee

Instrumental variables (IV) analysis is an important applied tool for areas such as healthcare and consumer economics. For IV analysis in high-dimensional settings, the Generalized Method of Moments (GMM) using deep neural networks offers an efficient approach. With non-i.i.d. data sourced from scattered decentralized clients, federated learning is a popular paradigm for training the models while promising data privacy. However, to our knowledge, no federated algorithm for either GMM or IV analysis exists to date. In this work, we introduce federated instrumental variables analysis (FedIV) via federated generalized method of moments (FedGMM). We formulate FedGMM as a federated zero-sum game defined by a federated non-convex non-concave minimax optimization problem, which is solved using federated gradient descent ascent (FedGDA) algorithm. One key challenge arises in theoretically characterizing the federated local optimality. To address this, we present properties and existence results of clients’ local equilibria via FedGDA limit points. Thereby, we show that the federated solution consistently estimates the local moment conditions of every participating client. The proposed algorithm is backed by extensive experiments to demonstrate the efficacy of our approach.

nan

Article 1006

Title@2025-05-27 (2): Unified Alignment Protocol: Making Sense of the Unlabeled Data in New Domains

Title: Unified Alignment Protocol: Making Sense of the Unlabeled Data in New Domains

Unified Alignment Protocol: Sense der unmarkierten Daten in neuen Domains

统一对齐协议: 在新域域中感知无标签数据 2505.21010v1

Authors: Sabbir Ahmed, Mamshad Nayeem Rizve, Abdullah Al Arafat, Jacqueline Liu, Rahim Hossain, Mohaiminul Al Nahian, Adnan Siraj Rakin

Semi-Supervised Federated Learning (SSFL) is gaining popularity over conventional Federated Learning in many real-world applications. Due to the practical limitation of limited labeled data on the client side, SSFL considers that participating clients train with unlabeled data, and only the central server has the necessary resources to access limited labeled data, making it an ideal fit for real-world applications (e.g., healthcare). However, traditional SSFL assumes that the data distributions in the training phase and testing phase are the same. In practice, however, domain shifts frequently occur, making it essential for SSFL to incorporate generalization capabilities and enhance their practicality. The core challenge is improving model generalization to new, unseen domains while the client participate in SSFL. However, the decentralized setup of SSFL and unsupervised client training necessitates innovation to achieve improved generalization across domains. To achieve this, we propose a novel framework called the Unified Alignment Protocol (UAP), which consists of an alternating two-stage training process. The first stage involves training the server model to learn and align the features with a parametric distribution, which is subsequently communicated to clients without additional communication overhead. The second stage proposes a novel training algorithm that utilizes the server feature distribution to align client features accordingly. Our extensive experiments on standard domain generalization benchmark datasets across multiple model architectures reveal that proposed UAP successfully achieves SOTA generalization performance in SSFL setting.

nan

Article 1007

Title@2025-05-27 (2): Transformers in Protein: A Survey

Title: Transformers in Protein: A Survey

Transformer in Protein: Eine Umfrage

蛋白质变换器:调查 2505.20098v2

Authors: Xiaowen Ling, Zhiqiang Li, Yanbin Wang, Zhuhong You

As protein informatics advances rapidly, the demand for enhanced predictive accuracy, structural analysis, and functional understanding has intensified. Transformer models, as powerful deep learning architectures, have demonstrated unprecedented potential in addressing diverse challenges across protein research. However, a comprehensive review of Transformer applications in this field remains lacking. This paper bridges this gap by surveying over 100 studies, offering an in-depth analysis of practical implementations and research progress of Transformers in protein-related tasks. Our review systematically covers critical domains, including protein structure prediction, function prediction, protein-protein interaction analysis, functional annotation, and drug discovery/target identification. To contextualize these advancements across various protein domains, we adopt a domain-oriented classification system. We first introduce foundational concepts: the Transformer architecture and attention mechanisms, categorize Transformer variants tailored for protein science, and summarize essential protein knowledge. For each research domain, we outline its objectives and background, critically evaluate prior methods and their limitations, and highlight transformative contributions enabled by Transformer models. We also curate and summarize pivotal datasets and open-source code resources to facilitate reproducibility and benchmarking. Finally, we discuss persistent challenges in applying Transformers to protein informatics and propose future research directions. This review aims to provide a consolidated foundation for the synergistic integration of Transformer and protein informatics, fostering further innovation and expanded applications in the field.

nan

Article 1008

Title@2025-05-27 (2): Fairness in Federated Learning: Fairness for Whom?

Title: Fairness in Federated Learning: Fairness for Whom?

Fairness im Federated Learning: Fairness für wen?

联邦学习中的公平性:谁的公平性? 2505.21584v1

Authors: Afaf Taik, Khaoula Chehbouni, Golnoosh Farnadi

Fairness in federated learning has emerged as a rapidly growing area of research, with numerous works proposing formal definitions and algorithmic interventions. Yet, despite this technical progress, fairness in FL is often defined and evaluated in ways that abstract away from the sociotechnical contexts in which these systems are deployed. In this paper, we argue that existing approaches tend to optimize narrow system level metrics, such as performance parity or contribution-based rewards, while overlooking how harms arise throughout the FL lifecycle and how they impact diverse stakeholders. We support this claim through a critical analysis of the literature, based on a systematic annotation of papers for their fairness definitions, design decisions, evaluation practices, and motivating use cases. Our analysis reveals five recurring pitfalls: 1) fairness framed solely through the lens of server client architecture, 2) a mismatch between simulations and motivating use-cases and contexts, 3) definitions that conflate protecting the system with protecting its users, 4) interventions that target isolated stages of the lifecycle while neglecting upstream and downstream effects, 5) and a lack of multi-stakeholder alignment where multiple fairness definitions can be relevant at once. Building on these insights, we propose a harm centered framework that links fairness definitions to concrete risks and stakeholder vulnerabilities. We conclude with recommendations for more holistic, context-aware, and accountable fairness research in FL.

nan

Article 1009

Title@2025-05-27 (2): Efficient and Unbiased Sampling from Boltzmann Distributions via Variance-Tuned Diffusion Models

Title: Efficient and Unbiased Sampling from Boltzmann Distributions via Variance-Tuned Diffusion Models

Effiziente und unvoreingenommene Probenahme von Boltzmann Distributionen über Variance-Tuned Diffusion Modelle

Boltzmann分销公司通过差异传播模型进行高效和无偏见的抽样 2505.21005v1

Authors: Fengzhe Zhang, Laurence I. Midgley, José Miguel Hernández-Lobato

Score-based diffusion models (SBDMs) are powerful amortized samplers for Boltzmann distributions; however, imperfect score estimates bias downstream Monte Carlo estimates. Classical importance sampling (IS) can correct this bias, but computing exact likelihoods requires solving the probability-flow ordinary differential equation (PF-ODE), a procedure that is prohibitively costly and scales poorly with dimensionality. We introduce Variance-Tuned Diffusion Importance Sampling (VT-DIS), a lightweight post-training method that adapts the per-step noise covariance of a pretrained SBDM by minimizing the $\alpha$-divergence ($\alpha=2$) between its forward diffusion and reverse denoising trajectories. VT-DIS assigns a single trajectory-wise importance weight to the joint forward-reverse process, yielding unbiased expectation estimates at test time with negligible overhead compared to standard sampling. On the DW-4, LJ-13, and alanine-dipeptide benchmarks, VT-DIS achieves effective sample sizes of approximately 80 %, 35 %, and 3.5 %, respectively, while using only a fraction of the computational budget required by vanilla diffusion + IS or PF-ODE-based IS.

nan

Article 1010

Title@2025-05-27 (2): BIPNN: Learning to Solve Binary Integer Programming via Hypergraph Neural Networks

Title: BIPNN: Learning to Solve Binary Integer Programming via Hypergraph Neural Networks

BIPNN: Lernen, Binäre Integer-Programmierung über Hypergraph Neuronale Netzwerke zu lösen

BIPNN: 学习通过超光速神经网络解决二元整数编程 2505.20997v1

Authors: Sen Bai, Chunqi Yang, Xin Bai, Xin Zhang, Zhengang Jiang

Binary (0-1) integer programming (BIP) is pivotal in scientific domains requiring discrete decision-making. As the advance of AI computing, recent works explore neural network-based solvers for integer linear programming (ILP) problems. Yet, they lack scalability for tackling nonlinear challenges. To handle nonlinearities, state-of-the-art Branch-and-Cut solvers employ linear relaxations, leading to exponential growth in auxiliary variables and severe computation limitations. To overcome these limitations, we propose BIPNN (Binary Integer Programming Neural Network), an unsupervised learning framework to solve nonlinear BIP problems via hypergraph neural networks (HyperGNN). Specifically, BIPNN reformulates BIPs-constrained, discrete, and nonlinear (sin, log, exp) optimization problems-into unconstrained, differentiable, and polynomial loss functions. The reformulation stems from the observation of a precise one-to-one mapping between polynomial BIP objectives and hypergraph structures, enabling the unsupervised training of HyperGNN to optimize BIP problems in an end-to-end manner. On this basis, we propose a GPU-accelerated and continuous-annealing-enhanced training pipeline for BIPNN. The pipeline enables BIPNN to optimize large-scale nonlinear terms in BIPs fully in parallel via straightforward gradient descent, thus significantly reducing the training cost while ensuring the generation of discrete, high-quality solutions. Extensive experiments on synthetic and real-world datasets highlight the superiority of our approach.

nan

Article 1011

Title@2025-05-27 (2): Efficient Identity and Position Graph Embedding via Spectral-Based Random Feature Aggregation

Title: Efficient Identity and Position Graph Embedding via Spectral-Based Random Feature Aggregation

Effiziente Einbettung von Identitäts- und Positionsdiagrammen über spektralbasierte Random Feature Aggregation

通过光谱-基于随机地物聚合的高效身份和位置图嵌入 2505.20992v1

Authors: Meng Qin, Jiahong Liu, Irwin King

Graph neural networks (GNNs), which capture graph structures via a feature aggregation mechanism following the graph embedding framework, have demonstrated a powerful ability to support various tasks. According to the topology properties (e.g., structural roles or community memberships of nodes) to be preserved, graph embedding can be categorized into identity and position embedding. However, it is unclear for most GNN-based methods which property they can capture. Some of them may also suffer from low efficiency and scalability caused by several time- and space-consuming procedures (e.g., feature extraction and training). From a perspective of graph signal processing, we find that high- and low-frequency information in the graph spectral domain may characterize node identities and positions, respectively. Based on this investigation, we propose random feature aggregation (RFA) for efficient identity and position embedding, serving as an extreme ablation study regarding GNN feature aggregation. RFA (i) adopts a spectral-based GNN without learnable parameters as its backbone, (ii) only uses random noises as inputs, and (iii) derives embeddings via just one feed-forward propagation (FFP). Inspired by degree-corrected spectral clustering, we further introduce a degree correction mechanism to the GNN backbone. Surprisingly, our experiments demonstrate that two variants of RFA with high- and low-pass filters can respectively derive informative identity and position embeddings via just one FFP (i.e., without any training). As a result, RFA can achieve a better trade-off between quality and efficiency for both identity and position embedding over various baselines.

nan

Article 1012

Title@2025-05-27 (2): Identifying Super Spreaders in Multilayer Networks

Title: Identifying Super Spreaders in Multilayer Networks

Identifizieren von Superspreizern in Multilayer-Netzwerken

识别多层网络中的超级传播器 2505.20980v1

Authors: Michał Czuba, Mateusz Stolarski, Adam Piróg, Piotr Bielak, Piotr Bródka

Identifying super-spreaders can be framed as a subtask of the influence maximisation problem. It seeks to pinpoint agents within a network that, if selected as single diffusion seeds, disseminate information most effectively. Multilayer networks, a specific class of heterogeneous graphs, can capture diverse types of interactions (e.g., physical-virtual or professional-social), and thus offer a more accurate representation of complex relational structures. In this work, we introduce a novel approach to identifying super-spreaders in such networks by leveraging graph neural networks. To this end, we construct a dataset by simulating information diffusion across hundreds of networks - to the best of our knowledge, the first of its kind tailored specifically to multilayer networks. We further formulate the task as a variation of the ranking prediction problem based on a four-dimensional vector that quantifies each agent’s spreading potential: (i) the number of activations; (ii) the duration of the diffusion process; (iii) the peak number of activations; and (iv) the simulation step at which this peak occurs. Our model, TopSpreadersNetwork, comprises a relationship-agnostic encoder and a custom aggregation layer. This design enables generalisation to previously unseen data and adapts to varying graph sizes. In an extensive evaluation, we compare our model against classic centrality-based heuristics and competitive deep learning methods. The results, obtained across a broad spectrum of real-world and synthetic multilayer networks, demonstrate that TopSpreadersNetwork achieves superior performance in identifying high-impact nodes, while also offering improved interpretability through its structured output.

nan

Article 1013

Title@2025-05-27 (2): Deep k-grouping: An Unsupervised Learning Framework for Combinatorial Optimization on Graphs and Hypergraphs

Title: Deep k-grouping: An Unsupervised Learning Framework for Combinatorial Optimization on Graphs and Hypergraphs

Deep k-grouping: Ein unüberwachter Lernrahmen für die kombinatorische Optimierung von Graphen und Hypergraphen

深 k 组: 图形和高光谱组合优化的无人监督的学习框架 2505.20972v1

Authors: Sen Bai, Chunqi Yang, Xin Bai, Xin Zhang, Zhengang Jiang

Along with AI computing shining in scientific discovery, its potential in the combinatorial optimization (CO) domain has also emerged in recent years. Yet, existing unsupervised neural network solvers struggle to solve $k$-grouping problems (e.g., coloring, partitioning) on large-scale graphs and hypergraphs, due to limited computational frameworks. In this work, we propose Deep $k$-grouping, an unsupervised learning-based CO framework. Specifically, we contribute: Novel one-hot encoded polynomial unconstrained binary optimization (OH-PUBO), a formulation for modeling k-grouping problems on graphs and hypergraphs (e.g., graph/hypergraph coloring and partitioning); GPU-accelerated algorithms for large-scale k-grouping CO problems. Deep $k$-grouping employs the relaxation of large-scale OH-PUBO objectives as differentiable loss functions and trains to optimize them in an unsupervised manner. To ensure scalability, it leverages GPU-accelerated algorithms to unify the training pipeline; A Gini coefficient-based continuous relaxation annealing strategy to enforce discreteness of solutions while preventing convergence to local optima. Experimental results demonstrate that Deep $k$-grouping outperforms existing neural network solvers and classical heuristics such as SCIP and Tabu.

nan

Article 1014

Title@2025-05-27 (2): Semantic Communication meets System 2 ML: How Abstraction, Compositionality and Emergent Languages Shape Intelligence

Title: Semantic Communication meets System 2 ML: How Abstraction, Compositionality and Emergent Languages Shape Intelligence

Semantische Kommunikation trifft System 2 ML: Wie Abstraktion, Kompositionalität und Emergente Sprachen Formintelligenz

语义通信满足系统2 ML:如何抽象、组成和新兴语言形式情报 2505.20964v1

Authors: Mehdi Bennis, Salem Lahlou

The trajectories of 6G and AI are set for a creative collision. However, current visions for 6G remain largely incremental evolutions of 5G, while progress in AI is hampered by brittle, data-hungry models that lack robust reasoning capabilities. This paper argues for a foundational paradigm shift, moving beyond the purely technical level of communication toward systems capable of semantic understanding and effective, goal-oriented interaction. We propose a unified research vision rooted in the principles of System-2 cognition, built upon three pillars: Abstraction, enabling agents to learn meaningful world models from raw sensorimotor data; Compositionality, providing the algebraic tools to combine learned concepts and subsystems; and Emergent Communication, allowing intelligent agents to create their own adaptive and grounded languages. By integrating these principles, we lay the groundwork for truly intelligent systems that can reason, adapt, and collaborate, unifying advances in wireless communications, machine learning, and robotics under a single coherent framework.

nan

Article 1015

Title@2025-05-27 (2): Resampling Filter Design for Multirate Neural Audio Effect Processing

Title: Resampling Filter Design for Multirate Neural Audio Effect Processing

Resampling Filter Design für Multirate Neural Audio Effect Processing

多立体神经音频效果处理的抽取过滤器设计 2501.18470v2

Authors: Alistair Carson, Vesa Välimäki, Alec Wright, Stefan Bilbao

Neural networks have become ubiquitous in audio effects modelling, especially for guitar amplifiers and distortion pedals. One limitation of such models is that the sample rate of the training data is implicitly encoded in the model weights and therefore not readily adjustable at inference. Recent work explored modifications to recurrent neural network architecture to approximate a sample rate independent system, enabling audio processing at a rate that differs from the original training rate. This method works well for integer oversampling and can reduce aliasing caused by nonlinear activation functions. For small fractional changes in sample rate, fractional delay filters can be used to approximate sample rate independence, but in some cases this method fails entirely. Here, we explore the use of real-time signal resampling at the input and output of the neural network as an alternative solution. We investigate several resampling filter designs and show that a two-stage design consisting of a half-band IIR filter cascaded with a Kaiser window FIR filter can give similar or better results to the previously proposed model adjustment method with many fewer filtering operations per sample and less than one millisecond of latency at typical audio rates. Furthermore, we investigate interpolation and decimation filters for the task of integer oversampling and show that cascaded half-band IIR and FIR designs can be used in conjunction with the model adjustment method to reduce aliasing in a range of distortion effect models.

nan

Article 1016

Title@2025-05-27 (2): Efficient and Microphone-Fault-Tolerant 3D Sound Source Localization

Title: Efficient and Microphone-Fault-Tolerant 3D Sound Source Localization

Effiziente und Mikrofon-Fehler-Tolerante 3D-Soundquelle Lokalisierung

高效的麦克风和麦克风-默认的 3D 声音源源本地化 2505.20961v1

Authors: Yiyuan Yang, Shitong Xu, Niki Trigoni, Andrew Markham

Sound source localization (SSL) is a critical technology for determining the position of sound sources in complex environments. However, existing methods face challenges such as high computational costs and precise calibration requirements, limiting their deployment in dynamic or resource-constrained environments. This paper introduces a novel 3D SSL framework, which uses sparse cross-attention, pretraining, and adaptive signal coherence metrics, to achieve accurate and computationally efficient localization with fewer input microphones. The framework is also fault-tolerant to unreliable or even unknown microphone position inputs, ensuring its applicability in real-world scenarios. Preliminary experiments demonstrate its scalability for multi-source localization without requiring additional hardware. This work advances SSL by balancing the model’s performance and efficiency and improving its robustness for real-world scenarios.

nan

Article 1017

Title@2025-05-27 (2): Personalized Clustering via Targeted Representation Learning

Title: Personalized Clustering via Targeted Representation Learning

Personalisiertes Clustering über gezieltes Repräsentationslernen

通过有针对性的代表学习进行个性化集群组合 2412.13690v3

Authors: Xiwen Geng, Suyun Zhao, Yixin Yu, Borui Peng, Pan Du, Hong Chen, Cuiping Li, Mengdie Wang

Clustering traditionally aims to reveal a natural grouping structure within unlabeled data. However, this structure may not always align with users’ preferences. In this paper, we propose a personalized clustering method that explicitly performs targeted representation learning by interacting with users via modicum task information (e.g., $\textit{must-link}$ or $\textit{cannot-link}$ pairs) to guide the clustering direction. We query users with the most informative pairs, i.e., those pairs most hard to cluster and those most easy to miscluster, to facilitate the representation learning in terms of the clustering preference. Moreover, by exploiting attention mechanism, the targeted representation is learned and augmented. By leveraging the targeted representation and constrained contrastive loss as well, personalized clustering is obtained. Theoretically, we verify that the risk of personalized clustering is tightly bounded, guaranteeing that active queries to users do mitigate the clustering risk. Experimentally, extensive results show that our method performs well across different clustering tasks and datasets, even when only a limited number of queries are available.

nan

Article 1018

Title@2025-05-27 (2): Unveiling Impact of Frequency Components on Membership Inference Attacks for Diffusion Models

Title: Unveiling Impact of Frequency Components on Membership Inference Attacks for Diffusion Models

Auswirkungen von Frequenzkomponenten auf Mitgliedschafts-Inferenzangriffe für Diffusionsmodelle enthüllen

频率组成部分对传播模型的传播成员推断攻击的不懈影响 2505.20955v1

Authors: Puwei Lian, Yujun Cai, Songze Li

Diffusion models have achieved tremendous success in image generation, but they also raise significant concerns regarding privacy and copyright issues. Membership Inference Attacks (MIAs) are designed to ascertain whether specific data were utilized during a model’s training phase. As current MIAs for diffusion models typically exploit the model’s image prediction ability, we formalize them into a unified general paradigm which computes the membership score for membership identification. Under this paradigm, we empirically find that existing attacks overlook the inherent deficiency in how diffusion models process high-frequency information. Consequently, this deficiency leads to member data with more high-frequency content being misclassified as hold-out data, and hold-out data with less high-frequency content tend to be misclassified as member data. Moreover, we theoretically demonstrate that this deficiency reduces the membership advantage of attacks, thereby interfering with the effective discrimination of member data and hold-out data. Based on this insight, we propose a plug-and-play high-frequency filter module to mitigate the adverse effects of the deficiency, which can be seamlessly integrated into any attacks within this general paradigm without additional time costs. Extensive experiments corroborate that this module significantly improves the performance of baseline attacks across different datasets and models.

nan

Article 1019

Title@2025-05-27 (2): More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives

Title: More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives

Mehr ist nicht immer besser? Viel-Shot-In-Context-Lernen mit differenzierten und neugewichtigen Zielen verbessern

越多越好,越多越好?用差异化和再加权目标,加强多热化的内流学习 2501.04070v3

Authors: Xiaoqing Zhang, Ang Lv, Yuhan Liu, Flood Sung, Wei Liu, Jian Luan, Shuo Shang, Xiuying Chen, Rui Yan

Large language models (LLMs) excel at few-shot in-context learning (ICL) without requiring parameter updates. However, as ICL demonstrations increase from a few to many, performance tends to plateau and eventually decline. We identify two primary causes for this trend: the suboptimal negative log-likelihood (NLL) optimization objective and the incremental data noise. To address these issues, we introduce \textit{DrICL}, a novel optimization method that enhances model performance through \textit{Differentiated} and \textit{Reweighting} objectives. Globally, DrICL utilizes differentiated learning to optimize the NLL objective, ensuring that many-shot performance surpasses zero-shot levels. Locally, it dynamically adjusts the weighting of many-shot demonstrations by leveraging cumulative advantages inspired by reinforcement learning, thereby mitigating the impact of noisy data. Recognizing the lack of multi-task datasets with diverse many-shot distributions, we develop the \textit{Many-Shot ICL Benchmark} (ICL-50)-a large-scale benchmark of 50 tasks that cover shot numbers from 1 to 350 within sequences of up to 8,000 tokens-for both fine-tuning and evaluation purposes. Experimental results demonstrate that LLMs enhanced with DrICL achieve significant improvements in many-shot setups across various tasks, including both in-domain and out-of-domain scenarios. We release the code and dataset hoping to facilitate further research in many-shot ICL\footnote{https://github.com/xiaoqzhwhu/DrICL}.

nan

Article 1020

Title@2025-05-27 (2): Double Descent Meets Out-of-Distribution Detection: Theoretical Insights and Empirical Analysis on the role of model complexity

Title: Double Descent Meets Out-of-Distribution Detection: Theoretical Insights and Empirical Analysis on the role of model complexity

Doppelter Abstieg trifft auf Out-of-Distribution Detection: Theoretische Erkenntnisse und empirische Analyse zur Rolle der Modellkomplexität

双重人种与分配外探测:关于模型复杂性作用的理论洞察和经验分析 2411.02184v2

Authors: Mouïn Ben Ammar, David Brellmann, Arturo Mendoza, Antoine Manzanera, Gianni Franchi

Out-of-distribution (OOD) detection is essential for ensuring the reliability and safety of machine learning systems. In recent years, it has received increasing attention, particularly through post-hoc detection and training-based methods. In this paper, we focus on post-hoc OOD detection, which enables identifying OOD samples without altering the model’s training procedure or objective. Our primary goal is to investigate the relationship between model capacity and its OOD detection performance. Specifically, we aim to answer the following question: Does the Double Descent phenomenon manifest in post-hoc OOD detection? This question is crucial, as it can reveal whether overparameterization, which is already known to benefit generalization, can also enhance OOD detection. Despite the growing interest in these topics by the classic supervised machine learning community, this intersection remains unexplored for OOD detection. We empirically demonstrate that the Double Descent effect does indeed appear in post-hoc OOD detection. Furthermore, we provide theoretical insights to explain why this phenomenon emerges in such setting. Finally, we show that the overparameterized regime does not yield superior results consistently, and we propose a method to identify the optimal regime for OOD detection based on our observations.

nan

Article 1021

Title@2025-05-27 (2): Recovering Fairness Directly from Modularity: a New Way for Fair Community Partitioning

Title: Recovering Fairness Directly from Modularity: a New Way for Fair Community Partitioning

Fairness direkt aus Modularität zu gewinnen: ein neuer Weg für faire Gemeinschaftspartitionierung

直接从模式中恢复公平:公平社区分割的新途径 2505.22684v1

Authors: Yufeng Wang, Yiguang Bai, Tianqing Zhu, Ismail Ben Ayed, Jing Yuan

Community partitioning is crucial in network analysis, with modularity optimization being the prevailing technique. However, traditional modularity-based methods often overlook fairness, a critical aspect in real-world applications. To address this, we introduce protected group networks and propose a novel fairness-modularity metric. This metric extends traditional modularity by explicitly incorporating fairness, and we prove that minimizing it yields naturally fair partitions for protected groups while maintaining theoretical soundness. We develop a general optimization framework for fairness partitioning and design the efficient Fair Fast Newman (FairFN) algorithm, enhancing the Fast Newman (FN) method to optimize both modularity and fairness. Experiments show FairFN achieves significantly improved fairness and high-quality partitions compared to state-of-the-art methods, especially on unbalanced datasets.

nan

Article 1022

Title@2025-05-27 (2): Scattering Networks on Noncommutative Finite Groups

Title: Scattering Networks on Noncommutative Finite Groups

Streunetze für nichtkommutative Finite-Gruppen

关于非调解性有限集团的散射网络 2505.20950v1

Authors: Maria Teresa Arias, Davide Barbieri, Eugenio Hernández

Scattering Networks were initially designed to elucidate the behavior of early layers in Convolutional Neural Networks (CNNs) over Euclidean spaces and are grounded in wavelets. In this work, we introduce a scattering transform on an arbitrary finite group (not necessarily abelian) within the context of group-equivariant convolutional neural networks (G-CNNs). We present wavelets on finite groups and analyze their similarity to classical wavelets. We demonstrate that, under certain conditions in the wavelet coefficients, the scattering transform is non-expansive, stable under deformations, preserves energy, equivariant with respect to left and right group translations, and, as depth increases, the scattering coefficients are less sensitive to group translations of the signal, all desirable properties of convolutional neural networks. Furthermore, we provide examples illustrating the application of the scattering transform to classify data with domains involving abelian and nonabelian groups.

nan

Article 1023

Title@2025-05-27 (2): shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python

Title: shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python

shapr: Erklären von Machine Learning-Modellen mit bedingten Shapley-Werten in R und Python

Shapr:解释R和Python中带有有条件阴影值的机器学习模型 2504.01842v2

Authors: Martin Jullum, Lars Henry Berge Olsen, Jon Lachmann, Annabelle Redelmeier

This paper introduces the shapr R package, a versatile tool for generating Shapley value based prediction explanations for machine learning and statistical regression models. Moreover, the shaprpy Python library brings the core capabilities of shapr to the Python ecosystem. Shapley values originate from cooperative game theory in the 1950s, but have over the past few years become a widely used method for quantifying how a model’s features/covariates contribute to specific prediction outcomes. The shapr package emphasizes conditional Shapley value estimates, providing a comprehensive range of approaches for accurately capturing feature dependencies – a crucial aspect for correct model explanation, typically lacking in similar software. In addition to regular tabular data, the shapr R package includes specialized functionality for explaining time series forecasts. The package offers a minimal set of user functions with sensible default values for most use cases while providing extensive flexibility for advanced users to fine-tune computations. Additional features include parallelized computations, iterative estimation with convergence detection, and rich visualization tools. shapr also extends its functionality to compute causal and asymmetric Shapley values when causal information is available. Overall, the shapr and shaprpy packages aim to enhance the interpretability of predictive models within a powerful and user-friendly framework.

nan

Article 1024

Title@2025-05-27 (2): Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

Title: Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

Zwei Experten sind alles, was Sie zum Lenken Denken brauchen: Kognitive Bemühungen in MoE-Reasoning-Modellen ohne zusätzliches Training verstärken

两位专家是指导思考所需要的两个专家:在没有额外培训的情况下加强教育部理由说明模式中的认知努力 2505.14681v2

Authors: Mengru Wang, Xingyu Chen, Yue Wang, Zhiwei He, Jiahao Xu, Tian Liang, Qiuzhi Liu, Yunzhi Yao, Wenxuan Wang, Ruotian Ma, Haitao Mi, Ningyu Zhang, Zhaopeng Tu, Xiaolong Li, Dong Yu

Mixture-of-Experts (MoE) architectures within Large Reasoning Models (LRMs) have achieved impressive reasoning capabilities by selectively activating experts to facilitate structured cognitive processes. Despite notable advances, existing reasoning models often suffer from cognitive inefficiencies like overthinking and underthinking. To address these limitations, we introduce a novel inference-time steering methodology called Reinforcing Cognitive Experts (RICE), designed to improve reasoning performance without additional training or complex heuristics. Leveraging normalized Pointwise Mutual Information (nPMI), we systematically identify specialized experts, termed ‘‘cognitive experts’’ that orchestrate meta-level reasoning operations characterized by tokens like ‘‘''. Empirical evaluations with leading MoE-based LRMs (DeepSeek-R1 and Qwen3-235B) on rigorous quantitative and scientific reasoning benchmarks demonstrate noticeable and consistent improvements in reasoning accuracy, cognitive efficiency, and cross-domain generalization. Crucially, our lightweight approach substantially outperforms prevalent reasoning-steering techniques, such as prompt design and decoding constraints, while preserving the model's general instruction-following skills. These results highlight reinforcing cognitive experts as a promising, practical, and interpretable direction to enhance cognitive efficiency within advanced reasoning models.

nan

Article 1025

Title@2025-05-27 (2): Efficient Spectral Control of Partially Observed Linear Dynamical Systems

Title: Efficient Spectral Control of Partially Observed Linear Dynamical Systems

Effiziente Spektralsteuerung teilweise beobachteter linearer dynamischer Systeme

局部观察线性动态系统的有效光谱控制 2505.20943v1

Authors: Anand Brahmbhatt, Gon Buzaglo, Sofiia Druchyna, Elad Hazan

We propose a new method for the problem of controlling linear dynamical systems under partial observation and adversarial disturbances. Our new algorithm, Double Spectral Control (DSC), matches the best known regret guarantees while exponentially improving runtime complexity over previous approaches in its dependence on the system’s stability margin. Our key innovation is a two-level spectral approximation strategy, leveraging double convolution with a universal basis of spectral filters, enabling efficient and accurate learning of the best linear dynamical controllers.

nan

Article 1026

Title@2025-05-27 (2): Towards Training One-Step Diffusion Models Without Distillation

Title: Towards Training One-Step Diffusion Models Without Distillation

Auf dem Weg zum Training von Ein-Schritt-Diffusionsmodellen ohne Destillation

培训不蒸馏的单级传播模型 2502.08005v3

Authors: Mingtian Zhang, Wenlin Chen, Jiajun He, Zijing Ou, José Miguel Hernández-Lobato, Bernhard Schölkopf, David Barber

Recent advances in training one-step diffusion models typically follow a two-stage pipeline: first training a teacher diffusion model and then distilling it into a one-step student model. This process often depends on both the teacher’s score function for supervision and its weights for initializing the student model. In this paper, we explore whether one-step diffusion models can be trained directly without this distillation procedure. We introduce a family of new training methods that entirely forgo teacher score supervision, yet outperforms most teacher-guided distillation approaches. This suggests that score supervision is not essential for effective training of one-step diffusion models. However, we find that initializing the student model with the teacher’s weights remains critical. Surprisingly, the key advantage of teacher initialization is not due to better latent-to-output mappings, but rather the rich set of feature representations across different noise levels that the teacher diffusion model provides. These insights take us one step closer towards training one-step diffusion models without distillation and provide a better understanding of the roles of teacher supervision and initialization in the distillation process.

nan

Article 1027

Title@2025-05-27 (2): Revisiting Sparsity Constraint Under High-Rank Property in Partial Multi-Label Learning

Title: Revisiting Sparsity Constraint Under High-Rank Property in Partial Multi-Label Learning

Überprüfung der Sparsamkeitsbeschränkungen unter Hochrangigem Eigentum im Teil-Multi-Label-Lernen

重新审视部分多标签学习中高等级属性下的平等限制 2505.20938v1

Authors: Chongjie Si, Yidan Cui, Fuchao Yang, Xiaokang Yang, Wei Shen

Partial Multi-Label Learning (PML) extends the multi-label learning paradigm to scenarios where each sample is associated with a candidate label set containing both ground-truth labels and noisy labels. Existing PML methods commonly rely on two assumptions: sparsity of the noise label matrix and low-rankness of the ground-truth label matrix. However, these assumptions are inherently conflicting and impractical for real-world scenarios, where the true label matrix is typically full-rank or close to full-rank. To address these limitations, we demonstrate that the sparsity constraint contributes to the high-rank property of the predicted label matrix. Based on this, we propose a novel method Schirn, which introduces a sparsity constraint on the noise label matrix while enforcing a high-rank property on the predicted label matrix. Extensive experiments demonstrate the superior performance of Schirn compared to state-of-the-art methods, validating its effectiveness in tackling real-world PML challenges.

nan

Article 1028

Title@2025-05-27 (2): EPIC: Efficient Position-Independent Caching for Serving Large Language Models

Title: EPIC: Efficient Position-Independent Caching for Serving Large Language Models

EPIC: Effizientes positionsunabhängiges Caching für das Servieren großer Sprachmodelle

EPIC: 高效的、独立定位的为大语言模式服务的工作 2410.15332v3

Authors: Junhao Hu, Wenrui Huang, Weidong Wang, Haoyi Wang, Tiancheng Hu, Qin Zhang, Hao Feng, Xusheng Chen, Yizhou Shan, Tao Xie

Large Language Models (LLMs) show great capabilities in a wide range of applications, but serving them efficiently becomes increasingly challenging as requests (prompts) become more complex. Context caching improves serving performance by reusing Key-Value (KV) vectors, the intermediate representations of tokens that are repeated across requests. However, existing context caching requires exact prefix matches across requests, limiting reuse cases in settings such as few-shot learning and retrieval-augmented generation, where immutable content (e.g., documents) remains unchanged across requests but is preceded by varying prefixes. Position-Independent Caching (PIC) addresses this issue by enabling modular reuse of the KV vectors regardless of prefixes. We formalize PIC and advance prior work by introducing EPIC, a serving system incorporating our new LegoLink algorithm, which mitigates the inappropriate “attention sink” effect at every document beginning, to maintain accuracy with minimal computation. Experiments show that EPIC achieves up to 8x improvements in Time-To-First-Token (TTFT) and 7x throughput gains over existing systems, with negligible or no accuracy loss.

nan

Article 1029

Title@2025-05-27 (2): Linear Bandits with Non-i.i.d. Noise

Title: Linear Bandits with Non-i.i.d. Noise

Lineare Banditen mit Non-i.i.d. Lärm

带有非i.i.d. 噪音的线形强盗 2505.20017v2

Authors: Baptiste Abélès, Eugenio Clerico, Hamish Flynn, Gergely Neu

We study the linear stochastic bandit problem, relaxing the standard i.i.d. assumption on the observation noise. As an alternative to this restrictive assumption, we allow the noise terms across rounds to be sub-Gaussian but interdependent, with dependencies that decay over time. To address this setting, we develop new confidence sequences using a recently introduced reduction scheme to sequential probability assignment, and use these to derive a bandit algorithm based on the principle of optimism in the face of uncertainty. We provide regret bounds for the resulting algorithm, expressed in terms of the decay rate of the strength of dependence between observations. Among other results, we show that our bounds recover the standard rates up to a factor of the mixing time for geometrically mixing observation noise.

nan

Article 1030

Title@2025-05-27 (2): NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion

Title: NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion

NatADiff: Adversariale Grenzführung für natürliche Adversariale Diffusion

NatadADiff: 自然反向扩散反向边界指南 2505.20934v1

Authors: Max Collins, Jordan Vice, Tim French, Ajmal Mian

Adversarial samples exploit irregularities in the manifold ``learned’’ by deep learning models to cause misclassifications. The study of these adversarial samples provides insight into the features a model uses to classify inputs, which can be leveraged to improve robustness against future attacks. However, much of the existing literature focuses on constrained adversarial samples, which do not accurately reflect test-time errors encountered in real-world settings. To address this, we propose `NatADiff’, an adversarial sampling scheme that leverages denoising diffusion to generate natural adversarial samples. Our approach is based on the observation that natural adversarial samples frequently contain structural elements from the adversarial class. Deep learning models can exploit these structural elements to shortcut the classification process, rather than learning to genuinely distinguish between classes. To leverage this behavior, we guide the diffusion trajectory towards the intersection of the true and adversarial classes, combining time-travel sampling with augmented classifier guidance to enhance attack transferability while preserving image fidelity. Our method achieves comparable attack success rates to current state-of-the-art techniques, while exhibiting significantly higher transferability across model architectures and better alignment with natural test-time errors as measured by FID. These results demonstrate that NatADiff produces adversarial samples that not only transfer more effectively across models, but more faithfully resemble naturally occurring test-time errors.

nan

Article 1031

Title@2025-05-27 (2): MLMC-based Resource Adequacy Assessment with Active Learning Trained Surrogate Models

Title: MLMC-based Resource Adequacy Assessment with Active Learning Trained Surrogate Models

MLMC-basierte Ressourcenadäquatitätsbewertung mit aktiven Learning-Trained-Surrogate-Modellen

以MLMC为基础的基于MLMC的资源充足性评估,与积极学习、经过培训的代用模型进行资源充足性评估 2505.20930v1

Authors: Ruiqi Zhang, Simon H. Tindemans

Multilevel Monte Carlo (MLMC) is a flexible and effective variance reduction technique for accelerating reliability assessments of complex power system. Recently, data-driven surrogate models have been proposed as lower-level models in the MLMC framework due to their high correlation and negligible execution time once trained. However, in resource adequacy assessments, pre-labeled datasets are typically unavailable. For large-scale systems, the efficiency gains from surrogate models are often offset by the substantial time required for labeling training data. Therefore, this paper introduces a speed metric that accounts for training time in evaluating MLMC efficiency. Considering the total time budget is limited, a vote-by-committee active learning approach is proposed to reduce the required labeling calls. A case study demonstrates that, within practical variance thresholds, active learning enables significantly improved MLMC efficiency with reduced training effort, compared to regular surrogate modelling approaches.

nan

Article 1032

Title@2025-05-27 (2): Label Leakage in Federated Inertial-based Human Activity Recognition

Title: Label Leakage in Federated Inertial-based Human Activity Recognition

Label-Leakage in Föderated Inertial-based Human Activity Recognition

以联邦为本的人类活动确认中联邦内地人类活动确认中的Label渗漏 2505.20924v1

Authors: Marius Bock, Maximilian Hopp, Kristof Van Laerhoven, Michael Moeller

While prior work has shown that Federated Learning updates can leak sensitive information, label reconstruction attacks, which aim to recover input labels from shared gradients, have not yet been examined in the context of Human Activity Recognition (HAR). Given the sensitive nature of activity labels, this study evaluates the effectiveness of state-of-the-art gradient-based label leakage attacks on HAR benchmark datasets. Our findings show that the number of activity classes, sampling strategy, and class imbalance are critical factors influencing the extent of label leakage, with reconstruction accuracies reaching up to 90% on two benchmark datasets, even for trained models. Moreover, we find that Local Differential Privacy techniques such as gradient noise and clipping offer only limited protection, as certain attacks still reliably infer both majority and minority class labels. We conclude by offering practical recommendations for the privacy-aware deployment of federated HAR systems and identify open challenges for future research. Code to reproduce our experiments is publicly available via github.com/mariusbock/leakage_har.

nan

Article 1033

Title@2025-05-27 (2): Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

Title: Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

Multi-Agenten-Weltmodellierung aus einer diffusionsinspirierten Perspektive Revue passieren

从传播启发的视角重新审视多股权世界建模 2505.20922v1

Authors: Yang Zhang, Xinran Li, Jianing Ye, Delin Qu, Shuang Qiu, Chongjie Zhang, Xiu Li, Chenjia Bai

World models have recently attracted growing interest in Multi-Agent Reinforcement Learning (MARL) due to their ability to improve sample efficiency for policy learning. However, accurately modeling environments in MARL is challenging due to the exponentially large joint action space and highly uncertain dynamics inherent in multi-agent systems. To address this, we reduce modeling complexity by shifting from jointly modeling the entire state-action transition dynamics to focusing on the state space alone at each timestep through sequential agent modeling. Specifically, our approach enables the model to progressively resolve uncertainty while capturing the structured dependencies among agents, providing a more accurate representation of how agents influence the state. Interestingly, this sequential revelation of agents’ actions in a multi-agent system aligns with the reverse process in diffusion models–a class of powerful generative models known for their expressiveness and training stability compared to autoregressive or latent variable models. Leveraging this insight, we develop a flexible and robust world model for MARL using diffusion models. Our method, Diffusion-Inspired Multi-Agent world model (DIMA), achieves state-of-the-art performance across multiple multi-agent control benchmarks, significantly outperforming prior world models in terms of final return and sample efficiency, including MAMuJoCo and Bi-DexHands. DIMA establishes a new paradigm for constructing multi-agent world models, advancing the frontier of MARL research.

nan

Article 1034

Title@2025-05-27 (2): Humble AI in the real-world: the case of algorithmic hiring

Title: Humble AI in the real-world: the case of algorithmic hiring

Humble KI in der realen Welt: der Fall der algorithmischen Einstellung

现实世界中的黄土人工智能:算法雇用案例 2505.20918v1

Authors: Rahul Nair, Inge Vejsbjerg, Elizabeth Daly, Christos Varytimidis, Bran Knowles

Humble AI (Knowles et al., 2023) argues for cautiousness in AI development and deployments through scepticism (accounting for limitations of statistical learning), curiosity (accounting for unexpected outcomes), and commitment (accounting for multifaceted values beyond performance). We present a real-world case study for humble AI in the domain of algorithmic hiring. Specifically, we evaluate virtual screening algorithms in a widely used hiring platform that matches candidates to job openings. There are several challenges in misrecognition and stereotyping in such contexts that are difficult to assess through standard fairness and trust frameworks; e.g., someone with a non-traditional background is less likely to rank highly. We demonstrate technical feasibility of how humble AI principles can be translated to practice through uncertainty quantification of ranks, entropy estimates, and a user experience that highlights algorithmic unknowns. We describe preliminary discussions with focus groups made up of recruiters. Future user studies seek to evaluate whether the higher cognitive load of a humble AI system fosters a climate of trust in its outcomes.

nan

Article 1035

Title@2025-05-27 (2): A Kernelised Stein Discrepancy for Assessing the Fit of Inhomogeneous Random Graph Models

Title: A Kernelised Stein Discrepancy for Assessing the Fit of Inhomogeneous Random Graph Models

Eine zerkleinerte Stein-Diskrepanz für die Beurteilung der Passform von inhomogenen Zufallsgraphenmodellen

用于评估不相容随机图模型是否适合的内核化石 Stein 差异性评估 2505.21580v1

Authors: Anum Fatima, Gesine Reinert

Complex data are often represented as a graph, which in turn can often be viewed as a realisation of a random graph, such as of an inhomogeneous random graph model (IRG). For general fast goodness-of-fit tests in high dimensions, kernelised Stein discrepancy (KSD) tests are a powerful tool. Here, we develop, test, and analyse a KSD-type goodness-of-fit test for IRG models that can be carried out with a single observation of the network. The test is applicable to a network of any size and does not depend on the asymptotic distribution of the test statistic. We also provide theoretical guarantees.

nan

Article 1036

Title@2025-05-27 (2): Exploring the Boundary of Diffusion-based Methods for Solving Constrained Optimization

Title: Exploring the Boundary of Diffusion-based Methods for Solving Constrained Optimization

Erforschung der Grenzen von diffusionsbasierten Methoden zur Lösung eingeschränkter Optimierung

探索以传播为基础的解决受限制的优化的解决方法的界限 2502.10330v3

Authors: Shutong Ding, Yimiao Zhou, Ke Hu, Xi Yao, Junchi Yan, Xiaoying Tang, Ye Shi

Diffusion models have achieved remarkable success in generative tasks such as image and video synthesis, and in control domains like robotics, owing to their strong generalization capabilities and proficiency in fitting complex multimodal distributions. However, their full potential in solving Continuous Constrained Optimization problems remains largely underexplored. Our work commences by investigating a two-dimensional constrained quadratic optimization problem as an illustrative example to explore the inherent challenges and issues when applying diffusion models to such optimization tasks and providing theoretical analyses for these observations. To address the identified gaps and harness diffusion models for Continuous Constrained Optimization, we build upon this analysis to propose a novel diffusion-based framework for optimization problems called DiOpt. This framework operates in two distinct phases: an initial warm-start phase, implemented via supervised learning, followed by a bootstrapping phase. This dual-phase architecture is designed to iteratively refine solutions, thereby improving the objective function while rigorously satisfying problem constraints. Finally, multiple candidate solutions are sampled, and the optimal one is selected through a screening process. We present extensive experiments detailing the training dynamics of DiOpt, its performance across a diverse set of Continuous Constrained Optimization problems, and an analysis of the impact of DiOpt’s various hyperparameters.

nan

Article 1037

Title@2025-05-27 (2): A data augmentation strategy for deep neural networks with application to epidemic modelling

Title: A data augmentation strategy for deep neural networks with application to epidemic modelling

Eine Datenvergrößerungsstrategie für tiefe neuronale Netzwerke mit Anwendung in der Epidemiemodellierung

用于流行病建模的深层神经网络数据增强战略 2502.21033v2

Authors: Muhammad Awais, Abu Safyan Ali, Giacomo Dimarco, Federica Ferrarese, Lorenzo Pareschi

In this work, we integrate the predictive capabilities of compartmental disease dynamics models with machine learning ability to analyze complex, high-dimensional data and uncover patterns that conventional models may overlook. Specifically, we present a proof of concept demonstrating the application of data-driven methods and deep neural networks to a recently introduced Susceptible-Infected-Recovered type model with social features, including a saturated incidence rate, to improve epidemic prediction and forecasting. Our results show that a robust data augmentation strategy trough suitable data-driven models can improve the reliability of Feed-Forward Neural Networks and Nonlinear Autoregressive Networks, providing a complementary strategy to Physics-Informed Neural Networks, particularly in settings where data augmentation from mechanistic models can enhance learning. This approach enhances the ability to handle nonlinear dynamics and offers scalable, data-driven solutions for epidemic forecasting, prioritizing predictive accuracy over the constraints of physics-based models. Numerical simulations of the lockdown and post-lockdown phase of the COVID-19 epidemic in Italy and Spain validate our methodology.

nan

Article 1038

Title@2025-05-27 (2): “Oh LLM, I’m Asking Thee, Please Give Me a Decision Tree”: Zero-Shot Decision Tree Induction and Embedding with Large Language Models

Title: “Oh LLM, I’m Asking Thee, Please Give Me a Decision Tree”: Zero-Shot Decision Tree Induction and Embedding with Large Language Models

“Oh LLM, ich frage dich, bitte gib mir einen Entscheidungsbaum”: Nullschnelle Entscheidungsbauminduktion und Einbettung mit großen Sprachmodellen

“哦,LLM,我问你,请给我一棵决定树”: “零热决定树上演和嵌入大语言模型” 2409.18594v2

Authors: Ricardo Knauer, Mario Koddenbrock, Raphael Wallsberger, Nicholas M. Brisson, Georg N. Duda, Deborah Falla, David W. Evans, Erik Rodner

Large language models (LLMs) provide powerful means to leverage prior knowledge for predictive modeling when data is limited. In this work, we demonstrate how LLMs can use their compressed world knowledge to generate intrinsically interpretable machine learning models, i.e., decision trees, without any training data. We find that these zero-shot decision trees can even surpass data-driven trees on some small-sized tabular datasets and that embeddings derived from these trees perform better than data-driven tree-based embeddings on average. Our decision tree induction and embedding approaches can therefore serve as new knowledge-driven baselines for data-driven machine learning methods in the low-data regime. Furthermore, they offer ways to harness the rich world knowledge within LLMs for tabular machine learning tasks. Our code and results are available at https://github.com/ml-lab-htw/llm-trees.

nan

Article 1039

Title@2025-05-27 (2): Music Foundation Model as Generic Booster for Music Downstream Tasks

Title: Music Foundation Model as Generic Booster for Music Downstream Tasks

Music Foundation Modell als Generic Booster für Downstream-Aufgaben

音乐基金会模式,作为音乐下流任务通用推进器 2411.01135v3

Authors: WeiHsiang Liao, Yuhta Takida, Yukara Ikemiya, Zhi Zhong, Chieh-Hsin Lai, Giorgio Fabbro, Kazuki Shimada, Keisuke Toyama, Kinwai Cheuk, Marco A. Martínez-Ramírez, Shusuke Takahashi, Stefan Uhlich, Taketo Akama, Woosung Choi, Yuichiro Koyama, Yuki Mitsufuji

We demonstrate the efficacy of using intermediate representations from a single foundation model to enhance various music downstream tasks. We introduce SoniDo, a music foundation model (MFM) designed to extract hierarchical features from target music samples. By leveraging hierarchical intermediate features, SoniDo constrains the information granularity, leading to improved performance across various downstream tasks including both understanding and generative tasks. We specifically evaluated this approach on representative tasks such as music tagging, music transcription, music source separation, and music mixing. Our results reveal that the features extracted from foundation models provide valuable enhancements in training downstream task models. This highlights the capability of using features extracted from music foundation models as a booster for downstream tasks. Our approach not only benefits existing task-specific models but also supports music downstream tasks constrained by data scarcity. This paves the way for more effective and accessible music processing solutions.

nan

Article 1040

Title@2025-05-27 (2): Simple Relative Deviation Bounds for Covariance and Gram Matrices

Title: Simple Relative Deviation Bounds for Covariance and Gram Matrices

Einfache relative Abweichungen für Kovarianz und Gram Matrices

常数和小数母体的简单相对偏差宽度 2410.05754v3

Authors: Daniel Barzilai, Ohad Shamir

We provide non-asymptotic, relative deviation bounds for the eigenvalues of empirical covariance and Gram matrices in general settings. Unlike typical uniform bounds, which may fail to capture the behavior of smaller eigenvalues, our results provide sharper control across the spectrum. Our analysis is based on a general-purpose theorem that allows one to convert existing uniform bounds into relative ones. The theorems and techniques emphasize simplicity and should be applicable across various settings.

nan

Article 1041

Title: Enhancing Performance of Explainable AI Models with Constrained Concept Refinement

Leistungssteigerung erklärbarer KI-Modelle mit eingeschränkter Konzeptverfeinerung

增强可解释的AI 概念改进模型的绩效 2502.06775v2

Authors: Geyu Liang, Senne Michielssen, Salar Fattahi

The trade-off between accuracy and interpretability has long been a challenge in machine learning (ML). This tension is particularly significant for emerging interpretable-by-design methods, which aim to redesign ML algorithms for trustworthy interpretability but often sacrifice accuracy in the process. In this paper, we address this gap by investigating the impact of deviations in concept representations-an essential component of interpretable models-on prediction performance and propose a novel framework to mitigate these effects. The framework builds on the principle of optimizing concept embeddings under constraints that preserve interpretability. Using a generative model as a test-bed, we rigorously prove that our algorithm achieves zero loss while progressively enhancing the interpretability of the resulting model. Additionally, we evaluate the practical performance of our proposed framework in generating explainable predictions for image classification tasks across various benchmarks. Compared to existing explainable methods, our approach not only improves prediction accuracy while preserving model interpretability across various large-scale benchmarks but also achieves this with significantly lower computational cost.

nan

Article 1042

Title@2025-05-27 (2): Achieving binary weight and activation for LLMs using Post-Training Quantization

Title: Achieving binary weight and activation for LLMs using Post-Training Quantization

Erreichen des binären Gewichts und Aktivierung für LLMs mit Post-Training Quantization

利用培训后量化办法使LLMMs实现二进制加权和激活 2504.05352v2

Authors: Siqing Song, Chuang Wang, Ruiqi Wang, Yi Yang, Xuyao Zhang

Quantizing large language models (LLMs) to 1-bit precision significantly reduces computational costs, but existing quantization techniques suffer from noticeable performance degradation when using weight and activation precisions below 4 bits (W4A4). In this paper, we propose a post-training quantization framework with W(1+1)A(1*4) configuration, where weights are quantized to 1 bit with an additional 1 bit for fine-grain grouping and activations are quantized to 1 bit with a 4-fold increase in the number of channels. For weight quantization, we propose utilizing Hessian-aware fine-grained grouping along with an EM-based quantization scheme. For activation quantization, we decompose INT4-quantized activations into a 4 * INT1 format equivalently and simultaneously smooth the scaling factors based on quantization errors, which further reduces the quantization errors in activations. Our method surpasses state-of-the-art (SOTA) LLM quantization baselines on W2A4 across multiple tasks, pushing the boundaries of existing LLM quantization methods toward fully binarized models. Code is available at https://github.com/JimmyCrave/LLM-PTQ-binarization.

nan

Article 1043

Title@2025-05-27 (2): Frequency-Aware Masked Autoencoders for Human Activity Recognition using Accelerometers

Title: Frequency-Aware Masked Autoencoders for Human Activity Recognition using Accelerometers

Frequency-Aware Maskierte Autoencoder für die Erkennung menschlicher Aktivität mit Beschleunigungsmessern

使用加速计识别人类活动的频率软件 2502.17477v2

Authors: Niels R. Lorenzen, Poul J. Jennum, Emmanuel Mignot, Andreas Brink-Kjaer

Wearable accelerometers are widely used for continuous monitoring of physical activity. Supervised machine learning and deep learning algorithms have long been used to extract meaningful activity information from raw accelerometry data, but progress has been hampered by the limited amount of labeled data that is publicly available. Exploiting large unlabeled datasets using self-supervised pretraining is a relatively new and underexplored approach in the field of human activity recognition (HAR). We used a time-series transformer masked autoencoder (MAE) approach to self-supervised pretraining and propose two novel spectrogram-based loss functions: the log-scale meanmagnitude (LMM) and log-scale magnitude variance (LMV) losses. We compared these losses with the mean squared error (MSE) loss for MAE training. We leveraged the large unlabeled UK Biobank accelerometry dataset (n = 109k) for pretraining and evaluated downstream HAR performance using a linear classifier in a smaller labelled dataset. We found that pretraining with the LMM loss improved performance compared to an MAE pretrained with the MSE loss, with 12.7% increase in subject-wise F1 score when using linear probing. Compared with a state-of-the-art ResNet-based HAR model, our LMM-pretrained transformer models performed better (+9.8% F1) with linear probing and comparably when fine-tuned using an LSTM classifier. The addition of the LMV to the LMM loss decreased performance compared to the LMM loss alone. These findings establish the LMM loss as a robust and effective method for pretraining MAE models on accelerometer data for HAR and show the potential of pretraining sequence-based models for free-living HAR.

nan

Article 1044

Title@2025-05-27 (2): How Do Transformers Learn Variable Binding in Symbolic Programs?

Title: How Do Transformers Learn Variable Binding in Symbolic Programs?

Wie lernen Transformer variable Bindungen in Symbolischen Programmen?

变换者如何在符号程序中学习变数绑定 ? 2505.20896v1

Authors: Yiwei Wu, Atticus Geiger, Raphaël Millière

Variable binding – the ability to associate variables with values – is fundamental to symbolic computation and cognition. Although classical architectures typically implement variable binding via addressable memory, it is not well understood how modern neural networks lacking built-in binding operations may acquire this capacity. We investigate this by training a Transformer to dereference queried variables in symbolic programs where variables are assigned either numerical constants or other variables. Each program requires following chains of variable assignments up to four steps deep to find the queried value, and also contains irrelevant chains of assignments acting as distractors. Our analysis reveals a developmental trajectory with three distinct phases during training: (1) random prediction of numerical constants, (2) a shallow heuristic prioritizing early variable assignments, and (3) the emergence of a systematic mechanism for dereferencing assignment chains. Using causal interventions, we find that the model learns to exploit the residual stream as an addressable memory space, with specialized attention heads routing information across token positions. This mechanism allows the model to dynamically track variable bindings across layers, resulting in accurate dereferencing. Our results show how Transformer models can learn to implement systematic variable binding without explicit architectural support, bridging connectionist and symbolic approaches.

nan

Article 1045

Title@2025-05-27 (2): DeepConvContext: A Multi-Scale Approach to Timeseries Classification in Human Activity Recognition

Title: DeepConvContext: A Multi-Scale Approach to Timeseries Classification in Human Activity Recognition

DeepConvContext: Ein mehrstufiger Ansatz zur Zeitreihenklassifizierung in der Anerkennung menschlicher Aktivität

深刻信念:人类活动确认中的时间序列分类的多比额表办法 2505.20894v1

Authors: Marius Bock, Michael Moeller, Kristof Van Laerhoven

Despite recognized limitations in modeling long-range temporal dependencies, Human Activity Recognition (HAR) has traditionally relied on a sliding window approach to segment labeled datasets. Deep learning models like the DeepConvLSTM typically classify each window independently, thereby restricting learnable temporal context to within-window information. To address this constraint, we propose DeepConvContext, a multi-scale time series classification framework for HAR. Drawing inspiration from the vision-based Temporal Action Localization community, DeepConvContext models both intra- and inter-window temporal patterns by processing sequences of time-ordered windows. Unlike recent HAR models that incorporate attention mechanisms, DeepConvContext relies solely on LSTMs – with ablation studies demonstrating the superior performance of LSTMs over attention-based variants for modeling inertial sensor data. Across six widely-used HAR benchmarks, DeepConvContext achieves an average 10% improvement in F1-score over the classic DeepConvLSTM, with gains of up to 21%. Code to reproduce our experiments is publicly available via github.com/mariusbock/context_har.

nan

Article 1046

Title@2025-05-27 (2): One-Time Soft Alignment Enables Resilient Learning without Weight Transport

Title: One-Time Soft Alignment Enables Resilient Learning without Weight Transport

One-Time Soft Alignment ermöglicht resilientes Lernen ohne Gewicht Transport

一次性软对齐使有弹性的学习无需体力运输 2505.20892v1

Authors: Jeonghwan Cheon, Jaehyuk Bae, Se-Bum Paik

Backpropagation is the cornerstone of deep learning, but its reliance on symmetric weight transport and global synchronization makes it computationally expensive and biologically implausible. Feedback alignment offers a promising alternative by approximating error gradients through fixed random feedback, thereby avoiding symmetric weight transport. However, this approach often struggles with poor learning performance and instability, especially in deep networks. Here, we show that a one-time soft alignment between forward and feedback weights at initialization enables deep networks to achieve performance comparable to backpropagation, without requiring weight transport during learning. This simple initialization condition guides stable error minimization in the loss landscape, improving network trainability. Spectral analyses further reveal that initial alignment promotes smoother gradient flow and convergence to flatter minima, resulting in better generalization and robustness. Notably, we also find that allowing moderate deviations from exact weight symmetry can improve adversarial robustness compared to standard backpropagation. These findings demonstrate that a simple initialization strategy can enable effective learning in deep networks in a biologically plausible and resource-efficient manner.

nan

Article 1047

Title@2025-05-27 (2): ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention

Title: ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention

ComplexEhemaliger: Disruptived Advance Transformer Inferenz-Fähigkeit über Head-Specific Complex Vector Achtung

复杂形式:通过头部特定复杂矢量的注意,干扰推进变压器推断能力 2505.10222v2

Authors: Jintian Shao, Hongyi Huang, Jiayi Wu, Beiwen Zhang, ZhiYu Wu, You Shan, MingKai Zheng

Transformer models rely on self-attention to capture token dependencies but face challenges in effectively integrating positional information while allowing multi-head attention (MHA) flexibility. Prior methods often model semantic and positional differences disparately or apply uniform positional adjustments across heads, potentially limiting representational capacity. This paper introduces ComplexFormer, featuring Complex Multi-Head Attention-CMHA. CMHA empowers each head to independently model semantic and positional differences unified within the complex plane, representing interactions as rotations and scaling. ComplexFormer incorporates two key improvements: (1) a per-head Euler transformation, converting real-valued query/key projections into polar-form complex vectors for head-specific complex subspace operation; and (2) a per-head adaptive differential rotation mechanism, exp[i(Adapt(ASmn,i) + Delta(Pmn),i)], allowing each head to learn distinct strategies for integrating semantic angle differences (ASmn,i) with relative positional encodings (Delta(Pmn),i). Extensive experiments on language modeling, text generation, code generation, and mathematical reasoning show ComplexFormer achieves superior performance, significantly lower generation perplexity , and improved long-context coherence compared to strong baselines like RoPE-Transformers. ComplexFormer demonstrates strong parameter efficiency, offering a more expressive, adaptable attention mechanism.

nan

Article 1048

Title@2025-05-27 (2): Power-Law Decay Loss for Large Language Model Finetuning: Focusing on Information Sparsity to Enhance Generation Quality

Title: Power-Law Decay Loss for Large Language Model Finetuning: Focusing on Information Sparsity to Enhance Generation Quality

Macht-Rechts-Dekay-Verlust für große Sprachmodell Finetuning: Fokussierung auf Informationssparsität zur Verbesserung der Generationsqualität

大语言模型调整的功率法减退损失:侧重于信息平等以提高世代质量 2505.16900v3

Authors: Jintian Shao, Yiming Cheng, Hongyi Huang, Jiayi Wu, Beiwen Zhang, Zhiyu Wu, You Shan, Mingkai Zheng

During the finetuning stage of text generation tasks, standard cross-entropy loss treats all tokens equally. This can lead models to overemphasize high-frequency, low-information tokens, neglecting lower-frequency tokens crucial for specificity and informativeness in generated content. This paper introduces a novel loss function, Power-Law Decay Loss (PDL), specifically designed to optimize the finetuning process for text generation. The core motivation for PDL stems from observations in information theory and linguistics: the informativeness of a token is often inversely proportional to its frequency of occurrence. PDL re-weights the contribution of each token in the standard cross-entropy loss based on its frequency in the training corpus, following a power-law decay. Specifically, the weights for high-frequency tokens are reduced, while low-frequency, information-dense tokens are assigned higher weights. This mechanism guides the model during finetuning to focus more on learning and generating tokens that convey specific and unique information, thereby enhancing the quality, diversity, and informativeness of the generated text. We theoretically elaborate on the motivation and construction of PDL and discuss its potential applications and advantages across various text generation finetuning tasks, such as abstractive summarization, dialogue systems, and style transfer.

nan

Article 1049

Title@2025-05-27 (2): Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective

Title: Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective

Auf dem Weg zur Analyse und dem Verständnis der Grenzen von VAPO: Eine theoretische Perspektive

分析和理解VAPO的局限性:理论视角 2505.17997v2

Authors: Jintian Shao, Yiming Cheng, Hongyi Huang, Beiwen Zhang, Zhiyu Wu, You Shan, Mingkai Zheng

The VAPO framework has demonstrated significant empirical success in enhancing the efficiency and reliability of reinforcement learning for long chain-of-thought (CoT) reasoning tasks with large language models (LLMs). By systematically addressing challenges such as value model bias, heterogeneous sequence lengths, and sparse reward signals, VAPO achieves state-of-the-art performance. While its practical benefits are evident, a deeper theoretical understanding of its underlying mechanisms and potential limitations is crucial for guiding future advancements. This paper aims to initiate such a discussion by exploring VAPO from a theoretical perspective, highlighting areas where its assumptions might be challenged and where further investigation could yield more robust and generalizable reasoning agents. We delve into the intricacies of value function approximation in complex reasoning spaces, the optimality of adaptive advantage estimation, the impact of token-level optimization, and the enduring challenges of exploration and generalization.

nan

Article 1050

Title: Fedivertex: a Graph Dataset based on Decentralized Social Networks for Trustworthy Machine Learning

Fedivertex: ein Graph Dataset auf Basis dezentralisierter sozialer Netzwerke für vertrauenswürdiges maschinelles Lernen

Fedivertex:基于分散社会网络的图表数据集,用于可信赖的机器学习 2505.20882v1

Authors: Marc Damie, Edwige Cyffers

Decentralized machine learning - where each client keeps its own data locally and uses its own computational resources to collaboratively train a model by exchanging peer-to-peer messages - is increasingly popular, as it enables better scalability and control over the data. A major challenge in this setting is that learning dynamics depend on the topology of the communication graph, which motivates the use of real graph datasets for benchmarking decentralized algorithms. Unfortunately, existing graph datasets are largely limited to for-profit social networks crawled at a fixed point in time and often collected at the user scale, where links are heavily influenced by the platform and its recommendation algorithms. The Fediverse, which includes several free and open-source decentralized social media platforms such as Mastodon, Misskey, and Lemmy, offers an interesting real-world alternative. We introduce Fedivertex, a new dataset of 182 graphs, covering seven social networks from the Fediverse, crawled weekly over 14 weeks. We release the dataset along with a Python package to facilitate its use, and illustrate its utility on several tasks, including a new defederation task, which captures a process of link deletion observed on these networks.

nan

Article 1051

Title@2025-05-27 (2): Generalizable Heuristic Generation Through Large Language Models with Meta-Optimization

Title: Generalizable Heuristic Generation Through Large Language Models with Meta-Optimization

Generalisierbare Heuristische Generation durch große Sprachmodelle mit Meta-Optimierung

通过配有元-优化的大型语言模型实现可普遍实现的超营养代 2505.20881v1

Authors: Yiding Shi, Jianan Zhou, Wen Song, Jieyi Bi, Yaoxin Wu, Jie Zhang

Heuristic design with large language models (LLMs) has emerged as a promising approach for tackling combinatorial optimization problems (COPs). However, existing approaches often rely on manually predefined evolutionary computation (EC) optimizers and single-task training schemes, which may constrain the exploration of diverse heuristic algorithms and hinder the generalization of the resulting heuristics. To address these issues, we propose Meta-Optimization of Heuristics (MoH), a novel framework that operates at the optimizer level, discovering effective optimizers through the principle of meta-learning. Specifically, MoH leverages LLMs to iteratively refine a meta-optimizer that autonomously constructs diverse optimizers through (self-)invocation, thereby eliminating the reliance on a predefined EC optimizer. These constructed optimizers subsequently evolve heuristics for downstream tasks, enabling broader heuristic exploration. Moreover, MoH employs a multi-task training scheme to promote its generalization capability. Experiments on classic COPs demonstrate that MoH constructs an effective and interpretable meta-optimizer, achieving state-of-the-art performance across various downstream tasks, particularly in cross-size settings.

nan

Article 1052

Title@2025-05-27 (2): Conditional Distribution Compression via the Kernel Conditional Mean Embedding

Title: Conditional Distribution Compression via the Kernel Conditional Mean Embedding

Conditional Distribution Compression über den Kernel Conditional Mean Embedding

通过内核有条件平均嵌入式压缩有条件分发 2504.10139v2

Authors: Dominic Broadbent, Nick Whiteley, Robert Allison, Tom Lovett

Existing distribution compression methods, like Kernel Herding (KH), were originally developed for unlabelled data. However, no existing approach directly compresses the conditional distribution of labelled data. To address this gap, we first introduce the Average Maximum Conditional Mean Discrepancy (AMCMD), a natural metric for comparing conditional distributions. We then derive a consistent estimator for the AMCMD and establish its rate of convergence. Next, we make a key observation: in the context of distribution compression, the cost of constructing a compressed set targeting the AMCMD can be reduced from $\mathcal{O}(n^3)$ to $\mathcal{O}(n)$. Building on this, we extend the idea of KH to develop Average Conditional Kernel Herding (ACKH), a linear-time greedy algorithm that constructs a compressed set targeting the AMCMD. To better understand the advantages of directly compressing the conditional distribution rather than doing so via the joint distribution, we introduce Joint Kernel Herding (JKH), a straightforward adaptation of KH designed to compress the joint distribution of labelled data. While herding methods provide a simple and interpretable selection process, they rely on a greedy heuristic. To explore alternative optimisation strategies, we propose Joint Kernel Inducing Points (JKIP) and Average Conditional Kernel Inducing Points (ACKIP), which jointly optimise the compressed set while maintaining linear complexity. Experiments show that directly preserving conditional distributions with ACKIP outperforms both joint distribution compression (via JKH and JKIP) and the greedy selection used in ACKH. Moreover, we see that JKIP consistently outperforms JKH.

nan

Article 1053

Title@2025-05-27 (2): Machine Learning - Driven Materials Discovery: Unlocking Next-Generation Functional Materials – A minireview

Title: Machine Learning - Driven Materials Discovery: Unlocking Next-Generation Functional Materials – A minireview

Machine Learning - Driven Materials Discovery: Locking Next-Generation Functional Materials – Eine Minireview

机器学习 – – 驱动材料发现:解锁下一轮启动功能材料 – – 小型审查 2503.18975v2

Authors: Dilshod Nematov, Mirabbos Hojamberdiev

The rapid advancement of machine learning and artificial intelligence (AI)-driven techniques is revolutionizing materials discovery, property prediction, and material design by minimizing human intervention and accelerating scientific progress. This review provides a comprehensive overview of smart, machine learning (ML)-driven approaches, emphasizing their role in predicting material properties, discovering novel compounds, and optimizing material structures. Key methodologies ranging from deep learning, graph neural networks, and Bayesian optimization to automated generative models, such as generative adversarial networks (GANs) and variational autoencoders (VAEs) enable the autonomous design of materials with tailored functionalities. By leveraging AutoML frameworks (e.g., AutoGluon, TPOT, and H2O.ai), researchers can automate the model selection, hyperparameter tuning, and feature engineering, significantly improving the efficiency of materials informatics. Furthermore, the integration of AI-driven robotic laboratories and high-throughput computing has established a fully automated pipeline for rapid synthesis and experimental validation, drastically reducing the time and cost of material discovery. This review highlights real-world applications of automated ML-driven approaches in predicting mechanical, thermal, electrical, and optical properties of materials, demonstrating successful cases in superconductors, catalysts, photovoltaics, and energy storage systems. We also address key challenges, such as data quality, interpretability, and the integration of AutoML with quantum computing, which are essential for future advancements. Ultimately, the synergy between AI, automated experimentation, and computational modeling transforms the way the materials are discovered, optimized, and designed, paving the way for next-generation innovations in energy, electronics, and nanotechnology.

nan

Article 1054

Title@2025-05-27 (2): In Context Learning with Vision Transformers: Case Study

Title: In Context Learning with Vision Transformers: Case Study

Im Kontext Lernen mit Vision Transformers: Fallstudie

与愿景变异者进行背景学习:案例研究 2505.20872v1

Authors: Antony Zhao, Alex Proshkin, Fergal Hennessy, Francesco Crivelli

Large transformer models have been shown to be capable of performing in-context learning. By using examples in a prompt as well as a query, they are capable of performing tasks such as few-shot, one-shot, or zero-shot learning to output the corresponding answer to this query. One area of interest to us is that these transformer models have been shown to be capable of learning the general class of certain functions, such as linear functions and small 2-layer neural networks, on random data (Garg et al, 2023). We aim to extend this to the image space to analyze their capability to in-context learn more complex functions on the image space, such as convolutional neural networks and other methods.

nan

Article 1055

Title@2025-05-27 (2): RL-SPH: Learning to Achieve Feasible Solutions for Integer Linear Programs

Title: RL-SPH: Learning to Achieve Feasible Solutions for Integer Linear Programs

RL-SPH: Lernen, um durchführbare Lösungen für Integer-Lineare-Programme zu erreichen

RL-SPH:学习为整数线性方案找到可行的解决办法 2411.19517v5

Authors: Tae-Hoon Lee, Min-Soo Kim

Integer linear programming (ILP) is widely utilized for various combinatorial optimization problems. Primal heuristics play a crucial role in quickly finding feasible solutions for NP-hard ILP. Although \textit{end-to-end learning}-based primal heuristics (E2EPH) have recently been proposed, they are typically unable to independently generate feasible solutions and mainly focus on binary variables. Ensuring feasibility is critical, especially when handling non-binary integer variables. To address this challenge, we propose RL-SPH, a novel reinforcement learning-based start primal heuristic capable of independently generating feasible solutions, even for ILP involving non-binary integers. Experimental results demonstrate that RL-SPH rapidly obtains high-quality feasible solutions, achieving on average a 44x lower primal gap and a 2.3x lower primal integral compared to existing primal heuristics.

nan

Article 1056

Title@2025-05-27 (2): Leveraging Diffusion Models for Parameterized Quantum Circuit Generation

Title: Leveraging Diffusion Models for Parameterized Quantum Circuit Generation

Nutzung von Diffusionsmodellen für die parameterisierte Quantum Circuit Generation

利用可计量量子电路生成的传播模型 2505.20863v1

Authors: Daniel Barta, Darya Martyniuk, Johannes Jung, Adrian Paschke

Quantum computing holds immense potential, yet its practical success depends on multiple factors, including advances in quantum circuit design. In this paper, we introduce a generative approach based on denoising diffusion models (DMs) to synthesize parameterized quantum circuits (PQCs). Extending the recent diffusion model pipeline of F"urrutter et al. [1], our model effectively conditions the synthesis process, enabling the simultaneous generation of circuit architectures and their continuous gate parameters. We demonstrate our approach in synthesizing PQCs optimized for generating high-fidelity Greenberger-Horne-Zeilinger (GHZ) states and achieving high accuracy in quantum machine learning (QML) classification tasks. Our results indicate a strong generalization across varying gate sets and scaling qubit counts, highlighting the versatility and computational efficiency of diffusion-based methods. This work illustrates the potential of generative models as a powerful tool for accelerating and optimizing the design of PQCs, supporting the development of more practical and scalable quantum applications.

nan

Article 1057

Title@2025-05-27 (2): Model Agnostic Differentially Private Causal Inference

Title: Model Agnostic Differentially Private Causal Inference

Modell Agnostisch unterschiedliche private Kausalableitung

示范性Agnistic 区分法私人原因推断 2505.19589v2

Authors: Christian Lebeda, Mathieu Even, Aurélien Bellet, Julie Josse

Estimating causal effects from observational data is essential in fields such as medicine, economics and social sciences, where privacy concerns are paramount. We propose a general, model-agnostic framework for differentially private estimation of average treatment effects (ATE) that avoids strong structural assumptions on the data-generating process or the models used to estimate propensity scores and conditional outcomes. In contrast to prior work, which enforces differential privacy by directly privatizing these nuisance components and results in a privacy cost that scales with model complexity, our approach decouples nuisance estimation from privacy protection. This separation allows the use of flexible, state-of-the-art black-box models, while differential privacy is achieved by perturbing only predictions and aggregation steps within a fold-splitting scheme with ensemble techniques. We instantiate the framework for three classical estimators – the G-formula, inverse propensity weighting (IPW), and augmented IPW (AIPW) – and provide formal utility and privacy guarantees. Empirical results show that our methods maintain competitive performance under realistic privacy budgets. We further extend our framework to support meta-analysis of multiple private ATE estimates. Our results bridge a critical gap between causal inference and privacy-preserving data analysis.

nan

Article 1058

Title@2025-05-27 (2): UOD: Unseen Object Detection in 3D Point Cloud

Title: UOD: Unseen Object Detection in 3D Point Cloud

UOD: Unsichtbare Objekterkennung in 3D-Punkt-Cloud

UOD: 3D点云中未见物体探测 2401.03846v2

Authors: Hyunjun Choi, Daeho Um, Hawook Jeong

Existing 3D object detectors encounter extreme challenges in localizing unseen 3D objects and recognizing them as unseen, which is a crucial technology in autonomous driving in the wild. To address these challenges, we propose practical methods to enhance the performance of 3D detection and Out-Of-Distribution (OOD) classification for unseen objects. The proposed methods include anomaly sample augmentation, learning of universal objectness, learning of detecting unseen objects, and learning of distinguishing unseen objects. To demonstrate the effectiveness of our approach, we propose the KITTI Misc benchmark and two additional synthetic OOD benchmarks: the Nuscenes OOD benchmark and the SUN-RGBD OOD benchmark. The proposed methods consistently enhance performance by a large margin across all existing methods, giving insight for future work on unseen 3D object detection in the wild.

nan

Article 1059

Title@2025-05-27 (2): Aggregation Buffer: Revisiting DropEdge with a New Parameter Block

Title: Aggregation Buffer: Revisiting DropEdge with a New Parameter Block

Aggregation Buffer: DropEdge mit einem neuen Parameterblock erneut aufrufen

聚合缓冲:用新参数块重新检查下坡面 2505.20840v1

Authors: Dooho Lee, Myeong Kong, Sagad Hamid, Cheonwoo Lee, Jaemin Yoo

We revisit DropEdge, a data augmentation technique for GNNs which randomly removes edges to expose diverse graph structures during training. While being a promising approach to effectively reduce overfitting on specific connections in the graph, we observe that its potential performance gain in supervised learning tasks is significantly limited. To understand why, we provide a theoretical analysis showing that the limited performance of DropEdge comes from the fundamental limitation that exists in many GNN architectures. Based on this analysis, we propose Aggregation Buffer, a parameter block specifically designed to improve the robustness of GNNs by addressing the limitation of DropEdge. Our method is compatible with any GNN model, and shows consistent performance improvements on multiple datasets. Moreover, our method effectively addresses well-known problems such as degree bias or structural disparity as a unifying solution. Code and datasets are available at https://github.com/dooho00/agg-buffer.

nan

Article 1060

Title@2025-05-27 (2): Tuning LLM Judge Design Decisions for 1/1000 of the Cost

Title: Tuning LLM Judge Design Decisions for 1/1000 of the Cost

Tuning LLM Richter Design Entscheidungen für 1/1000 der Kosten

1 000美元费用1 000美元法官设计决定 2501.17178v4

Authors: David Salinas, Omar Swelam, Frank Hutter

Evaluating Large Language Models (LLMs) often requires costly human annotations. To address this, LLM-based judges have been proposed, which compare the outputs of two LLMs enabling the ranking of models without human intervention. While several approaches have been proposed, many confounding factors are present between different papers. For instance the model, the prompt and other hyperparameters are typically changed at the same time making apple-to-apple comparisons challenging. In this paper, we propose to systematically analyze and tune the hyperparameters of LLM judges. To alleviate the high cost of evaluating a judge, we propose to leverage multi-objective multi-fidelity which allows to find judges that trade accuracy for cost and also significantly reduce the cost of the search. Our method identifies judges that not only outperform existing benchmarks in accuracy and cost-efficiency but also utilize open-weight models, ensuring greater accessibility and reproducibility. The code to reproduce our experiments is available at this repository https://github.com/geoalgo/judgetuning .

nan

Article 1061

Title@2025-05-27 (2): HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling

Title: HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling

HAD: Hybride Architektur Destillation übertrifft Lehrer in genomischer Sequenzmodellierung

HAD:混合结构蒸馏(混合结构蒸馏) 2505.20836v1

Authors: Hexiong Yang, Mingrui Chen, Huaibo Huang, Junxian Duan, Jie Cao, Zhen Zhou, Ran He

Inspired by the great success of Masked Language Modeling (MLM) in the natural language domain, the paradigm of self-supervised pre-training and fine-tuning has also achieved remarkable progress in the field of DNA sequence modeling. However, previous methods often relied on massive pre-training data or large-scale base models with huge parameters, imposing a significant computational burden. To address this, many works attempted to use more compact models to achieve similar outcomes but still fell short by a considerable margin. In this work, we propose a Hybrid Architecture Distillation (HAD) approach, leveraging both distillation and reconstruction tasks for more efficient and effective pre-training. Specifically, we employ the NTv2-500M as the teacher model and devise a grouping masking strategy to align the feature embeddings of visible tokens while concurrently reconstructing the invisible tokens during MLM pre-training. To validate the effectiveness of our proposed method, we conducted comprehensive experiments on the Nucleotide Transformer Benchmark and Genomic Benchmark. Compared to models with similar parameters, our model achieved excellent performance. More surprisingly, it even surpassed the distillation ceiling-teacher model on some sub-tasks, which is more than 500 $\times$ larger. Lastly, we utilize t-SNE for more intuitive visualization, which shows that our model can gain a sophisticated understanding of the intrinsic representation pattern in genomic sequences.

nan

Article 1062

Title@2025-05-27 (2): Beyond Semantics: The Unreasonable Effectiveness of Reasonless Intermediate Tokens

Title: Beyond Semantics: The Unreasonable Effectiveness of Reasonless Intermediate Tokens

Jenseits von Semantik: Die unvernünftige Wirksamkeit von vernünftigen Zwischenmarken

超越语义:无理性中肯的不合理效力 2505.13775v2

Authors: Kaya Stechly, Karthik Valmeekam, Atharva Gundawar, Vardhan Palod, Subbarao Kambhampati

Recent impressive results from large reasoning models have been interpreted as a triumph of Chain of Thought (CoT), and especially of the process of training on CoTs sampled from base LLMs in order to help find new reasoning patterns. In this paper, we critically examine that interpretation by investigating how the semantics of intermediate tokens-often anthropomorphized as “thoughts” or reasoning traces and which are claimed to display behaviors like backtracking, self-verification etc.-actually influence model performance. We train transformer models on formally verifiable reasoning traces and solutions, constraining both intermediate steps and final outputs to align with those of a formal solver (in our case, A* search). By constructing a formal interpreter of the semantics of our problems and intended algorithm, we systematically evaluate not only solution accuracy but also the correctness of intermediate traces, thus allowing us to evaluate whether the latter causally influences the former. We notice that, despite significant improvements on the solution-only baseline, models trained on entirely correct traces still produce invalid reasoning traces when arriving at correct solutions. To further show that trace accuracy is only loosely connected to solution accuracy, we then train models on noisy, corrupted traces which have no relation to the specific problem each is paired with, and find that not only does performance remain largely consistent with models trained on correct data, but in some cases can improve upon it and generalize more robustly on out-of-distribution tasks. These results challenge the assumption that intermediate tokens or “Chains of Thought” induce predictable reasoning behaviors and caution against anthropomorphizing such outputs or over-interpreting them (despite their mostly correct forms) as evidence of human-like or algorithmic behaviors in language models.

nan

Article 1063

Title@2025-05-27 (2): Concentration Distribution Learning from Label Distributions

Title: Concentration Distribution Learning from Label Distributions

Konzentrationsverteilung Lernen von Etikettenverteilungen

从标签分发中学习 2505.21576v1

Authors: Jiawei Tang, Yuheng Jia

Label distribution learning (LDL) is an effective method to predict the relative label description degree (a.k.a. label distribution) of a sample. However, the label distribution is not a complete representation of an instance because it overlooks the absolute intensity of each label. Specifically, it’s impossible to obtain the total description degree of hidden labels that not in the label space, which leads to the loss of information and confusion in instances. To solve the above problem, we come up with a new concept named background concentration to serve as the absolute description degree term of the label distribution and introduce it into the LDL process, forming the improved paradigm of concentration distribution learning. Moreover, we propose a novel model by probabilistic methods and neural networks to learn label distributions and background concentrations from existing LDL datasets. Extensive experiments prove that the proposed approach is able to extract background concentrations from label distributions while producing more accurate prediction results than the state-of-the-art LDL methods. The code is available in https://github.com/seutjw/CDL-LD.

nan

Article 1064

Title@2025-05-27 (2): The Third Pillar of Causal Analysis? A Measurement Perspective on Causal Representations

Title: The Third Pillar of Causal Analysis? A Measurement Perspective on Causal Representations

Die dritte Säule der Kausalanalyse? Eine Messperspektive auf Kausaldarstellungen

Causal 分析的第三个支柱? Causal 代表比例的衡量观点 2505.17708v2

Authors: Dingling Yao, Shimeng Huang, Riccardo Cadei, Kun Zhang, Francesco Locatello

Causal reasoning and discovery, two fundamental tasks of causal analysis, often face challenges in applications due to the complexity, noisiness, and high-dimensionality of real-world data. Despite recent progress in identifying latent causal structures using causal representation learning (CRL), what makes learned representations useful for causal downstream tasks and how to evaluate them are still not well understood. In this paper, we reinterpret CRL using a measurement model framework, where the learned representations are viewed as proxy measurements of the latent causal variables. Our approach clarifies the conditions under which learned representations support downstream causal reasoning and provides a principled basis for quantitatively assessing the quality of representations using a new Test-based Measurement EXclusivity (T-MEX) score. We validate T-MEX across diverse causal inference scenarios, including numerical simulations and real-world ecological video analysis, demonstrating that the proposed framework and corresponding score effectively assess the identification of learned representations and their usefulness for causal downstream tasks.

nan

Article 1065

Title@2025-05-27 (2): HybridLinker: Topology-Guided Posterior Sampling for Enhanced Diversity and Validity in 3D Molecular Linker Generation

Title: HybridLinker: Topology-Guided Posterior Sampling for Enhanced Diversity and Validity in 3D Molecular Linker Generation

HybridLinker: Topologie-geführte hintere Probenahme für verbesserte Diversität und Validität in der 3D-Molekularlinker-Generation

GlubLinker: 3D 分子联系器生成中加强多样性和有效性的地形学-指导外表抽样 2502.17349v3

Authors: Minyeong Hwang, Ziseok Lee, Kwang-Soo Kim, Kyungsu Kim, Eunho Yang

Linker generation is critical in drug discovery applications such as lead optimization and PROTAC design, where molecular fragments are assembled into diverse drug candidates via molecular linker. Existing methods fall into point cloud-free and point cloud-aware categories based on their use of fragments’ 3D poses alongside their topologies in sampling the linker’s topology. Point cloud-free models prioritize sample diversity but suffer from lower validity due to overlooking fragments’ spatial constraints, while point cloud-aware models ensure higher validity but restrict diversity by enforcing strict spatial constraints. To overcome these trade-offs without additional training, we propose HybridLinker, a framework that enhances point cloud-aware inference by providing diverse bonding topologies from a pretrained point cloud-free model as guidance. At its core, we propose LinkerDPS, the first diffusion posterior sampling (DPS) method operating across point cloud-free and point cloud-aware spaces, bridging molecular topology with 3D point clouds via an energy-inspired function. By transferring the diverse sampling distribution of point cloud-free models into the point cloud-aware distribution, HybridLinker significantly surpasses baselines, improving both validity and diversity in foundational molecular design and applied drug optimization tasks, establishing a new DPS framework in the molecular domains beyond imaging.

nan

Article 1066

Title@2025-05-27 (2): Do We Need All the Synthetic Data? Towards Targeted Synthetic Image Augmentation via Diffusion Models

Title: Do We Need All the Synthetic Data? Towards Targeted Synthetic Image Augmentation via Diffusion Models

Brauchen wir alle synthetischen Daten? Auf dem Weg zu einer gezielten Synthetischen Bildvergrößerung über Diffusionsmodelle

我们需要所有合成数据吗?通过扩散模型实现有针对性的合成图像增强 2505.21574v1

Authors: Dang Nguyen, Jiping Li, Jinghao Zheng, Baharan Mirzasoleiman

Synthetically augmenting training datasets with diffusion models has been an effective strategy for improving generalization of image classifiers. However, existing techniques struggle to ensure the diversity of generation and increase the size of the data by up to 10-30x to improve the in-distribution performance. In this work, we show that synthetically augmenting part of the data that is not learned early in training outperforms augmenting the entire dataset. By analyzing a two-layer CNN, we prove that this strategy improves generalization by promoting homogeneity in feature learning speed without amplifying noise. Our extensive experiments show that by augmenting only 30%-40% of the data, our method boosts the performance by up to 2.8% in a variety of scenarios, including training ResNet, ViT and DenseNet on CIFAR-10, CIFAR-100, and TinyImageNet, with a range of optimizers including SGD and SAM. Notably, our method applied with SGD outperforms the SOTA optimizer, SAM, on CIFAR-100 and TinyImageNet. It can also easily stack with existing weak and strong augmentation strategies to further boost the performance.

nan

Article 1067

Title@2025-05-27 (2): Spectral-inspired Neural Operator for Data-efficient PDE Simulation in Physics-agnostic Regimes

Title: Spectral-inspired Neural Operator for Data-efficient PDE Simulation in Physics-agnostic Regimes

Spektral-inspirierter Neuraloperator für dateneffiziente PDE-Simulation in physik-agnostischen Regimes

物理 – – 不可知系统数据高效PDE模拟光导神经操作器 2505.21573v1

Authors: Han Wan, Rui Zhang, Hao Sun

Partial differential equations (PDEs) govern the spatiotemporal evolution of various physical systems. Classical numerical solvers, while accurate, require fine discretization and full knowledge of the governing PDEs, limiting their applicability when the physics is unknown or fast inference is required. Data-driven neural PDE solvers alleviate these constraints by learning from data but demand large training datasets and perform poorly in data-scarce regimes. Physics-aware methods mitigate data requirements by incorporating physical knowledge yet rely on known PDE terms or local numerical schemes, restricting their ability to handle unknown or globally coupled systems. In this work, we propose the Spectral-inspired Neural Operator (SINO), a novel framework that learns PDE operators from limited trajectories (as few as 2-5), without any known PDE terms. SINO operates in the frequency domain and introduces a Frequency-to-Vector module to learn spectral representations analogous to derivative multipliers. To model nonlinear physical interactions, we design a nonlinear operator block that includes a $\Pi$-Block with low-pass filtering to prevent aliasing. Finally, we introduce an operator distillation technique to distill the trained model for efficient inference. SINO achieves state-of-the-art results across multiple PDE benchmarks, demonstrating strong discretization invariance and robust generalization to out-of-distribution initial conditions. To our knowledge, SINO is the first physics-aware method capable of accurately simulating globally coupled systems (e.g., the Navier-Stokes equations) from limited data without any explicit PDE terms.

nan

Article 1068

Title@2025-05-27 (2): Convergence of Clipped-SGD for Convex $(L_0,L_1)$-Smooth Optimization with Heavy-Tailed Noise

Title: Convergence of Clipped-SGD for Convex $(L_0,L_1)$-Smooth Optimization with Heavy-Tailed Noise

Konvergenz von Clipped-SGD für Convex $(L_0,L_1)$-Smooth-Optimierung mit schwerfälligem Lärm

使用 Cllipped-SGD 组合(L_0,L_1) $- 与重故障噪音平滑优化 2505.20817v1

Authors: Savelii Chezhegov, Aleksandr Beznosikov, Samuel Horváth, Eduard Gorbunov

Gradient clipping is a widely used technique in Machine Learning and Deep Learning (DL), known for its effectiveness in mitigating the impact of heavy-tailed noise, which frequently arises in the training of large language models. Additionally, first-order methods with clipping, such as Clip-SGD, exhibit stronger convergence guarantees than SGD under the $(L_0,L_1)$-smoothness assumption, a property observed in many DL tasks. However, the high-probability convergence of Clip-SGD under both assumptions – heavy-tailed noise and $(L_0,L_1)$-smoothness – has not been fully addressed in the literature. In this paper, we bridge this critical gap by establishing the first high-probability convergence bounds for Clip-SGD applied to convex $(L_0,L_1)$-smooth optimization with heavy-tailed noise. Our analysis extends prior results by recovering known bounds for the deterministic case and the stochastic setting with $L_1 = 0$ as special cases. Notably, our rates avoid exponentially large factors and do not rely on restrictive sub-Gaussian noise assumptions, significantly broadening the applicability of gradient clipping.

nan

Article 1069

Title: Mixture of Low Rank Adaptation with Partial Parameter Sharing for Time Series Forecasting

Mischung aus Low-Rank-Anpassung mit Teilparameter-Sharing für Zeitreihen-Prognose

低级别适应与时间序列预测部分参数共享混合 2505.17872v2

Authors: Licheng Pan, Zhichao Chen, Haoxuan Li, Guangyi Liu, Zhijian Xu, Zhaoran Liu, Hao Wang, Ying Wei

Multi-task forecasting has become the standard approach for time-series forecasting (TSF). However, we show that it suffers from an Expressiveness Bottleneck, where predictions at different time steps share the same representation, leading to unavoidable errors even with optimal representations. To address this issue, we propose a two-stage framework: first, pre-train a foundation model for one-step-ahead prediction; then, adapt it using step-specific LoRA modules.This design enables the foundation model to handle any number of forecast steps while avoiding the expressiveness bottleneck. We further introduce the Mixture-of-LoRA (MoLA) model, which employs adaptively weighted LoRA experts to achieve partial parameter sharing across steps. This approach enhances both efficiency and forecasting performance by exploiting interdependencies between forecast steps. Experiments show that MoLA significantly improves model expressiveness and outperforms state-of-the-art time-series forecasting methods. Code is available at https://anonymous.4open.science/r/MoLA-BC92.

nan

Article 1070

Title@2025-05-27 (2): Interpretable Credit Default Prediction with Ensemble Learning and SHAP

Title: Interpretable Credit Default Prediction with Ensemble Learning and SHAP

Interpretierbare Credit Default Vorhersage mit Ensemble Learning und SHAP

组合学习和SHAP的可解释信用默认预测 2505.20815v1

Authors: Shiqi Yang, Ziyi Huang, Wengran Xiao, Xinyu Shen

This study focuses on the problem of credit default prediction, builds a modeling framework based on machine learning, and conducts comparative experiments on a variety of mainstream classification algorithms. Through preprocessing, feature engineering, and model training of the Home Credit dataset, the performance of multiple models including logistic regression, random forest, XGBoost, LightGBM, etc. in terms of accuracy, precision, and recall is evaluated. The results show that the ensemble learning method has obvious advantages in predictive performance, especially in dealing with complex nonlinear relationships between features and data imbalance problems. It shows strong robustness. At the same time, the SHAP method is used to analyze the importance and dependency of features, and it is found that the external credit score variable plays a dominant role in model decision making, which helps to improve the model’s interpretability and practical application value. The research results provide effective reference and technical support for the intelligent development of credit risk control systems.

nan

Article 1071

Title@2025-05-27 (2): Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary Domains

Title: Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary Domains

Geometry Aware Operator Transformer als effizientes und präzises Neural Surrogate für PDEs auf willkürlichen Domains

操作者变异器作为任意域中PDEs的高效和准确神经外壳 2505.18781v2

Authors: Shizheng Wen, Arsh Kumbhat, Levi Lingsch, Sepehr Mousavi, Yizhou Zhao, Praveen Chandrashekar, Siddhartha Mishra

The very challenging task of learning solution operators of PDEs on arbitrary domains accurately and efficiently is of vital importance to engineering and industrial simulations. Despite the existence of many operator learning algorithms to approximate such PDEs, we find that accurate models are not necessarily computationally efficient and vice versa. We address this issue by proposing a geometry aware operator transformer (GAOT) for learning PDEs on arbitrary domains. GAOT combines novel multiscale attentional graph neural operator encoders and decoders, together with geometry embeddings and (vision) transformer processors to accurately map information about the domain and the inputs into a robust approximation of the PDE solution. Multiple innovations in the implementation of GAOT also ensure computational efficiency and scalability. We demonstrate this significant gain in both accuracy and efficiency of GAOT over several baselines on a large number of learning tasks from a diverse set of PDEs, including achieving state of the art performance on a large scale three-dimensional industrial CFD dataset.

nan

Article 1072

Title@2025-05-27 (2): Thickness-aware E(3)-Equivariant 3D Mesh Neural Networks

Title: Thickness-aware E(3)-Equivariant 3D Mesh Neural Networks

Dicke bewusst E(3)-Equivariante 3D-Mesh-Neurale Netze

E(3)-等离 3D 3D 气象神经网络 2505.21572v1

Authors: Sungwon Kim, Namkyeong Lee, Yunyoung Doh, Seungmin Shin, Guimok Cho, Seung-Won Jeon, Sangkook Kim, Chanyoung Park

Mesh-based 3D static analysis methods have recently emerged as efficient alternatives to traditional computational numerical solvers, significantly reducing computational costs and runtime for various physics-based analyses. However, these methods primarily focus on surface topology and geometry, often overlooking the inherent thickness of real-world 3D objects, which exhibits high correlations and similar behavior between opposing surfaces. This limitation arises from the disconnected nature of these surfaces and the absence of internal edge connections within the mesh. In this work, we propose a novel framework, the Thickness-aware E(3)-Equivariant 3D Mesh Neural Network (T-EMNN), that effectively integrates the thickness of 3D objects while maintaining the computational efficiency of surface meshes. Additionally, we introduce data-driven coordinates that encode spatial information while preserving E(3)-equivariance or invariance properties, ensuring consistent and robust analysis. Evaluations on a real-world industrial dataset demonstrate the superior performance of T-EMNN in accurately predicting node-level 3D deformations, effectively capturing thickness effects while maintaining computational efficiency.

nan

Article 1073

Title@2025-05-27 (2): Step-wise Adaptive Integration of Supervised Fine-tuning and Reinforcement Learning for Task-Specific LLMs

Title: Step-wise Adaptive Integration of Supervised Fine-tuning and Reinforcement Learning for Task-Specific LLMs

Schrittweise adaptive Integration von überwachtem Feinabstimmungs- und Verstärkungslernen für aufgabenspezifische LLMs

监督特定任务专责性微调和强化学习的渐进式适应性整合 2505.13026v2

Authors: Jack Chen, Fazhong Liu, Naruto Liu, Yuhan Luo, Erqu Qin, Harry Zheng, Tian Dong, Haojin Zhu, Yan Meng, Xiao Wang

Large language models (LLMs) excel at mathematical reasoning and logical problem-solving. The current popular training paradigms primarily use supervised fine-tuning (SFT) and reinforcement learning (RL) to enhance the models’ reasoning abilities. However, when using SFT or RL alone, there are respective challenges: SFT may suffer from overfitting, while RL is prone to mode collapse. The state-of-the-art methods have proposed hybrid training schemes. However, static switching faces challenges such as poor generalization across different tasks and high dependence on data quality. In response to these challenges, inspired by the curriculum learning-quiz mechanism in human reasoning cultivation, We propose SASR, a step-wise adaptive hybrid training framework that theoretically unifies SFT and RL and dynamically balances the two throughout optimization. SASR uses SFT for initial warm-up to establish basic reasoning skills, and then uses an adaptive dynamic adjustment algorithm based on gradient norm and divergence relative to the original distribution to seamlessly integrate SFT with the online RL method GRPO. By monitoring the training status of LLMs and adjusting the training process in sequence, SASR ensures a smooth transition between training schemes, maintaining core reasoning abilities while exploring different paths. Experimental results demonstrate that SASR outperforms SFT, RL, and static hybrid training methods.

nan

Article 1074

Title@2025-05-27 (2): Simple yet Effective Graph Distillation via Clustering

Title: Simple yet Effective Graph Distillation via Clustering

Einfache und dennoch effektive Graphendestillation über Clustering

通过集群进行简单而有效的图形蒸馏 2505.20807v1

Authors: Yurui Lai, Taiyan Zhang, Renchi Yang

Despite plentiful successes achieved by graph representation learning in various domains, the training of graph neural networks (GNNs) still remains tenaciously challenging due to the tremendous computational overhead needed for sizable graphs in practice. Recently, graph data distillation (GDD), which seeks to distill large graphs into compact and informative ones, has emerged as a promising technique to enable efficient GNN training. However, most existing GDD works rely on heuristics that align model gradients or representation distributions on condensed and original graphs, leading to compromised result quality, expensive training for distilling large graphs, or both. Motivated by this, this paper presents an efficient and effective GDD approach, ClustGDD. Under the hood, ClustGDD resorts to synthesizing the condensed graph and node attributes through fast and theoretically-grounded clustering that minimizes the within-cluster sum of squares and maximizes the homophily on the original graph. The fundamental idea is inspired by our empirical and theoretical findings unveiling the connection between clustering and empirical condensation quality using Fr'echet Inception Distance, a well-known quality metric for synthetic images. Furthermore, to mitigate the adverse effects caused by the homophily-based clustering, ClustGDD refines the nodal attributes of the condensed graph with a small augmentation learned via class-aware graph sampling and consistency loss. Our extensive experiments exhibit that GNNs trained over condensed graphs output by ClustGDD consistently achieve superior or comparable performance to state-of-the-art GDD methods in terms of node classification on five benchmark datasets, while being orders of magnitude faster.

nan

Article 1075

Title@2025-05-27 (2): FCOS: A Two-Stage Recoverable Model Pruning Framework for Automatic Modulation Recognition

Title: FCOS: A Two-Stage Recoverable Model Pruning Framework for Automatic Modulation Recognition

FCOS: Ein zweistufiges, wiederherstellbares Modell-Beschneidungs-Framework für die automatische Modulationserkennung

FCOS: 自动调整识别的双层可回收模型保护框架 2505.21571v1

Authors: Yao Lu, Tengfei Ma, Zeyu Wang, Zhuangzhi Chen, Dongwei Xu, Yun Lin, Qi Xuan, Guan Gui

With the rapid development of wireless communications and the growing complexity of digital modulation schemes, traditional manual modulation recognition methods struggle to extract reliable signal features and meet real-time requirements in modern scenarios. Recently, deep learning based Automatic Modulation Recognition (AMR) approaches have greatly improved classification accuracy. However, their large model sizes and high computational demands hinder deployment on resource-constrained devices. Model pruning provides a general approach to reduce model complexity, but existing weight, channel, and layer pruning techniques each present a trade-off between compression rate, hardware acceleration, and accuracy preservation. To this end, in this paper, we introduce FCOS, a novel Fine-to-COarse two-Stage pruning framework that combines channel-level pruning with layer-level collapse diagnosis to achieve extreme compression, high performance and efficient inference. In the first stage of FCOS, hierarchical clustering and parameter fusion are applied to channel weights to achieve channel-level pruning. Then a Layer Collapse Diagnosis (LaCD) module uses linear probing to identify layer collapse and removes the collapsed layers due to high channel compression ratio. Experiments on multiple AMR benchmarks demonstrate that FCOS outperforms existing channel and layer pruning methods. Specifically, FCOS achieves 95.51% FLOPs reduction and 95.31% parameter reduction while still maintaining performance close to the original ResNet56, with only a 0.46% drop in accuracy on Sig2019-12. Code is available at https://github.com/yaolu-zjut/FCOS.

nan

Article 1076

Title@2025-05-27 (2): Quantum Machine Learning in Healthcare: Evaluating QNN and QSVM Models

Title: Quantum Machine Learning in Healthcare: Evaluating QNN and QSVM Models

Quantum Machine Learning in Healthcare: Bewertung von QNN- und QSVM-Modellen

QNN和QSVM模型评估 QNN和QSVM模型 2505.20804v1

Authors: Antonio Tudisco, Deborah Volpe, Giovanna Turvani

Effective and accurate diagnosis of diseases such as cancer, diabetes, and heart failure is crucial for timely medical intervention and improving patient survival rates. Machine learning has revolutionized diagnostic methods in recent years by developing classification models that detect diseases based on selected features. However, these classification tasks are often highly imbalanced, limiting the performance of classical models. Quantum models offer a promising alternative, exploiting their ability to express complex patterns by operating in a higher-dimensional computational space through superposition and entanglement. These unique properties make quantum models potentially more effective in addressing the challenges of imbalanced datasets. This work evaluates the potential of quantum classifiers in healthcare, focusing on Quantum Neural Networks (QNNs) and Quantum Support Vector Machines (QSVMs), comparing them with popular classical models. The study is based on three well-known healthcare datasets – Prostate Cancer, Heart Failure, and Diabetes. The results indicate that QSVMs outperform QNNs across all datasets due to their susceptibility to overfitting. Furthermore, quantum models prove the ability to overcome classical models in scenarios with high dataset imbalance. Although preliminary, these findings highlight the potential of quantum models in healthcare classification tasks and lead the way for further research in this domain.

nan

Article 1077

Title@2025-05-27 (2): Sentiment Reasoning for Healthcare

Title: Sentiment Reasoning for Healthcare

Sentiment Reasoning für die Gesundheitsversorgung

保健的情感理由 2407.21054v4

Authors: Khai-Nguyen Nguyen, Khai Le-Duc, Bach Phan Tat, Duy Le, Long Vo-Dang, Truong-Son Hy

Transparency in AI healthcare decision-making is crucial. By incorporating rationales to explain reason for each predicted label, users could understand Large Language Models (LLMs)’s reasoning to make better decision. In this work, we introduce a new task - Sentiment Reasoning - for both speech and text modalities, and our proposed multimodal multitask framework and the world’s largest multimodal sentiment analysis dataset. Sentiment Reasoning is an auxiliary task in sentiment analysis where the model predicts both the sentiment label and generates the rationale behind it based on the input transcript. Our study conducted on both human transcripts and Automatic Speech Recognition (ASR) transcripts shows that Sentiment Reasoning helps improve model transparency by providing rationale for model prediction with quality semantically comparable to humans while also improving model’s classification performance (+2% increase in both accuracy and macro-F1) via rationale-augmented fine-tuning. Also, no significant difference in the semantic quality of generated rationales between human and ASR transcripts. All code, data (five languages - Vietnamese, English, Chinese, German, and French) and models are published online: https://github.com/leduckhai/Sentiment-Reasoning

nan

Article 1078

Title@2025-05-27 (2): Leaner Transformers: More Heads, Less Depth

Title: Leaner Transformers: More Heads, Less Depth

Leaner Transformer: Mehr Köpfe, weniger Tiefe

皮质变形器: 更多的头, 更少深度 2505.20802v1

Authors: Hemanth Saratchandran, Damien Teney, Simon Lucey

Transformers have reshaped machine learning by utilizing attention mechanisms to capture complex patterns in large datasets, leading to significant improvements in performance. This success has contributed to the belief that “bigger means better”, leading to ever-increasing model sizes. This paper challenge this ideology by showing that many existing transformers might be unnecessarily oversized. We discover a theoretical principle that redefines the role of multi-head attention. An important benefit of the multiple heads is in improving the conditioning of the attention block. We exploit this theoretical insight and redesign popular architectures with an increased number of heads. The improvement in the conditioning proves so significant in practice that model depth can be decreased, reducing the parameter count by up to 30-50% while maintaining accuracy. We obtain consistent benefits across a variety of transformer-based architectures of various scales, on tasks in computer vision (ImageNet-1k) as well as language and sequence modeling (GLUE benchmark, TinyStories, and the Long-Range Arena benchmark).

nan

Article 1079

Title@2025-05-27 (2): Multi-VQC: A Novel QML Approach for Enhancing Healthcare Classification

Title: Multi-VQC: A Novel QML Approach for Enhancing Healthcare Classification

Multi-VQC: Ein neuartiger QML-Ansatz zur Verbesserung der Gesundheitsklassifikation

多VQC:加强保健分类的新QML方法 2505.20797v1

Authors: Antonio Tudisco, Deborah Volpe, Giovanna Turvani

Accurate and reliable diagnosis of diseases is crucial in enabling timely medical treatment and enhancing patient survival rates. In recent years, Machine Learning has revolutionized diagnostic practices by creating classification models capable of identifying diseases. However, these classification problems often suffer from significant class imbalances, which can inhibit the effectiveness of traditional models. Therefore, the interest in Quantum models has arisen, driven by the captivating promise of overcoming the limitations of the classical counterpart thanks to their ability to express complex patterns by mapping data in a higher-dimensional computational space.

nan

Article 1080

Title@2025-05-27 (2): A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models

Title: A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models

Eine Graphenperspektive zur Untersuchung struktureller Wissensmuster in großen Sprachmodellen

《大语言模式知识结构模式研究图示展望》 2505.19286v2

Authors: Utkarsh Sahu, Zhisheng Qi, Yongjia Lei, Ryan A. Rossi, Franck Dernoncourt, Nesreen K. Ahmed, Mahantesh M Halappanavar, Yao Ma, Yu Wang

Large language models have been extensively studied as neural knowledge bases for their knowledge access, editability, reasoning, and explainability. However, few works focus on the structural patterns of their knowledge. Motivated by this gap, we investigate these structural patterns from a graph perspective. We quantify the knowledge of LLMs at both the triplet and entity levels, and analyze how it relates to graph structural properties such as node degree. Furthermore, we uncover the knowledge homophily, where topologically close entities exhibit similar levels of knowledgeability, which further motivates us to develop graph machine learning models to estimate entity knowledge based on its local neighbors. This model further enables valuable knowledge checking by selecting triplets less known to LLMs. Empirical results show that using selected triplets for fine-tuning leads to superior performance.

nan

Article 1081

Title@2025-05-27 (2): Amortized Bayesian Workflow

Title: Amortized Bayesian Workflow

Amortisierter Bayesischer Workflow

摊还的贝耶斯人工作流量 2409.04332v2

Authors: Chengkun Li, Aki Vehtari, Paul-Christian Bürkner, Stefan T. Radev, Luigi Acerbi, Marvin Schmitt

Bayesian inference often faces a trade-off between computational speed and sampling accuracy. We propose an adaptive workflow that integrates rapid amortized inference with gold-standard MCMC techniques to achieve a favorable combination of both speed and accuracy when performing inference on many observed datasets. Our approach uses principled diagnostics to guide the choice of inference method for each dataset, moving along the Pareto front from fast amortized sampling via generative neural networks to slower but guaranteed-accurate MCMC when needed. By reusing computations across steps, our workflow synergizes amortized and MCMC-based inference. We demonstrate the effectiveness of this integrated approach on several synthetic and real-world problems with tens of thousands of datasets, showing efficiency gains while maintaining high posterior quality.

nan

Article 1082

Title@2025-05-27 (2): Where You Place the Norm Matters: From Prejudiced to Neutral Initializations

Title: Where You Place the Norm Matters: From Prejudiced to Neutral Initializations

Wo Sie die Norm-Materien platzieren: Von voreingenommenen zu neutralen Initialisierungen

将规范问题放在哪里: 从偏见到中立初始化 2505.11312v3

Authors: Emanuele Francazi, Francesco Pinto, Aurelien Lucchi, Marco Baity-Jesi

Normalization layers, such as Batch Normalization and Layer Normalization, are central components in modern neural networks, widely adopted to improve training stability and generalization. While their practical effectiveness is well documented, a detailed theoretical understanding of how normalization affects model behavior, starting from initialization, remains an important open question. In this work, we investigate how both the presence and placement of normalization within hidden layers influence the statistical properties of network predictions before training begins. In particular, we study how these choices shape the distribution of class predictions at initialization, which can range from unbiased (Neutral) to highly concentrated (Prejudiced) toward a subset of classes. Our analysis shows that normalization placement induces systematic differences in the initial prediction behavior of neural networks, which in turn shape the dynamics of learning. By linking architectural choices to prediction statistics at initialization, our work provides a principled understanding of how normalization can influence early training behavior and offers guidance for more controlled and interpretable network design.

nan

Article 1083

Title@2025-05-27 (2): Enhancing Wearable Tap Water Audio Detection through Subclass Annotation in the HD-Epic Dataset

Title: Enhancing Wearable Tap Water Audio Detection through Subclass Annotation in the HD-Epic Dataset

Verbesserung der tragbaren Wasserhahn-Audioerkennung durch Unterklasse-Annotation im HD-Epic-Datensatz

通过在HD-Epic数据集中分级注解,加强穿戴式塔普水音频探测 2505.20788v1

Authors: Robin Burchard, Kristof Van Laerhoven

Wearable human activity recognition has been shown to benefit from the inclusion of acoustic data, as the sounds around a person often contain valuable context. However, due to privacy concerns, it is usually not ethically feasible to record and save microphone data from the device, since the audio could, for instance, also contain private conversations. Rather, the data should be processed locally, which in turn requires processing power and consumes energy on the wearable device. One special use case of contextual information that can be utilized to augment special tasks in human activity recognition is water flow detection, which can, e.g., be used to aid wearable hand washing detection. We created a new label called tap water for the recently released HD-Epic data set, creating 717 hand-labeled annotations of tap water flow, based on existing annotations of the water class. We analyzed the relation of tap water and water in the dataset and additionally trained and evaluated two lightweight classifiers to evaluate the newly added label class, showing that the new class can be learned more easily.

nan

Article 1084

Title@2025-05-27 (2): LIB-KD: Learning Inductive Bias, Not Just Parameters A New Perspective on Knowledge Distillations

Title: LIB-KD: Learning Inductive Bias, Not Just Parameters A New Perspective on Knowledge Distillations

LIB-KD: Induktive Bias lernen, nicht nur Parameter Eine neue Perspektive auf Wissensdestillationen

LIB-KD:学习感性偏见,而不仅仅是知识蒸馏的新视角参数 2310.00369v3

Authors: Gousia Habib, Tausifa Jan Saleem, Ishfaq Ahmad Malik, Brejesh Lall

With the rapid development of computer vision, Vision Transformers (ViTs) offer the tantalizing prospect of unified information processing across visual and textual domains. But due to the lack of inherent inductive biases in ViTs, they require enormous amount of data for training. To make their applications practical, we introduce an innovative ensemble-based distillation approach distilling inductive bias from complementary lightweight teacher models. Prior systems relied solely on convolution-based teaching. However, this method incorporates an ensemble of light teachers with different architectural tendencies, such as convolution and involution, to instruct the student transformer jointly. Because of these unique inductive biases, instructors can accumulate a wide range of knowledge, even from readily identifiable stored datasets, which leads to enhanced student performance. Our proposed framework also involves precomputing and storing logits in advance, essentially the unnormalized predictions of the model. This optimization can accelerate the distillation process by eliminating the need for repeated forward passes during knowledge distillation, significantly reducing the computational burden and enhancing efficiency.

nan

Article 1085

Title@2025-05-27 (2): Low-Rank Adapting Models for Sparse Autoencoders

Title: Low-Rank Adapting Models for Sparse Autoencoders

Low-Rank Anpassungsmodelle für Sparse Autoencoder

普通自动解析器低 Rank 适应模型 2501.19406v2

Authors: Matthew Chen, Joshua Engels, Max Tegmark

Sparse autoencoders (SAEs) decompose language model representations into a sparse set of linear latent vectors. Recent works have improved SAEs using language model gradients, but these techniques require many expensive backward passes during training and still cause a significant increase in cross entropy loss when SAE reconstructions are inserted into the model. In this work, we improve on these limitations by taking a fundamentally different approach: we use low-rank adaptation (LoRA) to finetune the \textit{language model itself} around a previously trained SAE. We analyze our method across SAE sparsity, SAE width, language model size, LoRA rank, and model layer on the Gemma Scope family of SAEs. In these settings, our method reduces the cross entropy loss gap by 30\% to 55\% when SAEs are inserted during the forward pass. We also find that compared to end-to-end (e2e) SAEs, our approach achieves the same downstream cross entropy loss 3$\times$ to 20$\times$ faster on \gemma and 2$\times$ to 10$\times$ faster on \llama. We further show that our technique improves downstream metrics and can adapt multiple SAEs at once without harming general language model capabilities. Our results demonstrate that improving model interpretability is not limited to post-hoc SAE training; Pareto improvements can also be achieved by directly optimizing the model itself.

nan

Article 1086

Title@2025-05-27 (2): STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation

Title: STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation

STITCH-OPE: Trajektorienstiche mit geführter Diffusion für Off-Policy-Bewertung

STSTTCH-OPE: 非政策评价的引导传播的轨迹 2505.20781v1

Authors: Hossein Goli, Michael Gimelfarb, Nathan Samuel de Lara, Haruki Nishimura, Masha Itkina, Florian Shkurti

Off-policy evaluation (OPE) estimates the performance of a target policy using offline data collected from a behavior policy, and is crucial in domains such as robotics or healthcare where direct interaction with the environment is costly or unsafe. Existing OPE methods are ineffective for high-dimensional, long-horizon problems, due to exponential blow-ups in variance from importance weighting or compounding errors from learned dynamics models. To address these challenges, we propose STITCH-OPE, a model-based generative framework that leverages denoising diffusion for long-horizon OPE in high-dimensional state and action spaces. Starting with a diffusion model pre-trained on the behavior data, STITCH-OPE generates synthetic trajectories from the target policy by guiding the denoising process using the score function of the target policy. STITCH-OPE proposes two technical innovations that make it advantageous for OPE: (1) prevents over-regularization by subtracting the score of the behavior policy during guidance, and (2) generates long-horizon trajectories by stitching partial trajectories together end-to-end. We provide a theoretical guarantee that under mild assumptions, these modifications result in an exponential reduction in variance versus long-horizon trajectory diffusion. Experiments on the D4RL and OpenAI Gym benchmarks show substantial improvement in mean squared error, correlation, and regret metrics compared to state-of-the-art OPE methods.

nan

Article 1087

Title@2025-05-27 (2): SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long Sequences

Title: SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long Sequences

SpecExtend: Ein Drop-in-Enhancement für spekulative Decoding von langen Sequenzen

外观:对长期序列的投机性代谢的减少增强 2505.20776v1

Authors: Jungyoub Cha, Hyunjong Kim, Sungzoon Cho

Speculative decoding is a widely adopted technique for accelerating inference in large language models (LLMs), but its performance degrades on long inputs due to increased attention cost and reduced draft accuracy. We introduce SpecExtend, a drop-in enhancement that improves the performance of speculative decoding on long sequences without any additional training. SpecExtend integrates efficient attention mechanisms such as FlashAttention and Hybrid Tree Attention into both the draft and target models, reducing latency across all stages. To improve draft accuracy and speed, we propose Cross-model Retrieval, a novel KV cache update strategy that uses the target model’s attention scores to dynamically select relevant context for the draft model. Extensive evaluations on three long-context understanding datasets show that SpecExtend accelerates standard tree-based speculative decoding by up to 2.22x for inputs up to 16K tokens, providing an effective solution for speculative decoding of long sequences. The code is available at https://github.com/jycha98/SpecExtend .

nan

Article 1088

Title@2025-05-27 (2): T-REX: Mixture-of-Rank-One-Experts with Semantic-aware Intuition for Multi-task Large Language Model Finetuning

Title: T-REX: Mixture-of-Rank-One-Experts with Semantic-aware Intuition for Multi-task Large Language Model Finetuning

T-REX: Mixture-of-Rank-One-Experts mit semantischer Intuition für Multi-Task Large Language Model Finetuning

T-REX:多任务大语言模型微调中具有语义认知度的多任务大语言模型微调混合型兰克单方专家 2404.08985v2

Authors: Rongyu Zhang, Yijiang Liu, Huanrui Yang, Shenli Zheng, Dan Wang, Yuan Du, Li Du, Shanghang Zhang

Large language models (LLMs) encounter significant adaptation challenges in diverse multitask finetuning. Mixture-of-experts (MoE) provides a promising solution with a dynamic architecture, enabling effective task decoupling. However, scaling up the number of MoE experts incurs substantial parameter and computational overheads and suffers from limited performance gain due to naive routing mechanisms. In this paper, we design a novel framework, mix\underline{\textbf{T}}ure\underline{\textbf{-}}of-\underline{\textbf{R}}ank-on\underline{\textbf{E}}-e\underline{\textbf{X}}perts (\texttt{T-REX}), which leverages the combination of ultra-low rank experts to construct LoRA weights on pretrained LLMs. The rank-1 experts enable a mix-and-match mechanism to quadratically expand the vector subspace of experts with linear parameter overheads, achieving approximate error reduction with optimal efficiency. In addition, T-REX offers implicit guidance to the router, leveraging the inherent semantic clustering of training embeddings as prior knowledge, enabling optimized feature allocation across experts for a smoother convergence. Extensive theoretical and empirical results demonstrate that T-REX achieves superior efficiency and generalizability across diverse tasks. Compared with other LoRA-based methods, T-REX achieves up to 1.78\% mean accuracy improvement with around 30\%-40\% less trainable parameters across 14 public datasets. \href{https://github.com/RoyZry98/T-REX-Pytorch}{Code} is available.

nan

Article 1089

Title@2025-05-27 (2): Non-invasive maturity assessment of iPSC-CMs based on optical maturity characteristics using interpretable AI

Title: Non-invasive maturity assessment of iPSC-CMs based on optical maturity characteristics using interpretable AI

Nicht-invasive Bewertung der Laufzeit von iPSC-CMs auf der Grundlage optischer Reifemerkmale unter Verwendung interpretierbarer KI

使用可解释的AI根据光学成熟度特性对iPSC-CMMs进行非侵入性成熟度评估 2505.20775v1

Authors: Fabian Scheurer, Alexander Hammer, Mario Schubert, Robert-Patrick Steiner, Oliver Gamm, Kaomei Guan, Frank Sonntag, Hagen Malberg, Martin Schmidt

Human induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) are an important resource for the identification of new therapeutic targets and cardioprotective drugs. After differentiation iPSC-CMs show an immature, fetal-like phenotype. Cultivation of iPSC-CMs in lipid-supplemented maturation medium (MM) strongly enhances their structural, metabolic and functional phenotype. Nevertheless, assessing iPSC-CM maturation state remains challenging as most methods are time consuming and go in line with cell damage or loss of the sample. To address this issue, we developed a non-invasive approach for automated classification of iPSC-CM maturity through interpretable artificial intelligence (AI)-based analysis of beat characteristics derived from video-based motion analysis. In a prospective study, we evaluated 230 video recordings of early-state, immature iPSC-CMs on day 21 after differentiation (d21) and more mature iPSC-CMs cultured in MM (d42, MM). For each recording, 10 features were extracted using Maia motion analysis software and entered into a support vector machine (SVM). The hyperparameters of the SVM were optimized in a grid search on 80 % of the data using 5-fold cross-validation. The optimized model achieved an accuracy of 99.5 $\pm$ 1.1 % on a hold-out test set. Shapley Additive Explanations (SHAP) identified displacement, relaxation-rise time and beating duration as the most relevant features for assessing maturity level. Our results suggest the use of non-invasive, optical motion analysis combined with AI-based methods as a tool to assess iPSC-CMs maturity and could be applied before performing functional readouts or drug testing. This may potentially reduce the variability and improve the reproducibility of experimental studies.

nan

Article 1090

Title@2025-05-27 (2): TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state

Title: TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state

TimePro: Effiziente Multivariate Langzeit-Zeitreihen-Prognose mit variabler und zeitversetzter Hyperstate

具有可变和时间warware超状态预测的高效多变长期时间序列 2505.20774v1

Authors: Xiaowen Ma, Zhenliang Ni, Shuai Xiao, Xinghao Chen

In long-term time series forecasting, different variables often influence the target variable over distinct time intervals, a challenge known as the multi-delay issue. Traditional models typically process all variables or time points uniformly, which limits their ability to capture complex variable relationships and obtain non-trivial time representations. To address this issue, we propose TimePro, an innovative Mamba-based model that constructs variate- and time-aware hyper-states. Unlike conventional approaches that merely transfer plain states across variable or time dimensions, TimePro preserves the fine-grained temporal features of each variate token and adaptively selects the focused time points to tune the plain state. The reconstructed hyper-state can perceive both variable relationships and salient temporal information, which helps the model make accurate forecasting. In experiments, TimePro performs competitively on eight real-world long-term forecasting benchmarks with satisfactory linear complexity. Code is available at https://github.com/xwmaxwma/TimePro.

nan

Article 1091

Title@2025-05-27 (2): MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning

Title: MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning

MetaSlot: Durchbruch durch die feste Anzahl von Slots im Objekt-Zentrischen Lernen

MetaSlot: 打破对象中心学习中的固定空格数 2505.20772v1

Authors: Hongjia Liu, Rongzhen Zhao, Haohan Chen, Joni Pajarinen

Learning object-level, structured representations is widely regarded as a key to better generalization in vision and underpins the design of next-generation Pre-trained Vision Models (PVMs). Mainstream Object-Centric Learning (OCL) methods adopt Slot Attention or its variants to iteratively aggregate objects’ super-pixels into a fixed set of query feature vectors, termed slots. However, their reliance on a static slot count leads to an object being represented as multiple parts when the number of objects varies. We introduce MetaSlot, a plug-and-play Slot Attention variant that adapts to variable object counts. MetaSlot (i) maintains a codebook that holds prototypes of objects in a dataset by vector-quantizing the resulting slot representations; (ii) removes duplicate slots from the traditionally aggregated slots by quantizing them with the codebook; and (iii) injects progressively weaker noise into the Slot Attention iterations to accelerate and stabilize the aggregation. MetaSlot is a general Slot Attention variant that can be seamlessly integrated into existing OCL architectures. Across multiple public datasets and tasks–including object discovery and recognition–models equipped with MetaSlot achieve significant performance gains and markedly interpretable slot representations, compared with existing Slot Attention variants.

nan

Article 1092

Title@2025-05-27 (2): ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

Title: ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

ChemHAS: Hierarchische Agenzien-Stacking zur Verbesserung von Chemiewerkzeugen

ChemHAS:加强化学工具的等级代理人 2505.21569v1

Authors: Zhucong Li, Bowei Zhang, Jin Xiao, Zhijian Zhou, Fenglei Cao, Jiaqing Liang, Yuan Qi

Large Language Model (LLM)-based agents have demonstrated the ability to improve performance in chemistry-related tasks by selecting appropriate tools. However, their effectiveness remains limited by the inherent prediction errors of chemistry tools. In this paper, we take a step further by exploring how LLMbased agents can, in turn, be leveraged to reduce prediction errors of the tools. To this end, we propose ChemHAS (Chemical Hierarchical Agent Stacking), a simple yet effective method that enhances chemistry tools through optimizing agent-stacking structures from limited data. ChemHAS achieves state-of-the-art performance across four fundamental chemistry tasks, demonstrating that our method can effectively compensate for prediction errors of the tools. Furthermore, we identify and characterize four distinct agent-stacking behaviors, potentially improving interpretability and revealing new possibilities for AI agent applications in scientific research. Our code and dataset are publicly available at https: //anonymous.4open.science/r/ChemHAS-01E4/README.md.

nan

Article 1093

Title@2025-05-27 (2): Divide-Fuse-Conquer: Eliciting “Aha Moments” in Multi-Scenario Games

Title: Divide-Fuse-Conquer: Eliciting “Aha Moments” in Multi-Scenario Games

Divide-Fuse-Conquer: Eliciting “Aha Momente” in Multi-Szenario-Spiele

分裂-裂变:在多种场景运动会中激发“哈动力” 2505.16401v2

Authors: Xiaoqing Zhang, Huabin Zheng, Ang Lv, Yuhan Liu, Zirui Song, Flood Sung, Xiuying Chen, Rui Yan

Large language models (LLMs) have been observed to suddenly exhibit advanced reasoning abilities during reinforcement learning (RL), resembling an ``aha moment’’ triggered by simple outcome-based rewards. While RL has proven effective in eliciting such breakthroughs in tasks involving mathematics, coding, and vision, it faces significant challenges in multi-scenario games. The diversity of game rules, interaction modes, and environmental complexities often leads to policies that perform well in one scenario but fail to generalize to others. Simply combining multiple scenarios during training introduces additional challenges, such as training instability and poor performance. To overcome these challenges, we propose Divide-Fuse-Conquer, a framework designed to enhance generalization in multi-scenario RL. This approach starts by heuristically grouping games based on characteristics such as rules and difficulties. Specialized models are then trained for each group to excel at games in the group is what we refer to as the divide step. Next, we fuse model parameters from different groups as a new model, and continue training it for multiple groups, until the scenarios in all groups are conquered. Experiments across 18 TextArena games show that Qwen2.5-32B-Align trained with the Divide-Fuse-Conquer strategy reaches a performance level comparable to Claude3.5, achieving 7 wins and 4 draws. We hope our approach can inspire future research on using reinforcement learning to improve the generalization of LLMs.

nan

Article 1094

Title@2025-05-27 (2): Robust and Explainable Detector of Time Series Anomaly via Augmenting Multiclass Pseudo-Anomalies

Title: Robust and Explainable Detector of Time Series Anomaly via Augmenting Multiclass Pseudo-Anomalies

Robuster und erklärbarer Detektor der Zeitreihenanomalie durch Augmenting-Multiclass-Pseudoanomalien

通过增强多级优度反射器反射反射器,对时间序列时间序列进行强力和可解释的探测器 2505.20765v1

Authors: Kohei Obata, Yasuko Matsubara, Yasushi Sakurai

Unsupervised anomaly detection in time series has been a pivotal research area for decades. Current mainstream approaches focus on learning normality, on the assumption that all or most of the samples in the training set are normal. However, anomalies in the training set (i.e., anomaly contamination) can be misleading. Recent studies employ data augmentation to generate pseudo-anomalies and learn the boundary separating the training samples from the augmented samples. Although this approach mitigates anomaly contamination if augmented samples mimic unseen real anomalies, it suffers from several limitations. (1) Covering a wide range of time series anomalies is challenging. (2) It disregards augmented samples that resemble normal samples (i.e., false anomalies). (3) It places too much trust in the labels of training and augmented samples. In response, we propose RedLamp, which employs diverse data augmentations to generate multiclass pseudo-anomalies and learns the multiclass boundary. Such multiclass pseudo-anomalies cover a wide variety of time series anomalies. We conduct multiclass classification using soft labels, which prevents the model from being overconfident and ensures its robustness against contaminated/false anomalies. The learned latent space is inherently explainable as it is trained to separate pseudo-anomalies into multiclasses. Extensive experiments demonstrate the effectiveness of RedLamp in anomaly detection and its robustness against anomaly contamination.

nan

Article 1095

Title@2025-05-27 (2): ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval

Title: ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval

ConText-CIR: Von Konzepten lernen im Text für das komponierte Bild-Retrieval

ConText-CIR:从合成图像检索文本中的概念学习 2505.20764v1

Authors: Eric Xing, Pranavi Kolouju, Robert Pless, Abby Stylianou, Nathan Jacobs

Composed image retrieval (CIR) is the task of retrieving a target image specified by a query image and a relative text that describes a semantic modification to the query image. Existing methods in CIR struggle to accurately represent the image and the text modification, resulting in subpar performance. To address this limitation, we introduce a CIR framework, ConText-CIR, trained with a Text Concept-Consistency loss that encourages the representations of noun phrases in the text modification to better attend to the relevant parts of the query image. To support training with this loss function, we also propose a synthetic data generation pipeline that creates training data from existing CIR datasets or unlabeled images. We show that these components together enable stronger performance on CIR tasks, setting a new state-of-the-art in composed image retrieval in both the supervised and zero-shot settings on multiple benchmark datasets, including CIRR and CIRCO. Source code, model checkpoints, and our new datasets are available at https://github.com/mvrl/ConText-CIR.

nan

Article 1096

Title@2025-05-27 (2): Learning to Explain Air Traffic Situation

Title: Learning to Explain Air Traffic Situation

Erklären der Lage im Luftverkehr

学习解释空中交通状况 2502.10764v2

Authors: Hong-ah Chai, Seokbin Yoon, Keumjin Lee

Understanding how air traffic controllers construct a mental ‘picture’ of complex air traffic situations is crucial but remains a challenge due to the inherently intricate, high-dimensional interactions between aircraft, pilots, and controllers. Previous work on modeling the strategies of air traffic controllers and their mental image of traffic situations often centers on specific air traffic control tasks or pairwise interactions between aircraft, neglecting to capture the comprehensive dynamics of an air traffic situation. To address this issue, we propose a machine learning-based framework for explaining air traffic situations. Specifically, we employ a Transformer-based multi-agent trajectory model that encapsulates both the spatio-temporal movement of aircraft and social interaction between them. By deriving attention scores from the model, we can quantify the influence of individual aircraft on overall traffic dynamics. This provides explainable insights into how air traffic controllers perceive and understand the traffic situation. Trained on real-world air traffic surveillance data collected from the terminal airspace around Incheon International Airport in South Korea, our framework effectively explicates air traffic situations. This could potentially support and enhance the decision-making and situational awareness of air traffic controllers.

nan

Article 1097

Title@2025-05-27 (2): Practical estimation of the optimal classification error with soft labels and calibration

Title: Practical estimation of the optimal classification error with soft labels and calibration

Praktische Schätzung des optimalen Klassifizierungsfehlers mit Softlabels und Kalibrierung

用软标签和校准校准对最佳分类错误的实际估计 2505.20761v1

Authors: Ryota Ushio, Takashi Ishida, Masashi Sugiyama

While the performance of machine learning systems has experienced significant improvement in recent years, relatively little attention has been paid to the fundamental question: to what extent can we improve our models? This paper provides a means of answering this question in the setting of binary classification, which is practical and theoretically supported. We extend a previous work that utilizes soft labels for estimating the Bayes error, the optimal error rate, in two important ways. First, we theoretically investigate the properties of the bias of the hard-label-based estimator discussed in the original work. We reveal that the decay rate of the bias is adaptive to how well the two class-conditional distributions are separated, and it can decay significantly faster than the previous result suggested as the number of hard labels per instance grows. Second, we tackle a more challenging problem setting: estimation with corrupted soft labels. One might be tempted to use calibrated soft labels instead of clean ones. However, we reveal that calibration guarantee is not enough, that is, even perfectly calibrated soft labels can result in a substantially inaccurate estimate. Then, we show that isotonic calibration can provide a statistically consistent estimator under an assumption weaker than that of the previous work. Our method is instance-free, i.e., we do not assume access to any input instances. This feature allows it to be adopted in practical scenarios where the instances are not available due to privacy issues. Experiments with synthetic and real-world datasets show the validity of our methods and theory.

nan

Article 1098

Title@2025-05-27 (2): Multi-Stage Speaker Diarization for Noisy Classrooms

Title: Multi-Stage Speaker Diarization for Noisy Classrooms

Mehrstufige Speaker-Diarisierung für Lärmklassenräume

多级发言人多级发言人吵闹教室的响声 2505.10879v2

Authors: Ali Sartaz Khan, Tolulope Ogunremi, Ahmed Adel Attia, Dorottya Demszky

Speaker diarization, the process of identifying “who spoke when” in audio recordings, is essential for understanding classroom dynamics. However, classroom settings present distinct challenges, including poor recording quality, high levels of background noise, overlapping speech, and the difficulty of accurately capturing children’s voices. This study investigates the effectiveness of multi-stage diarization models using Nvidia’s NeMo diarization pipeline. We assess the impact of denoising on diarization accuracy and compare various voice activity detection (VAD) models, including self-supervised transformer-based frame-wise VAD models. We also explore a hybrid VAD approach that integrates Automatic Speech Recognition (ASR) word-level timestamps with frame-level VAD predictions. We conduct experiments using two datasets from English speaking classrooms to separate teacher vs. student speech and to separate all speakers. Our results show that denoising significantly improves the Diarization Error Rate (DER) by reducing the rate of missed speech. Additionally, training on both denoised and noisy datasets leads to substantial performance gains in noisy conditions. The hybrid VAD model leads to further improvements in speech detection, achieving a DER as low as 17% in teacher-student experiments and 45% in all-speaker experiments. However, we also identified trade-offs between voice activity detection and speaker confusion. Overall, our study highlights the effectiveness of multi-stage diarization models and integrating ASR-based information for enhancing speaker diarization in noisy classroom environments.

nan

Article 1099

Title@2025-05-27 (2): Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model

Title: Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model

Paarweise Optimale Transporte für Training All-to-All Flow-Based Condition Transfer Modell

以对等方式最佳运输培训全到所有流动条件转让模式 2504.03188v2

Authors: Kotaro Ikeda, Masanori Koyama, Jinzhe Zhang, Kohei Hayashi, Kenji Fukumizu

In this paper, we propose a flow-based method for learning all-to-all transfer maps among conditional distributions that approximates pairwise optimal transport. The proposed method addresses the challenge of handling the case of continuous conditions, which often involve a large set of conditions with sparse empirical observations per condition. We introduce a novel cost function that enables simultaneous learning of optimal transports for all pairs of conditional distributions. Our method is supported by a theoretical guarantee that, in the limit, it converges to the pairwise optimal transports among infinite pairs of conditional distributions. The learned transport maps are subsequently used to couple data points in conditional flow matching. We demonstrate the effectiveness of this method on synthetic and benchmark datasets, as well as on chemical datasets in which continuous physical properties are defined as conditions.

nan

Article 1100

Title@2025-05-27 (2): Scalable Model Merging with Progressive Layer-wise Distillation

Title: Scalable Model Merging with Progressive Layer-wise Distillation

Skalierbares Modell Zusammenführen mit progressiver schichtweiser Destillation

可缩放模型与递进图层蒸馏法合并 2502.12706v2

Authors: Jing Xu, Jiazheng Li, Jingzhao Zhang

Model merging offers an effective way to integrate the capabilities of multiple fine-tuned models. However, the performance degradation of the merged model remains a challenge, particularly when none or few data are available. This paper first highlights the necessity of domain-specific data for model merging by proving that data-agnostic algorithms can have arbitrarily bad worst-case performance. Building on this theoretical insight, we explore the relationship between model merging and distillation, introducing a novel few-shot merging algorithm, ProDistill (Progressive Layer-wise Distillation). Unlike common belief that layer wise training hurts performance, we show that layer-wise teacher-student distillation not only enhances the scalability but also improves model merging performance. We conduct extensive experiments to show that compared to existing few-shot merging methods, ProDistill achieves state-of-the-art performance, with up to 6.14% and 6.61% improvements in vision and NLU tasks. Furthermore, we extend the experiments to models with over 10B parameters, showcasing the exceptional scalability of ProDistill.

nan

Article 1101

Title@2025-05-27 (2): Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction

Title: Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction

Uni-Instruct: Einstufiges Diffusionsmodell durch Unified Diffusion Divergence Instruction

Uni- Instruct: 通过统一扩散分散指令单步扩散模型 2505.20755v1

Authors: Yifei Wang, Weimin Bai, Colin Zhang, Debing Zhang, Weijian Luo, He Sun

In this paper, we unify more than 10 existing one-step diffusion distillation approaches, such as Diff-Instruct, DMD, SIM, SiD, $f$-distill, etc, inside a theory-driven framework which we name the \textbf{\emph{Uni-Instruct}}. Uni-Instruct is motivated by our proposed diffusion expansion theory of the $f$-divergence family. Then we introduce key theories that overcome the intractability issue of the original expanded $f$-divergence, resulting in an equivalent yet tractable loss that effectively trains one-step diffusion models by minimizing the expanded $f$-divergence family. The novel unification introduced by Uni-Instruct not only offers new theoretical contributions that help understand existing approaches from a high-level perspective but also leads to state-of-the-art one-step diffusion generation performances. On the CIFAR10 generation benchmark, Uni-Instruct achieves record-breaking Frechet Inception Distance (FID) values of \textbf{\emph{1.46}} for unconditional generation and \textbf{\emph{1.38}} for conditional generation. On the ImageNet-$64\times 64$ generation benchmark, Uni-Instruct achieves a new SoTA one-step generation FID of \textbf{\emph{1.02}}, which outperforms its 79-step teacher diffusion with a significant improvement margin of 1.33 (1.02 vs 2.35). We also apply Uni-Instruct on broader tasks like text-to-3D generation. For text-to-3D generation, Uni-Instruct gives decent results, which slightly outperforms previous methods, such as SDS and VSD, in terms of both generation quality and diversity. Both the solid theoretical and empirical contributions of Uni-Instruct will potentially help future studies on one-step diffusion distillation and knowledge transferring of diffusion models.

nan

Article 1102

Title@2025-05-27 (2): Stationary MMD Points for Cubature

Title: Stationary MMD Points for Cubature

Stationäre MMD-Punkte für Kubature

Cubature 固定的 MMMD点 2505.20754v1

Authors: Zonghao Chen, Toni Karvonen, Heishiro Kanagawa, François-Xavier Briol, Chris. J. Oates

Approximation of a target probability distribution using a finite set of points is a problem of fundamental importance, arising in cubature, data compression, and optimisation. Several authors have proposed to select points by minimising a maximum mean discrepancy (MMD), but the non-convexity of this objective precludes global minimisation in general. Instead, we consider \emph{stationary} points of the MMD which, in contrast to points globally minimising the MMD, can be accurately computed. Our main theoretical contribution is the (perhaps surprising) result that, for integrands in the associated reproducing kernel Hilbert space, the cubature error of stationary MMD points vanishes \emph{faster} than the MMD. Motivated by this \emph{super-convergence} property, we consider discretised gradient flows as a practical strategy for computing stationary points of the MMD, presenting a refined convergence analysis that establishes a novel non-asymptotic finite-particle error bound, which may be of independent interest.

nan

Article 1103

Title@2025-05-27 (2): EaqVLA: Encoding-aligned Quantization for Vision-Language-Action Models

Title: EaqVLA: Encoding-aligned Quantization for Vision-Language-Action Models

EaqVLA: Kodierungsorientierte Quantisierung für Vision-Language-Action-Modelle

EaqVLA: 愿景-语言-行动模式的编码和一致的量化 2505.21567v1

Authors: Feng Jiang, Zihao Zheng, Xiuping Cui, Maoliang Li, JIayu Chen, Xiang Chen

With the development of Embodied Artificial intelligence, the end-to-end control policy such as Vision-Language-Action (VLA) model has become the mainstream. Existing VLA models faces expensive computing/storage cost, which need to be optimized. Quantization is considered as the most effective method which can not only reduce the memory cost but also achieve computation acceleration. However, we find the token alignment of VLA models hinders the application of existing quantization methods. To address this, we proposed an optimized framework called EaqVLA, which apply encoding-aligned quantization to VLA models. Specifically, we propose an complete analysis method to find the misalignment in various granularity. Based on the analysis results, we propose a mixed precision quantization with the awareness of encoding alignment. Experiments shows that the porposed EaqVLA achieves better quantization performance (with the minimal quantization loss for end-to-end action control and xxx times acceleration) than existing quantization methods.

nan

Article 1104

Title@2025-05-27 (2): Map Space Belief Prediction for Manipulation-Enhanced Mapping

Title: Map Space Belief Prediction for Manipulation-Enhanced Mapping

Karte Raum Glaube Vorhersage für manipulations-verbesserte Mapping

人工-增强绘图的地图空间信仰预测 2502.20606v2

Authors: Joao Marcos Correia Marques, Nils Dengler, Tobias Zaenker, Jesper Mucke, Shenlong Wang, Maren Bennewitz, Kris Hauser

Searching for objects in cluttered environments requires selecting efficient viewpoints and manipulation actions to remove occlusions and reduce uncertainty in object locations, shapes, and categories. In this work, we address the problem of manipulation-enhanced semantic mapping, where a robot has to efficiently identify all objects in a cluttered shelf. Although Partially Observable Markov Decision Processes~(POMDPs) are standard for decision-making under uncertainty, representing unstructured interactive worlds remains challenging in this formalism. To tackle this, we define a POMDP whose belief is summarized by a metric-semantic grid map and propose a novel framework that uses neural networks to perform map-space belief updates to reason efficiently and simultaneously about object geometries, locations, categories, occlusions, and manipulation physics. Further, to enable accurate information gain analysis, the learned belief updates should maintain calibrated estimates of uncertainty. Therefore, we propose Calibrated Neural-Accelerated Belief Updates (CNABUs) to learn a belief propagation model that generalizes to novel scenarios and provides confidence-calibrated predictions for unknown areas. Our experiments show that our novel POMDP planner improves map completeness and accuracy over existing methods in challenging simulations and successfully transfers to real-world cluttered shelves in zero-shot fashion.

nan

Article 1105

Title@2025-05-27 (2): MOLLM: Multi-Objective Large Language Model for Molecular Design – Optimizing with Experts

Title: MOLLM: Multi-Objective Large Language Model for Molecular Design – Optimizing with Experts

MOLLM: Multi-Objective Large Language Model for Molecular Design – Optimierung mit Experten

MOLLM: 分子设计多目标大语言模型 – – 与专家优化 2502.12845v2

Authors: Nian Ran, Yue Wang, Richard Allmendinger

Molecular design plays a critical role in advancing fields such as drug discovery, materials science, and chemical engineering. This work introduces the Multi-Objective Large Language Model for Molecular Design (MOLLM), a novel framework that combines domain-specific knowledge with the adaptability of large language models to optimize molecular properties across multiple objectives. Leveraging in-context learning and multi-objective optimization, MOLLM achieves superior performance and innovation, consistently surpassing state-of-the-art (SOTA) methods. We significantly improve the efficiency of our framework, making it 14 times faster and substantially more cost-effective without compromising performance compared to the latest similar work. Our results demonstrate that MOLLM consistently outperforms SOTA models across experiments and excels on the PMO benchmark. In addition, we provide extensive ablation studies and analysis to evaluate the effectiveness of each component and the quality of the output molecules.

nan

Article 1106

Title@2025-05-27 (2): ‘Hello, World!’: Making GNNs Talk with LLMs

Title: ‘Hello, World!’: Making GNNs Talk with LLMs

“Hallo, Welt!”: GNNs mit LLMs sprechen zu lassen

“你好,世界!” “让GNNs和LLMs说话” 2505.20742v1

Authors: Sunwoo Kim, Soo Yong Lee, Jaemin Yoo, Kijung Shin

While graph neural networks (GNNs) have shown remarkable performance across diverse graph-related tasks, their high-dimensional hidden representations render them black boxes. In this work, we propose Graph Lingual Network (GLN), a GNN built on large language models (LLMs), with hidden representations in the form of human-readable text. Through careful prompt design, GLN incorporates not only the message passing module of GNNs but also advanced GNN techniques, including graph attention and initial residual connection. The comprehensibility of GLN’s hidden representations enables an intuitive analysis of how node representations change (1) across layers and (2) under advanced GNN techniques, shedding light on the inner workings of GNNs. Furthermore, we demonstrate that GLN achieves strong zero-shot performance on node classification and link prediction, outperforming existing LLM-based baseline methods.

nan

Article 1107

Title@2025-05-27 (2): Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?

Title: Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?

Können kleine Sprachmodelle Geräuschmuster lernen, nicht lernen und erhalten?

小语言模型能够学习、不学习和保留噪音模式吗? 2407.00996v3

Authors: Nicy Scaria, Silvester John Joseph Kennedy, Deepak Subramani

With the growing need for efficient language models in resource-constrained environments, Small Language Models (SLMs) have emerged as compact and practical alternatives to Large Language Models (LLMs). While studies have explored noise handling in LLMs, little is known about how SLMs handle noise, a critical factor for their reliable real-world deployment. This study investigates the ability of SLMs with parameters between 1 and 3 billion to learn, retain, and subsequently eliminate different types of noise (word flip, character flip, transliteration, irrelevant content, and contradictory information). Four pretrained SLMs (Olmo 1B, Qwen1.5 1.8B, Gemma1.1 2B, and Phi2 2.7B) were instruction-tuned on noise-free data and tested with in-context examples to assess noise learning. Subsequently, noise patterns were introduced in instruction tuning to assess their adaptability. The results revealed differences in how models handle noise, with smaller models like Olmo quickly adapting to noise patterns. Phi2’s carefully curated, structured, and high-quality pretraining data enabled resistance to character level, transliteration, and counterfactual noise, while Gemma adapted successfully to transliteration noise through its multilingual pretraining. Subsequent clean data training effectively mitigated noise effects. These findings provide practical strategies for developing robust SLMs for real-world applications.

nan

Article 1108

Title@2025-05-27 (2): Detecting Informative Channels: ActionFormer

Title: Detecting Informative Channels: ActionFormer

Informative Kanäle erkennen: AktionEhemaliger

检测信息渠道:行动前 2505.20739v1

Authors: Kunpeng Zhao, Asahi Miyazaki, Tsuyoshi Okita

Human Activity Recognition (HAR) has recently witnessed advancements with Transformer-based models. Especially, ActionFormer shows us a new perspectives for HAR in the sense that this approach gives us additional outputs which detect the border of the activities as well as the activity labels. ActionFormer was originally proposed with its input as image/video. However, this was converted to with its input as sensor signals as well. We analyze this extensively in terms of deep learning architectures. Based on the report of high temporal dynamics which limits the model’s ability to capture subtle changes effectively and of the interdependencies between the spatial and temporal features. We propose the modified ActionFormer which will decrease these defects for sensor signals. The key to our approach lies in accordance with the Sequence-and-Excitation strategy to minimize the increase in additional parameters and opt for the swish activation function to retain the information about direction in the negative range. Experiments on the WEAR dataset show that our method achieves substantial improvement of a 16.01\% in terms of average mAP for inertial data.

nan

Article 1109

Title@2025-05-27 (2): Adversarial bandit optimization for approximately linear functions

Title: Adversarial bandit optimization for approximately linear functions

Adversariale Bandit-Optimierung für etwa lineare Funktionen

大约直线功能的对面土匪优化 2505.20734v1

Authors: Zhuoyu Cheng, Kohei Hatano, Eiji Takimoto

We consider a bandit optimization problem for nonconvex and non-smooth functions, where in each trial the loss function is the sum of a linear function and a small but arbitrary perturbation chosen after observing the player’s choice. We give both expected and high probability regret bounds for the problem. Our result also implies an improved high-probability regret bound for the bandit linear optimization, a special case with no perturbation. We also give a lower bound on the expected regret.

nan

Article 1110

Title@2025-05-27 (2): SPA-RL: Reinforcing LLM Agents via Stepwise Progress Attribution

Title: SPA-RL: Reinforcing LLM Agents via Stepwise Progress Attribution

SPA-RL: Verstärkung der LLM-Agenten durch schrittweise Fortschrittszuweisung

SPA-RL:通过逐步推进加强LLM代理 2505.20732v1

Authors: Hanlin Wang, Chak Tou Leong, Jiashuo Wang, Jian Wang, Wenjie Li

Reinforcement learning (RL) holds significant promise for training LLM agents to handle complex, goal-oriented tasks that require multi-step interactions with external environments. However, a critical challenge when applying RL to these agentic tasks arises from delayed rewards: feedback signals are typically available only after the entire task is completed. This makes it non-trivial to assign delayed rewards to earlier actions, providing insufficient guidance regarding environmental constraints and hindering agent training. In this work, we draw on the insight that the ultimate completion of a task emerges from the cumulative progress an agent makes across individual steps. We propose Stepwise Progress Attribution (SPA), a general reward redistribution framework that decomposes the final reward into stepwise contributions, each reflecting its incremental progress toward overall task completion. To achieve this, we train a progress estimator that accumulates stepwise contributions over a trajectory to match the task completion. During policy optimization, we combine the estimated per-step contribution with a grounding signal for actions executed in the environment as the fine-grained, intermediate reward for effective agent training. Extensive experiments on common agent benchmarks (including Webshop, ALFWorld, and VirtualHome) demonstrate that SPA consistently outperforms the state-of-the-art method in both success rate (+2.5\% on average) and grounding accuracy (+1.9\% on average). Further analyses demonstrate that our method remarkably provides more effective intermediate rewards for RL training. Our code is available at https://github.com/WangHanLinHenry/SPA-RL-Agent.

nan

Article 1111

Title@2025-05-27 (2): Semi-supervised Clustering Through Representation Learning of Large-scale EHR Data

Title: Semi-supervised Clustering Through Representation Learning of Large-scale EHR Data

Halbüberwachtes Clustering durch Repräsentationslernen von EHR-Großdaten

通过代表学习大规模电子人力资源数据,进行半监督的集群组合 2505.20731v1

Authors: Linshanshan Wang, Mengyan Li, Zongqi Xia, Molei Liu, Tianxi Cai

Electronic Health Records (EHR) offer rich real-world data for personalized medicine, providing insights into disease progression, treatment responses, and patient outcomes. However, their sparsity, heterogeneity, and high dimensionality make them difficult to model, while the lack of standardized ground truth further complicates predictive modeling. To address these challenges, we propose SCORE, a semi-supervised representation learning framework that captures multi-domain disease profiles through patient embeddings. SCORE employs a Poisson-Adapted Latent factor Mixture (PALM) Model with pre-trained code embeddings to characterize codified features and extract meaningful patient phenotypes and embeddings. To handle the computational challenges of large-scale data, it introduces a hybrid Expectation-Maximization (EM) and Gaussian Variational Approximation (GVA) algorithm, leveraging limited labeled data to refine estimates on a vast pool of unlabeled samples. We theoretically establish the convergence of this hybrid approach, quantify GVA errors, and derive SCORE’s error rate under diverging embedding dimensions. Our analysis shows that incorporating unlabeled data enhances accuracy and reduces sensitivity to label scarcity. Extensive simulations confirm SCORE’s superior finite-sample performance over existing methods. Finally, we apply SCORE to predict disability status for patients with multiple sclerosis (MS) using partially labeled EHR data, demonstrating that it produces more informative and predictive patient embeddings for multiple MS-related conditions compared to existing approaches.

nan

Article 1112

Title@2025-05-27 (2): What LLMs Miss in Recommendations: Bridging the Gap with Retrieval-Augmented Collaborative Signals

Title: What LLMs Miss in Recommendations: Bridging the Gap with Retrieval-Augmented Collaborative Signals

Was LLMs in Empfehlungen vermissen: Die Lücke mit retrieval-Augmented Collaborative Signals überbrücken

在建议中错过了什么的LLM女士:用检索增强的合作信号弥合差距 2505.20730v1

Authors: Shahrooz Pouryousef

User-item interactions contain rich collaborative signals that form the backbone of many successful recommender systems. While recent work has explored the use of large language models (LLMs) for recommendation, it remains unclear whether LLMs can effectively reason over this type of collaborative information. In this paper, we conduct a systematic comparison between LLMs and classical matrix factorization (MF) models to assess LLMs’ ability to leverage user-item interaction data. We further introduce a simple retrieval-augmented generation (RAG) method that enhances LLMs by grounding their predictions in structured interaction data. Our experiments reveal that current LLMs often fall short in capturing collaborative patterns inherent to MF models, but that our RAG-based approach substantially improves recommendation quality-highlighting a promising direction for future LLM-based recommenders.

nan

Article 1113

Title@2025-05-27 (2): Energy-based generator matching: A neural sampler for general state space

Title: Energy-based generator matching: A neural sampler for general state space

Energiebasierte Generator-Matching: Ein neuronaler Sampler für den allgemeinen Zustandsraum

基于能源的发电机匹配:一般状态空间的神经取样器 2505.19646v2

Authors: Dongyeop Woo, Minsu Kim, Minkyu Kim, Kiyoung Seong, Sungsoo Ahn

We propose Energy-based generator matching (EGM), a modality-agnostic approach to train generative models from energy functions in the absence of data. Extending the recently proposed generator matching, EGM enables training of arbitrary continuous-time Markov processes, e.g., diffusion, flow, and jump, and can generate data from continuous, discrete, and a mixture of two modalities. To this end, we propose estimating the generator matching loss using self-normalized importance sampling with an additional bootstrapping trick to reduce variance in the importance weight. We validate EGM on both discrete and multimodal tasks up to 100 and 20 dimensions, respectively.

nan

Article 1114

Title@2025-05-27 (2): A reinforcement learning agent for maintenance of deteriorating systems with increasingly imperfect repairs

Title: A reinforcement learning agent for maintenance of deteriorating systems with increasingly imperfect repairs

Ein Verstärkungs-Lernmittel für die Instandhaltung von verschlechternden Systemen mit zunehmend unvollkommenen Reparaturen

强化学习代理,用于维护修理越来越不完善的恶化系统 2505.20725v1

Authors: Alberto Pliego Marugán, Jesús M. Pinar-Pérez, Fausto Pedro García Márquez

Efficient maintenance has always been essential for the successful application of engineering systems. However, the challenges to be overcome in the implementation of Industry 4.0 necessitate new paradigms of maintenance optimization. Machine learning techniques are becoming increasingly used in engineering and maintenance, with reinforcement learning being one of the most promising. In this paper, we propose a gamma degradation process together with a novel maintenance model in which repairs are increasingly imperfect, i.e., the beneficial effect of system repairs decreases as more repairs are performed, reflecting the degradational behavior of real-world systems. To generate maintenance policies for this system, we developed a reinforcement-learning-based agent using a Double Deep Q-Network architecture. This agent presents two important advantages: it works without a predefined preventive threshold, and it can operate in a continuous degradation state space. Our agent learns to behave in different scenarios, showing great flexibility. In addition, we performed an analysis of how changes in the main parameters of the environment affect the maintenance policy proposed by the agent. The proposed approach is demonstrated to be appropriate and to significatively improve long-run cost as compared with other common maintenance strategies.

nan

Article 1115

Title@2025-05-27 (2): LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation

Title: LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation

LeDiFlow: Erlernter, verteilungsgeführter Fluss passend zur beschleunigten Bildgenerierung

LediFlow:为加速图像生成而实现的派发指导流动匹配 2505.20723v1

Authors: Pascal Zwick, Nils Friederich, Maximilian Beichter, Lennart Hilbert, Ralf Mikut, Oliver Bringmann

Enhancing the efficiency of high-quality image generation using Diffusion Models (DMs) is a significant challenge due to the iterative nature of the process. Flow Matching (FM) is emerging as a powerful generative modeling paradigm based on a simulation-free training objective instead of a score-based one used in DMs. Typical FM approaches rely on a Gaussian distribution prior, which induces curved, conditional probability paths between the prior and target data distribution. These curved paths pose a challenge for the Ordinary Differential Equation (ODE) solver, requiring a large number of inference calls to the flow prediction network. To address this issue, we present Learned Distribution-guided Flow Matching (LeDiFlow), a novel scalable method for training FM-based image generation models using a better-suited prior distribution learned via a regression-based auxiliary model. By initializing the ODE solver with a prior closer to the target data distribution, LeDiFlow enables the learning of more computationally tractable probability paths. These paths directly translate to fewer solver steps needed for high-quality image generation at inference time. Our method utilizes a State-Of-The-Art (SOTA) transformer architecture combined with latent space sampling and can be trained on a consumer workstation. We empirically demonstrate that LeDiFlow remarkably outperforms the respective FM baselines. For instance, when operating directly on pixels, our model accelerates inference by up to 3.75x compared to the corresponding pixel-space baseline. Simultaneously, our latent FM model enhances image quality on average by 1.32x in CLIP Maximum Mean Discrepancy (CMMD) metric against its respective baseline.

nan

Article 1116

Title@2025-05-27 (2): Diffusion Model-based Activity Completion for AI Motion Capture from Videos

Title: Diffusion Model-based Activity Completion for AI Motion Capture from Videos

Diffusion Modellbasierte Aktivitätsvervollständigung für AI Motion Capture aus Videos

AI 从视频中抓取 AI 运动的传播示范活动完成 2505.21566v1

Authors: Gao Huayu, Huang Tengjiu, Ye Xiaolong, Tsuyoshi Okita

AI-based motion capture is an emerging technology that offers a cost-effective alternative to traditional motion capture systems. However, current AI motion capture methods rely entirely on observed video sequences, similar to conventional motion capture. This means that all human actions must be predefined, and movements outside the observed sequences are not possible. To address this limitation, we aim to apply AI motion capture to virtual humans, where flexible actions beyond the observed sequences are required. We assume that while many action fragments exist in the training data, the transitions between them may be missing. To bridge these gaps, we propose a diffusion-model-based action completion technique that generates complementary human motion sequences, ensuring smooth and continuous movements. By introducing a gate module and a position-time embedding module, our approach achieves competitive results on the Human3.6M dataset. Our experimental results show that (1) MDC-Net outperforms existing methods in ADE, FDE, and MMADE but is slightly less accurate in MMFDE, (2) MDC-Net has a smaller model size (16.84M) compared to HumanMAC (28.40M), and (3) MDC-Net generates more natural and coherent motion sequences. Additionally, we propose a method for extracting sensor data, including acceleration and angular velocity, from human motion sequences.

nan

Article 1117

Title@2025-05-27 (2): Recurrent Neural Operators: Stable Long-Term PDE Prediction

Title: Recurrent Neural Operators: Stable Long-Term PDE Prediction

Recurrent Neural Operators: Stabile Langzeit-PDE-Vorhersage

经常性神经操作员:稳定的长期PDE预测 2505.20721v1

Authors: Zaijun Ye, Chen-Song Zhang, Wansheng Wang

Neural operators have emerged as powerful tools for learning solution operators of partial differential equations. However, in time-dependent problems, standard training strategies such as teacher forcing introduce a mismatch between training and inference, leading to compounding errors in long-term autoregressive predictions. To address this issue, we propose Recurrent Neural Operators (RNOs)-a novel framework that integrates recurrent training into neural operator architectures. Instead of conditioning each training step on ground-truth inputs, RNOs recursively apply the operator to their own predictions over a temporal window, effectively simulating inference-time dynamics during training. This alignment mitigates exposure bias and enhances robustness to error accumulation. Theoretically, we show that recurrent training can reduce the worst-case exponential error growth typical of teacher forcing to linear growth. Empirically, we demonstrate that recurrently trained Multigrid Neural Operators significantly outperform their teacher-forced counterparts in long-term accuracy and stability on standard benchmarks. Our results underscore the importance of aligning training with inference dynamics for robust temporal generalization in neural operator learning.

nan

Article 1118

Title@2025-05-27 (2): ProgCo: Program Helps Self-Correction of Large Language Models

Title: ProgCo: Program Helps Self-Correction of Large Language Models

ProgCo: Programm hilft bei der Selbstkorrektur großer Sprachmodelle

ProgC:帮助大语言模式自我校正方案 2501.01264v2

Authors: Xiaoshuai Song, Yanan Wu, Weixun Wang, Jiaheng Liu, Wenbo Su, Bo Zheng

Self-Correction aims to enable large language models (LLMs) to self-verify and self-refine their initial responses without external feedback. However, LLMs often fail to effectively self-verify and generate correct feedback, further misleading refinement and leading to the failure of self-correction, especially in complex reasoning tasks. In this paper, we propose Program-driven Self-Correction (ProgCo). First, program-driven verification (ProgVe) achieves complex verification logic and extensive validation through self-generated, self-executing verification pseudo-programs. Then, program-driven refinement (ProgRe) receives feedback from ProgVe, conducts dual reflection and refinement on both responses and verification programs to mitigate misleading of incorrect feedback in complex reasoning tasks. Experiments on three instruction-following and mathematical benchmarks indicate that ProgCo achieves effective self-correction, and can be further enhance performance when combined with real program tools. We release our code at https://github.com/songxiaoshuai/progco.

nan

Article 1119

Title@2025-05-27 (2): LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models

Title: LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models

LatentExplainer: Erklären von latenten Darstellungen in tiefgenerativen Modellen mit multimodalen großen Sprachmodellen

前任Explainer:在多模式大语言模型的深创模型中解释前述表述 2406.14862v6

Authors: Mengdan Zhu, Raasikh Kanjiani, Jiahui Lu, Andrew Choi, Qirui Ye, Liang Zhao

Deep generative models like VAEs and diffusion models have advanced various generation tasks by leveraging latent variables to learn data distributions and generate high-quality samples. Despite the field of explainable AI making strides in interpreting machine learning models, understanding latent variables in generative models remains challenging. This paper introduces \textit{LatentExplainer}, a framework for automatically generating semantically meaningful explanations of latent variables in deep generative models. \textit{LatentExplainer} tackles three main challenges: inferring the meaning of latent variables, aligning explanations with inductive biases, and handling varying degrees of explainability. Our approach perturbs latent variables, interpreting changes in generated data, and uses multimodal large language models (MLLMs) to produce human-understandable explanations. We evaluate our proposed method on several real-world and synthetic datasets, and the results demonstrate superior performance in generating high-quality explanations for latent variables. The results highlight the effectiveness of incorporating inductive biases and uncertainty quantification, significantly enhancing model interpretability.

nan

Article 1120

Title@2025-05-27 (2): PCDCNet: A Surrogate Model for Air Quality Forecasting with Physical-Chemical Dynamics and Constraints

Title: PCDCNet: A Surrogate Model for Air Quality Forecasting with Physical-Chemical Dynamics and Constraints

PCDCNet: Ein Surrogate-Modell für die Luftqualitätsprognose mit physikalisch-chemischer Dynamik und Einschränkungen

PCDCNet:利用物理化学动态和制约因素进行空气质量预测的替代模型 2505.19842v2

Authors: Shuo Wang, Yun Cheng, Qingye Meng, Olga Saukh, Jiang Zhang, Jingfang Fan, Yuanting Zhang, Xingyuan Yuan, Lothar Thiele

Air quality forecasting (AQF) is critical for public health and environmental management, yet remains challenging due to the complex interplay of emissions, meteorology, and chemical transformations. Traditional numerical models, such as CMAQ and WRF-Chem, provide physically grounded simulations but are computationally expensive and rely on uncertain emission inventories. Deep learning models, while computationally efficient, often struggle with generalization due to their lack of physical constraints. To bridge this gap, we propose PCDCNet, a surrogate model that integrates numerical modeling principles with deep learning. PCDCNet explicitly incorporates emissions, meteorological influences, and domain-informed constraints to model pollutant formation, transport, and dissipation. By combining graph-based spatial transport modeling, recurrent structures for temporal accumulation, and representation enhancement for local interactions, PCDCNet achieves state-of-the-art (SOTA) performance in 72-hour station-level PM2.5 and O3 forecasting while significantly reducing computational costs. Furthermore, our model is deployed in an online platform, providing free, real-time air quality forecasts, demonstrating its scalability and societal impact. By aligning deep learning with physical consistency, PCDCNet offers a practical and interpretable solution for AQF, enabling informed decision-making for both personal and regulatory applications.

nan

Article 1121

Title@2025-05-27 (2): What is Fair? Defining Fairness in Machine Learning for Health

Title: What is Fair? Defining Fairness in Machine Learning for Health

Was ist fair? Fairness im maschinellen Lernen für die Gesundheit definieren

什么是公平?界定机器保健学习的公平性 2406.09307v5

Authors: Jianhui Gao, Benson Chou, Zachary R. McCaw, Hilary Thurston, Paul Varghese, Chuan Hong, Jessica Gronsbell

Ensuring that machine learning (ML) models are safe, effective, and equitable across all patients is critical for clinical decision-making and for preventing the amplification of existing health disparities. In this work, we examine how fairness is conceptualized in ML for health, including why ML models may lead to unfair decisions and how fairness has been measured in diverse real-world applications. We review commonly used fairness notions within group, individual, and causal-based frameworks. We also discuss the outlook for future research and highlight opportunities and challenges in operationalizing fairness in health-focused applications.

nan

Article 1122

Title@2025-05-27 (2): Are Data Embeddings effective in time series forecasting?

Title: Are Data Embeddings effective in time series forecasting?

Sind Daten-Embeddings in der Zeitreihenvorhersage wirksam?

数据嵌入在时间序列预测中是否有效? 2505.20716v1

Authors: Reza Nematirad, Anil Pahwa, Balasubramaniam Natarajan

Time series forecasting plays a crucial role in many real-world applications, and numerous complex forecasting models have been proposed in recent years. Despite their architectural innovations, most state-of-the-art models report only marginal improvements – typically just a few thousandths in standard error metrics. These models often incorporate complex data embedding layers to transform raw inputs into higher-dimensional representations to enhance accuracy. But are data embedding techniques actually effective in time series forecasting? Through extensive ablation studies across fifteen state-of-the-art models and four benchmark datasets, we find that removing data embedding layers from many state-of-the-art models does not degrade forecasting performance. In many cases, it improves both accuracy and computational efficiency. The gains from removing embedding layers often exceed the performance differences typically reported between competing models. Code available at: https://github.com/neuripsdataembedidng/DataEmbedding

nan

Article 1123

Title@2025-05-27 (2): Wideband RF Radiance Field Modeling Using Frequency-embedded 3D Gaussian Splatting

Title: Wideband RF Radiance Field Modeling Using Frequency-embedded 3D Gaussian Splatting

Wideband RF Radiance Field Modellierung mit Frequenz eingebettet 3D Gaussian Splatting

使用频率组合的 3D 高斯平面 2505.20714v1

Authors: Zechen Li, Lanqing Yang, Yiheng Bian, Hao Pan, Yongjian Fu, Yezhou Wang, Yi-Chao Chen, Guangtao Xue, Ju Ren

This paper presents an innovative frequency-embedded 3D Gaussian splatting (3DGS) algorithm for wideband radio-frequency (RF) radiance field modeling, offering an advancement over the existing works limited to single-frequency modeling. Grounded in fundamental physics, we uncover the complex relationship between EM wave propagation behaviors and RF frequencies. Inspired by this, we design an EM feature network with attenuation and radiance modules to learn the complex relationships between RF frequencies and the key properties of each 3D Gaussian, specifically the attenuation factor and RF signal intensity. By training the frequency-embedded 3DGS model, we can efficiently reconstruct RF radiance fields at arbitrary unknown frequencies within a given 3D environment. Finally, we propose a large-scale power angular spectrum (PAS) dataset containing 50000 samples ranging from 1 to 100 GHz in 6 indoor environments, and conduct extensive experiments to verify the effectiveness of our method. Our approach achieves an average Structural Similarity Index Measure (SSIM) up to 0.72, and a significant improvement up to 17.8% compared to the current state-of-the-art (SOTA) methods trained on individual test frequencies. Additionally, our method achieves an SSIM of 0.70 without prior training on these frequencies, which represents only a 2.8% performance drop compared to models trained with full PAS data. This demonstrates our model’s capability to estimate PAS at unknown frequencies. For related code and datasets, please refer to https://github.com/sim-2-real/Wideband3DGS.

nan

Article 1124

Title@2025-05-27 (2): Does Graph Prompt Work? A Data Operation Perspective with Theoretical Analysis

Title: Does Graph Prompt Work? A Data Operation Perspective with Theoretical Analysis

Funktioniert Graph Prompt? Eine Datenbetriebsperspektive mit theoretischer Analyse

《图表迅速工作吗? 带有理论分析的数据操作视角》 2410.01635v2

Authors: Qunzhong Wang, Xiangguo Sun, Hong Cheng

In recent years, graph prompting has emerged as a promising research direction, enabling the learning of additional tokens or subgraphs appended to the original graphs without requiring retraining of pre-trained graph models across various applications. This novel paradigm, shifting from the traditional pretraining and finetuning to pretraining and prompting has shown significant empirical success in simulating graph data operations, with applications ranging from recommendation systems to biological networks and graph transferring. However, despite its potential, the theoretical underpinnings of graph prompting remain underexplored, raising critical questions about its fundamental effectiveness. The lack of rigorous theoretical proof of why and how much it works is more like a dark cloud over the graph prompt area to go further. To fill this gap, this paper introduces a theoretical framework that rigorously analyzes graph prompting from a data operation perspective. Our contributions are threefold: First, we provide a formal guarantee theorem, demonstrating graph prompts capacity to approximate graph transformation operators, effectively linking upstream and downstream tasks. Second, we derive upper bounds on the error of these data operations by graph prompts for a single graph and extend this discussion to batches of graphs, which are common in graph model training. Third, we analyze the distribution of data operation errors, extending our theoretical findings from linear graph models (e.g., GCN) to non-linear graph models (e.g., GAT). Extensive experiments support our theoretical results and confirm the practical implications of these guarantees.

nan

Article 1125

Title@2025-05-27 (2): Time-Series Learning for Proactive Fault Prediction in Distributed Systems with Deep Neural Structures

Title: Time-Series Learning for Proactive Fault Prediction in Distributed Systems with Deep Neural Structures

Time-Series Learning für proaktive Fehlervorhersage in verteilten Systemen mit tiefen neuralen Strukturen

深心神经结构分布系统预发性故障预测时间序列学习 2505.20705v1

Authors: Yang Wang, Wenxuan Zhu, Xuehui Quan, Heyi Wang, Chang Liu, Qiyuan Wu

This paper addresses the challenges of fault prediction and delayed response in distributed systems by proposing an intelligent prediction method based on temporal feature learning. The method takes multi-dimensional performance metric sequences as input. We use a Gated Recurrent Unit (GRU) to model the evolution of system states over time. An attention mechanism is then applied to enhance key temporal segments, improving the model’s ability to identify potential faults. On this basis, a feedforward neural network is designed to perform the final classification, enabling early warning of system failures. To validate the effectiveness of the proposed approach, comparative experiments and ablation analyses were conducted using data from a large-scale real-world cloud system. The experimental results show that the model outperforms various mainstream time-series models in terms of Accuracy, F1-Score, and AUC. This demonstrates strong prediction capability and stability. Furthermore, the loss function curve confirms the convergence and reliability of the training process. It indicates that the proposed method effectively learns system behavior patterns and achieves efficient fault detection.

nan

Article 1126

Title@2025-05-27 (2): NeUQI: Near-Optimal Uniform Quantization Parameter Initialization

Title: NeUQI: Near-Optimal Uniform Quantization Parameter Initialization

NeUQI: Beinahe-optimale einheitliche Quantisierung Parameter Initialisierung

NeUQI: 近最佳统一量化参数初始化 2505.17595v2

Authors: Li Lin, Xinyu Hu, Xiaojun Wan

Large language models (LLMs) achieve impressive performance across domains but face significant challenges when deployed on consumer-grade GPUs or personal devices such as laptops, due to high memory consumption and inference costs. Post-training quantization (PTQ) of LLMs offers a promising solution that reduces their memory footprint and decoding latency. In practice, PTQ with uniform quantization representation is favored for its efficiency and ease of deployment since uniform quantization is widely supported by mainstream hardware and software libraries. Recent studies on $\geq 2$-bit uniform quantization have led to noticeable improvements in post-quantization model performance; however, they primarily focus on quantization methodologies, while the initialization of quantization parameters is underexplored and still relies on the suboptimal Min-Max strategies. In this work, we propose NeUQI, a method devoted to efficiently determining near-optimal initial parameters for uniform quantization. NeUQI is orthogonal to prior quantization methodologies and can seamlessly integrate with them. The experiments with the LLaMA and Qwen families on various tasks demonstrate that our NeUQI consistently outperforms existing methods. Furthermore, when combined with a lightweight distillation strategy, NeUQI can achieve superior performance to PV-tuning, a much more resource-intensive approach.

nan

Article 1127

Title@2025-05-27 (2): Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases

Title: Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases

Zwischen Circuits und Chomsky: Pre-Pretraining auf Formal Languages Imparts Linguistic Biases

巡回巡回和乔姆斯基之间:正式语言语言语言预科培训 2502.19249v2

Authors: Michael Y. Hu, Jackson Petty, Chuan Shi, William Merrill, Tal Linzen

Pretraining language models on formal language can improve their acquisition of natural language. Which features of the formal language impart an inductive bias that leads to effective transfer? Drawing on insights from linguistics and complexity theory, we hypothesize that effective transfer occurs when two conditions are met: the formal language should capture the dependency structures present in natural language, and it should remain within the computational limitations of the model architecture. We experiment with pre-pretraining (training on formal language before natural languages) on transformers and find that formal languages capturing hierarchical dependencies indeed enable language models to achieve lower loss on natural language and better linguistic generalization compared to other formal languages. We also find modest support for the hypothesis that the formal language should fall within the computational limitations of the architecture. Strikingly, pre-pretraining reduces loss more efficiently than training on a matched amount of natural language. For a 1B-parameter language model trained on roughly 1.6B tokens of natural language, pre-pretraining achieves the same loss and better linguistic generalization with a 33% smaller token budget. Finally, we also give mechanistic evidence of transfer from formal to natural language: attention heads acquired during pre-pretraining remain crucial for the model’s performance on syntactic evaluations.

nan

Article 1128

Title@2025-05-27 (2): vCache: Verified Semantic Prompt Caching

Title: vCache: Verified Semantic Prompt Caching

vCache: Verifizierter semantischer Prompt-Caching

vCache: 校验语义快速缓冲 2502.03771v3

Authors: Luis Gaspar Schroeder, Aditya Desai, Alejandro Cuadron, Kyle Chu, Shu Liu, Mark Zhao, Stephan Krusche, Alfons Kemper, Matei Zaharia, Joseph E. Gonzalez

Semantic caches return cached LLM-generated responses for semantically similar prompts to reduce inference latency and cost. They embed cached prompts and store them alongside their response in a vector database. Embedding similarity metrics assign a numerical score to quantify the similarity between a request and its nearest neighbor prompt from the cache. Existing systems use the same static similarity threshold across all requests to determine whether two prompts can share similar responses. However, we observe that static thresholds do not give formal correctness guarantees, can result in unexpected error rates, and lead to suboptimal cache hit rates. This paper proposes vCache, the first verified semantic cache with user-defined error rate guarantees. It employs an online learning algorithm to estimate an optimal threshold for each cached prompt, enabling reliable cache responses without additional training. Our experiments show that vCache consistently meets the specified error bounds while outperforming state-of-the-art static-threshold and fine-tuned embedding baselines. We release the vCache implementation and benchmarks to support future research.

nan

Article 1129

Title@2025-05-27 (2): Multi-instance Learning as Downstream Task of Self-Supervised Learning-based Pre-trained Model

Title: Multi-instance Learning as Downstream Task of Self-Supervised Learning-based Pre-trained Model

Multi-Instance-Lernen als Downstream-Aufgabe des selbstüberwachten Learning-basierten vortrainierten Modells

将多机构学习作为自监督学习模式培训前模式的下游任务 2505.21564v1

Authors: Koki Matsuishi, Tsuyoshi Okita

In deep multi-instance learning, the number of applicable instances depends on the data set. In histopathology images, deep learning multi-instance learners usually assume there are hundreds to thousands instances in a bag. However, when the number of instances in a bag increases to 256 in brain hematoma CT, learning becomes extremely difficult. In this paper, we address this drawback. To overcome this problem, we propose using a pre-trained model with self-supervised learning for the multi-instance learner as a downstream task. With this method, even when the original target task suffers from the spurious correlation problem, we show improvements of 5% to 13% in accuracy and 40% to 55% in the F1 measure for the hypodensity marker classification of brain hematoma CT.

nan

Article 1130

Title@2025-05-27 (2): Sparsified State-Space Models are Efficient Highway Networks

Title: Sparsified State-Space Models are Efficient Highway Networks

Sparsifizierte State-Space-Modelle sind effiziente Highway-Netzwerke

国家空间模型是高效公路网 2505.20698v1

Authors: Woomin Song, Jihoon Tack, Sangwoo Mo, Seunghyuk Oh, Jinwoo Shin

State-space models (SSMs) offer a promising architecture for sequence modeling, providing an alternative to Transformers by replacing expensive self-attention with linear recurrences. In this paper, we propose a simple yet effective trick to enhance SSMs within given computational budgets by sparsifying them. Our intuition is that tokens in SSMs are highly redundant due to gradual recurrent updates, and dense recurrence operations block the delivery of past information. In particular, we observe that upper layers of SSMs tend to be more redundant as they encode global information, while lower layers encode local information. Motivated by this, we introduce Simba, a hierarchical sparsification method for SSMs based on token pruning. Simba sparsifies upper layers more than lower layers, encouraging the upper layers to behave like highways. To achieve this, we propose a novel token pruning criterion for SSMs, measuring the global impact of tokens on the final output by accumulating local recurrences. We demonstrate that Simba outperforms the baseline model, Mamba, with the same FLOPS in various natural language tasks. Moreover, we illustrate the effect of highways, showing that Simba not only enhances efficiency but also improves the information flow across long sequences. Code is available at https://github.com/woominsong/Simba.

nan

Article 1131

Title@2025-05-27 (2): Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models

Title: Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models

Token-Level Akzeptieren oder ablehnen: Ein Micro Alignment-Ansatz für große Sprachmodelle

接受或拒绝时肯级别:大语言模式微调整方法 2505.19743v2

Authors: Yang Zhang, Yu Yu, Bo Tang, Yu Zhu, Chuxiong Sun, Wenqiang Wei, Jie Hu, Zipeng Xie, Zhiyu Li, Feiyu Xiong, Edward Chung

With the rapid development of Large Language Models (LLMs), aligning these models with human preferences and values is critical to ensuring ethical and safe applications. However, existing alignment techniques such as RLHF or DPO often require direct fine-tuning on LLMs with billions of parameters, resulting in substantial computational costs and inefficiencies. To address this, we propose Micro token-level Accept-Reject Aligning (MARA) approach designed to operate independently of the language models. MARA simplifies the alignment process by decomposing sentence-level preference learning into token-level binary classification, where a compact three-layer fully-connected network determines whether candidate tokens are “Accepted” or “Rejected” as part of the response. Extensive experiments across seven different LLMs and three open-source datasets show that MARA achieves significant improvements in alignment performance while reducing computational costs. The source code and implementation details are publicly available at https://github.com/IAAR-Shanghai/MARA, and the trained models are released at https://huggingface.co/IAAR-Shanghai/MARA_AGENTS.

nan

Article 1132

Title@2025-05-27 (2): Generating Hypotheses of Dynamic Causal Graphs in Neuroscience: Leveraging Generative Factor Models of Observed Time Series

Title: Generating Hypotheses of Dynamic Causal Graphs in Neuroscience: Leveraging Generative Factor Models of Observed Time Series

Generieren von Hypothesen dynamischer Kausalgraphen in der Neurowissenschaft: Nutzung generativer Faktorenmodelle beobachteter Zeitreihen

在神经科学中生成动态因果图的假设:利用观测时间序列的生成因数模型 2505.20697v1

Authors: Zachary C. Brown, David Carlson

The field of hypothesis generation promises to reduce costs in neuroscience by narrowing the range of interventional studies needed to study various phenomena. Existing machine learning methods can generate scientific hypotheses from complex datasets, but many approaches assume causal relationships are static over time, limiting their applicability to systems with dynamic, state-dependent behavior, such as the brain. While some techniques attempt dynamic causal discovery through factor models, they often restrict relationships to linear patterns or impose other simplifying assumptions. We propose a novel method that models dynamic graphs as a conditionally weighted superposition of static graphs, where each static graph can capture nonlinear relationships. This approach enables the detection of complex, time-varying interactions between variables beyond linear limitations. Our method improves f1-scores of predicted dynamic causal patterns by roughly 22-28% on average over baselines in some of our experiments, with some improvements reaching well over 60%. A case study on real brain data demonstrates our method’s ability to uncover relationships linked to specific behavioral states, offering valuable insights into neural dynamics.

nan

Article 1133

Title@2025-05-27 (2): Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration

Title: Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration

Navigieren Sie das Unbekannte: Verbesserung der LLM-Vernunft mit intrinsischer Motivation geführte Exploration

导航未知:利用内在动力性引导探索加强LLM 2505.17621v2

Authors: Jingtong Gao, Ling Pan, Yejing Wang, Rui Zhong, Chi Lu, Qingpeng Cai, Peng Jiang, Xiangyu Zhao

Reinforcement learning (RL) has emerged as a pivotal method for improving the reasoning capabilities of Large Language Models (LLMs). However, prevalent RL approaches such as Proximal Policy Optimization (PPO) and Group-Regularized Policy Optimization (GRPO) face critical limitations due to their reliance on sparse outcome-based rewards and inadequate mechanisms for incentivizing exploration. These limitations result in inefficient guidance for multi-step reasoning processes. Specifically, sparse reward signals fail to deliver effective or sufficient feedback, particularly for challenging problems. Furthermore, such reward structures induce systematic biases that prioritize exploitation of familiar trajectories over novel solution discovery. These shortcomings critically hinder performance in complex reasoning tasks, which inherently demand iterative refinement across ipntermediate steps. To address these challenges, we propose an Intrinsic Motivation guidEd exploratioN meThOd foR LLM Reasoning (i-MENTOR), a novel method designed to both deliver dense rewards and amplify explorations in the RL-based training paradigm. i-MENTOR introduces three key innovations: trajectory-aware exploration rewards that mitigate bias in token-level strategies while maintaining computational efficiency; dynamic reward scaling to stabilize exploration and exploitation in large action spaces; and advantage-preserving reward implementation that maintains advantage distribution integrity while incorporating exploratory guidance. Experiments across three public datasets demonstrate i-MENTOR’s effectiveness with a 22.39% improvement on the difficult dataset Countdown-4.

nan

Article 1134

Title@2025-05-27 (2): Temporal Saliency-Guided Distillation: A Scalable Framework for Distilling Video Datasets

Title: Temporal Saliency-Guided Distillation: A Scalable Framework for Distilling Video Datasets

Temporale Saliency-geführte Destillation: Ein skalierbares Framework für die Destillierung von Videodatensätzen

时间性盐度-指导蒸馏:用于蒸馏视频数据集的可缩放框架 2505.20694v1

Authors: Xulin Gu, Xinhao Zhong, Zhixing Wei, Yimin Zhou, Shuoyang Sun, Bin Chen, Hongpeng Wang, Yuan Luo

Dataset distillation (DD) has emerged as a powerful paradigm for dataset compression, enabling the synthesis of compact surrogate datasets that approximate the training utility of large-scale ones. While significant progress has been achieved in distilling image datasets, extending DD to the video domain remains challenging due to the high dimensionality and temporal complexity inherent in video data. Existing video distillation (VD) methods often suffer from excessive computational costs and struggle to preserve temporal dynamics, as na"ive extensions of image-based approaches typically lead to degraded performance. In this paper, we propose a novel uni-level video dataset distillation framework that directly optimizes synthetic videos with respect to a pre-trained model. To address temporal redundancy and enhance motion preservation, we introduce a temporal saliency-guided filtering mechanism that leverages inter-frame differences to guide the distillation process, encouraging the retention of informative temporal cues while suppressing frame-level redundancy. Extensive experiments on standard video benchmarks demonstrate that our method achieves state-of-the-art performance, bridging the gap between real and distilled video data and offering a scalable solution for video dataset compression.

nan

Article 1135

Title@2025-05-27 (2): Phir Hera Fairy: An English Fairytaler is a Strong Faker of Fluent Speech in Low-Resource Indian Languages

Title: Phir Hera Fairy: An English Fairytaler is a Strong Faker of Fluent Speech in Low-Resource Indian Languages

Phir Hera Fairy: Ein englisches Märchen ist ein starker Faker der fließenden Rede in Low-Resource indischen Sprachen

Phir Hera Fairy:英国仙女是印度低资源语言流利流利的有力名人 2505.20693v1

Authors: Praveen Srinivasa Varadhan, Srija Anand, Soma Siddhartha, Mitesh M. Khapra

What happens when an English Fairytaler is fine-tuned on Indian languages? We evaluate how the English F5-TTS model adapts to 11 Indian languages, measuring polyglot fluency, voice-cloning, style-cloning, and code-mixing. We compare: (i) training from scratch, (ii) fine-tuning English F5 on Indian data, and (iii) fine-tuning on both Indian and English data to prevent forgetting. Fine-tuning with only Indian data proves most effective and the resultant IN-F5 is a near-human polyglot; that enables speakers of one language (e.g., Odia) to fluently speak in another (e.g., Hindi). Our results show English pretraining aids low-resource TTS in reaching human parity. To aid progress in other low-resource languages, we study data-constrained setups and arrive at a compute optimal strategy. Finally, we show IN-F5 can synthesize unseen languages like Bhojpuri and Tulu using a human-in-the-loop approach for zero-resource TTS via synthetic data generation.

nan

Article 1136

Title@2025-05-27 (2): Evidential Deep Active Learning for Semi-Supervised Classification

Title: Evidential Deep Active Learning for Semi-Supervised Classification

Evidentielles tiefes aktives Lernen für semi-überwachte Klassifikation

半监督分类的证明深层积极学习 2505.20691v1

Authors: Shenkai Zhao, Xinao Zhang, Lipeng Pan, Xiaobin Xu, Danilo Pelusi

Semi-supervised classification based on active learning has made significant progress, but the existing methods often ignore the uncertainty estimation (or reliability) of the prediction results during the learning process, which makes it questionable whether the selected samples can effectively update the model. Hence, this paper proposes an evidential deep active learning approach for semi-supervised classification (EDALSSC). EDALSSC builds a semi-supervised learning framework to simultaneously quantify the uncertainty estimation of labeled and unlabeled data during the learning process. The uncertainty estimation of the former is associated with evidential deep learning, while that of the latter is modeled by combining ignorance information and conflict information of the evidence from the perspective of the T-conorm operator. Furthermore, this article constructs a heuristic method to dynamically balance the influence of evidence and the number of classes on uncertainty estimation to ensure that it does not produce counter-intuitive results in EDALSSC. For the sample selection strategy, EDALSSC selects the sample with the greatest uncertainty estimation that is calculated in the form of a sum when the training loss increases in the latter half of the learning process. Experimental results demonstrate that EDALSSC outperforms existing semi-supervised and supervised active learning approaches on image classification datasets.

nan

Article 1137

Title@2025-05-27 (2): Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Title: Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Beschleunigung der RL für LLM-Vernunft mit optimaler Regression

以最优优势回归加速 LLL 来计算LLM 加速RL 原因 2505.20686v1

Authors: Kianté Brantley, Mingyu Chen, Zhaolin Gao, Jason D. Lee, Wen Sun, Wenhao Zhan, Xuezhou Zhang

Reinforcement learning (RL) has emerged as a powerful tool for fine-tuning large language models (LLMs) to improve complex reasoning abilities. However, state-of-the-art policy optimization methods often suffer from high computational overhead and memory consumption, primarily due to the need for multiple generations per prompt and the reliance on critic networks or advantage estimates of the current policy. In this paper, we propose $A$-PO, a novel two-stage policy optimization framework that directly approximates the optimal advantage function and enables efficient training of LLMs for reasoning tasks. In the first stage, we leverage offline sampling from a reference policy to estimate the optimal value function $V$, eliminating the need for costly online value estimation. In the second stage, we perform on-policy updates using a simple least-squares regression loss with only a single generation per prompt. Theoretically, we establish performance guarantees and prove that the KL-regularized RL objective can be optimized without requiring complex exploration strategies. Empirically, $A$-PO achieves competitive performance across a wide range of mathematical reasoning benchmarks, while reducing training time by up to 2$\times$ and peak memory usage by over 30% compared to PPO, GRPO, and REBEL. Implementation of $A$-PO can be found at https://github.com/ZhaolinGao/A-PO.

nan

Article 1138

Title@2025-05-27 (2): A Survey of LLM $\times$ DATA

Title: A Survey of LLM $\times$ DATA

Eine Umfrage über LLM $\times$ DATEN

对LLLM 美元-美元-美元-美元-数据数据的调查 2505.18458v2

Authors: Xuanhe Zhou, Junxuan He, Wei Zhou, Haodong Chen, Zirui Tang, Haoyu Zhao, Xin Tong, Guoliang Li, Youmin Chen, Jun Zhou, Zhaojun Sun, Binyuan Hui, Shuo Wang, Conghui He, Zhiyuan Liu, Jingren Zhou, Fan Wu

The integration of large language model (LLM) and data management (DATA) is rapidly redefining both domains. In this survey, we comprehensively review the bidirectional relationships. On the one hand, DATA4LLM, spanning large-scale data processing, storage, and serving, feeds LLMs with high quality, diversity, and timeliness of data required for stages like pre-training, post-training, retrieval-augmented generation, and agentic workflows: (i) Data processing for LLMs includes scalable acquisition, deduplication, filtering, selection, domain mixing, and synthetic augmentation; (ii) Data Storage for LLMs focuses on efficient data and model formats, distributed and heterogeneous storage hierarchies, KV-cache management, and fault-tolerant checkpointing; (iii) Data serving for LLMs tackles challenges in RAG (e.g., knowledge post-processing), LLM inference (e.g., prompt compression, data provenance), and training strategies (e.g., data packing and shuffling). On the other hand, in LLM4DATA, LLMs are emerging as general-purpose engines for data management. We review recent advances in (i) data manipulation, including automatic data cleaning, integration, discovery; (ii) data analysis, covering reasoning over structured, semi-structured, and unstructured data, and (iii) system optimization (e.g., configuration tuning, query rewriting, anomaly diagnosis), powered by LLM techniques like retrieval-augmented prompting, task-specialized fine-tuning, and multi-agent collaboration.

nan

Article 1139

Title@2025-05-27 (2): MODULI: Unlocking Preference Generalization via Diffusion Models for Offline Multi-Objective Reinforcement Learning

Title: MODULI: Unlocking Preference Generalization via Diffusion Models for Offline Multi-Objective Reinforcement Learning

MODULI: Locking Preference Generalization via Diffusion Models for Offline Multi-Objective Reinforcement Learning

MODULI:通过离线多目标强化学习扩散模型解锁普及 2408.15501v2

Authors: Yifu Yuan, Zhenrui Zheng, Zibin Dong, Jianye Hao

Multi-objective Reinforcement Learning (MORL) seeks to develop policies that simultaneously optimize multiple conflicting objectives, but it requires extensive online interactions. Offline MORL provides a promising solution by training on pre-collected datasets to generalize to any preference upon deployment. However, real-world offline datasets are often conservatively and narrowly distributed, failing to comprehensively cover preferences, leading to the emergence of out-of-distribution (OOD) preference areas. Existing offline MORL algorithms exhibit poor generalization to OOD preferences, resulting in policies that do not align with preferences. Leveraging the excellent expressive and generalization capabilities of diffusion models, we propose MODULI (Multi-objective Diffusion Planner with Sliding Guidance), which employs a preference-conditioned diffusion model as a planner to generate trajectories that align with various preferences and derive action for decision-making. To achieve accurate generation, MODULI introduces two return normalization methods under diverse preferences for refining guidance. To further enhance generalization to OOD preferences, MODULI proposes a novel sliding guidance mechanism, which involves training an additional slider adapter to capture the direction of preference changes. Incorporating the slider, it transitions from in-distribution (ID) preferences to generating OOD preferences, patching, and extending the incomplete Pareto front. Extensive experiments on the D4MORL benchmark demonstrate that our algorithm outperforms state-of-the-art Offline MORL baselines, exhibiting excellent generalization to OOD preferences.

nan

Article 1140

Title@2025-05-27 (2): SELF-PERCEPT: Introspection Improves Large Language Models’ Detection of Multi-Person Mental Manipulation in Conversations

Title: SELF-PERCEPT: Introspection Improves Large Language Models’ Detection of Multi-Person Mental Manipulation in Conversations

SELF-PERCEPT: Introspection verbessert die Erkennung von Multi-Person-Gedankenmanipulation in Gesprächen durch große Sprachmodelle

SELF-PERCEPT: 调查改进大语言模型在对话中探测多人心理操纵 2505.20679v1

Authors: Danush Khanna, Pratinav Seth, Sidhaarth Sredharan Murali, Aditya Kumar Guru, Siddharth Shukla, Tanuj Tyagi, Sandeep Chaurasia, Kripabandhu Ghosh

Mental manipulation is a subtle yet pervasive form of abuse in interpersonal communication, making its detection critical for safeguarding potential victims. However, due to manipulation’s nuanced and context-specific nature, identifying manipulative language in complex, multi-turn, and multi-person conversations remains a significant challenge for large language models (LLMs). To address this gap, we introduce the MultiManip dataset, comprising 220 multi-turn, multi-person dialogues balanced between manipulative and non-manipulative interactions, all drawn from reality shows that mimic real-world scenarios. For manipulative interactions, it includes 11 distinct manipulations depicting real-life scenarios. We conduct extensive evaluations of state-of-the-art LLMs, such as GPT-4o and Llama-3.1-8B, employing various prompting strategies. Despite their capabilities, these models often struggle to detect manipulation effectively. To overcome this limitation, we propose SELF-PERCEPT, a novel, two-stage prompting framework inspired by Self-Perception Theory, demonstrating strong performance in detecting multi-person, multi-turn mental manipulation. Our code and data are publicly available at https://github.com/danushkhanna/self-percept .

nan

Article 1141

Title@2025-05-27 (2): Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System

Title: Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System

Viele Köpfe sind besser als eins: Verbesserte wissenschaftliche Idee-Generation durch ein LLM-basiertes Multi-Agent-System

许多领导人比一个领导人好得多:由以LLM为基础的多种机构系统改进科学思想的一代 2410.09403v4

Authors: Haoyang Su, Renqi Chen, Shixiang Tang, Zhenfei Yin, Xinzhe Zheng, Jinzhe Li, Biqing Qi, Qi Wu, Hui Li, Wanli Ouyang, Philip Torr, Bowen Zhou, Nanqing Dong

The rapid advancement of scientific progress requires innovative tools that can accelerate knowledge discovery. Although recent AI methods, particularly large language models (LLMs), have shown promise in tasks such as hypothesis generation and experimental design, they fall short of replicating the collaborative nature of real-world scientific practices, where diverse experts work together in teams to tackle complex problems. To address the limitations, we propose an LLM-based multi-agent system, i.e., Virtual Scientists (VirSci), designed to mimic the teamwork inherent in scientific research. VirSci organizes a team of agents to collaboratively generate, evaluate, and refine research ideas. Through comprehensive experiments, we demonstrate that this multi-agent approach outperforms the state-of-the-art method in producing novel scientific ideas. We further investigate the collaboration mechanisms that contribute to its tendency to produce ideas with higher novelty, offering valuable insights to guide future research and illuminating pathways toward building a robust system for autonomous scientific discovery. The code is available at https://github.com/open-sciencelab/Virtual-Scientists.

nan

Article 1142

Title@2025-05-27 (2): LLM-Guided Reinforcement Learning: Addressing Training Bottlenecks through Policy Modulation

Title: LLM-Guided Reinforcement Learning: Addressing Training Bottlenecks through Policy Modulation

LLM-geführtes Stärkungslernen: Bewältigung von Ausbildungsengpässen durch politische Modulation

LLM-LLM-指导强化学习:通过政策调整解决培训瓶颈问题 2505.20671v1

Authors: Heng Tan, Hua Yan, Yu Yang

While reinforcement learning (RL) has achieved notable success in various domains, training effective policies for complex tasks remains challenging. Agents often converge to local optima and fail to maximize long-term rewards. Existing approaches to mitigate training bottlenecks typically fall into two categories: (i) Automated policy refinement, which identifies critical states from past trajectories to guide policy updates, but suffers from costly and uncertain model training; and (ii) Human-in-the-loop refinement, where human feedback is used to correct agent behavior, but this does not scale well to environments with large or continuous action spaces. In this work, we design a large language model-guided policy modulation framework that leverages LLMs to improve RL training without additional model training or human intervention. We first prompt an LLM to identify critical states from a sub-optimal agent’s trajectories. Based on these states, the LLM then provides action suggestions and assigns implicit rewards to guide policy refinement. Experiments across standard RL benchmarks demonstrate that our method outperforms state-of-the-art baselines, highlighting the effectiveness of LLM-based explanations in addressing RL training bottlenecks.

nan

Article 1143

Title@2025-05-27 (2): From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

Title: From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

Vom Sehen zum Tun: Überbrücken von Vernunft und Entscheidung für die Robotermanipulation

从看到做:机器人操纵的搭桥理由和决定 2505.08548v2

Authors: Yifu Yuan, Haiqin Cui, Yibin Chen, Zibin Dong, Fei Ni, Longxin Kou, Jinyi Liu, Pengyi Li, Yan Zheng, Jianye Hao

Achieving generalization in robotic manipulation remains a critical challenge, particularly for unseen scenarios and novel tasks. Current Vision-Language-Action (VLA) models, while building on top of general Vision-Language Models (VLMs), still fall short of achieving robust zero-shot performance due to the scarcity and heterogeneity prevalent in embodied datasets. To address these limitations, we propose FSD (From Seeing to Doing), a novel vision-language model that generates intermediate representations through spatial relationship reasoning, providing fine-grained guidance for robotic manipulation. Our approach combines a hierarchical data pipeline for training with a self-consistency mechanism that aligns spatial coordinates with visual signals. Through extensive experiments, we comprehensively validated FSD’s capabilities in both “seeing” and “doing,” achieving outstanding performance across 8 benchmarks for general spatial reasoning and embodied reference abilities, as well as on our proposed more challenging benchmark VABench. We also verified zero-shot capabilities in robot manipulation, demonstrating significant performance improvements over baseline methods in both SimplerEnv and real robot settings. Experimental results show that FSD achieves 40.6% success rate in SimplerEnv and 72% success rate across 8 real-world tasks, outperforming the strongest baseline by 30%.

nan

Article 1144

Title@2025-05-27 (2): RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts

Title: RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts

RE-Bench: Bewertung der KI-FuE-Fähigkeiten von Sprachmodellagenten gegen menschliche Experten

RE-BENCH: 对照人类专家评估语言模范代理商的AI研究与开发的前沿能力 2411.15114v2

Authors: Hjalmar Wijk, Tao Lin, Joel Becker, Sami Jawhar, Neev Parikh, Thomas Broadley, Lawrence Chan, Michael Chen, Josh Clymer, Jai Dhyani, Elena Ericheva, Katharyn Garcia, Brian Goodrich, Nikola Jurkovic, Holden Karnofsky, Megan Kinniment, Aron Lajko, Seraphina Nix, Lucas Sato, William Saunders, Maksym Taran, Ben West, Elizabeth Barnes

Frontier AI safety policies highlight automation of AI research and development (R&D) by AI agents as an important capability to anticipate. However, there exist few evaluations for AI R&D capabilities, and none that are highly realistic and have a direct comparison to human performance. We introduce RE-Bench (Research Engineering Benchmark, v1), which consists of 7 challenging, open-ended ML research engineering environments and data from 71 8-hour attempts by 61 distinct human experts. We confirm that our experts make progress in the environments given 8 hours, with 82% of expert attempts achieving a non-zero score and 24% matching or exceeding our strong reference solutions. We compare humans to several public frontier models through best-of-k with varying time budgets and agent designs, and find that the best AI agents achieve a score 4x higher than human experts when both are given a total time budget of 2 hours per environment. However, humans currently display better returns to increasing time budgets, narrowly exceeding the top AI agent scores given an 8-hour budget, and achieving 2x the score of the top AI agent when both are given 32 total hours (across different attempts). Qualitatively, we find that modern AI agents possess significant expertise in many ML topics – e.g. an agent wrote a faster custom Triton kernel than any of our human experts’ – and can generate and test solutions over ten times faster than humans, at much lower cost. We open-source the evaluation environments, human expert data, analysis code and agent trajectories to facilitate future research.

nan

Article 1145

Title@2025-05-27 (2): Predicting and Understanding College Student Mental Health with Interpretable Machine Learning

Title: Predicting and Understanding College Student Mental Health with Interpretable Machine Learning

Vorhersagen und Verständnis College Student Mental Health mit Interpretable Machine Learning

预测和理解学院学生心理健康与可解释机器学习 2503.08002v2

Authors: Meghna Roy Chowdhury, Wei Xuan, Shreyas Sen, Yixue Zhao, Yi Ding

Mental health issues among college students have reached critical levels, significantly impacting academic performance and overall wellbeing. Predicting and understanding mental health status among college students is challenging due to three main factors: the necessity for large-scale longitudinal datasets, the prevalence of black-box machine learning models lacking transparency, and the tendency of existing approaches to provide aggregated insights at the population level rather than individualized understanding. To tackle these challenges, this paper presents I-HOPE, the first Interpretable Hierarchical mOdel for Personalized mEntal health prediction. I-HOPE is a two-stage hierarchical model that connects raw behavioral features to mental health status through five defined behavioral categories as interaction labels. We evaluate I-HOPE on the College Experience Study, the longest longitudinal mobile sensing dataset. This dataset spans five years and captures data from both pre-pandemic periods and the COVID-19 pandemic. I-HOPE achieves a prediction accuracy of 91%, significantly surpassing the 60-70% accuracy of baseline methods. In addition, I-HOPE distills complex patterns into interpretable and individualized insights, enabling the future development of tailored interventions and improving mental health support. The code is available at https://github.com/roycmeghna/I-HOPE.

nan

Article 1146

Title@2025-05-27 (2): Continuous-Time Attention: PDE-Guided Mechanisms for Long-Sequence Transformers

Title: Continuous-Time Attention: PDE-Guided Mechanisms for Long-Sequence Transformers

Continuous-Time-Achtung: PDE-geführte Mechanismen für lange Sequenztransformatoren

持续关注:长序列变换者PDE-指导机制 2505.20666v1

Authors: Yukun Zhang, Xueqing Zhou

We propose a novel framework, Continuous_Time Attention, which infuses partial differential equations (PDEs) into the Transformer’s attention mechanism to address the challenges of extremely long input sequences. Instead of relying solely on a static attention matrix, we allow attention weights to evolve over a pseudo_time dimension via diffusion, wave, or reaction_diffusion dynamics. This mechanism systematically smooths local noise, enhances long_range dependencies, and stabilizes gradient flow. Theoretically, our analysis shows that PDE_based attention leads to better optimization landscapes and polynomial rather than exponential decay of distant interactions. Empirically, we benchmark our method on diverse experiments_demonstrating consistent gains over both standard and specialized long sequence Transformer variants. Our findings highlight the potential of PDE_based formulations to enrich attention mechanisms with continuous_time dynamics and global coherence.

nan

Article 1147

Title@2025-05-27 (2): Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond

Title: Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond

Auf dem Weg zu LLM Unlearning Resilient to Relearning Attacks: Eine scharfsinnige Minimierungsperspektive und darüber hinaus

走向LLM 学会学会学会学会重新学习攻击的不学习能力:锐化-尽量减少知识的视角及展望 2502.05374v4

Authors: Chongyu Fan, Jinghan Jia, Yihua Zhang, Anil Ramakrishna, Mingyi Hong, Sijia Liu

The LLM unlearning technique has recently been introduced to comply with data regulations and address the safety and ethical concerns of LLMs by removing the undesired data-model influence. However, state-of-the-art unlearning methods face a critical vulnerability: they are susceptible to ``relearning’’ the removed information from a small number of forget data points, known as relearning attacks. In this paper, we systematically investigate how to make unlearned models robust against such attacks. For the first time, we establish a connection between robust unlearning and sharpness-aware minimization (SAM) through a unified robust optimization framework, in an analogy to adversarial training designed to defend against adversarial attacks. Our analysis for SAM reveals that smoothness optimization plays a pivotal role in mitigating relearning attacks. Thus, we further explore diverse smoothing strategies to enhance unlearning robustness. Extensive experiments on benchmark datasets, including WMDP and MUSE, demonstrate that SAM and other smoothness optimization approaches consistently improve the resistance of LLM unlearning to relearning attacks. Notably, smoothness-enhanced unlearning also helps defend against (input-level) jailbreaking attacks, broadening our proposal’s impact in robustifying LLM unlearning. Codes are available at https://github.com/OPTML-Group/Unlearn-Smooth.

nan

Article 1148

Title@2025-05-27 (2): BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting Models

Title: BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting Models

BLAST: Ausgewogene Zeitreihen für universelle Vorhersagemodelle

BLAST: 通用预测模型平衡抽样时间序列 2505.17871v2

Authors: Zezhi Shao, Yujie Li, Fei Wang, Chengqing Yu, Yisong Fu, Tangwen Qian, Bin Xu, Boyu Diao, Yongjun Xu, Xueqi Cheng

The advent of universal time series forecasting models has revolutionized zero-shot forecasting across diverse domains, yet the critical role of data diversity in training these models remains underexplored. Existing large-scale time series datasets often suffer from inherent biases and imbalanced distributions, leading to suboptimal model performance and generalization. To address this gap, we introduce BLAST, a novel pre-training corpus designed to enhance data diversity through a balanced sampling strategy. First, BLAST incorporates 321 billion observations from publicly available datasets and employs a comprehensive suite of statistical metrics to characterize time series patterns. Then, to facilitate pattern-oriented sampling, the data is implicitly clustered using grid-based partitioning. Furthermore, by integrating grid sampling and grid mixup techniques, BLAST ensures a balanced and representative coverage of diverse patterns. Experimental results demonstrate that models pre-trained on BLAST achieve state-of-the-art performance with a fraction of the computational resources and training tokens required by existing methods. Our findings highlight the pivotal role of data diversity in improving both training efficiency and model performance for the universal forecasting task.

nan

Article 1149

Title@2025-05-27 (2): Generalized and Personalized Federated Learning with Foundation Models via Orthogonal Transformations

Title: Generalized and Personalized Federated Learning with Foundation Models via Orthogonal Transformations

Generalisiertes und personalisiertes Federated Learning mit Gründungsmodellen über Orthogonale Transformationen

通过矫形转变形成基础模型的通用和个性化联邦学习 2505.19888v2

Authors: Eun Gyung Kong, Je Won Yeom, Yonghoon Jeon, Taesup Kim

Federated Learning (FL) aims to train models across decentralized clients or devices holding local data without the need for centralized data collection, thus enhancing data privacy and security. However, achieving both generalization and personalization in heterogeneous settings remains a significant challenge. To address this, we introduce FedOT, a novel approach that leverages black-box foundation models. FedOT shares only a global task-dependent classifier across clients while locally adapting features through orthogonal transformations. By enforcing orthogonality, FedOT mitigates gradient conflicts across diverse clients, preserves semantic integrity, and achieves robust performance even in the presence of substantial data heterogeneity. The strategy of combining global and local parameters enables a more balanced approach for both generalization and personalization, outperforming baseline FL methods across multiple benchmarks. Furthermore, our extensive analysis confirms that joint optimization of global classifiers and local orthogonal transformations yields superior performance and suggests broader applicability.

nan

Article 1150

Title@2025-05-27 (2): ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

Title: ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

ReMA: Meta-Denken lernen für LLMs mit Multi-Agenten-Verstärkungs-Lernen

ReMA:学习多机构强化学习的LLMLM的元思维 2503.09501v3

Authors: Ziyu Wan, Yunxiang Li, Xiaoyu Wen, Yan Song, Hanjing Wang, Linyi Yang, Mark Schmidt, Jun Wang, Weinan Zhang, Shuyue Hu, Ying Wen

Recent research on Reasoning of Large Language Models (LLMs) has sought to further enhance their performance by integrating meta-thinking – enabling models to monitor, evaluate, and control their reasoning processes for more adaptive and effective problem-solving. However, current single-agent work lacks a specialized design for acquiring meta-thinking, resulting in low efficacy. To address this challenge, we introduce Reinforced Meta-thinking Agents (ReMA), a novel framework that leverages Multi-Agent Reinforcement Learning (MARL) to elicit meta-thinking behaviors, encouraging LLMs to think about thinking. ReMA decouples the reasoning process into two hierarchical agents: a high-level meta-thinking agent responsible for generating strategic oversight and plans, and a low-level reasoning agent for detailed executions. Through iterative reinforcement learning with aligned objectives, these agents explore and learn collaboration, leading to improved generalization and robustness. Empirical results from single-turn experiments demonstrate that ReMA outperforms single-agent RL baselines on complex reasoning tasks, including competitive-level mathematical benchmarks and LLM-as-a-Judge benchmarks. Additionally, we further extend ReMA to multi-turn interaction settings, leveraging turn-level ratio and parameter sharing to improve efficiency. Comprehensive ablation studies further illustrate the evolving dynamics of each distinct agent, providing valuable insights into how the meta-thinking reasoning process enhances the reasoning capabilities of LLMs. Our code can be found in https://github.com/ziyuwan/ReMA-public

nan

Article 1151

Title@2025-05-27 (2): How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines

Title: How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines

Wie können neurale Netzwerke mit Skalierungsgesetzen ausgebaut werden? Eine Umfrage und praktische Leitlinien

如何提升具有扩展法的神经网络? 2502.12051v3

Authors: Ayan Sengupta, Yash Goel, Tanmoy Chakraborty

Neural scaling laws have revolutionized the design and optimization of large-scale AI models by revealing predictable relationships between model size, dataset volume, and computational resources. Early research established power-law relationships in model performance, leading to compute-optimal scaling strategies. However, recent studies highlighted their limitations across architectures, modalities, and deployment contexts. Sparse models, mixture-of-experts, retrieval-augmented learning, and multimodal models often deviate from traditional scaling patterns. Moreover, scaling behaviors vary across domains such as vision, reinforcement learning, and fine-tuning, underscoring the need for more nuanced approaches. In this survey, we synthesize insights from over 50 studies, examining the theoretical foundations, empirical findings, and practical implications of scaling laws. We also explore key challenges, including data efficiency, inference scaling, and architecture-specific constraints, advocating for adaptive scaling strategies tailored to real-world applications. We suggest that while scaling laws provide a useful guide, they do not always generalize across all architectures and training strategies.

nan

Article 1152

Title@2025-05-27 (2): Enhancing Time Series Forecasting via a Parallel Hybridization of ARIMA and Polynomial Classifiers

Title: Enhancing Time Series Forecasting via a Parallel Hybridization of ARIMA and Polynomial Classifiers

Verbesserung der Zeitreihenprognose über eine parallele Hybridisierung von ARIMA und Polynom-Klassifikatoren

通过ARIMA和多边分类的平行混合预测增强时间序列 2505.06874v2

Authors: Thanh Son Nguyen, Van Thanh Nguyen, Dang Minh Duc Nguyen

Time series forecasting has attracted significant attention, leading to the de-velopment of a wide range of approaches, from traditional statistical meth-ods to advanced deep learning models. Among them, the Auto-Regressive Integrated Moving Average (ARIMA) model remains a widely adopted linear technique due to its effectiveness in modeling temporal dependencies in economic, industrial, and social data. On the other hand, polynomial classifi-ers offer a robust framework for capturing non-linear relationships and have demonstrated competitive performance in domains such as stock price pre-diction. In this study, we propose a hybrid forecasting approach that inte-grates the ARIMA model with a polynomial classifier to leverage the com-plementary strengths of both models. The hybrid method is evaluated on multiple real-world time series datasets spanning diverse domains. Perfor-mance is assessed based on forecasting accuracy and computational effi-ciency. Experimental results reveal that the proposed hybrid model consist-ently outperforms the individual models in terms of prediction accuracy, al-beit with a modest increase in execution time.

nan

Article 1153

Title@2025-05-27 (2): An Optimisation Framework for Unsupervised Environment Design

Title: An Optimisation Framework for Unsupervised Environment Design

Ein Rahmen für die Optimierung des unbeaufsichtigten Umweltdesigns

无人监督环境设计优化框架 2505.20659v1

Authors: Nathan Monette, Alistair Letcher, Michael Beukman, Matthew T. Jackson, Alexander Rutherford, Alexander D. Goldie, Jakob N. Foerster

For reinforcement learning agents to be deployed in high-risk settings, they must achieve a high level of robustness to unfamiliar scenarios. One method for improving robustness is unsupervised environment design (UED), a suite of methods aiming to maximise an agent’s generalisability across configurations of an environment. In this work, we study UED from an optimisation perspective, providing stronger theoretical guarantees for practical settings than prior work. Whereas previous methods relied on guarantees if they reach convergence, our framework employs a nonconvex-strongly-concave objective for which we provide a provably convergent algorithm in the zero-sum setting. We empirically verify the efficacy of our method, outperforming prior methods in a number of environments with varying difficulties.

nan

Article 1154

Title@2025-05-27 (2): When More is Less: Understanding Chain-of-Thought Length in LLMs

Title: When More is Less: Understanding Chain-of-Thought Length in LLMs

Wenn mehr weniger ist: Verstehst du die Kettenlänge in LLMs?

越少越多: 了解LLM 中所寻求的链条长度 2502.07266v3

Authors: Yuyang Wu, Yifei Wang, Ziyu Ye, Tianqi Du, Stefanie Jegelka, Yisen Wang

Large Language Models (LLMs) employ Chain-of-Thought (CoT) reasoning to deconstruct complex problems. While longer CoTs are often presumed superior, this paper challenges that notion, arguing that longer is not always better. Drawing on combined evidence from real-world observations, controlled experiments, and theoretical analysis, we demonstrate that task accuracy typically follows an inverted U-shaped curve with CoT length, where performance initially improves but eventually decreases as the number of CoT steps increases. With controlled experiments, we further uncover the scaling behaviors of the optimal CoT length: it increases with task difficulty but decreases with model capability, exposing an inherent simplicity bias where more capable models favor shorter, more efficient CoT reasoning. This bias is also evident in Reinforcement Learning (RL) training, where models gravitate towards shorter CoTs as their accuracy improves. To have a deep understanding of these dynamics, we establish a simple theoretical model that formally proves these phenomena, including the optimal length’s scaling laws and the emergence of simplicity bias during RL. Guided by this framework, we demonstrate significant practical benefits from training with optimally-lengthed CoTs and employing length-aware filtering at inference. These findings offer both a principled understanding of the “overthinking” phenomenon and multiple practical guidelines for CoT calibration, enabling LLMs to achieve optimal reasoning performance with adaptive CoTs tailored to task complexity and model capability.

nan

Article 1155

Title@2025-05-27 (2): Prompting Decision Transformers for Zero-Shot Reach-Avoid Policies

Title: Prompting Decision Transformers for Zero-Shot Reach-Avoid Policies

Prompting Decision Transformers für Zero-Shot-Reach-Aoid-Politiken

推动零热切无损政策决策变革者 2505.19337v2

Authors: Kevin Li, Marinka Zitnik

Offline goal-conditioned reinforcement learning methods have shown promise for reach-avoid tasks, where an agent must reach a target state while avoiding undesirable regions of the state space. Existing approaches typically encode avoid-region information into an augmented state space and cost function, which prevents flexible, dynamic specification of novel avoid-region information at evaluation time. They also rely heavily on well-designed reward and cost functions, limiting scalability to complex or poorly structured environments. We introduce RADT, a decision transformer model for offline, reward-free, goal-conditioned, avoid region-conditioned RL. RADT encodes goals and avoid regions directly as prompt tokens, allowing any number of avoid regions of arbitrary size to be specified at evaluation time. Using only suboptimal offline trajectories from a random policy, RADT learns reach-avoid behavior through a novel combination of goal and avoid-region hindsight relabeling. We benchmark RADT against 3 existing offline goal-conditioned RL models across 11 tasks, environments, and experimental settings. RADT generalizes in a zero-shot manner to out-of-distribution avoid region sizes and counts, outperforming baselines that require retraining. In one such zero-shot setting, RADT achieves 35.7% improvement in normalized cost over the best retrained baseline while maintaining high goal-reaching success. We apply RADT to cell reprogramming in biology, where it reduces visits to undesirable intermediate gene expression states during trajectories to desired target states, despite stochastic transitions and discrete, structured state dynamics.

nan

Article 1156

Title@2025-05-27 (2): New Paradigm of Adversarial Training: Releasing Accuracy-Robustness Trade-Off via Dummy Class

Title: New Paradigm of Adversarial Training: Releasing Accuracy-Robustness Trade-Off via Dummy Class

Neuer Paradigma der Adversarial Training: Freigabe von Genauigkeit-Robustheit-Trade-Off über Dummy-Klasse

反向培训新范例:通过Dummi类实现释放准确性-交战交易 2410.12671v2

Authors: Yanyun Wang, Li Liu, Zi Liang, Yi R., Fung, Qingqing Ye, Haibo Hu

Adversarial Training (AT) is one of the most effective methods to enhance the robustness of Deep Neural Networks (DNNs). However, existing AT methods suffer from an inherent accuracy-robustness trade-off. Previous works have studied this issue under the current AT paradigm, but still face over 10% accuracy reduction without significant robustness improvement over simple baselines such as PGD-AT. This inherent trade-off raises a question: Whether the current AT paradigm, which assumes to learn corresponding benign and adversarial samples as the same class, inappropriately mixes clean and robust objectives that may be essentially inconsistent. In fact, our empirical results show that up to 40% of CIFAR-10 adversarial samples always fail to satisfy such an assumption across various AT methods and robust models, explicitly indicating the room for improvement of the current AT paradigm. To relax from this overstrict assumption and the tension between clean and robust learning, in this work, we propose a new AT paradigm by introducing an additional dummy class for each original class, aiming to accommodate hard adversarial samples with shifted distribution after perturbation. The robustness w.r.t. these adversarial samples can be achieved by runtime recovery from the predicted dummy classes to the corresponding original ones, without conflicting with the clean objective on accuracy of benign samples. Finally, based on our new paradigm, we propose a novel DUmmy Classes-based Adversarial Training (DUCAT) method that concurrently improves accuracy and robustness in a plug-and-play manner only relevant to logits, loss, and a proposed two-hot soft label-based supervised signal. Our method outperforms state-of-the-art (SOTA) benchmarks, effectively releasing the current trade-off. The code is available at https://github.com/FlaAI/DUCAT.

nan

Article 1157

Title@2025-05-27 (2): FRABench and GenEval: Scaling Fine-Grained Aspect Evaluation across Tasks, Modalities

Title: FRABench and GenEval: Scaling Fine-Grained Aspect Evaluation across Tasks, Modalities

FRABench und GenEval: Skalierung feinkörniger Aspekte Bewertung über Aufgaben, Modalitäten hinweg

FRA Bench和GenEval:扩大对各任务、方式、方式和方式的精细评价 2505.12795v2

Authors: Shibo Hong, Jiahao Ying, Haiyuan Liang, Mengdi Zhang, Jun Kuang, Jiazheng Zhang, Yixin Cao

Evaluating the open-ended outputs of large language models (LLMs) has become a bottleneck as model capabilities, task diversity, and modality coverage rapidly expand. Existing “LLM-as-a-Judge” evaluators are typically narrow in a few tasks, aspects, or modalities, and easily suffer from low consistency. In this paper, we argue that explicit, fine-grained aspect specification is the key to both generalizability and objectivity in automated evaluation. To this end, we propose a hierarchical aspect taxonomy encompassing 112 distinct aspects that unifies evaluation across four representative settings – Natural Language Generation, Image Understanding, Image Generation, and Interleaved Text-and-Image Generation. Building upon this taxonomy, we create FRABench, a benchmark comprising 60.4k pairwise samples with 325k evaluation labels obtained from a combination of human and LLM annotations. FRABench provides the first large-scale, multi-modal resource for training and meta-evaluating fine-grained LMM judges. Leveraging FRABench, we develop GenEval, a fine-grained evaluator generalizable across tasks and modalities. Experiments show that GenEval (i) attains high agreement with GPT-4o and expert annotators, (ii) transfers robustly to unseen tasks and modalities, and (iii) reveals systematic weaknesses of current LMMs on evaluation.

nan

Article 1158

Title@2025-05-27 (2): Voronoi-grid-based Pareto Front Learning and Its Application to Collaborative Federated Learning

Title: Voronoi-grid-based Pareto Front Learning and Its Application to Collaborative Federated Learning

Voronoi-Grid-basiertes Pareto-Front-Lernen und seine Anwendung auf kollaboratives Federated Learning

以Voronoi-Grid为基础的Pareto阵线学习及其在联邦学习合作组织中的应用 2505.20648v1

Authors: Mengmeng Chen, Xiaohu Wu, Qiqi Liu, Tiantian He, Yew-Soon Ong, Yaochu Jin, Qicheng Lao, Han Yu

Multi-objective optimization (MOO) exists extensively in machine learning, and aims to find a set of Pareto-optimal solutions, called the Pareto front, e.g., it is fundamental for multiple avenues of research in federated learning (FL). Pareto-Front Learning (PFL) is a powerful method implemented using Hypernetworks (PHNs) to approximate the Pareto front. This method enables the acquisition of a mapping function from a given preference vector to the solutions on the Pareto front. However, most existing PFL approaches still face two challenges: (a) sampling rays in high-dimensional spaces; (b) failing to cover the entire Pareto Front which has a convex shape. Here, we introduce a novel PFL framework, called as PHN-HVVS, which decomposes the design space into Voronoi grids and deploys a genetic algorithm (GA) for Voronoi grid partitioning within high-dimensional space. We put forward a new loss function, which effectively contributes to more extensive coverage of the resultant Pareto front and maximizes the HV Indicator. Experimental results on multiple MOO machine learning tasks demonstrate that PHN-HVVS outperforms the baselines significantly in generating Pareto front. Also, we illustrate that PHN-HVVS advances the methodologies of several recent problems in the FL field. The code is available at https://github.com/buptcmm/phnhvvs}{https://github.com/buptcmm/phnhvvs.

nan

Article 1159

Title@2025-05-27 (2): Moment Expansions of the Energy Distance

Title: Moment Expansions of the Energy Distance

Momenterweiterungen der Energieentfernung

扩大能源距离时间 2505.20647v1

Authors: Ian Langmore

The energy distance is used to test distributional equality, and as a loss function in machine learning. While $D^2(X, Y)=0$ only when $X\sim Y$, the sensitivity to different moments is of practical importance. This work considers $D^2(X, Y)$ in the case where the distributions are close. In this regime, $D^2(X, Y)$ is more sensitive to differences in the means $\bar{X}-\bar{Y}$, than differences in the covariances $\Delta$. This is due to the structure of the energy distance and is independent of dimension. The sensitivity to on versus off diagonal components of $\Delta$ is examined when $X$ and $Y$ are close to isotropic. Here a dimension dependent averaging occurs and, in many cases, off diagonal correlations contribute significantly less. Numerical results verify these relationships hold even when distributional assumptions are not strictly met.

nan

Article 1160

Title@2025-05-27 (2): Evaluating Training in Binarized Neural Networks Through the Lens of Algorithmic Information Theory

Title: Evaluating Training in Binarized Neural Networks Through the Lens of Algorithmic Information Theory

Bewertung der Ausbildung in Binarized Neural Networks durch die Linse der algorithmischen Informationstheorie

通过分析信息理论的透镜评估神经网络的觉测培训 2505.20646v1

Authors: Eduardo Y. Sakabe, Felipe S. Abrahão, Alexandre Simões, Esther Colombini, Paula Costa, Ricardo Gudwin, Hector Zenil

Understanding and controlling the informational complexity of neural networks is a central challenge in machine learning, with implications for generalization, optimization, and model capacity. While most approaches rely on entropy-based loss functions and statistical metrics, these measures often fail to capture deeper, causally relevant algorithmic regularities embedded in network structure. We propose a shift toward algorithmic information theory, using Binarized Neural Networks (BNNs) as a first proxy. Grounded in algorithmic probability (AP) and the universal distribution it defines, our approach characterizes learning dynamics through a formal, causally grounded lens. We apply the Block Decomposition Method (BDM) – a scalable approximation of algorithmic complexity based on AP – and demonstrate that it more closely tracks structural changes during training than entropy, consistently exhibiting stronger correlations with training loss across varying model sizes and randomized training runs. These results support the view of training as a process of algorithmic compression, where learning corresponds to the progressive internalization of structured regularities. In doing so, our work offers a principled estimate of learning progression and suggests a framework for complexity-aware learning and regularization, grounded in first principles from information theory, complexity, and computability.

nan

Article 1161

Title@2025-05-27 (2): Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain

Title: Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain

Aufgabenoptimierte konvolutionäre recurrente Netzwerke richten sich an taktile Verarbeitung im Nagetierhirn

与鼠脑中触摸处理相适应的任务优化的革命经常网络 2505.18361v2

Authors: Trinity Chung, Yuchen Shen, Nathan C. L. Kong, Aran Nayebi

Tactile sensing remains far less understood in neuroscience and less effective in artificial systems compared to more mature modalities such as vision and language. We bridge these gaps by introducing a novel Encoder-Attender-Decoder (EAD) framework to systematically explore the space of task-optimized temporal neural networks trained on realistic tactile input sequences from a customized rodent whisker-array simulator. We identify convolutional recurrent neural networks (ConvRNNs) as superior encoders to purely feedforward and state-space architectures for tactile categorization. Crucially, these ConvRNN-encoder-based EAD models achieve neural representations closely matching rodent somatosensory cortex, saturating the explainable neural variability and revealing a clear linear relationship between supervised categorization performance and neural alignment. Furthermore, contrastive self-supervised ConvRNN-encoder-based EADs, trained with tactile-specific augmentations, match supervised neural fits, serving as an ethologically-relevant, label-free proxy. For neuroscience, our findings highlight nonlinear recurrent processing as important for general-purpose tactile representations in somatosensory cortex, providing the first quantitative characterization of the underlying inductive biases in this system. For embodied AI, our results emphasize the importance of recurrent EAD architectures to handle realistic tactile inputs, along with tailored self-supervised learning methods for achieving robust tactile perception with the same type of sensors animals use to sense in unstructured environments.

nan

Article 1162

Title@2025-05-27 (2): Can Past Experience Accelerate LLM Reasoning?

Title: Can Past Experience Accelerate LLM Reasoning?

Kann vergangene Erfahrung LLM Reasoning beschleunigen?

以往经验能否加快LLM理由解释? 2505.20643v1

Authors: Bo Pan, Liang Zhao

Allocating more compute to large language models (LLMs) reasoning has generally been demonstrated to improve their effectiveness, but also results in increased inference time. In contrast, humans can perform tasks faster and better with increased experience and exposure. Hence, this paper aims to investigate the question: Can LLMs also become faster at reasoning through recurrent exposure on relevant tasks, and if so, how can it be achieved? To address these questions, we first formalize the problem setting of LLM reasoning speedup systematically in the dimensions of task relevancy and compute budget calculation. We then propose SpeedupLLM, a theoretically guaranteed framework to implement and benchmark such reasoning speedup behaviour based on adaptive compute allocation and memory mechanisms. We further conduct comprehensive experiments to benchmark such behaviour across different question similarity levels, memory methods, and reasoning methods. Results show that LLMs can generally reason faster with past experience, achieving up to a 56% reduction in compute cost when equipped with appropriate memory and reasoning methods.

nan

Article 1163

Title@2025-05-27 (2): PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

Title: PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

PosterO: Strukturierung von Layout-Strukturen zur Aktivierung von Sprachmodellen in der Generierung von generalisierten Content-Aware-Layouts

PosterO: 构建布局树以在通用内容软件布局生成中启用语言模型 2505.07843v2

Authors: HsiaoYuan Hsu, Yuxin Peng

In poster design, content-aware layout generation is crucial for automatically arranging visual-textual elements on the given image. With limited training data, existing work focused on image-centric enhancement. However, this neglects the diversity of layouts and fails to cope with shape-variant elements or diverse design intents in generalized settings. To this end, we proposed a layout-centric approach that leverages layout knowledge implicit in large language models (LLMs) to create posters for omnifarious purposes, hence the name PosterO. Specifically, it structures layouts from datasets as trees in SVG language by universal shape, design intent vectorization, and hierarchical node representation. Then, it applies LLMs during inference to predict new layout trees by in-context learning with intent-aligned example selection. After layout trees are generated, we can seamlessly realize them into poster designs by editing the chat with LLMs. Extensive experimental results have demonstrated that PosterO can generate visually appealing layouts for given images, achieving new state-of-the-art performance across various benchmarks. To further explore PosterO’s abilities under the generalized settings, we built PStylish7, the first dataset with multi-purpose posters and various-shaped elements, further offering a challenging test for advanced research.

nan

Article 1164

Title@2025-05-27 (2): Rethinking MUSHRA: Addressing Modern Challenges in Text-to-Speech Evaluation

Title: Rethinking MUSHRA: Addressing Modern Challenges in Text-to-Speech Evaluation

Rethinking MUSHRA: Bewältigung moderner Herausforderungen in der Text-zu-Speech-Bewertung

重新思考MUSHRA:应对文本到语音评价中的现代挑战 2411.12719v3

Authors: Praveen Srinivasa Varadhan, Amogh Gulati, Ashwin Sankar, Srija Anand, Anirudh Gupta, Anirudh Mukherjee, Shiva Kumar Marepally, Ankur Bhatia, Saloni Jaju, Suvrat Bhooshan, Mitesh M. Khapra

Despite rapid advancements in TTS models, a consistent and robust human evaluation framework is still lacking. For example, MOS tests fail to differentiate between similar models, and CMOS’s pairwise comparisons are time-intensive. The MUSHRA test is a promising alternative for evaluating multiple TTS systems simultaneously, but in this work we show that its reliance on matching human reference speech unduly penalises the scores of modern TTS systems that can exceed human speech quality. More specifically, we conduct a comprehensive assessment of the MUSHRA test, focusing on its sensitivity to factors such as rater variability, listener fatigue, and reference bias. Based on our extensive evaluation involving 492 human listeners across Hindi and Tamil we identify two primary shortcomings: (i) reference-matching bias, where raters are unduly influenced by the human reference, and (ii) judgement ambiguity, arising from a lack of clear fine-grained guidelines. To address these issues, we propose two refined variants of the MUSHRA test. The first variant enables fairer ratings for synthesized samples that surpass human reference quality. The second variant reduces ambiguity, as indicated by the relatively lower variance across raters. By combining these approaches, we achieve both more reliable and more fine-grained assessments. We also release MANGO, a massive dataset of 246,000 human ratings, the first-of-its-kind collection for Indian languages, aiding in analyzing human preferences and developing automatic metrics for evaluating TTS systems.

nan

Article 1165

Title@2025-05-27 (2): Pointing the Way: Refining Radar-Lidar Localization Using Learned ICP Weights

Title: Pointing the Way: Refining Radar-Lidar Localization Using Learned ICP Weights

Den Weg weisen: Verfeinerung der Radar-Lidar-Lokalisierung mit erfahrenen ICP-Gewichten

指向方向:利用比较方案所积累的重量改进雷达-里达尔的本地化 2309.08731v4

Authors: Daniil Lisus, Johann Laconte, Keenan Burnett, Ziyu Zhang, Timothy D. Barfoot

This paper presents a novel deep-learning-based approach to improve localizing radar measurements against lidar maps. This radar-lidar localization leverages the benefits of both sensors; radar is resilient against adverse weather, while lidar produces high-quality maps in clear conditions. However, owing in part to the unique artefacts present in radar measurements, radar-lidar localization has struggled to achieve comparable performance to lidar-lidar systems, preventing it from being viable for autonomous driving. This work builds on ICP-based radar-lidar localization by including a learned preprocessing step that weights radar points based on high-level scan information. To train the weight-generating network, we present a novel, stand-alone, open-source differentiable ICP library. The learned weights facilitate ICP by filtering out harmful radar points related to artefacts, noise, and even vehicles on the road. Combining an analytical approach with a learned weight reduces overall localization errors and improves convergence in radar-lidar ICP results run on real-world autonomous driving data. Our code base is publicly available to facilitate reproducibility and extensions.

nan

Article 1166

Title@2025-05-27 (2): GMoE: Empowering LLMs Fine-Tuning via MoE Graph Collaboration

Title: GMoE: Empowering LLMs Fine-Tuning via MoE Graph Collaboration

GMoE: Stärkung von LLMs Feinsteuerung über MoE Graph Collaboration

GMOE:通过教育部图表合作,赋予LLMs Fine-Turning女士权力 2412.16216v3

Authors: Ting Bai, Yue Yu, Le Huang, Zenan Xu, Zhe Zhao, Chuan Shi

The sparse Mixture-of-Experts (MoE) architecture of large language models (LLMs) confronts an inherent issue of load imbalance arising from the simplistic linear router strategy, which ultimately causes the instability and inefficient learning of LLMs. To address this challenge, we introduce a novel MoE graph-based framework $\textbf{GMoE}$, aimed at enhancing the collaboration among multiple experts. In GMoE, a graph router function is designed to capture the collaboration signals among experts. This enables all experts to dynamically allocate information derived from input data by sharing information with their neighboring experts. Moreover, we put forward two coordination strategies in GMoE: the $\textit{Poisson distribution-based distinction strategy}$ and the $\textit{Normal distribution-based balance strategy}$, to further release the capacity of each expert and increase the model stability in the fine-tuning of LLMs. Specifically, we leverage a parameter-efficient fine-tuning technique, i.e., Low-Rank Adaptation (LoRA), to implement the graph MoE architecture. Extensive experiments on four real-world benchmark datasets demonstrate the effectiveness of GMoE, showing the benefits of facilitating collaborations of multiple experts in LLM fine-tuning. The code of experimental implementation is available at https://github.com/BAI-LAB/GMoE

nan

Article 1167

Title@2025-05-27 (2): Non-identifiability distinguishes Neural Networks among Parametric Models

Title: Non-identifiability distinguishes Neural Networks among Parametric Models

Nicht-Identifizierbarkeit unterscheidet neurale Netzwerke zwischen parametrischen Modellen

不可识别性将神经网络区分为参数模型 2504.18017v2

Authors: Sourav Chatterjee, Timothy Sudijono

One of the enduring problems surrounding neural networks is to identify the factors that differentiate them from traditional statistical models. We prove a pair of results which distinguish feedforward neural networks among parametric models at the population level, for regression tasks. Firstly, we prove that for any pair of random variables $(X,Y)$, neural networks always learn a nontrivial relationship between $X$ and $Y$, if one exists. Secondly, we prove that for reasonable smooth parametric models, under local and global identifiability conditions, there exists a nontrivial $(X,Y)$ pair for which the parametric model learns the constant predictor $\mathbb{E}[Y]$. Together, our results suggest that a lack of identifiability distinguishes neural networks among the class of smooth parametric models.

nan

Article 1168

Title@2025-05-27 (2): Scintillation pulse characterization with spectrum-inspired temporal neural networks: case studies on particle detector signals

Title: Scintillation pulse characterization with spectrum-inspired temporal neural networks: case studies on particle detector signals

Scintillation-Pulscharakterisierung mit spektruminspirierten zeitlichen neuronalen Netzwerken: Fallstudien zu Partikeldetektor-Signalen

与受频谱启发的时时神经网络的闪烁脉冲定性:粒子探测器信号案例研究 2410.07267v3

Authors: Pengcheng Ai, Xiangming Sun, Zhi Deng, Xinchi Ran

Particle detectors based on scintillators are widely used in high-energy physics and astroparticle physics experiments, nuclear medicine imaging, industrial and environmental detection, etc. Precisely extracting scintillation signal characteristics at the event level is important for these applications, not only in respect of understanding the scintillator itself, but also kinds and physical property of incident particles. Recent researches demonstrate data-driven neural networks surpass traditional statistical methods, especially when the analytical form of signals is hard to obtain, or noise is significant. However, most densely connected or convolution-based networks fail to fully exploit the spectral and temporal structure of scintillation signals, leaving large space for performance improvement. In this paper, we propose a network architecture specially tailored for scintillation pulse characterization based on previous works on time series analysis. The core insight is that, by directly applying Fast Fourier Transform on original signals and utilizing different frequency components, the proposed network architecture can serve as a lightweight and enhanced representation learning backbone. We prove our idea in two case studies: (a) simulation data generated with the setting of the LUX dark matter detector, and (b) experimental electrical signals with fast electronics to emulate scintillation variations for the NICA/MPD calorimeter. The proposed model achieves significantly better results than the reference model in literature and densely connected models and demonstrates higher cost-efficiency than conventional machine learning methods.

nan

Article 1169

Title@2025-05-27 (2): Policy Design for Two-sided Platforms with Participation Dynamics

Title: Policy Design for Two-sided Platforms with Participation Dynamics

Politikgestaltung für zweiseitige Plattformen mit Partizipationsdynamik

具有参与动态的双面平台政策设计 2502.01792v2

Authors: Haruka Kiyohara, Fan Yao, Sarah Dean

In two-sided platforms (e.g., video streaming or e-commerce), viewers and providers engage in interactive dynamics: viewers benefit from increases in provider populations, while providers benefit from increases in viewer population. Despite the importance of such “population effects” on long-term platform health, recommendation policies do not generally take the participation dynamics into account. This paper thus studies the dynamics and recommender policy design on two-sided platforms under the population effects for the first time. Our control- and game-theoretic findings warn against the use of the standard “myopic-greedy” policy and shed light on the importance of provider-side considerations (i.e., effectively distributing exposure among provider groups) to improve social welfare via population growth. We also present a simple algorithm to optimize long-term social welfare by taking the population effects into account, and demonstrate its effectiveness in synthetic and real-data experiments. Our experiment code is available at https://github.com/sdean-group/dynamics-two-sided-market.

nan

Article 1170

Title@2025-05-27 (2): Explaining Concept Shift with Interpretable Feature Attribution

Title: Explaining Concept Shift with Interpretable Feature Attribution

Erklären von Konzeptverschiebungen mit interpretierbarer Eigenschaftszuweisung

解释解释概念转变与可解释性地物归属 2505.20634v1

Authors: Ruiqi Lyu, Alistair Turcan, Bryan Wilder

Regardless the amount of data a machine learning (ML) model is trained on, there will inevitably be data that differs from their training set, lowering model performance. Concept shift occurs when the distribution of labels conditioned on the features changes, making even a well-tuned ML model to have learned a fundamentally incorrect representation. Identifying these shifted features provides unique insight into how one dataset differs from another, considering the difference may be across a scientifically relevant dimension, such as time, disease status, population, etc. In this paper, we propose SGShift, a model for detecting concept shift in tabular data and attributing reduced model performance to a sparse set of shifted features. SGShift models concept shift with a Generalized Additive Model (GAM) and performs subsequent feature selection to identify shifted features. We propose further extensions of SGShift by incorporating knockoffs to control false discoveries and an absorption term to account for models with poor fit to the data. We conduct extensive experiments in synthetic and real data across various ML models and find SGShift can identify shifted features with AUC $>0.9$ and recall $>90\%$, often 2 or 3 times as high as baseline methods.

nan

Article 1171

Title@2025-05-27 (2): Adaptive Backtracking Line Search

Title: Adaptive Backtracking Line Search

Adaptive Rückverfolgungszeilensuche

适应性后回跟踪线搜索 2408.13150v2

Authors: Joao V. Cavalcanti, Laurent Lessard, Ashia C. Wilson

Backtracking line search is foundational in numerical optimization. The basic idea is to adjust the step-size of an algorithm by a constant factor until some chosen criterion (e.g. Armijo, Descent Lemma) is satisfied. We propose a novel way to adjust step-sizes, replacing the constant factor used in regular backtracking with one that takes into account the degree to which the chosen criterion is violated, with no additional computational burden. This light-weight adjustment leads to significantly faster optimization, which we confirm by performing a variety of experiments on over fifteen real world datasets. For convex problems, we prove adaptive backtracking requires no more adjustments to produce a feasible step-size than regular backtracking does. For nonconvex smooth problems, we prove adaptive backtracking enjoys the same guarantees of regular backtracking. Furthermore, we prove adaptive backtracking preserves the convergence rates of gradient descent and its accelerated variant.

nan

Article 1172

Title@2025-05-27 (2): Test-Time Learning for Large Language Models

Title: Test-Time Learning for Large Language Models

Test-Time Learning für große Sprachmodelle

大语言模型试验时间学习 2505.20633v1

Authors: Jinwu Hu, Zhitian Zhang, Guohao Chen, Xutao Wen, Chao Shuai, Wei Luo, Bin Xiao, Yuanqing Li, Mingkui Tan

While Large Language Models (LLMs) have exhibited remarkable emergent capabilities through extensive pre-training, they still face critical limitations in generalizing to specialized domains and handling diverse linguistic variations, known as distribution shifts. In this paper, we propose a Test-Time Learning (TTL) paradigm for LLMs, namely TLM, which dynamically adapts LLMs to target domains using only unlabeled test data during testing. Specifically, we first provide empirical evidence and theoretical insights to reveal that more accurate predictions from LLMs can be achieved by minimizing the input perplexity of the unlabeled test data. Based on this insight, we formulate the Test-Time Learning process of LLMs as input perplexity minimization, enabling self-supervised enhancement of LLM performance. Furthermore, we observe that high-perplexity samples tend to be more informative for model optimization. Accordingly, we introduce a Sample Efficient Learning Strategy that actively selects and emphasizes these high-perplexity samples for test-time updates. Lastly, to mitigate catastrophic forgetting and ensure adaptation stability, we adopt Low-Rank Adaptation (LoRA) instead of full-parameter optimization, which allows lightweight model updates while preserving more original knowledge from the model. We introduce the AdaptEval benchmark for TTL and demonstrate through experiments that TLM improves performance by at least 20% compared to original LLMs on domain knowledge adaptation.

nan

Article 1173

Title@2025-05-27 (2): Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training

Title: Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training

Einschließlich flexibler Bildkonditionierung in Text-zu-Video-Diffusionsmodelle ohne Training

将灵活的图像条件纳入无培训的文本到视频传播模型 2505.20629v1

Authors: Bolin Lai, Sangmin Lee, Xu Cao, Xiang Li, James M. Rehg

Text-image-to-video (TI2V) generation is a critical problem for controllable video generation using both semantic and visual conditions. Most existing methods typically add visual conditions to text-to-video (T2V) foundation models by finetuning, which is costly in resources and only limited to a few predefined conditioning settings. To tackle this issue, we introduce a unified formulation for TI2V generation with flexible visual conditioning. Furthermore, we propose an innovative training-free approach, dubbed FlexTI2V, that can condition T2V foundation models on an arbitrary amount of images at arbitrary positions. Specifically, we firstly invert the condition images to noisy representation in a latent space. Then, in the denoising process of T2V models, our method uses a novel random patch swapping strategy to incorporate visual features into video representations through local image patches. To balance creativity and fidelity, we use a dynamic control mechanism to adjust the strength of visual conditioning to each video frame. Extensive experiments validate that our method surpasses previous training-free image conditioning methods by a notable margin. We also show more insights of our method by detailed ablation study and analysis.

nan

Article 1174

Title@2025-05-27 (2): Position: Adopt Constraints Over Penalties in Deep Learning

Title: Position: Adopt Constraints Over Penalties in Deep Learning

Position: Überstrapazierte Strafen im Deep Learning adoptieren

职位:在深深学习中采用约束措施以凌驾刑罚 2505.20628v1

Authors: Juan Ramirez, Meraj Hashemizadeh, Simon Lacoste-Julien

Recent efforts toward developing trustworthy AI systems with accountability guarantees have led to a growing reliance on machine learning formulations that incorporate external requirements, or constraints. These requirements are often enforced through penalization–adding fixed-weight terms to the task loss. We argue that this approach is ill-suited, and that tailored constrained optimization methods should be adopted instead. In particular, no penalty coefficient may yield a solution that both satisfies the constraints and achieves good performance–i.e., one solving the constrained problem. Moreover, tuning these coefficients is costly, incurring significant time and computational overhead. In contrast, tailored constrained methods–such as the Lagrangian approach, which optimizes the penalization “coefficients” (the Lagrange multipliers) alongside the model–(i) truly solve the constrained problem and add accountability, (ii) eliminate the need for extensive penalty tuning, and (iii) integrate seamlessly with modern deep learning pipelines.

nan

Article 1175

Title@2025-05-27 (2): JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 Minutes

Title: JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 Minutes

JaxRobotarium: Schulung und Einsatz von Multi-Roboter-Politik in 10 Minuten

JaxRobotior:10分钟内培训和部署多机器人政策 2505.06771v2

Authors: Shalin Anand Jain, Jiazhen Liu, Siva Kailas, Harish Ravichandar

Multi-agent reinforcement learning (MARL) has emerged as a promising solution for learning complex and scalable coordination behaviors in multi-robot systems. However, established MARL platforms (e.g., SMAC and MPE) lack robotics relevance and hardware deployment, leaving multi-robot learning researchers to develop bespoke environments and hardware testbeds dedicated to the development and evaluation of their individual contributions. The Multi-Agent RL Benchmark and Learning Environment for the Robotarium (MARBLER) is an exciting recent step in providing a standardized robotics-relevant platform for MARL, by bridging the Robotarium testbed with existing MARL software infrastructure. However, MARBLER lacks support for parallelization and GPU/TPU execution, making the platform prohibitively slow compared to modern MARL environments and hindering adoption. We contribute JaxRobotarium, a Jax-powered end-to-end simulation, learning, deployment, and benchmarking platform for the Robotarium. JaxRobotarium enables rapid training and deployment of multi-robot RL (MRRL) policies with realistic robot dynamics and safety constraints, supporting parallelization and hardware acceleration. Our generalizable learning interface integrates easily with SOTA MARL libraries (e.g., JaxMARL). In addition, JaxRobotarium includes eight standardized coordination scenarios, including four novel scenarios that bring established MARL benchmark tasks (e.g., RWARE and Level-Based Foraging) to a robotics setting. We demonstrate that JaxRobotarium retains high simulation fidelity while achieving dramatic speedups over baseline (20x in training and 150x in simulation), and provides an open-access sim-to-real evaluation pipeline through the Robotarium testbed, accelerating and democratizing access to multi-robot learning research and evaluation. Our code is available at https://github.com/GT-STAR-Lab/JaxRobotarium.

nan

Article 1176

Title@2025-05-27 (2): Knowledge Distillation Approach for SOS Fusion Staging: Towards Fully Automated Skeletal Maturity Assessment

Title: Knowledge Distillation Approach for SOS Fusion Staging: Towards Fully Automated Skeletal Maturity Assessment

Wissensdestillationsansatz für SOS-Fusionsstaging: Auf dem Weg zu einer vollautomatischen Skeletalreifebewertung

利用知识蒸馏方法解决求求求融合问题:全面自动化骨骼成熟期评估 2505.21561v1

Authors: Omid Halimi Milani, Amanda Nikho, Marouane Tliba, Lauren Mills, Ahmet Enis Cetin, Mohammed H Elnagar

We introduce a novel deep learning framework for the automated staging of spheno-occipital synchondrosis (SOS) fusion, a critical diagnostic marker in both orthodontics and forensic anthropology. Our approach leverages a dual-model architecture wherein a teacher model, trained on manually cropped images, transfers its precise spatial understanding to a student model that operates on full, uncropped images. This knowledge distillation is facilitated by a newly formulated loss function that aligns spatial logits as well as incorporates gradient-based attention spatial mapping, ensuring that the student model internalizes the anatomically relevant features without relying on external cropping or YOLO-based segmentation. By leveraging expert-curated data and feedback at each step, our framework attains robust diagnostic accuracy, culminating in a clinically viable end-to-end pipeline. This streamlined approach obviates the need for additional pre-processing tools and accelerates deployment, thereby enhancing both the efficiency and consistency of skeletal maturation assessment in diverse clinical settings.

nan

Article 1177

Title@2025-05-27 (2): SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation

Title: SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation

SeqPO-SiMT: Sequentielle Politikoptimierung für die gleichzeitige maschinelle Übersetzung

SeqPO-SIMT:同步机器翻译的序列政策优化 2505.20622v1

Authors: Ting Xu, Zhichao Huang, Jiankai Sun, Shanbo Cheng, Wai Lam

We present Sequential Policy Optimization for Simultaneous Machine Translation (SeqPO-SiMT), a new policy optimization framework that defines the simultaneous machine translation (SiMT) task as a sequential decision making problem, incorporating a tailored reward to enhance translation quality while reducing latency. In contrast to popular Reinforcement Learning from Human Feedback (RLHF) methods, such as PPO and DPO, which are typically applied in single-step tasks, SeqPO-SiMT effectively tackles the multi-step SiMT task. This intuitive framework allows the SiMT LLMs to simulate and refine the SiMT process using a tailored reward. We conduct experiments on six datasets from diverse domains for En to Zh and Zh to En SiMT tasks, demonstrating that SeqPO-SiMT consistently achieves significantly higher translation quality with lower latency. In particular, SeqPO-SiMT outperforms the supervised fine-tuning (SFT) model by 1.13 points in COMET, while reducing the Average Lagging by 6.17 in the NEWSTEST2021 En to Zh dataset. While SiMT operates with far less context than offline translation, the SiMT results of SeqPO-SiMT on 7B LLM surprisingly rival the offline translation of high-performing LLMs, including Qwen-2.5-7B-Instruct and LLaMA-3-8B-Instruct.

nan

Article 1178

Title@2025-05-27 (2): Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning

Title: Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning

Mehrstufige Zertifizierte Verteidigung gegen vergiftende Angriffe im Offline-Verstärkungslernen

多级认证防卫,防止在离线强化学习中进行毒物攻击 2505.20621v1

Authors: Shijie Liu, Andrew C. Cullen, Paul Montague, Sarah Erfani, Benjamin I. P. Rubinstein

Similar to other machine learning frameworks, Offline Reinforcement Learning (RL) is shown to be vulnerable to poisoning attacks, due to its reliance on externally sourced datasets, a vulnerability that is exacerbated by its sequential nature. To mitigate the risks posed by RL poisoning, we extend certified defenses to provide larger guarantees against adversarial manipulation, ensuring robustness for both per-state actions, and the overall expected cumulative reward. Our approach leverages properties of Differential Privacy, in a manner that allows this work to span both continuous and discrete spaces, as well as stochastic and deterministic environments – significantly expanding the scope and applicability of achievable guarantees. Empirical evaluations demonstrate that our approach ensures the performance drops to no more than $50\%$ with up to $7\%$ of the training data poisoned, significantly improving over the $0.008\%$ in prior work~\citep{wu_copa_2022}, while producing certified radii that is $5$ times larger as well. This highlights the potential of our framework to enhance safety and reliability in offline RL.

nan

Article 1179

Title@2025-05-27 (2): An Inexact Halpern Iteration with Application to Distributionally Robust Optimization

Title: An Inexact Halpern Iteration with Application to Distributionally Robust Optimization

Eine ungenaue Halpern-Iteration mit Anwendung zur distributiv robusten Optimierung

用于分布强力优化优化的不精确 Halpern 迭代 2402.06033v3

Authors: Ling Liang, Zusen Xu, Kim-Chuan Toh, Jia-Jie Zhu

The Halpern iteration for solving monotone inclusion problems has gained increasing interests in recent years due to its simple form and appealing convergence properties. In this paper, we investigate the inexact variants of the scheme in both deterministic and stochastic settings. We conduct extensive convergence analysis and show that by choosing the inexactness tolerances appropriately, the inexact schemes admit an $O(k^{-1})$ convergence rate in terms of the (expected) residue norm. Our results relax the state-of-the-art inexactness conditions employed in the literature while sharing the same competitive convergence properties. We then demonstrate how the proposed methods can be applied for solving two classes of data-driven Wasserstein distributionally robust optimization problems that admit convex-concave min-max optimization reformulations. We highlight its capability of performing inexact computations for distributionally robust learning with stochastic first-order methods and for general nonlinear convex-concave loss functions, which are competitive in the literature.

nan

Article 1180

Title@2025-05-27 (2): SoftPQ: Robust Instance Segmentation Evaluation via Soft Matching and Tunable Thresholds

Title: SoftPQ: Robust Instance Segmentation Evaluation via Soft Matching and Tunable Thresholds

SoftPQ: Robuste Instance Segmentierungsbewertung über Soft Matching und Tunable Thresholds

软PQ:通过软匹配和金枪鱼分量阈值进行强力实例分化评价 2505.12155v2

Authors: Ranit Karmakar, Simon F. Nørrelykke

Segmentation evaluation metrics traditionally rely on binary decision logic: predictions are either correct or incorrect, based on rigid IoU thresholds. Detection–based metrics such as F1 and mAP determine correctness at the object level using fixed overlap cutoffs, while overlap–based metrics like Intersection over Union (IoU) and Dice operate at the pixel level, often overlooking instance–level structure. Panoptic Quality (PQ) attempts to unify detection and segmentation assessment, but it remains dependent on hard-threshold matching–treating predictions below the threshold as entirely incorrect. This binary framing obscures important distinctions between qualitatively different errors and fails to reward gradual model improvements. We propose SoftPQ, a flexible and interpretable instance segmentation metric that redefines evaluation as a graded continuum rather than a binary classification. SoftPQ introduces tunable upper and lower IoU thresholds to define a partial matching region and applies a sublinear penalty function to ambiguous or fragmented predictions. These extensions allow SoftPQ to exhibit smoother score behavior, greater robustness to structural segmentation errors, and more informative feedback for model development and evaluation. Through controlled perturbation experiments, we show that SoftPQ captures meaningful differences in segmentation quality that existing metrics overlook, making it a practical and principled alternative for both benchmarking and iterative model refinement.

nan

Article 1181

Title@2025-05-27 (2): Real-Time Stress Monitoring, Detection, and Management in College Students: A Wearable Technology and Machine-Learning Approach

Title: Real-Time Stress Monitoring, Detection, and Management in College Students: A Wearable Technology and Machine-Learning Approach

Echtzeit-Stress-Monitoring, Detection und Management in College-Studenten: Ein Wearable-Technologie- und Machine-Learning-Ansatz

大学生实时应力监测、检测和管理:穿戴技术和机械学习方法 2505.15974v2

Authors: Alan Ta, Nilsu Salgin, Mustafa Demir, Kala Phillips Reindel, Ranjana K. Mehta, Anthony McDonald, Carly McCord, Farzan Sasangohar

College students are increasingly affected by stress, anxiety, and depression, yet face barriers to traditional mental health care. This study evaluated the efficacy of a mobile health (mHealth) intervention, Mental Health Evaluation and Lookout Program (mHELP), which integrates a smartwatch sensor and machine learning (ML) algorithms for real-time stress detection and self-management. In a 12-week randomized controlled trial (n = 117), participants were assigned to a treatment group using mHELP’s full suite of interventions or a control group using the app solely for real-time stress logging and weekly psychological assessments. The primary outcome, “Moments of Stress” (MS), was assessed via physiological and self-reported indicators and analyzed using Generalized Linear Mixed Models (GLMM) approaches. Similarly, secondary outcomes of psychological assessments, including the Generalized Anxiety Disorder-7 (GAD-7) for anxiety, the Patient Health Questionnaire (PHQ-8) for depression, and the Perceived Stress Scale (PSS), were also analyzed via GLMM. The finding of the objective measure, MS, indicates a substantial decrease in MS among the treatment group compared to the control group, while no notable between-group differences were observed in subjective scores of anxiety (GAD-7), depression (PHQ-8), or stress (PSS). However, the treatment group exhibited a clinically meaningful decline in GAD-7 and PSS scores. These findings underscore the potential of wearable-enabled mHealth tools to reduce acute stress in college populations and highlight the need for extended interventions and tailored features to address chronic symptoms like depression.

nan

Article 1182

Title@2025-05-27 (2): LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers

Title: LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers

LLM-FE: Automatisiertes Feature Engineering für Tabellendaten mit LLMs als Evolutionsoptimierer

LLM-FE: 制表数据的自动地貌工程,LLMM作为进化优化器 2503.14434v2

Authors: Nikhil Abhyankar, Parshin Shojaee, Chandan K. Reddy

Automated feature engineering plays a critical role in improving predictive model performance for tabular learning tasks. Traditional automated feature engineering methods are limited by their reliance on pre-defined transformations within fixed, manually designed search spaces, often neglecting domain knowledge. Recent advances using Large Language Models (LLMs) have enabled the integration of domain knowledge into the feature engineering process. However, existing LLM-based approaches use direct prompting or rely solely on validation scores for feature selection, failing to leverage insights from prior feature discovery experiments or establish meaningful reasoning between feature generation and data-driven performance. To address these challenges, we propose LLM-FE, a novel framework that combines evolutionary search with the domain knowledge and reasoning capabilities of LLMs to automatically discover effective features for tabular learning tasks. LLM-FE formulates feature engineering as a program search problem, where LLMs propose new feature transformation programs iteratively, and data-driven feedback guides the search process. Our results demonstrate that LLM-FE consistently outperforms state-of-the-art baselines, significantly enhancing the performance of tabular prediction models across diverse classification and regression benchmarks.

nan

Article 1183

Title@2025-05-27 (2): PhySense: Sensor Placement Optimization for Accurate Physics Sensing

Title: PhySense: Sensor Placement Optimization for Accurate Physics Sensing

PhySense: Sensor-Platzierungs-Optimierung für präzise Physik Sensing

感应:精确物理遥感传感器定位优化 2505.18190v2

Authors: Yuezhou Ma, Haixu Wu, Hang Zhou, Huikun Weng, Jianmin Wang, Mingsheng Long

Physics sensing plays a central role in many scientific and engineering domains, which inherently involves two coupled tasks: reconstructing dense physical fields from sparse observations and optimizing scattered sensor placements to observe maximum information. While deep learning has made rapid advances in sparse-data reconstruction, existing methods generally omit optimization of sensor placements, leaving the mutual enhancement between reconstruction and placement on the shelf. To change this suboptimal practice, we propose PhySense, a synergistic two-stage framework that learns to jointly reconstruct physical fields and to optimize sensor placements, both aiming for accurate physics sensing. The first stage involves a flow-based generative model enhanced by cross-attention to adaptively fuse sparse observations. Leveraging the reconstruction feedback, the second stage performs sensor placement via projected gradient descent to satisfy spatial constraints. We further prove that the learning objectives of the two stages are consistent with classical variance-minimization principles, providing theoretical guarantees. Extensive experiments across three challenging benchmarks, especially a 3D geometry dataset, indicate PhySense achieves state-of-the-art physics sensing accuracy and discovers informative sensor placements previously unconsidered.

nan

Article 1184

Title@2025-05-27 (2): Intelligent Incident Hypertension Prediction in Obstructive Sleep Apnea

Title: Intelligent Incident Hypertension Prediction in Obstructive Sleep Apnea

Intelligente Hypertonie-Vorhersage bei obstruktiver Schlafapnoe

阻碍睡眠的智能性事件超强度预测 2505.20615v1

Authors: Omid Halimi Milani, Ahmet Enis Cetin, Bharati Prasad

Obstructive sleep apnea (OSA) is a significant risk factor for hypertension, primarily due to intermittent hypoxia and sleep fragmentation. Predicting whether individuals with OSA will develop hypertension within five years remains a complex challenge. This study introduces a novel deep learning approach that integrates Discrete Cosine Transform (DCT)-based transfer learning to enhance prediction accuracy. We are the first to incorporate all polysomnography signals together for hypertension prediction, leveraging their collective information to improve model performance. Features were extracted from these signals and transformed into a 2D representation to utilize pre-trained 2D neural networks such as MobileNet, EfficientNet, and ResNet variants. To further improve feature learning, we introduced a DCT layer, which transforms input features into a frequency-based representation, preserving essential spectral information, decorrelating features, and enhancing robustness to noise. This frequency-domain approach, coupled with transfer learning, is especially beneficial for limited medical datasets, as it leverages rich representations from pre-trained networks to improve generalization. By strategically placing the DCT layer at deeper truncation depths within EfficientNet, our model achieved a best area under the curve (AUC) of 72.88%, demonstrating the effectiveness of frequency-domain feature extraction and transfer learning in predicting hypertension risk in OSA patients over a five-year period.

nan

Article 1185

Title@2025-05-27 (2): A Concentration Bound for TD(0) with Function Approximation

Title: A Concentration Bound for TD(0) with Function Approximation

Ein Konzentrationsbund für TD(0) mit Funktionsannäherung

具有函数接近度的 TD(0) 的浓度界值 2312.10424v3

Authors: Siddharth Chandak, Vivek S. Borkar

We derive a concentration bound of the type `for all $n \geq n_0$ for some $n_0$’ for TD(0) with linear function approximation. We work with online TD learning with samples from a single sample path of the underlying Markov chain. This makes our analysis significantly different from offline TD learning or TD learning with access to independent samples from the stationary distribution of the Markov chain. We treat TD(0) as a contractive stochastic approximation algorithm, with both martingale and Markov noises. Markov noise is handled using the Poisson equation and the lack of almost sure guarantees on boundedness of iterates is handled using the concept of relaxed concentration inequalities.

nan

Article 1186

Title@2025-05-27 (2): REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning

Title: REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning

REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning

实际检索: 数学理由的回收增量精液预言 2505.20613v1

Authors: Ziju Shen, Naohao Huang, Fanyi Yang, Yutong Wang, Guoxiong Gao, Tianyi Xu, Jiedong Jiang, Wanyi He, Pu Yang, Mengzhou Sun, Haocheng Ju, Peihao Wu, Bryan Dai, Bin Dong

Nowadays, formal theorem provers have made monumental progress on high-school and competition-level mathematics, but few of them generalize to more advanced mathematics. In this paper, we present REAL-Prover, a new open-source stepwise theorem prover for Lean 4 to push this boundary. This prover, based on our fine-tuned large language model (REAL-Prover-v1) and integrated with a retrieval system (Leansearch-PS), notably boosts performance on solving college-level mathematics problems. To train REAL-Prover-v1, we developed HERALD-AF, a data extraction pipeline that converts natural language math problems into formal statements, and a new open-source Lean 4 interactive environment (Jixia-interactive) to facilitate synthesis data collection. In our experiments, our prover using only supervised fine-tune achieves competitive results with a 23.7% success rate (Pass@64) on the ProofNet dataset-comparable to state-of-the-art (SOTA) models. To further evaluate our approach, we introduce FATE-M, a new benchmark focused on algebraic problems, where our prover achieves a SOTA success rate of 56.7% (Pass@64).

nan

Article 1187

Title@2025-05-27 (2): Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models

Title: Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models

Roboflow100-VL: Ein Multi-Domain-Objekterkennungs-Benchmark für Vision-Language-Modelle

机器人流100-VL:愿景-语言模型多功能物体探测基准 2505.20612v1

Authors: Peter Robicheaux, Matvei Popov, Anish Madan, Isaac Robinson, Joseph Nelson, Deva Ramanan, Neehar Peri

Vision-language models (VLMs) trained on internet-scale data achieve remarkable zero-shot detection performance on common objects like car, truck, and pedestrian. However, state-of-the-art models still struggle to generalize to out-of-distribution classes, tasks and imaging modalities not typically found in their pre-training. Rather than simply re-training VLMs on more visual data, we argue that one should align VLMs to new concepts with annotation instructions containing a few visual examples and rich textual descriptions. To this end, we introduce Roboflow100-VL, a large-scale collection of 100 multi-modal object detection datasets with diverse concepts not commonly found in VLM pre-training. We evaluate state-of-the-art models on our benchmark in zero-shot, few-shot, semi-supervised, and fully-supervised settings, allowing for comparison across data regimes. Notably, we find that VLMs like GroundingDINO and Qwen2.5-VL achieve less than 2% zero-shot accuracy on challenging medical imaging datasets within Roboflow100-VL, demonstrating the need for few-shot concept alignment. Our code and dataset are available at https://github.com/roboflow/rf100-vl/ and https://universe.roboflow.com/rf100-vl/

nan

Article 1188

Title@2025-05-27 (2): Hierarchical Mamba Meets Hyperbolic Geometry: A New Paradigm for Structured Language Embeddings

Title: Hierarchical Mamba Meets Hyperbolic Geometry: A New Paradigm for Structured Language Embeddings

Hierarchische Mamba trifft auf Hyperbolische Geometrie: Ein neues Paradigma für strukturierte Spracheinbettungen

等级式 Mamba 相遇超双曲几何: 结构化语言嵌入的新范式 2505.18973v2

Authors: Sarang Patil, Ashish Parmanand Pandey, Ioannis Koutis, Mengjia Xu

Selective state-space models have achieved great success in long-sequence modeling. However, their capacity for language representation, especially in complex hierarchical reasoning tasks, remains underexplored. Most large language models rely on flat Euclidean embeddings, limiting their ability to capture latent hierarchies. To address this limitation, we propose Hierarchical Mamba (HiM), integrating efficient Mamba2 with exponential growth and curved nature of hyperbolic geometry to learn hierarchy-aware language embeddings for deeper linguistic understanding. Mamba2-processed sequences are projected to the Poincare ball (via tangent-based mapping) or Lorentzian manifold (via cosine and sine-based mapping) with “learnable” curvature, optimized with a combined hyperbolic loss. Our HiM model facilitates the capture of relational distances across varying hierarchical levels, enabling effective long-range reasoning. This makes it well-suited for tasks like mixed-hop prediction and multi-hop inference in hierarchical classification. We evaluated our HiM with four linguistic and medical datasets for mixed-hop prediction and multi-hop inference tasks. Experimental results demonstrated that: 1) Both HiM models effectively capture hierarchical relationships for four ontological datasets, surpassing Euclidean baselines. 2) HiM-Poincare captures fine-grained semantic distinctions with higher h-norms, while HiM-Lorentz provides more stable, compact, and hierarchy-preserving embeddings favoring robustness over detail.

nan

Article 1189

Title@2025-05-27 (2): Integral Imprecise Probability Metrics

Title: Integral Imprecise Probability Metrics

Integral Ungenaue Wahrscheinlichkeits-Metriken

综合综合不全性障碍概率概率度量 2505.16156v2

Authors: Siu Lun Chau, Michele Caprio, Krikamol Muandet

Quantifying differences between probability distributions is fundamental to statistics and machine learning, primarily for comparing statistical uncertainty. In contrast, epistemic uncertainty (EU) – due to incomplete knowledge – requires richer representations than those offered by classical probability. Imprecise probability (IP) theory offers such models, capturing ambiguity and partial belief. This has driven growing interest in imprecise probabilistic machine learning (IPML), where inference and decision-making rely on broader uncertainty models – highlighting the need for metrics beyond classical probability. This work introduces the Integral Imprecise Probability Metric (IIPM) framework, a Choquet integral-based generalisation of classical Integral Probability Metric (IPM) to the setting of capacities – a broad class of IP models encompassing many existing ones, including lower probabilities, probability intervals, belief functions, and more. Theoretically, we establish conditions under which IIPM serves as a valid metric and metrises a form of weak convergence of capacities. Practically, IIPM not only enables comparison across different IP models but also supports the quantification of epistemic uncertainty within a single IP model. In particular, by comparing an IP model with its conjugate, IIPM gives rise to a new class of EU measures – Maximum Mean Imprecision – which satisfy key axiomatic properties proposed in the Uncertainty Quantification literature. We validate MMI through selective classification experiments, demonstrating strong empirical performance against established EU measures, and outperforming them when classical methods struggle to scale to a large number of classes. Our work advances both theory and practice in IPML, offering a principled framework for comparing and quantifying epistemic uncertainty under imprecision.

nan

Article 1190

Title@2025-05-27 (2): Improving Generative Inverse Design of Rectangular Patch Antennas with Test Time Optimization

Title: Improving Generative Inverse Design of Rectangular Patch Antennas with Test Time Optimization

Verbesserung des generativen Inversen Designs von rechteckigen Patchantennen mit Testzeitoptimierung

改进带测试时间优化的矩形补边天线的生成反向设计 2505.18188v2

Authors: Beck LaBash, Shahriar Khushrushahi, Fabian Ruehle

We propose a two-stage deep learning framework for the inverse design of rectangular patch antennas. Our approach leverages generative modeling to learn a latent representation of antenna frequency response curves and conditions a subsequent generative model on these responses to produce feasible antenna geometries. We further demonstrate that leveraging search and optimization techniques at test-time improves the accuracy of the generated designs and enables consideration of auxiliary objectives such as manufacturability. Our approach generalizes naturally to different design criteria, and can be easily adapted to more complex geometric design spaces.

nan

Article 1191

Title@2025-05-27 (2): InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling

Title: InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling

InstGenIE: Generative Bildbearbeitung mit Mask-aware Caching und Scheduling effizient gemacht

InstGenie: 生成图像编辑, 高效使用防面具图像缓冲和排程 2505.20600v1

Authors: Xiaoxiao Jiang, Suyi Li, Lingyun Yang, Tianyu Feng, Zhipeng Di, Weiyi Lu, Guoxuan Zhu, Xiu Lin, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Liping Zhang, Wei Wang

Generative image editing using diffusion models has become a prevalent application in today’s AI cloud services. In production environments, image editing typically involves a mask that specifies the regions of an image template to be edited. The use of masks provides direct control over the editing process and introduces sparsity in the model inference. In this paper, we present InstGenIE, a system that efficiently serves image editing requests. The key insight behind InstGenIE is that image editing only modifies the masked regions of image templates while preserving the original content in the unmasked areas. Driven by this insight, InstGenIE judiciously skips redundant computations associated with the unmasked areas by reusing cached intermediate activations from previous inferences. To mitigate the high cache loading overhead, InstGenIE employs a bubble-free pipeline scheme that overlaps computation with cache loading. Additionally, to reduce queuing latency in online serving while improving the GPU utilization, InstGenIE proposes a novel continuous batching strategy for diffusion model serving, allowing newly arrived requests to join the running batch in just one step of denoising computation, without waiting for the entire batch to complete. As heterogeneous masks induce imbalanced loads, InstGenIE also develops a load balancing strategy that takes into account the loads of both computation and cache loading. Collectively, InstGenIE outperforms state-of-the-art diffusion serving systems for image editing, achieving up to 3x higher throughput and reducing average request latency by up to 14.7x while ensuring image quality.

nan

Article 1192

Title@2025-05-27 (2): Randomly Sampled Language Reasoning Problems Explain Limits of LLMs

Title: Randomly Sampled Language Reasoning Problems Explain Limits of LLMs

Zufällig gemusterte Sprachbegründungsprobleme erklären Grenzen von LLMs

随机抽样语言原因问题解释LLMM限制 2501.02825v5

Authors: Kavi Gupta, Kate Sanders, Armando Solar-Lezama

While LLMs have revolutionized the field of machine learning due to their high performance across a range of tasks, they are known to perform poorly in planning, hallucinate false answers, have degraded performance on less canonical versions of the same task, and answer incorrectly on a variety of specific prompts. There are several emerging theories of LLM performance with some predictive power, among them that LLMs lack world modeling ability, that they have an undesirable bias towards an autoregressive prior, and that they perform less well on more novel problems. The existing literature on novelty has focused on tasks of relatively high complexity, studying perturbations of canonical but complex problems. In this paper, we attempt to isolate novelty as a factor in LLM underperformance. To this end, we consider an extremely simple domain: next token prediction on simple language tasks. The twist is that these language tasks are unseen, as they are randomly drawn from a large, parsimoniously defined set of languages arising from simple grammar rules. This allows us to isolate the effect of task novelty and see if it is sufficient to explain low performance. We find that LLMs uniformly underperform n-gram models (which do not have the capacity for world modeling) on these tasks, both when used as next token predictors and as reasoners.

nan

Article 1193

Title@2025-05-26 (1): GenMol: A Drug Discovery Generalist with Discrete Diffusion

Title: GenMol: A Drug Discovery Generalist with Discrete Diffusion

GenMol: Ein Drug Discovery Generalist mit diskreter Diffusion

GenMol: 具有分辨扩散作用的药物发现通俗主义者 2501.06158v2

Authors: Seul Lee, Karsten Kreis, Srimukh Prasad Veccham, Meng Liu, Danny Reidenbach, Yuxing Peng, Saee Paliwal, Weili Nie, Arash Vahdat

Drug discovery is a complex process that involves multiple stages and tasks. However, existing molecular generative models can only tackle some of these tasks. We present Generalist Molecular generative model (GenMol), a versatile framework that uses only a single discrete diffusion model to handle diverse drug discovery scenarios. GenMol generates Sequential Attachment-based Fragment Embedding (SAFE) sequences through non-autoregressive bidirectional parallel decoding, thereby allowing the utilization of a molecular context that does not rely on the specific token ordering while having better sampling efficiency. GenMol uses fragments as basic building blocks for molecules and introduces fragment remasking, a strategy that optimizes molecules by regenerating masked fragments, enabling effective exploration of chemical space. We further propose molecular context guidance (MCG), a guidance method tailored for masked discrete diffusion of GenMol. GenMol significantly outperforms the previous GPT-based model in de novo generation and fragment-constrained generation, and achieves state-of-the-art performance in goal-directed hit generation and lead optimization. These results demonstrate that GenMol can tackle a wide range of drug discovery tasks, providing a unified and versatile approach for molecular design.

nan

Article 1194

Title@2025-05-26 (1): Prot2Token: A Unified Framework for Protein Modeling via Next-Token Prediction

Title: Prot2Token: A Unified Framework for Protein Modeling via Next-Token Prediction

Prot2Token: Ein einheitliches Framework für Proteinmodellierung über Next-Token-Vorhersage

Prot2Token:通过次声预测建立蛋白模型的统一框架 2505.20589v1

Authors: Mahdi Pourmirzaei, Farzaneh Esmaili, Salhuldin Alqarghuli, Mohammadreza Pourmirzaei, Ye Han, Kai Chen, Mohsen Rezaei, Duolin Wang, Dong Xu

The diverse nature of protein prediction tasks has traditionally necessitated specialized models, hindering the development of broadly applicable and computationally efficient Protein Language Models (PLMs). In this work, we introduce Prot2Token, a unified framework that overcomes these challenges by converting a wide spectrum of protein-related predictions, from sequence-level properties and residue-specific attributes to complex inter-protein interactions, into a standardized next-token prediction format. At its core, Prot2Token employs an autoregressive decoder, conditioned on embeddings from pre-trained protein encoders and guided by learnable task tokens, to perform diverse predictions. This architecture uniquely facilitates multi-task learning, enabling a single model to master numerous tasks with improved efficiency. We present extensive experimental validation across a variety of benchmarks, demonstrating Prot2Tokens strong predictive power in different types of protein-prediction tasks. Key results include significant speedups (e.g., near 1000x over AlphaFold2 with MSA) and performance often matching or exceeding specialized approaches. Beyond that, we introduce an auxiliary self-supervised decoder pre-training approach to improve spatially sensitive task performance. Prot2Token thus offers a significant step towards a versatile, high-throughput paradigm for protein modeling, promising to accelerate biological discovery and the development of novel therapeutics. The code is available at https://github.com/mahdip72/prot2token .

nan

Article 1195

Title@2025-05-26 (1): Bidirectional Variational Autoencoders

Title: Bidirectional Variational Autoencoders

Bidirektionale Variationale Autoencoder

双向多向自动自动编码器 2505.16074v2

Authors: Bart Kosko, Olaoluwa Adigun

We present the new bidirectional variational autoencoder (BVAE) network architecture. The BVAE uses a single neural network both to encode and decode instead of an encoder-decoder network pair. The network encodes in the forward direction and decodes in the backward direction through the same synaptic web. Simulations compared BVAEs and ordinary VAEs on the four image tasks of image reconstruction, classification, interpolation, and generation. The image datasets included MNIST handwritten digits, Fashion-MNIST, CIFAR-10, and CelebA-64 face images. The bidirectional structure of BVAEs cut the parameter count by almost 50% and still slightly outperformed the unidirectional VAEs.

nan

Article 1196

Title@2025-05-26 (1): Balancing Performance and Costs in Best Arm Identification

Title: Balancing Performance and Costs in Best Arm Identification

Ausgewogene Leistung und Kosten bei der Ermittlung der besten Waffen

平衡最佳武器识别的性能和费用 2505.20583v1

Authors: Michael O. Harding, Kirthevasan Kandasamy

We consider the problem of identifying the best arm in a multi-armed bandit model. Despite a wealth of literature in the traditional fixed budget and fixed confidence regimes of the best arm identification problem, it still remains a mystery to most practitioners as to how to choose an approach and corresponding budget or confidence parameter. We propose a new formalism to avoid this dilemma altogether by minimizing a risk functional which explicitly balances the performance of the recommended arm and the cost incurred by learning this arm. In this framework, a cost is incurred for each observation during the sampling phase, and upon recommending an arm, a performance penalty is incurred for identifying a suboptimal arm. The learner’s goal is to minimize the sum of the penalty and cost. This new regime mirrors the priorities of many practitioners, e.g. maximizing profit in an A/B testing framework, better than classical fixed budget or confidence settings. We derive theoretical lower bounds for the risk of each of two choices for the performance penalty, the probability of misidentification and the simple regret, and propose an algorithm called DBCARE to match these lower bounds up to polylog factors on nearly all problem instances. We then demonstrate the performance of DBCARE on a number of simulated models, comparing to fixed budget and confidence algorithms to show the shortfalls of existing BAI paradigms on this problem.

nan

Article 1197

Title@2025-05-26 (1): Training a Generally Curious Agent

Title: Training a Generally Curious Agent

Ein allgemein neugieriger Agent ausbilden

a 训练一般好奇剂 2502.17543v3

Authors: Fahim Tajwar, Yiding Jiang, Abitha Thankaraj, Sumaita Sadia Rahman, J Zico Kolter, Jeff Schneider, Ruslan Salakhutdinov

Efficient exploration is essential for intelligent systems interacting with their environment, but existing language models often fall short in scenarios that require strategic information gathering. In this paper, we present Paprika, a fine-tuning approach that enables language models to develop general decision-making capabilities that are not confined to particular environments. By training on synthetic interaction data from different tasks that require diverse strategies, Paprika teaches models to explore and adapt their behavior on a new task based on environment feedback in-context without more gradient updates. Experimental results show that models fine-tuned with Paprika can effectively transfer their learned decision-making capabilities to entirely unseen tasks without additional training. Unlike traditional training, our approach’s primary bottleneck lies in sampling useful interaction data instead of model updates. To improve sample efficiency, we propose a curriculum learning strategy that prioritizes sampling trajectories from tasks with high learning potential. These results suggest a promising path towards AI systems that can autonomously solve novel sequential decision-making problems that require interactions with the external world.

nan

Article 1198

Title@2025-05-26 (1): Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL

Title: Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL

Strg-DNA: Kontrollierbare Zell-Typ-spezifische Regulatorische DNA-Design über eingeschränkte RL

Ctrl-DNA:通过受控RL设计可控细胞-Type-具体监管DNA 2505.20578v1

Authors: Xingyu Chen, Shihao Ma, Runsheng Lin, Jiecong Lin, Bo Wang

Designing regulatory DNA sequences that achieve precise cell-type-specific gene expression is crucial for advancements in synthetic biology, gene therapy and precision medicine. Although transformer-based language models (LMs) can effectively capture patterns in regulatory DNA, their generative approaches often struggle to produce novel sequences with reliable cell-specific activity. Here, we introduce Ctrl-DNA, a novel constrained reinforcement learning (RL) framework tailored for designing regulatory DNA sequences with controllable cell-type specificity. By formulating regulatory sequence design as a biologically informed constrained optimization problem, we apply RL to autoregressive genomic LMs, enabling the models to iteratively refine sequences that maximize regulatory activity in targeted cell types while constraining off-target effects. Our evaluation on human promoters and enhancers demonstrates that Ctrl-DNA consistently outperforms existing generative and RL-based approaches, generating high-fitness regulatory sequences and achieving state-of-the-art cell-type specificity. Moreover, Ctrl-DNA-generated sequences capture key cell-type-specific transcription factor binding sites (TFBS), short DNA motifs recognized by regulatory proteins that control gene expression, demonstrating the biological plausibility of the generated sequences.

nan

Article 1199

Title@2025-05-26 (1): Emotion Classification In-Context in Spanish

Title: Emotion Classification In-Context in Spanish

Emotion Classification In-Context auf Spanisch

西班牙文《情感分类西班牙文内引文》 2505.20571v1

Authors: Bipul Thapa, Gabriel Cofre

Classifying customer feedback into distinct emotion categories is essential for understanding sentiment and improving customer experience. In this paper, we classify customer feedback in Spanish into three emotion categories–positive, neutral, and negative–using advanced NLP and ML techniques. Traditional methods translate feedback from widely spoken languages to less common ones, resulting in a loss of semantic integrity and contextual nuances inherent to the original language. To address this limitation, we propose a hybrid approach that combines TF-IDF with BERT embeddings, effectively transforming Spanish text into rich numerical representations that preserve the semantic depth of the original language by using a Custom Stacking Ensemble (CSE) approach. To evaluate emotion classification, we utilize a range of models, including Logistic Regression, KNN, Bagging classifier with LGBM, and AdaBoost. The CSE model combines these classifiers as base models and uses a one-vs-all Logistic Regression as the meta-model. Our experimental results demonstrate that CSE significantly outperforms the individual and BERT model, achieving a test accuracy of 93.3% on the native Spanish dataset–higher than the accuracy obtained from the translated version. These findings underscore the challenges of emotion classification in Spanish and highlight the advantages of combining vectorization techniques like TF-IDF with BERT for improved accuracy. Our results provide valuable insights for businesses seeking to leverage emotion classification to enhance customer feedback analysis and service improvements.

nan

Article 1200

Title@2025-05-26 (1): Bi-Level Unsupervised Feature Selection

Title: Bi-Level Unsupervised Feature Selection

Bi-Level-Unüberwachte Feature-Auswahl

双级不受监督的地物选择 2505.20563v1

Authors: Jingjing Liu, Xiansen Ju, Xianchao Xiu, Wanquan Liu

Unsupervised feature selection (UFS) is an important task in data engineering. However, most UFS methods construct models from a single perspective and often fail to simultaneously evaluate feature importance and preserve their inherent data structure, thus limiting their performance. To address this challenge, we propose a novel bi-level unsupervised feature selection (BLUFS) method, including a clustering level and a feature level. Specifically, at the clustering level, spectral clustering is used to generate pseudo-labels for representing the data structure, while a continuous linear regression model is developed to learn the projection matrix. At the feature level, the $\ell_{2,0}$-norm constraint is imposed on the projection matrix for more effectively selecting features. To the best of our knowledge, this is the first work to combine a bi-level framework with the $\ell_{2,0}$-norm. To solve the proposed bi-level model, we design an efficient proximal alternating minimization (PAM) algorithm, whose subproblems either have explicit solutions or can be computed by fast solvers. Furthermore, we establish the convergence result and computational complexity. Finally, extensive experiments on two synthetic datasets and eight real datasets demonstrate the superiority of BLUFS in clustering and classification tasks.

nan

Article 1201

Title@2025-05-26 (1): Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

Title: Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

Jenseits von Markovian: Reflektierende Exploration über Bayes-Adaptive RL für LLM-Reasoning

马尔科维安之后:通过Bayes-Adapative RL进行反射勘探,用于LLM 理由分析 2505.20561v1

Authors: Shenao Zhang, Yaqing Wang, Yinxiao Liu, Tianqi Liu, Peter Grabowski, Eugene Ie, Zhaoran Wang, Yunxuan Li

Large Language Models (LLMs) trained via Reinforcement Learning (RL) have exhibited strong reasoning capabilities and emergent reflective behaviors, such as backtracking and error correction. However, conventional Markovian RL confines exploration to the training phase to learn an optimal deterministic policy and depends on the history contexts only through the current state. Therefore, it remains unclear whether reflective reasoning will emerge during Markovian RL training, or why they are beneficial at test time. To remedy this, we recast reflective exploration within the Bayes-Adaptive RL framework, which explicitly optimizes the expected return under a posterior distribution over Markov decision processes. This Bayesian formulation inherently incentivizes both reward-maximizing exploitation and information-gathering exploration via belief updates. Our resulting algorithm, BARL, instructs the LLM to stitch and switch strategies based on the observed outcomes, offering principled guidance on when and how the model should reflectively explore. Empirical results on both synthetic and mathematical reasoning tasks demonstrate that BARL outperforms standard Markovian RL approaches at test time, achieving superior token efficiency with improved exploration effectiveness. Our code is available at https://github.com/shenao-zhang/BARL.

nan

Article 1202

Title@2025-05-26 (1): Advancing Molecular Machine Learning Representations with Stereoelectronics-Infused Molecular Graphs

Title: Advancing Molecular Machine Learning Representations with Stereoelectronics-Infused Molecular Graphs

Advancing Molecular Machine Learning Representations mit stereoelectronics-infused Molecular Graphs

具有立体电子成份式分子图的分子机学习演示 2408.04520v2

Authors: Daniil A. Boiko, Thiago Reschützegger, Benjamin Sanchez-Lengeling, Samuel M. Blau, Gabe Gomes

Molecular representation is a critical element in our understanding of the physical world and the foundation for modern molecular machine learning. Previous molecular machine learning models have employed strings, fingerprints, global features, and simple molecular graphs that are inherently information-sparse representations. However, as the complexity of prediction tasks increases, the molecular representation needs to encode higher fidelity information. This work introduces a novel approach to infusing quantum-chemical-rich information into molecular graphs via stereoelectronic effects, enhancing expressivity and interpretability. Learning to predict the stereoelectronics-infused representation with a tailored double graph neural network workflow enables its application to any downstream molecular machine learning task without expensive quantum chemical calculations. We show that the explicit addition of stereoelectronic information significantly improves the performance of message-passing 2D machine learning models for molecular property prediction. We show that the learned representations trained on small molecules can accurately extrapolate to much larger molecular structures, yielding chemical insight into orbital interactions for previously intractable systems, such as entire proteins, opening new avenues of molecular design. Finally, we have developed a web application (simg.cheme.cmu.edu) where users can rapidly explore stereoelectronic information for their own molecular systems.

nan

Article 1203

Title@2025-05-26 (1): Causal Composition Diffusion Model for Closed-loop Traffic Generation

Title: Causal Composition Diffusion Model for Closed-loop Traffic Generation

Causal Composition Diffusion Modell für die Closed-Loop-Verkehrserzeugung

闭闭环交通流量生成原因构成传播模式 2412.17920v3

Authors: Haohong Lin, Xin Huang, Tung Phan-Minh, David S. Hayden, Huan Zhang, Ding Zhao, Siddhartha Srinivasa, Eric M. Wolff, Hongge Chen

Simulation is critical for safety evaluation in autonomous driving, particularly in capturing complex interactive behaviors. However, generating realistic and controllable traffic scenarios in long-tail situations remains a significant challenge. Existing generative models suffer from the conflicting objective between user-defined controllability and realism constraints, which is amplified in safety-critical contexts. In this work, we introduce the Causal Compositional Diffusion Model (CCDiff), a structure-guided diffusion framework to address these challenges. We first formulate the learning of controllable and realistic closed-loop simulation as a constrained optimization problem. Then, CCDiff maximizes controllability while adhering to realism by automatically identifying and injecting causal structures directly into the diffusion process, providing structured guidance to enhance both realism and controllability. Through rigorous evaluations on benchmark datasets and in a closed-loop simulator, CCDiff demonstrates substantial gains over state-of-the-art approaches in generating realistic and user-preferred trajectories. Our results show CCDiff’s effectiveness in extracting and leveraging causal structures, showing improved closed-loop performance based on key metrics such as collision rate, off-road rate, FDE, and comfort.

nan

Article 1204

Title@2025-05-26 (1): Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text

Title: Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text

Task-informierte Anti-Kurriculum durch Masken verbessert Downstream-Performance auf Text

通过遮罩改进文字下流业绩,以任务化的反文体 2502.12953v2

Authors: Andrei Jarca, Florinel Alin Croitoru, Radu Tudor Ionescu

Masked language modeling has become a widely adopted unsupervised technique to pre-train large language models (LLMs). However, the process of selecting tokens for masking is random, and the percentage of masked tokens is typically fixed for the entire training process. In this paper, we propose to adjust the masking ratio and to decide which tokens to mask based on a novel task-informed anti-curriculum learning scheme. First, we harness task-specific knowledge about useful and harmful tokens in order to determine which tokens to mask. Second, we propose a cyclic decaying masking ratio, which corresponds to an anti-curriculum schedule (from hard to easy). We exemplify our novel task-informed anti-curriculum by masking (TIACBM) approach across three diverse downstream tasks: sentiment analysis, text classification by topic, and authorship attribution. Our findings suggest that TIACBM enhances the ability of the model to focus on key task-relevant features, contributing to statistically significant performance gains across tasks. We release our code at https://github.com/JarcaAndrei/TIACBM.

nan

Article 1205

Title@2025-05-26 (1): Learning a Pessimistic Reward Model in RLHF

Title: Learning a Pessimistic Reward Model in RLHF

Ein pessimistisches Belohnungsmodell in RLHF lernen

在RLHF学习悲观奖励模式 2505.20556v1

Authors: Yinglun Xu, Hangoo Kang, Tarun Suresh, Yuxuan Wan, Gagandeep Singh

This work proposes `PET’, a novel pessimistic reward fine-tuning method, to learn a pessimistic reward model robust against reward hacking in offline reinforcement learning from human feedback (RLHF). Traditional reward modeling techniques in RLHF train an imperfect reward model, on which a KL regularization plays a pivotal role in mitigating reward hacking when optimizing a policy. Such an intuition-based method still suffers from reward hacking, and the policies with large KL divergence from the dataset distribution are excluded during learning. In contrast, we show that when optimizing a policy on a pessimistic reward model fine-tuned through PET, reward hacking can be prevented without relying on any regularization. We test our methods on the standard TL;DR summarization dataset. We find that one can learn a high-quality policy on our pessimistic reward without using any regularization. Such a policy has a high KL divergence from the dataset distribution while having high performance in practice. In summary, our work shows the feasibility of learning a pessimistic reward model against reward hacking. The agent can greedily search for the policy with a high pessimistic reward without suffering from reward hacking.

nan

Article 1206

Title@2025-05-26 (1): A ZeNN architecture to avoid the Gaussian trap

Title: A ZeNN architecture to avoid the Gaussian trap

Eine ZeNN-Architektur, um die Gaussische Falle zu vermeiden

避免高斯陷阱的 ZeNN 建筑 2505.20553v1

Authors: Luís Carvalho, João L. Costa, José Mourão, Gonçalo Oliveira

We propose a new simple architecture, Zeta Neural Networks (ZeNNs), in order to overcome several shortcomings of standard multi-layer perceptrons (MLPs). Namely, in the large width limit, MLPs are non-parametric, they do not have a well-defined pointwise limit, they lose non-Gaussian attributes and become unable to perform feature learning; moreover, finite width MLPs perform poorly in learning high frequencies. The new ZeNN architecture is inspired by three simple principles from harmonic analysis: i) Enumerate the perceptons and introduce a non-learnable weight to enforce convergence; ii) Introduce a scaling (or frequency) factor; iii) Choose activation functions that lead to near orthogonal systems. We will show that these ideas allow us to fix the referred shortcomings of MLPs. In fact, in the infinite width limit, ZeNNs converge pointwise, they exhibit a rich asymptotic structure beyond Gaussianity, and perform feature learning. Moreover, when appropriate activation functions are chosen, (finite width) ZeNNs excel at learning high-frequency features of functions with low dimensional domains.

nan

Article 1207

Title@2025-05-26 (1): Estimating Motor Symptom Presence and Severity in Parkinson’s Disease from Wrist Accelerometer Time Series using ROCKET and InceptionTime

Title: Estimating Motor Symptom Presence and Severity in Parkinson’s Disease from Wrist Accelerometer Time Series using ROCKET and InceptionTime

Abschätzung von Motorsymptome und Schweregrad bei Parkinson-Krankheit aus der Wrist Accelerometer Time Serie mit ROCKET und InceptionTime

利用 ROCKET 和受孕时间从风速计时间序列中估计帕金森氏病的机动症状存在和严重性 2304.11265v3

Authors: Cedric Donié, Neha Das, Satoshi Endo, Sandra Hirche

Parkinson’s disease (PD) is a neurodegenerative condition characterized by frequently changing motor symptoms, necessitating continuous symptom monitoring for more targeted treatment. Classical time series classification and deep learning techniques have demonstrated limited efficacy in monitoring PD symptoms using wearable accelerometer data due to complex PD movement patterns and the small size of available datasets. We investigate InceptionTime and RandOm Convolutional KErnel Transform (ROCKET) as they are promising for PD symptom monitoring. InceptionTime’s high learning capacity is well-suited to modeling complex movement patterns, while ROCKET is suited to small datasets. With random search methodology, we identify the highest-scoring InceptionTime architecture and compare its performance to ROCKET with a ridge classifier and a multi-layer perceptron (MLP) on wrist motion data from PD patients. Our findings indicate that all approaches can learn to estimate tremor severity and bradykinesia presence with moderate performance but encounter challenges in detecting dyskinesia. Among the presented approaches, ROCKET demonstrates higher scores in identifying dyskinesia, whereas InceptionTime exhibits slightly better performance in tremor and bradykinesia estimation. Notably, both methods outperform the multi-layer perceptron. In conclusion, InceptionTime can classify complex wrist motion time series and holds potential for continuous symptom monitoring in PD with further development.

nan

Article 1208

Title@2025-05-26 (1): TAPIP3D: Tracking Any Point in Persistent 3D Geometry

Title: TAPIP3D: Tracking Any Point in Persistent 3D Geometry

TAPIP3D: Verfolgung eines beliebigen Punktes in persistenter 3D-Geometrie

TAPIP3D:跟踪持久性三维几何中的任何点 2504.14717v2

Authors: Bowei Zhang, Lei Ke, Adam W. Harley, Katerina Fragkiadaki

We introduce TAPIP3D, a novel approach for long-term 3D point tracking in monocular RGB and RGB-D videos. TAPIP3D represents videos as camera-stabilized spatio-temporal feature clouds, leveraging depth and camera motion information to lift 2D video features into a 3D world space where camera movement is effectively canceled out. Within this stabilized 3D representation, TAPIP3D iteratively refines multi-frame motion estimates, enabling robust point tracking over long time horizons. To handle the irregular structure of 3D point distributions, we propose a 3D Neighborhood-to-Neighborhood (N2N) attention mechanism - a 3D-aware contextualization strategy that builds informative, spatially coherent feature neighborhoods to support precise trajectory estimation. Our 3D-centric formulation significantly improves performance over existing 3D point tracking methods and even surpasses state-of-the-art 2D pixel trackers in accuracy when reliable depth is available. The model supports inference in both camera-centric (unstabilized) and world-centric (stabilized) coordinates, with experiments showing that compensating for camera motion leads to substantial gains in tracking robustness. By replacing the conventional 2D square correlation windows used in prior 2D and 3D trackers with a spatially grounded 3D attention mechanism, TAPIP3D achieves strong and consistent results across multiple 3D point tracking benchmarks. Project Page: https://tapip3d.github.io

nan

Article 1209

Title@2025-05-26 (1): Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling

Title: Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling

Erreichen von Anpassungsfähigkeit und Optimität für mehrarmige Banditen mit Expenential-Kullback Leibler Maillard Sampling

利用Expernitial-Kullback Leiber Leiber Maillard抽样,实现多武装强盗的适应性和最佳性 2502.14379v2

Authors: Hao Qin, Kwang-Sung Jun, Chicheng Zhang

We study the problem of $K$-armed bandits with reward distributions belonging to a one-parameter exponential distribution family. In the literature, several criteria have been proposed to evaluate the performance of such algorithms, including Asymptotic Optimality, Minimax Optimality, Sub-UCB, and variance-adaptive worst-case regret bound. Thompson Sampling-based and Upper Confidence Bound-based algorithms have been employed to achieve some of these criteria. However, none of these algorithms simultaneously satisfy all the aforementioned criteria. In this paper, we design an algorithm, Exponential Kullback-Leibler Maillard Sampling (abbrev. Exp-KL-MS), that can achieve multiple optimality criteria simultaneously, including Asymptotic Optimality, Minimax Optimality with a $\sqrt{\ln (K)}$ factor, Sub-UCB, and variance-adaptive worst-case regret bound.

nan

Article 1210

Title@2025-05-26 (1): Quantum Speedups in Regret Analysis of Infinite Horizon Average-Reward Markov Decision Processes

Title: Quantum Speedups in Regret Analysis of Infinite Horizon Average-Reward Markov Decision Processes

Quantum Speedups bei der Bedauernsanalyse von Unendlichen Horizon durchschnittlichen Markov-Entscheidungsprozessen

对无限地平地平平线平均回报Markov决定程序进行遗憾分析时的量量加速 2310.11684v4

Authors: Bhargav Ganguly, Yang Xu, Vaneet Aggarwal

This paper investigates the potential of quantum acceleration in addressing infinite horizon Markov Decision Processes (MDPs) to enhance average reward outcomes. We introduce an innovative quantum framework for the agent’s engagement with an unknown MDP, extending the conventional interaction paradigm. Our approach involves the design of an optimism-driven tabular Reinforcement Learning algorithm that harnesses quantum signals acquired by the agent through efficient quantum mean estimation techniques. Through thorough theoretical analysis, we demonstrate that the quantum advantage in mean estimation leads to exponential advancements in regret guarantees for infinite horizon Reinforcement Learning. Specifically, the proposed Quantum algorithm achieves a regret bound of $\tilde{\mathcal{O}}(1)$, a significant improvement over the $\tilde{\mathcal{O}}(\sqrt{T})$ bound exhibited by classical counterparts.

nan

Article 1211

Title@2025-05-26 (1): RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs

Title: RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs

RL nur im Namen? Analyse der strukturellen Annahmen im RL-Post-Training für LLMs

仅限名称的RL?分析在RL为LLMs提供的培训后培训中的结构假设 2505.13697v2

Authors: Soumya Rani Samineni, Durgesh Kalwar, Karthik Valmeekam, Kaya Stechly, Subbarao Kambhampati

Reinforcement learning-based post-training of large language models (LLMs) has recently gained attention, particularly following the release of DeepSeek R1, which applied GRPO for fine-tuning. Amid the growing hype around improved reasoning abilities attributed to RL post-training, we critically examine the formulation and assumptions underlying these methods. We start by highlighting the popular structural assumptions made in modeling LLM training as a Markov Decision Process (MDP), and show how they lead to a degenerate MDP that doesn’t quite need the RL/GRPO apparatus. The two critical structural assumptions include (1) making the MDP states be just a concatenation of the actions-with states becoming the context window and the actions becoming the tokens in LLMs and (2) splitting the reward of a state-action trajectory uniformly across the trajectory. Through a comprehensive analysis, we demonstrate that these simplifying assumptions make the approach effectively equivalent to an outcome-driven supervised learning. Our experiments on benchmarks including GSM8K and Countdown using Qwen-2.5 base models show that iterative supervised fine-tuning, incorporating both positive and negative samples, achieves performance comparable to GRPO-based training. We will also argue that the structural assumptions indirectly incentivize the RL to generate longer sequences of intermediate tokens-which in turn feeds into the narrative of “RL generating longer thinking traces.” While RL may well be a very useful technique for improving the reasoning abilities of LLMs, our analysis shows that the simplistic structural assumptions made in modeling the underlying MDP render the popular LLM RL frameworks and their interpretations questionable.

nan

Article 1212

Title@2025-05-26 (1): Covariate-Adjusted Deep Causal Learning for Heterogeneous Panel Data Models

Title: Covariate-Adjusted Deep Causal Learning for Heterogeneous Panel Data Models

Kovariate-adjusted Deep Causal Learning für heterogene Panel-Datenmodelle

异质小组数据模型的共变调整深因学习 2505.20536v1

Authors: Guanhao Zhou, Yuefeng Han, Xiufan Yu

This paper studies the task of estimating heterogeneous treatment effects in causal panel data models, in the presence of covariate effects. We propose a novel Covariate-Adjusted Deep Causal Learning (CoDEAL) for panel data models, that employs flexible model structures and powerful neural network architectures to cohesively deal with the underlying heterogeneity and nonlinearity of both panel units and covariate effects. The proposed CoDEAL integrates nonlinear covariate effect components (parameterized by a feed-forward neural network) with nonlinear factor structures (modeled by a multi-output autoencoder) to form a heterogeneous causal panel model. The nonlinear covariate component offers a flexible framework for capturing the complex influences of covariates on outcomes. The nonlinear factor analysis enables CoDEAL to effectively capture both cross-sectional and temporal dependencies inherent in the data panel. This latent structural information is subsequently integrated into a customized matrix completion algorithm, thereby facilitating more accurate imputation of missing counterfactual outcomes. Moreover, the use of a multi-output autoencoder explicitly accounts for heterogeneity across units and enhances the model interpretability of the latent factors. We establish theoretical guarantees on the convergence of the estimated counterfactuals, and demonstrate the compelling performance of the proposed method using extensive simulation studies and a real data application.

nan

Article 1213

Title@2025-05-26 (1): Rotary Masked Autoencoders are Versatile Learners

Title: Rotary Masked Autoencoders are Versatile Learners

Rotary Masked Autoencoder sind vielseitige Lerner

扶轮式遮罩自动算术员是多功能学习者 2505.20535v1

Authors: Uros Zivanovic, Serafina Di Gioia, Andre Scaffidi, Martín de los Rios, Gabriella Contardo, Roberto Trotta

Applying Transformers to irregular time-series typically requires specializations to their baseline architecture, which can result in additional computational overhead and increased method complexity. We present the Rotary Masked Autoencoder (RoMAE), which utilizes the popular Rotary Positional Embedding (RoPE) method for continuous positions. RoMAE is an extension to the Masked Autoencoder (MAE) that enables representation learning with multidimensional continuous positional information while avoiding any time-series-specific architectural specializations. We showcase RoMAE’s performance on a variety of modalities including irregular and multivariate time-series, images, and audio, demonstrating that RoMAE surpasses specialized time-series architectures on difficult datasets such as the DESC ELAsTiCC Challenge while maintaining MAE’s usual performance across other modalities. In addition, we investigate RoMAE’s ability to reconstruct the embedded continuous positions, demonstrating that including learned embeddings in the input sequence breaks RoPE’s relative position property.

nan

Article 1214

Title@2025-05-26 (1): HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell Data

Title: HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell Data

HiPoNet: Ein Multi-View-Komplexnetzwerk für hochdimensionale Point-Cloud- und Single-Cell-Daten

HipoNet:高多面点和单细胞数据多视图简易复杂的网络 2502.07746v2

Authors: Siddharth Viswanath, Hiren Madhu, Dhananjay Bhaskar, Jake Kovalic, David R Johnson, Christopher Tape, Ian Adelstein, Rex Ying, Michael Perlmutter, Smita Krishnaswamy

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning on high-dimensional point clouds. Our work is motivated by single-cell data which can have very high-dimensionality –exceeding the capabilities of existing methods for point clouds which are mostly tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e., one data set for every patient), necessitating models that can process large, high-dimensional point-clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric and topological information. In contrast, HiPoNet models the point-cloud as a set of higher-order simplicial complexes, with each particular complex being created using a reweighting of features. This method thus generates multiple constructs corresponding to different views of high-dimensional data, which in biology offers the possibility of disentangling distinct cellular processes. It then employs simplicial wavelet transforms to extract multiscale features, capturing both local and global topology from each view. We show that geometric and topological information is preserved in this framework both theoretically and empirically. We showcase the utility of HiPoNet on point-cloud level tasks, involving classification and regression of entire point-clouds in data cohorts. Experimentally, we find that HiPoNet outperforms other point-cloud and graph-based models on single-cell data. We also apply HiPoNet to spatial transcriptomics datasets using spatial coordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.

nan

Article 1215

Title@2025-05-26 (1): One-shot Robust Federated Learning of Independent Component Analysis

Title: One-shot Robust Federated Learning of Independent Component Analysis

One-shot Robust Federated Learning of Independent Component Analysis

强力学习独立构成部分分析 2505.20532v1

Authors: Dian Jin, Xin Bing, Yuqian Zhang

This paper investigates a general robust one-shot aggregation framework for distributed and federated Independent Component Analysis (ICA) problem. We propose a geometric median-based aggregation algorithm that leverages $k$-means clustering to resolve the permutation ambiguity in local client estimations. Our method first performs k-means to partition client-provided estimators into clusters and then aggregates estimators within each cluster using the geometric median. This approach provably remains effective even in highly heterogeneous scenarios where at most half of the clients can observe only a minimal number of samples. The key theoretical contribution lies in the combined analysis of the geometric median’s error bound-aided by sample quantiles-and the maximum misclustering rates of the aforementioned solution of $k$-means. The effectiveness of the proposed approach is further supported by simulation studies conducted under various heterogeneous settings.

nan

Article 1216

Title@2025-05-26 (1): Prediction-Enhanced Monte Carlo: A Machine Learning View on Control Variate

Title: Prediction-Enhanced Monte Carlo: A Machine Learning View on Control Variate

Vorhersage-erweitert Monte Carlo: Eine Machine-Learning-Ansicht auf Steuerungsvariate

预测增强的蒙特卡洛:关于控制Variatte的机械学习观点 2412.11257v2

Authors: Fengpei Li, Haoxian Chen, Jiahe Lin, Arkin Gupta, Xiaowei Tan, Honglei Zhao, Gang Xu, Yuriy Nevmyvaka, Agostino Capponi, Henry Lam

For many complex simulation tasks spanning areas such as healthcare, engineering, and finance, Monte Carlo (MC) methods are invaluable due to their unbiased estimates and precise error quantification. Nevertheless, Monte Carlo simulations often become computationally prohibitive, especially for nested, multi-level, or path-dependent evaluations lacking effective variance reduction techniques. While machine learning (ML) surrogates appear as natural alternatives, naive replacements typically introduce unquantifiable biases. We address this challenge by introducing Prediction-Enhanced Monte Carlo (PEMC), a framework that leverages modern ML models as learned predictors, using cheap and parallelizable simulation as features, to output unbiased evaluation with reduced variance and runtime. PEMC can also be viewed as a “modernized” view of control variates, where we consider the overall computation-cost-aware variance reduction instead of per-replication reduction, while bypassing the closed-form mean function requirement and maintaining the advantageous unbiasedness and uncertainty quantifiability of Monte Carlo. We illustrate PEMC’s broader efficacy and versatility through three examples: first, equity derivatives such as variance swaps under stochastic local volatility models; second, interest rate derivatives such as swaption pricing under the Heath-Jarrow-Morton (HJM) interest-rate model. Finally, we showcase PEMC in a socially significant context - ambulance dispatch and hospital load balancing - where accurate mortality rate estimates are key for ethically sensitive decision-making. Across these diverse scenarios, PEMC consistently reduces variance while preserving unbiasedness, highlighting its potential as a powerful enhancement to standard Monte Carlo baselines.

nan

Article 1217

Title@2025-05-26 (1): Fast Calculation of Feature Contributions in Boosting Trees

Title: Fast Calculation of Feature Contributions in Boosting Trees

Schnelle Berechnung von Feature-Beiträgen bei der Förderung von Bäumen

快速计算推动树的特性贡献 2407.03515v2

Authors: Zhongli Jiang, Min Zhang, Dabao Zhang

Recently, several fast algorithms have been proposed to decompose predicted value into Shapley values, enabling individualized feature contribution analysis in tree models. While such local decomposition offers valuable insights, it underscores the need for a global evaluation of feature contributions. Although coefficients of determination ($R^2$) allow for comparative assessment of individual features, individualizing $R^2$ is challenged by the underlying quadratic losses. To address this, we propose Q-SHAP, an efficient algorithm that reduces the computational complexity of calculating Shapley values for quadratic losses to polynomial time. Our simulations show that Q-SHAP not only improves computational efficiency but also enhances the accuracy of feature-specific $R^2$ estimates.

nan

Article 1218

Title@2025-05-26 (1): Training Articulatory Inversion Models for Inter-Speaker Consistency

Title: Training Articulatory Inversion Models for Inter-Speaker Consistency

Training Artikulatorische Inversionsmodelle für die Konsistenz zwischen den Lautsprechern

供发言者间和谐使用的培训用人工转换模型 2505.20529v1

Authors: Charles McGhee, Mark J. F. Gales, Kate M. Knill

Acoustic-to-Articulatory Inversion (AAI) attempts to model the inverse mapping from speech to articulation. Exact articulatory prediction from speech alone may be impossible, as speakers can choose different forms of articulation seemingly without reference to their vocal tract structure. However, once a speaker has selected an articulatory form, their productions vary minimally. Recent works in AAI have proposed adapting Self-Supervised Learning (SSL) models to single-speaker datasets, claiming that these single-speaker models provide a universal articulatory template. In this paper, we investigate whether SSL-adapted models trained on single and multi-speaker data produce articulatory targets which are consistent across speaker identities for English and Russian. We do this through the use of a novel evaluation method which extracts articulatory targets using minimal pair sets. We also present a training method which can improve inter-speaker consistency using only speech data.

nan

Article 1219

Title@2025-05-26 (1): DYMAG: Rethinking Message Passing Using Dynamical-systems-based Waveforms

Title: DYMAG: Rethinking Message Passing Using Dynamical-systems-based Waveforms

DYMAG: Nachricht neu denken Passieren mit Dynamisch-Systeme-basierten Wellenformen

DYMAG: 利用动态系统波形重新思考信息传递方式 2309.09924v5

Authors: Dhananjay Bhaskar, Xingzhi Sun, Yanlei Zhang, Charles Xu, Arman Afrasiyabi, Siddharth Viswanath, Oluwadamilola Fasina, Maximilian Nickel, Guy Wolf, Michael Perlmutter, Smita Krishnaswamy

We present DYMAG, a graph neural network based on a novel form of message aggregation. Standard message-passing neural networks, which often aggregate local neighbors via mean-aggregation, can be regarded as convolving with a simple rectangular waveform which is non-zero only on 1-hop neighbors of every vertex. Here, we go beyond such local averaging. We will convolve the node features with more sophisticated waveforms generated using dynamics such as the heat equation, wave equation, and the Sprott model (an example of chaotic dynamics). Furthermore, we use snapshots of these dynamics at different time points to create waveforms at many effective scales. Theoretically, we show that these dynamic waveforms can capture salient information about the graph including connected components, connectivity, and cycle structures even with no features. Empirically, we test DYMAG on both real and synthetic benchmarks to establish that DYMAG outperforms baseline models on recovery of graph persistence, generating parameters of random graphs, as well as property prediction for proteins, molecules and materials. Our code is available at https://github.com/KrishnaswamyLab/DYMAG.

nan

Article 1220

Title@2025-05-26 (1): Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks

Title: Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks

Lernpolitische Ausschüsse für effektive Personalisierung in MDPs mit unterschiedlichen Aufgaben

在有不同任务的多边发展方案中促进有效个性化的学习政策委员会 2503.01885v2

Authors: Luise Ge, Michael Lanier, Anindya Sarkar, Bengisu Guresti, Chongjie Zhang, Yevgeniy Vorobeychik

Many dynamic decision problems, such as robotic control, involve a series of tasks, many of which are unknown at training time. Typical approaches for these problems, such as multi-task and meta reinforcement learning, do not generalize well when the tasks are diverse. On the other hand, approaches that aim to tackle task diversity, such as using task embedding as policy context and task clustering, typically lack performance guarantees and require a large number of training tasks. To address these challenges, we propose a novel approach for learning a policy committee that includes at least one near-optimal policy with high probability for tasks encountered during execution. While we show that this problem is in general inapproximable, we present two practical algorithmic solutions. The first yields provable approximation and task sample complexity guarantees when tasks are low-dimensional (the best we can do due to inapproximability), whereas the second is a general and practical gradient-based approach. In addition, we provide a provable sample complexity bound for few-shot learning. Our experiments on MuJoCo and Meta-World show that the proposed approach outperforms state-of-the-art multi-task, meta-, and task clustering baselines in training, generalization, and few-shot learning, often by a large margin. Our code is available at https://github.com/CERL-WUSTL/PACMAN.

nan

Article 1221

Title@2025-05-26 (1): Towards Fully FP8 GEMM LLM Training at Scale

Title: Towards Fully FP8 GEMM LLM Training at Scale

Auf dem Weg zum vollständigen RP8 GEMM LLM Training auf Scale

GEMM GEMM LLM 大规模培训 2505.20524v1

Authors: Alejandro Hernández-Cano, Dhia Garbaya, Imanol Schlag, Martin Jaggi

Despite the significant potential of FP8 data formats for large language model (LLM) pre-training, their adoption has been limited due to challenges in maintaining stability at scale. Existing approaches often rely on suboptimal fine-grained FP8 kernels or fall back to higher-precision matrix multiplications (GEMMs) in sensitive components, such as attention projections, compromising potential throughput gains. We introduce a new class of LLM architectures that, for the first time, support FP8 computation for all GEMMs within transformer blocks during both forward and backward passes. This enables unprecedented throughput gains, particularly at scale, while matching the downstream performance of standard BF16 training. Our architecture design reduces large outlier activations, promoting stable long-term FP8 training. In addition, we identify key metrics to monitor low-precision training and predict potential future divergences.

nan

Article 1222

Title@2025-05-26 (1): Scaling over Scaling: Exploring Test-Time Scaling Pareto in Large Reasoning Models

Title: Scaling over Scaling: Exploring Test-Time Scaling Pareto in Large Reasoning Models

Skalierung über Skalierung: Untersuchung von Test-Zeit-Skalierung Pareto in großen vernünftigen Modellen

缩放过缩放: 探索大型理由模型中的测试时间缩放派 2505.20522v1

Authors: Jian Wang, Boyan Zhu, Chak Tou Leong, Yongqi Li, Wenjie Li

Large reasoning models (LRMs) have exhibited the capacity of enhancing reasoning performance via internal test-time scaling. Building upon this, a promising direction is to further scale test-time compute to unlock even greater reasoning capabilities. However, as we push these scaling boundaries, systematically understanding the practical limits and achieving optimal resource allocation becomes a critical challenge. In this paper, we investigate the scaling Pareto of test-time scaling and introduce the Test-Time Scaling Performance Model (TTSPM). We theoretically analyze two fundamental paradigms for such extended scaling, parallel scaling and sequential scaling, from a probabilistic modeling perspective. Our primary contribution is the derivation of the saturation point on the scaling budget for both strategies, identifying thresholds beyond which additional computation yields diminishing returns. Remarkably, despite their distinct mechanisms, both paradigms converge to a unified mathematical structure in their upper bounds. We empirically validate our theoretical findings on challenging reasoning benchmarks, including AIME, MATH-500, and GPQA, demonstrating the practical utility of these bounds for test-time resource allocation. We hope that this work provides insights into the cost-benefit trade-offs of test-time scaling, guiding the development of more resource-efficient inference strategies for large reasoning models.

nan

Article 1223

Title@2025-05-26 (1): Semi-Explicit Neural DAEs: Learning Long-Horizon Dynamical Systems with Algebraic Constraints

Title: Semi-Explicit Neural DAEs: Learning Long-Horizon Dynamical Systems with Algebraic Constraints

Halbexplizite neurale DAEs: Lernen von langhorizontigen dynamischen Systemen mit algebraischen Einschränkungen

半显性神经DAEs:学习具有代数限制的长毛利区动态系统 2505.20515v1

Authors: Avik Pal, Alan Edelman, Christopher Rackauckas

Despite the promise of scientific machine learning (SciML) in combining data-driven techniques with mechanistic modeling, existing approaches for incorporating hard constraints in neural differential equations (NDEs) face significant limitations. Scalability issues and poor numerical properties prevent these neural models from being used for modeling physical systems with complicated conservation laws. We propose Manifold-Projected Neural ODEs (PNODEs), a method that explicitly enforces algebraic constraints by projecting each ODE step onto the constraint manifold. This framework arises naturally from semi-explicit differential-algebraic equations (DAEs), and includes both a robust iterative variant and a fast approximation requiring a single Jacobian factorization. We further demonstrate that prior works on relaxation methods are special cases of our approach. PNODEs consistently outperform baselines across six benchmark problems achieving a mean constraint violation error below $10^{-10}$. Additionally, PNODEs consistently achieve lower runtime compared to other methods for a given level of error tolerance. These results show that constraint projection offers a simple strategy for learning physically consistent long-horizon dynamics.

nan

Article 1224

Title@2025-05-26 (1): On a Neural Implementation of Brenier’s Polar Factorization

Title: On a Neural Implementation of Brenier’s Polar Factorization

Über eine neurale Umsetzung von Breniers Polarfaktorisierung

布赖尼尔极地化的神经实施 2403.03071v4

Authors: Nina Vesseron, Marco Cuturi

In 1991, Brenier proved a theorem that generalizes the polar decomposition for square matrices – factored as PSD $\times$ unitary – to any vector field $F:\mathbb{R}^d\rightarrow \mathbb{R}^d$. The theorem, known as the polar factorization theorem, states that any field $F$ can be recovered as the composition of the gradient of a convex function $u$ with a measure-preserving map $M$, namely $F=\nabla u \circ M$. We propose a practical implementation of this far-reaching theoretical result, and explore possible uses within machine learning. The theorem is closely related to optimal transport (OT) theory, and we borrow from recent advances in the field of neural optimal transport to parameterize the potential $u$ as an input convex neural network. The map $M$ can be either evaluated pointwise using $u^$, the convex conjugate of $u$, through the identity $M=\nabla u^ \circ F$, or learned as an auxiliary network. Because $M$ is, in general, not injective, we consider the additional task of estimating the ill-posed inverse map that can approximate the pre-image measure $M^{-1}$ using a stochastic generator. We illustrate possible applications of Brenier’s polar factorization to non-convex optimization problems, as well as sampling of densities that are not log-concave.

nan

Article 1225

Title@2025-05-26 (1): A Novel Convolutional Neural Network-Based Framework for Complex Multiclass Brassica Seed Classification

Title: A Novel Convolutional Neural Network-Based Framework for Complex Multiclass Brassica Seed Classification

Ein neuartiges konvolutionäres neurales Netzwerk-basiertes Framework für die komplexe Klassifizierung von mehrstufigen Brassica-Samen

复杂多级巴西种子种子分类新革命神经网络框架 2505.21558v1

Authors: Elhoucine Elfatimia, Recep Eryigitb, Lahcen Elfatimi

Agricultural research has accelerated in recent years, yet farmers often lack the time and resources for on-farm research due to the demands of crop production and farm operations. Seed classification offers valuable insights into quality control, production efficiency, and impurity detection. Early identification of seed types is critical to reducing the cost and risk associated with field emergence, which can lead to yield losses or disruptions in downstream processes like harvesting. Seed sampling supports growers in monitoring and managing seed quality, improving precision in determining seed purity levels, guiding management adjustments, and enhancing yield estimations. This study proposes a novel convolutional neural network (CNN)-based framework for the efficient classification of ten common Brassica seed types. The approach addresses the inherent challenge of texture similarity in seed images using a custom-designed CNN architecture. The model’s performance was evaluated against several pre-trained state-of-the-art architectures, with adjustments to layer configurations for optimized classification. Experimental results using our collected Brassica seed dataset demonstrate that the proposed model achieved a high accuracy rate of 93 percent.

nan

Article 1226

Title@2025-05-26 (1): Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures

Title: Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures

Beispiel und Karte aus einem einzigen Convex-Potential: Erzeugung mit konjugierenden Momenten

单一汇合潜能的样本和地图:使用协同时间措施生成 2503.10576v2

Authors: Nina Vesseron, Louis Béthune, Marco Cuturi

The canonical approach in generative modeling is to split model fitting into two blocks: define first how to sample noise (e.g. Gaussian) and choose next what to do with it (e.g. using a single map or flows). We explore in this work an alternative route that ties sampling and mapping. We find inspiration in moment measures, a result that states that for any measure $\rho$, there exists a unique convex potential $u$ such that $\rho=\nabla u \sharp e^{-u}$. While this does seem to tie effectively sampling (from log-concave distribution $e^{-u}$) and action (pushing particles through $\nabla u$), we observe on simple examples (e.g., Gaussians or 1D distributions) that this choice is ill-suited for practical tasks. We study an alternative factorization, where $\rho$ is factorized as $\nabla w^\sharp e^{-w}$, where $w^$ is the convex conjugate of a convex potential $w$. We call this approach conjugate moment measures, and show far more intuitive results on these examples. Because $\nabla w^*$ is the Monge map between the log-concave distribution $e^{-w}$ and $\rho$, we rely on optimal transport solvers to propose an algorithm to recover $w$ from samples of $\rho$, and parameterize $w$ as an input-convex neural network. We also address the common sampling scenario in which the density of $\rho$ is known only up to a normalizing constant, and propose an algorithm to learn $w$ in this setting.

nan

Article 1227

Title@2025-05-26 (1): Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review

Title: Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review

Verkörperte KI mit Basismodellen für mobile Serviceroboter: Ein Systematischer Test

与 “ 移动服务机器人:系统审查 “ 基金会模型 2505.20503v1

Authors: Matthew Lisondra, Beno Benhabib, Goldie Nejat

Rapid advancements in foundation models, including Large Language Models, Vision-Language Models, Multimodal Large Language Models, and Vision-Language-Action Models have opened new avenues for embodied AI in mobile service robotics. By combining foundation models with the principles of embodied AI, where intelligent systems perceive, reason, and act through physical interactions, robots can improve understanding, adapt to, and execute complex tasks in dynamic real-world environments. However, embodied AI in mobile service robots continues to face key challenges, including multimodal sensor fusion, real-time decision-making under uncertainty, task generalization, and effective human-robot interactions (HRI). In this paper, we present the first systematic review of the integration of foundation models in mobile service robotics, identifying key open challenges in embodied AI and examining how foundation models can address them. Namely, we explore the role of such models in enabling real-time sensor fusion, language-conditioned control, and adaptive task execution. Furthermore, we discuss real-world applications in the domestic assistance, healthcare, and service automation sectors, demonstrating the transformative impact of foundation models on service robotics. We also include potential future research directions, emphasizing the need for predictive scaling laws, autonomous long-term adaptation, and cross-embodiment generalization to enable scalable, efficient, and robust deployment of foundation models in human-centric robotic systems.

nan

Article 1228

Title@2025-05-26 (1): Retrieve to Explain: Evidence-driven Predictions for Explainable Drug Target Identification

Title: Retrieve to Explain: Evidence-driven Predictions for Explainable Drug Target Identification

Erklären Sie: Evidenz-getriebene Vorhersagen für erklärbare Drogenziel-Identifikation

寻求解释:对可解释药物目标识别的由证据驱动的预测 2402.04068v4

Authors: Ravi Patel, Angus Brayne, Rogier Hintzen, Daniel Jaroslawicz, Georgiana Neculae, Dane Corneil

Language models hold incredible promise for enabling scientific discovery by synthesizing massive research corpora. Many complex scientific research questions have multiple plausible answers, each supported by evidence of varying strength. However, existing language models lack the capability to quantitatively and faithfully compare answer plausibility in terms of supporting evidence. To address this, we introduce Retrieve to Explain (R2E), a retrieval-based model that scores and ranks all possible answers to a research question based on evidence retrieved from a document corpus. The architecture represents each answer only in terms of its supporting evidence, with the answer itself masked. This allows us to extend feature attribution methods such as Shapley values, to transparently attribute answer scores to supporting evidence at inference time. The architecture also allows incorporation of new evidence without retraining, including non-textual data modalities templated into natural language. We developed R2E for the challenging scientific discovery task of drug target identification, a human-in-the-loop process where failures are extremely costly and explainability paramount. When predicting whether drug targets will subsequently be confirmed as efficacious in clinical trials, R2E not only matches non-explainable literature-based models but also surpasses a genetics-based target identification approach used throughout the pharmaceutical industry.

nan

Article 1229

Title@2025-05-26 (1): CLEVRER-Humans: Describing Physical and Causal Events the Human Way

Title: CLEVRER-Humans: Describing Physical and Causal Events the Human Way

CLEVRER-Mensch: Physikalische und kausale Ereignisse auf menschliche Weise beschreiben

CLEVRER-人类:将自然和因果事件描述为人类道路 2310.03635v2

Authors: Jiayuan Mao, Xuelin Yang, Xikun Zhang, Noah D. Goodman, Jiajun Wu

Building machines that can reason about physical events and their causal relationships is crucial for flexible interaction with the physical world. However, most existing physical and causal reasoning benchmarks are exclusively based on synthetically generated events and synthetic natural language descriptions of causal relationships. This design brings up two issues. First, there is a lack of diversity in both event types and natural language descriptions; second, causal relationships based on manually-defined heuristics are different from human judgments. To address both shortcomings, we present the CLEVRER-Humans benchmark, a video reasoning dataset for causal judgment of physical events with human labels. We employ two techniques to improve data collection efficiency: first, a novel iterative event cloze task to elicit a new representation of events in videos, which we term Causal Event Graphs (CEGs); second, a data augmentation technique based on neural language generative models. We convert the collected CEGs into questions and answers to be consistent with prior work. Finally, we study a collection of baseline approaches for CLEVRER-Humans question-answering, highlighting the great challenges set forth by our benchmark.

nan

Article 1230

Title@2025-05-26 (1): Distributionally Robust Optimization

Title: Distributionally Robust Optimization

Verteilungsstarke Optimierung

分布强力优化 2411.02549v3

Authors: Daniel Kuhn, Soroosh Shafiee, Wolfram Wiesemann

Distributionally robust optimization (DRO) studies decision problems under uncertainty where the probability distribution governing the uncertain problem parameters is itself uncertain. A key component of any DRO model is its ambiguity set, that is, a family of probability distributions consistent with any available structural or statistical information. DRO seeks decisions that perform best under the worst distribution in the ambiguity set. This worst case criterion is supported by findings in psychology and neuroscience, which indicate that many decision-makers have a low tolerance for distributional ambiguity. DRO is rooted in statistics, operations research and control theory, and recent research has uncovered its deep connections to regularization techniques and adversarial training in machine learning. This survey presents the key findings of the field in a unified and self-contained manner.

nan

Article 1231

Title@2025-05-26 (1): Avoid Forgetting by Preserving Global Knowledge Gradients in Federated Learning with Non-IID Data

Title: Avoid Forgetting by Preserving Global Knowledge Gradients in Federated Learning with Non-IID Data

Vermeiden Sie das Vergessen, indem Sie globale Wissensgradienten im Föderierten Lernen mit nicht-ID-Daten bewahren

避免在使用非二二二维数据进行联邦学习时因保留全球知识进步而被遗忘 2505.20485v1

Authors: Abhijit Chunduru, Majid Morafah, Mahdi Morafah, Vishnu Pandi Chellapandi, Ang Li

The inevitable presence of data heterogeneity has made federated learning very challenging. There are numerous methods to deal with this issue, such as local regularization, better model fusion techniques, and data sharing. Though effective, they lack a deep understanding of how data heterogeneity can affect the global decision boundary. In this paper, we bridge this gap by performing an experimental analysis of the learned decision boundary using a toy example. Our observations are surprising: (1) we find that the existing methods suffer from forgetting and clients forget the global decision boundary and only learn the perfect local one, and (2) this happens regardless of the initial weights, and clients forget the global decision boundary even starting from pre-trained optimal weights. In this paper, we present FedProj, a federated learning framework that robustly learns the global decision boundary and avoids its forgetting during local training. To achieve better ensemble knowledge fusion, we design a novel server-side ensemble knowledge transfer loss to further calibrate the learned global decision boundary. To alleviate the issue of learned global decision boundary forgetting, we further propose leveraging an episodic memory of average ensemble logits on a public unlabeled dataset to regulate the gradient updates at each step of local training. Experimental results demonstrate that FedProj outperforms state-of-the-art methods by a large margin.

nan

Article 1232

Title@2025-05-26 (1): Towards Efficient Training of Graph Neural Networks: A Multiscale Approach

Title: Towards Efficient Training of Graph Neural Networks: A Multiscale Approach

Auf dem Weg zu einer effizienten Ausbildung von Graphen-Neuralen Netzwerken: Ein multiskaliger Ansatz

争取对图形神经网络进行有效培训:一种多部门办法 2503.19666v3

Authors: Eshed Gal, Moshe Eliasof, Carola-Bibiane Schönlieb, Ivan I. Kyrchei, Eldad Haber, Eran Treister

Graph Neural Networks (GNNs) have become powerful tools for learning from graph-structured data, finding applications across diverse domains. However, as graph sizes and connectivity increase, standard GNN training methods face significant computational and memory challenges, limiting their scalability and efficiency. In this paper, we present a novel framework for efficient multiscale training of GNNs. Our approach leverages hierarchical graph representations and subgraphs, enabling the integration of information across multiple scales and resolutions. By utilizing coarser graph abstractions and subgraphs, each with fewer nodes and edges, we significantly reduce computational overhead during training. Building on this framework, we propose a suite of scalable training strategies, including coarse-to-fine learning, subgraph-to-full-graph transfer, and multiscale gradient computation. We also provide some theoretical analysis of our methods and demonstrate their effectiveness across various datasets and learning tasks. Our results show that multiscale training can substantially accelerate GNN training for large scale problems while maintaining, or even improving, predictive performance.

nan

Article 1233

Title@2025-05-26 (1): CardioPatternFormer: Pattern-Guided Attention for Interpretable ECG Classification with Transformer Architecture

Title: CardioPatternFormer: Pattern-Guided Attention for Interpretable ECG Classification with Transformer Architecture

CardioPatternFormer: Mustergeführte Aufmerksamkeit für die Interpretierbare EKG-Klassifikation mit Transformer-Architektur

卡尔迪·皮德·皮德罗·弗德:对具有变形结构的可解释的ECG分类的典型引导关注 2505.20481v1

Authors: Berat Kutay Uğraş, Ömer Nezih Gerek, İbrahim Talha Saygı

Accurate ECG interpretation is vital, yet complex cardiac data and “black-box” AI models limit clinical utility. Inspired by Transformer architectures’ success in NLP for understanding sequential data, we frame ECG as the heart’s unique “language” of temporal patterns. We present CardioPatternFormer, a novel Transformer-based model for interpretable ECG classification. It employs a sophisticated attention mechanism to precisely identify and classify diverse cardiac patterns, excelling at discerning subtle anomalies and distinguishing multiple co-occurring conditions. This pattern-guided attention provides clear insights by highlighting influential signal regions, effectively allowing the “heart to talk” through transparent interpretations. CardioPatternFormer demonstrates robust performance on challenging ECGs, including complex multi-pathology cases. Its interpretability via attention maps enables clinicians to understand the model’s rationale, fostering trust and aiding informed diagnostic decisions. This work offers a powerful, transparent solution for advanced ECG analysis, paving the way for more reliable and clinically actionable AI in cardiology.

nan

Article 1234

Title@2025-05-26 (1): Leveraging Sparsity for Sample-Efficient Preference Learning: A Theoretical Perspective

Title: Leveraging Sparsity for Sample-Efficient Preference Learning: A Theoretical Perspective

Sparsamkeit für stichprobeneffizientes Preference-Lernen: Eine theoretische Perspektive

利用差距促进抽样有效优先学习:理论视角 2501.18282v3

Authors: Yunzhen Yao, Lie He, Michael Gastpar

This paper considers the sample-efficiency of preference learning, which models and predicts human choices based on comparative judgments. The minimax optimal estimation error rate $\Theta(d/n)$ in classical estimation theory requires that the number of samples $n$ scales linearly with the dimensionality of the feature space $d$. However, the high dimensionality of the feature space and the high cost of collecting human-annotated data challenge the efficiency of traditional estimation methods. To remedy this, we leverage sparsity in the preference model and establish sharp error rates. We show that under the sparse random utility model, where the parameter of the reward function is $k$-sparse, the minimax optimal rate can be reduced to $\Theta(k/n \log(d/k))$. Furthermore, we analyze the $\ell_{1}$-regularized estimator and show that it achieves near-optimal rate under mild assumptions on the Gram matrix. Experiments on synthetic data and LLM alignment data validate our theoretical findings, showing that sparsity-aware methods significantly reduce sample complexity and improve prediction accuracy.

nan

Article 1235

Title@2025-05-26 (1): From learnable objects to learnable random objects

Title: From learnable objects to learnable random objects

Von lernbaren Objekten zu lernbaren zufälligen Objekten

从可学习对象到可学习随机对象 2504.00847v2

Authors: Aaron Anderson, Michael Benedikt

We consider the relationship between learnability of a “base class” of functions on a set $X$, and learnability of a class of statistical functions derived from the base class. For example, we refine results showing that learnability of a family $h_p: p \in Y$ of functions implies learnability of the family of functions $h_\mu=\lambda p: Y. E_\mu(h_p)$, where $E_\mu$ is the expectation with respect to $\mu$, and $\mu$ ranges over probability distributions on $X$. We will look at both Probably Approximately Correct (PAC) learning, where example inputs and outputs are chosen at random, and online learning, where the examples are chosen adversarily. For agnostic learning, we establish improved bounds on the sample complexity of learning for statistical classes, stated in terms of combinatorial dimensions of the base class. We connect these problems to techniques introduced in model theory for “randomizing a structure”. We also provide counterexamples for realizable learning, in both the PAC and online settings.

nan

Article 1236

Title@2025-05-26 (1): Stochastic Preconditioning for Neural Field Optimization

Title: Stochastic Preconditioning for Neural Field Optimization

Stochastische Vorkonditionierung für die Neuralfeldoptimierung

神经场优化的斯托克预设设备 2505.20473v1

Authors: Selena Ling, Merlin Nimier-David, Alec Jacobson, Nicholas Sharp

Neural fields are a highly effective representation across visual computing. This work observes that fitting these fields is greatly improved by incorporating spatial stochasticity during training, and that this simple technique can replace or even outperform custom-designed hierarchies and frequency space constructions. The approach is formalized as implicitly operating on a blurred version of the field, evaluated in-expectation by sampling with Gaussian-distributed offsets. Querying the blurred field during optimization greatly improves convergence and robustness, akin to the role of preconditioners in numerical linear algebra. This implicit, sampling-based perspective fits naturally into the neural field paradigm, comes at no additional cost, and is extremely simple to implement. We describe the basic theory of this technique, including details such as handling boundary conditions, and extending to a spatially-varying blur. Experiments demonstrate this approach on representations including coordinate MLPs, neural hashgrids, triplanes, and more, across tasks including surface reconstruction and radiance fields. In settings where custom-designed hierarchies have already been developed, stochastic preconditioning nearly matches or improves their performance with a simple and unified approach; in settings without existing hierarchies it provides an immediate boost to quality and robustness.

nan

Article 1237

Title@2025-05-26 (1): WeatherEdit: Controllable Weather Editing with 4D Gaussian Field

Title: WeatherEdit: Controllable Weather Editing with 4D Gaussian Field

WeatherEdit: Kontrollierbare Wetterbearbeitung mit 4D Gaussian Field

气象编辑: 4D Gaussian 字段的可控天气编辑 2505.20471v1

Authors: Chenghao Qian, Wenjing Li, Yuhu Guo, Gustav Markkula

In this work, we present WeatherEdit, a novel weather editing pipeline for generating realistic weather effects with controllable types and severity in 3D scenes. Our approach is structured into two key components: weather background editing and weather particle construction. For weather background editing, we introduce an all-in-one adapter that integrates multiple weather styles into a single pretrained diffusion model, enabling the generation of diverse weather effects in 2D image backgrounds. During inference, we design a Temporal-View (TV-) attention mechanism that follows a specific order to aggregate temporal and spatial information, ensuring consistent editing across multi-frame and multi-view images. To construct the weather particles, we first reconstruct a 3D scene using the edited images and then introduce a dynamic 4D Gaussian field to generate snowflakes, raindrops and fog in the scene. The attributes and dynamics of these particles are precisely controlled through physical-based modelling and simulation, ensuring realistic weather representation and flexible severity adjustments. Finally, we integrate the 4D Gaussian field with the 3D scene to render consistent and highly realistic weather effects. Experiments on multiple driving datasets demonstrate that WeatherEdit can generate diverse weather effects with controllable condition severity, highlighting its potential for autonomous driving simulation in adverse weather. See project page: https://jumponthemoon.github.io/w-edit

nan

Article 1238

Title@2025-05-26 (1): Recursive Deep Inverse Reinforcement Learning

Title: Recursive Deep Inverse Reinforcement Learning

Rekursives tiefes Inverse-Verstärkung-Lernen

递归深反向强化学习 2504.13241v4

Authors: Paul Ghanem, Owen Howell, Michael Potter, Pau Closas, Alireza Ramezani, Deniz Erdogmus, Tales Imbiriba

Inferring an adversary’s goals from exhibited behavior is crucial for counterplanning and non-cooperative multi-agent systems in domains like cybersecurity, military, and strategy games. Deep Inverse Reinforcement Learning (IRL) methods based on maximum entropy principles show promise in recovering adversaries’ goals but are typically offline, require large batch sizes with gradient descent, and rely on first-order updates, limiting their applicability in real-time scenarios. We propose an online Recursive Deep Inverse Reinforcement Learning (RDIRL) approach to recover the cost function governing the adversary actions and goals. Specifically, we minimize an upper bound on the standard Guided Cost Learning (GCL) objective using sequential second-order Newton updates, akin to the Extended Kalman Filter (EKF), leading to a fast (in terms of convergence) learning algorithm. We demonstrate that RDIRL is able to recover cost and reward functions of expert agents in standard and adversarial benchmark tasks. Experiments on benchmark tasks show that our proposed approach outperforms several leading IRL algorithms.

nan

Article 1239

Title@2025-05-26 (1): Learning with Expected Signatures: Theory and Applications

Title: Learning with Expected Signatures: Theory and Applications

Lernen mit erwarteten Signaturen: Theorie und Anwendungen

学习与预期签名:理论和应用 2505.20465v1

Authors: Lorenzo Lucchese, Mikko S. Pakkanen, Almut E. D. Veraart

The expected signature maps a collection of data streams to a lower dimensional representation, with a remarkable property: the resulting feature tensor can fully characterize the data generating distribution. This “model-free” embedding has been successfully leveraged to build multiple domain-agnostic machine learning (ML) algorithms for time series and sequential data. The convergence results proved in this paper bridge the gap between the expected signature’s empirical discrete-time estimator and its theoretical continuous-time value, allowing for a more complete probabilistic interpretation of expected signature-based ML methods. Moreover, when the data generating process is a martingale, we suggest a simple modification of the expected signature estimator with significantly lower mean squared error and empirically demonstrate how it can be effectively applied to improve predictive performance.

nan

Article 1240

Title@2025-05-26 (1): Federated Learning-Distillation Alternation for Resource-Constrained IoT

Title: Federated Learning-Distillation Alternation for Resource-Constrained IoT

Federated Learning-Destillation Alternative für ressourcengebundenes IoT

资源培训型IOT 资源培训型IOT替代物 2505.20456v1

Authors: Rafael Valente da Silva, Onel L. Alcaraz López, Richard Demo Souza

Federated learning (FL) faces significant challenges in Internet of Things (IoT) networks due to device limitations in energy and communication resources, especially when considering the large size of FL models. From an energy perspective, the challenge is aggravated if devices rely on energy harvesting (EH), as energy availability can vary significantly over time, influencing the average number of participating users in each iteration. Additionally, the transmission of large model updates is more susceptible to interference from uncorrelated background traffic in shared wireless environments. As an alternative, federated distillation (FD) reduces communication overhead and energy consumption by transmitting local model outputs, which are typically much smaller than the entire model used in FL. However, this comes at the cost of reduced model accuracy. Therefore, in this paper, we propose FL-distillation alternation (FLDA). In FLDA, devices alternate between FD and FL phases, balancing model information with lower communication overhead and energy consumption per iteration. We consider a multichannel slotted-ALOHA EH-IoT network subject to background traffic/interference. In such a scenario, FLDA demonstrates higher model accuracy than both FL and FD, and achieves faster convergence than FL. Moreover, FLDA achieves target accuracies saving up to 98% in energy consumption, while also being less sensitive to interference, both relative to FL.

nan

Article 1241

Title@2025-05-26 (1): Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection

Title: Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection

Skalierungsgesetze für das Vergessen beim Finetuning mit Vorschulungs-Dateninjektion

调整前数据输入时遗忘法律的扩大范围 2502.06042v2

Authors: Louis Bethune, David Grangier, Dan Busbridge, Eleonora Gualdoni, Marco Cuturi, Pierre Ablin

A widespread strategy to obtain a language model that performs well on a target domain is to finetune a pretrained model to perform unsupervised next-token prediction on data from that target domain. Finetuning presents two challenges: (i) if the amount of target data is limited, as in most practical applications, the model will quickly overfit, and (ii) the model will drift away from the original model, forgetting the pretraining data and the generic knowledge that comes with it. We aim to derive scaling laws that quantify these two phenomena for various target domains, amounts of available target data, and model scales. We measure the efficiency of injecting pretraining data into the finetuning data mixture to avoid forgetting and mitigate overfitting. A key practical takeaway from our study is that injecting as little as 1% of pretraining data in the finetuning data mixture prevents the model from forgetting the pretraining set.

nan

Article 1242

Title@2025-05-26 (1): BlastOFormer: Attention and Neural Operator Deep Learning Methods for Explosive Blast Prediction

Title: BlastOFormer: Attention and Neural Operator Deep Learning Methods for Explosive Blast Prediction

BlastOFormer: Aufmerksamkeit und neuraler Operator Deep Learning Methoden zur explosiven Blast-Vorhersage

BLastO Former: 爆炸性爆炸预测的注意和神经操作员深学习方法 2505.20454v1

Authors: Reid Graves, Anthony Zhou, Amir Barati Farimani

Accurate prediction of blast pressure fields is essential for applications in structural safety, defense planning, and hazard mitigation. Traditional methods such as empirical models and computational fluid dynamics (CFD) simulations offer limited trade offs between speed and accuracy; empirical models fail to capture complex interactions in cluttered environments, while CFD simulations are computationally expensive and time consuming. In this work, we introduce BlastOFormer, a novel Transformer based surrogate model for full field maximum pressure prediction from arbitrary obstacle and charge configurations. BlastOFormer leverages a signed distance function (SDF) encoding and a grid to grid attention based architecture inspired by OFormer and Vision Transformer (ViT) frameworks. Trained on a dataset generated using the open source blastFoam CFD solver, our model outperforms convolutional neural networks (CNNs) and Fourier Neural Operators (FNOs) across both log transformed and unscaled domains. Quantitatively, BlastOFormer achieves the highest R2 score (0.9516) and lowest error metrics, while requiring only 6.4 milliseconds for inference, more than 600,000 times faster than CFD simulations. Qualitative visualizations and error analyses further confirm BlastOFormer’s superior spatial coherence and generalization capabilities. These results highlight its potential as a real time alternative to conventional CFD approaches for blast pressure estimation in complex environments.

nan

Article 1243

Title@2025-05-26 (1): Active Learning for Multiple Change Point Detection in Non-stationary Time Series with Deep Gaussian Processes

Title: Active Learning for Multiple Change Point Detection in Non-stationary Time Series with Deep Gaussian Processes

Aktives Lernen für Multiple Change Point Detection in nicht-stationären Zeitreihen mit tiefen Gauß-Prozessen

与深高斯进程一起在非静止时间序列中进行多变点探测活动学习 2505.20452v1

Authors: Hao Zhao, Rong Pan

Multiple change point (MCP) detection in non-stationary time series is challenging due to the variety of underlying patterns. To address these challenges, we propose a novel algorithm that integrates Active Learning (AL) with Deep Gaussian Processes (DGPs) for robust MCP detection. Our method leverages spectral analysis to identify potential changes and employs AL to strategically select new sampling points for improved efficiency. By incorporating the modeling flexibility of DGPs with the change-identification capabilities of spectral methods, our approach adapts to diverse spectral change behaviors and effectively localizes multiple change points. Experiments on both simulated and real-world data demonstrate that our method outperforms existing techniques in terms of detection accuracy and sampling efficiency for non-stationary time series.

nan

Article 1244

Title@2025-05-26 (1): Symmetry constrained neural networks for detection and localization of damage in metal plates

Title: Symmetry constrained neural networks for detection and localization of damage in metal plates

Symmetrie eingeschränkte neuronale Netze zur Erkennung und Lokalisierung von Schäden in Metallplatten

用于金属板块损害探测和定位的对称约束神经网络 2409.06084v3

Authors: James Amarel, Christopher Rudolf, Athanasios Iliopoulos, John Michopoulos, Leslie N. Smith

The present paper is concerned with deep learning techniques applied to detection and localization of damage in a thin aluminum plate. We used data collected on a tabletop apparatus by mounting to the plate four piezoelectric transducers, each of which took turn to generate a Lamb wave that then traversed the region of interest before being received by the remaining three sensors. On training a neural network to analyze time-series data of the material response, which displayed damage-reflective features whenever the plate guided waves interacted with a contact load, we achieved a model that detected with greater than $99\%$ accuracy in addition to a model that localized with $2.58 \pm 0.12$ mm mean distance error. For each task, the best-performing model was designed according to the inductive bias that our transducers were both similar and arranged in a square pattern on a nearly uniform plate.

nan

Article 1245

Title@2025-05-26 (1): Time Series Generation Under Data Scarcity: A Unified Generative Modeling Approach

Title: Time Series Generation Under Data Scarcity: A Unified Generative Modeling Approach

Zeitreihenerstellung unter Datenknappheit: Ein einheitlicher generativer Modellierungsansatz

数据缺乏情况下的时间序列生成:统一生成模式方法 2505.20446v1

Authors: Tal Gonen, Itai Pemper, Ilan Naiman, Nimrod Berman, Omri Azencot

Generative modeling of time series is a central challenge in time series analysis, particularly under data-scarce conditions. Despite recent advances in generative modeling, a comprehensive understanding of how state-of-the-art generative models perform under limited supervision remains lacking. In this work, we conduct the first large-scale study evaluating leading generative models in data-scarce settings, revealing a substantial performance gap between full-data and data-scarce regimes. To close this gap, we propose a unified diffusion-based generative framework that can synthesize high-fidelity time series across diverse domains using just a few examples. Our model is pre-trained on a large, heterogeneous collection of time series datasets, enabling it to learn generalizable temporal representations. It further incorporates architectural innovations such as dynamic convolutional layers for flexible channel adaptation and dataset token conditioning for domain-aware generation. Without requiring abundant supervision, our unified model achieves state-of-the-art performance in few-shot settings-outperforming domain-specific baselines across a wide range of subset sizes. Remarkably, it also surpasses all baselines even when tested on full datasets benchmarks, highlighting the strength of pre-training and cross-domain generalization. We hope this work encourages the community to revisit few-shot generative modeling as a key problem in time series research and pursue unified solutions that scale efficiently across domains. Code is available at https://github.com/azencot-group/ImagenFew.

nan

Article 1246

Title@2025-05-26 (1): HoPE: Hybrid of Position Embedding for Length Generalization in Vision-Language Models

Title: HoPE: Hybrid of Position Embedding for Length Generalization in Vision-Language Models

HoPE: Hybrid der Positionseinbettung für die Längenverallgemeinerung in Vision-Language-Modelle

HoPE:愿景-语言模型中长期通用化所嵌入的立场组合 2505.20444v1

Authors: Haoran Li, Yingjie Qin, Baoyuan Ou, Lai Xu, Ruiwen Xu

Vision-Language Models (VLMs) have made significant progress in multimodal tasks. However, their performance often deteriorates in long-context scenarios, particularly long videos. While Rotary Position Embedding (RoPE) has been widely adopted for length generalization in Large Language Models (LLMs), extending vanilla RoPE to capture the intricate spatial-temporal dependencies in videos remains an unsolved challenge. Existing methods typically allocate different frequencies within RoPE to encode 3D positional information. However, these allocation strategies mainly rely on heuristics, lacking in-depth theoretical analysis. In this paper, we first study how different allocation strategies impact the long-context capabilities of VLMs. Our analysis reveals that current multimodal RoPEs fail to reliably capture semantic similarities over extended contexts. To address this issue, we propose HoPE, a Hybrid of Position Embedding designed to improve the long-context capabilities of VLMs. HoPE introduces a hybrid frequency allocation strategy for reliable semantic modeling over arbitrarily long context, and a dynamic temporal scaling mechanism to facilitate robust learning and flexible inference across diverse context lengths. Extensive experiments across four video benchmarks on long video understanding and retrieval tasks demonstrate that HoPE consistently outperforms existing methods, confirming its effectiveness. Code is available at https://github.com/hrlics/HoPE.

nan

Article 1247

Title@2025-05-26 (1): AI Learning Algorithms: Deep Learning, Hybrid Models, and Large-Scale Model Integration

Title: AI Learning Algorithms: Deep Learning, Hybrid Models, and Large-Scale Model Integration

KI-Learning-Algorithmen: Deep Learning, hybride Modelle und großformatige Modellintegration

AI 学习等级:深学习、混合模型和大型模型整合 2410.09186v3

Authors: Noorbakhsh Amiri Golilarz, Elias Hossain, Abdoljalil Addeh, Keyan Alexander Rahimi

In this paper, we discuss learning algorithms and their importance in different types of applications which includes training to identify important patterns and features in a straightforward, easy-to-understand manner. We will review the main concepts of artificial intelligence (AI), machine learning (ML), deep learning (DL), and hybrid models. Some important subsets of Machine Learning algorithms such as supervised, unsupervised, and reinforcement learning are also discussed in this paper. These techniques can be used for some important tasks like prediction, classification, and segmentation. Convolutional Neural Networks (CNNs) are used for image and video processing and many more applications. We dive into the architecture of CNNs and how to integrate CNNs with ML algorithms to build hybrid models. This paper explores the vulnerability of learning algorithms to noise, leading to misclassification. We further discuss the integration of learning algorithms with Large Language Models (LLM) to generate coherent responses applicable to many domains such as healthcare, marketing, and finance by learning important patterns from large volumes of data. Furthermore, we discuss the next generation of learning algorithms and how we may have an unified Adaptive and Dynamic Network to perform important tasks. Overall, this article provides brief overview of learning algorithms, exploring their current state, applications and future direction.

nan

Article 1248

Title@2025-05-26 (1): Holes in Latent Space: Topological Signatures Under Adversarial Influence

Title: Holes in Latent Space: Topological Signatures Under Adversarial Influence

Löcher im latenten Raum: Topologische Signaturen unter dem Einfluss von Adversarien

低空空洞:在对立影响下的地形签名 2505.20435v1

Authors: Aideen Fay, Inés García-Redondo, Qiquan Wang, Haim Dubossarsky, Anthea Monod

Understanding how adversarial conditions affect language models requires techniques that capture both global structure and local detail within high-dimensional activation spaces. We propose persistent homology (PH), a tool from topological data analysis, to systematically characterize multiscale latent space dynamics in LLMs under two distinct attack modes – backdoor fine-tuning and indirect prompt injection. By analyzing six state-of-the-art LLMs, we show that adversarial conditions consistently compress latent topologies, reducing structural diversity at smaller scales while amplifying dominant features at coarser ones. These topological signatures are statistically robust across layers, architectures, model sizes, and align with the emergence of adversarial effects deeper in the network. To capture finer-grained mechanisms underlying these shifts, we introduce a neuron-level PH framework that quantifies how information flows and transforms within and across layers. Together, our findings demonstrate that PH offers a principled and unifying approach to interpreting representational dynamics in LLMs, particularly under distributional shift.

nan

Article 1249

Title@2025-05-26 (1): Kernel Quantile Embeddings and Associated Probability Metrics

Title: Kernel Quantile Embeddings and Associated Probability Metrics

Kernel-Quantile-Embeddings und zugehörige Wahrscheinlichkeits-Metriken

内核量量嵌入器及相关概率 2505.20433v1

Authors: Masha Naslidnyk, Siu Lun Chau, François-Xavier Briol, Krikamol Muandet

Embedding probability distributions into reproducing kernel Hilbert spaces (RKHS) has enabled powerful nonparametric methods such as the maximum mean discrepancy (MMD), a statistical distance with strong theoretical and computational properties. At its core, the MMD relies on kernel mean embeddings to represent distributions as mean functions in RKHS. However, it remains unclear if the mean function is the only meaningful RKHS representation. Inspired by generalised quantiles, we introduce the notion of kernel quantile embeddings (KQEs). We then use KQEs to construct a family of distances that: (i) are probability metrics under weaker kernel conditions than MMD; (ii) recover a kernelised form of the sliced Wasserstein distance; and (iii) can be efficiently estimated with near-linear cost. Through hypothesis testing, we show that these distances offer a competitive alternative to MMD and its fast approximations.

nan

Article 1250

Title@2025-05-26 (1): Differentiable Quadratic Optimization For The Maximum Independent Set Problem

Title: Differentiable Quadratic Optimization For The Maximum Independent Set Problem

Unterschiedliche quadratische Optimierung für das maximale unabhängige Set-Problem

最大独立集集问题可区别的二次二次曲线优化 2406.19532v6

Authors: Ismail Alkhouri, Cedric Le Denmat, Yingjie Li, Cunxi Yu, Jia Liu, Rongrong Wang, Alvaro Velasquez

Combinatorial Optimization (CO) addresses many important problems, including the challenging Maximum Independent Set (MIS) problem. Alongside exact and heuristic solvers, differentiable approaches have emerged, often using continuous relaxations of ReLU-based or quadratic objectives. Noting that an MIS in a graph is a Maximum Clique (MC) in its complement, we propose a new quadratic formulation for MIS by incorporating an MC term, improving convergence and exploration. We show that every maximal independent set corresponds to a local minimizer, derive conditions with respect to the MIS size, and characterize stationary points. To tackle the non-convexity of the objective, we propose optimizing several initializations in parallel using momentum-based gradient descent, complemented by an efficient MIS checking criterion derived from our theory. We dub our method as parallelized Clique-Informed Quadratic Optimization for MIS (pCQO-MIS). Our experimental results demonstrate the effectiveness of the proposed method compared to exact, heuristic, sampling, and data-centric approaches. Notably, our method avoids the out-of-distribution tuning and reliance on (un)labeled data required by data-centric methods, while achieving superior MIS sizes and competitive runtime relative to their inference time. Additionally, a key advantage of pCQO-MIS is that, unlike exact and heuristic solvers, the runtime scales only with the number of nodes in the graph, not the number of edges. Our code is available at the GitHub repository: https://github.com/ledenmat/pCQO-mis-benchmark/tree/refactor.

nan

Article 1251

Title@2025-05-26 (1): Self-reflective Uncertainties: Do LLMs Know Their Internal Answer Distribution?

Title: Self-reflective Uncertainties: Do LLMs Know Their Internal Answer Distribution?

Selbstreflektierende Unsicherheiten: Kennen LLMs ihre interne Antwortverteilung?

自我反感的不确定性:LLMs知道他们的内部答案分布吗? 2505.20295v1

Authors: Michael Kirchhof, Luca Füger, Adam Goliński, Eeshan Gunesh Dhekane, Arno Blaas, Sinead Williamson

To reveal when a large language model (LLM) is uncertain about a response, uncertainty quantification commonly produces percentage numbers along with the output. But is this all we can do? We argue that in the output space of LLMs, the space of strings, exist strings expressive enough to summarize the distribution over output strings the LLM deems possible. We lay a foundation for this new avenue of uncertainty explication and present SelfReflect, a theoretically-motivated metric to assess how faithfully a string summarizes an LLM’s internal answer distribution. We show that SelfReflect is able to discriminate even subtle differences of candidate summary strings and that it aligns with human judgement, outperforming alternative metrics such as LLM judges and embedding comparisons. With SelfReflect, we investigate a number of self-summarization methods and find that even state-of-the-art reasoning models struggle to explicate their internal uncertainty. But we find that faithful summarizations can be generated by sampling and summarizing. Our metric enables future works towards this universal form of LLM uncertainties.

nan

Article 1252

Title@2025-05-26 (1): Reasoning LLMs are Wandering Solution Explorers

Title: Reasoning LLMs are Wandering Solution Explorers

Grundlegende LLMs sind wandernde Lösungs-Explorer

理据LLMs是游荡的解决方案探索者 2505.20296v1

Authors: Jiahao Lu, Ziwei Xu, Mohan Kankanhalli

Large Language Models (LLMs) have demonstrated impressive reasoning abilities through test-time computation (TTC) techniques such as chain-of-thought prompting and tree-based reasoning. However, we argue that current reasoning LLMs (RLLMs) lack the ability to systematically explore the solution space. This paper formalizes what constitutes systematic problem solving and identifies common failure modes that reveal reasoning LLMs to be wanderers rather than systematic explorers. Through qualitative and quantitative analysis across multiple state-of-the-art LLMs, we uncover persistent issues: invalid reasoning steps, redundant explorations, hallucinated or unfaithful conclusions, and so on. Our findings suggest that current models’ performance can appear to be competent on simple tasks yet degrade sharply as complexity increases. Based on the findings, we advocate for new metrics and tools that evaluate not just final outputs but the structure of the reasoning process itself.

nan

Article 1253

Title@2025-05-26 (1): Lorentz Local Canonicalization: How to Make Any Network Lorentz-Equivariant

Title: Lorentz Local Canonicalization: How to Make Any Network Lorentz-Equivariant

Lorentz lokale Canonicalization: Wie man jedes Netzwerk Lorentz-Equivariant

Lorentz 本地 Canonicalization : 如何制造任何网络 Lorentz- Equivalication 2505.20280v1

Authors: Jonas Spinner, Luigi Favaro, Peter Lippmann, Sebastian Pitz, Gerrit Gerhartz, Tilman Plehn, Fred A. Hamprecht

Lorentz-equivariant neural networks are becoming the leading architectures for high-energy physics. Current implementations rely on specialized layers, limiting architectural choices. We introduce Lorentz Local Canonicalization (LLoCa), a general framework that renders any backbone network exactly Lorentz-equivariant. Using equivariantly predicted local reference frames, we construct LLoCa-transformers and graph networks. We adapt a recent approach to geometric message passing to the non-compact Lorentz group, allowing propagation of space-time tensorial features. Data augmentation emerges from LLoCa as a special choice of reference frame. Our models surpass state-of-the-art accuracy on relevant particle physics tasks, while being $4\times$ faster and using $5$-$100\times$ fewer FLOPs.

nan

Article 1254

Title@2025-05-26 (1): Solving Hidden Monotone Variational Inequalities with Surrogate Losses

Title: Solving Hidden Monotone Variational Inequalities with Surrogate Losses

Lösen versteckter monotoner Variationsungleichheiten mit Surrogatverlusten

解决与代谢损失的隐藏单式单体差异性不平等 2411.05228v3

Authors: Ryan D’Orazio, Danilo Vucetic, Zichu Liu, Junhyung Lyle Kim, Ioannis Mitliagkas, Gauthier Gidel

Deep learning has proven to be effective in a wide variety of loss minimization problems. However, many applications of interest, like minimizing projected Bellman error and min-max optimization, cannot be modelled as minimizing a scalar loss function but instead correspond to solving a variational inequality (VI) problem. This difference in setting has caused many practical challenges as naive gradient-based approaches from supervised learning tend to diverge and cycle in the VI case. In this work, we propose a principled surrogate-based approach compatible with deep learning to solve VIs. We show that our surrogate-based approach has three main benefits: (1) under assumptions that are realistic in practice (when hidden monotone structure is present, interpolation, and sufficient optimization of the surrogates), it guarantees convergence, (2) it provides a unifying perspective of existing methods, and (3) is amenable to existing deep learning optimizers like ADAM. Experimentally, we demonstrate our surrogate-based approach is effective in min-max optimization and minimizing projected Bellman error. Furthermore, in the deep reinforcement learning case, we propose a novel variant of TD(0) which is more compute and sample efficient.

nan

Article 1255

Title@2025-05-26 (1): The Coverage Principle: A Framework for Understanding Compositional Generalization

Title: The Coverage Principle: A Framework for Understanding Compositional Generalization

Das Coverage-Prinzip: Ein Rahmen für das Verständnis der kompositorischen Verallgemeinerung

覆盖范围原则:理解普遍组成框架 2505.20278v1

Authors: Hoyeon Chang, Jinho Park, Hanseul Cho, Sohee Yang, Miyoung Ko, Hyeonbin Hwang, Seungpil Won, Dohaeng Lee, Youbin Ahn, Minjoon Seo

Large language models excel at pattern matching, yet often fall short in systematic compositional generalization. We propose the coverage principle: a data-centric framework showing that models relying primarily on pattern matching for compositional tasks cannot reliably generalize beyond substituting fragments that yield identical results when used in the same contexts. We demonstrate that this framework has a strong predictive power for the generalization capabilities of Transformers. First, we derive and empirically confirm that the training data required for two-hop generalization grows at least quadratically with the token set size, and the training data efficiency does not improve with 20x parameter scaling. Second, for compositional tasks with path ambiguity where one variable affects the output through multiple computational paths, we show that Transformers learn context-dependent state representations that undermine both performance and interoperability. Third, Chain-of-Thought supervision improves training data efficiency for multi-hop tasks but still struggles with path ambiguity. Finally, we outline a \emph{mechanism-based} taxonomy that distinguishes three ways neural networks can generalize: structure-based (bounded by coverage), property-based (leveraging algebraic invariances), and shared-operator (through function reuse). This conceptual lens contextualizes our results and highlights where new architectural ideas are needed to achieve systematic compositionally. Overall, the coverage principle provides a unified lens for understanding compositional reasoning, and underscores the need for fundamental architectural or training innovations to achieve truly systematic compositionality.

nan

Article 1256

Title@2025-05-26 (1): Probabilistic Kernel Function for Fast Angle Testing

Title: Probabilistic Kernel Function for Fast Angle Testing

Probabilistische Kernel-Funktion für schnelle Winkelprüfung

用于快速角测试的概率内核函数 2505.20274v1

Authors: Kejing Lu, Chuan Xiao, Yoshiharu Ishikawa

In this paper, we study the angle testing problem in high-dimensional Euclidean spaces and propose two projection-based probabilistic kernel functions, one designed for angle comparison and the other for angle thresholding. Unlike existing approaches that rely on random projection vectors drawn from Gaussian distributions, our approach leverages reference angles and employs a deterministic structure for the projection vectors. Notably, our kernel functions do not require asymptotic assumptions, such as the number of projection vectors tending to infinity, and can be both theoretically and experimentally shown to outperform Gaussian-distribution-based kernel functions. We further apply the proposed kernel function to Approximate Nearest Neighbor Search (ANNS) and demonstrate that our approach achieves a 2.5X ~ 3X higher query-per-second (QPS) throughput compared to the state-of-the-art graph-based search algorithm HNSW.

nan

Article 1257

Title@2025-05-26 (1): Comparing Neural Network Encodings for Logic-based Explainability

Title: Comparing Neural Network Encodings for Logic-based Explainability

Vergleich von Neural Network Encodings für Logic-basierte Erklärbarkeit

比较基于逻辑的解释性神经网络编码 2505.20269v1

Authors: Levi Cordeiro Carvalho, Saulo A. F. Oliveira, Thiago Alves Rocha

Providing explanations for the outputs of artificial neural networks (ANNs) is crucial in many contexts, such as critical systems, data protection laws and handling adversarial examples. Logic-based methods can offer explanations with correctness guarantees, but face scalability challenges. Due to these issues, it is necessary to compare different encodings of ANNs into logical constraints, which are used in logic-based explainability. This work compares two encodings of ANNs: one has been used in the literature to provide explanations, while the other will be adapted for our context of explainability. Additionally, the second encoding uses fewer variables and constraints, thus, potentially enhancing efficiency. Experiments showed similar running times for computing explanations, but the adapted encoding performed up to 18\% better in building logical constraints and up to 16\% better in overall time.

nan

Article 1258

Title@2025-05-26 (1): Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits

Title: Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits

Ergebnisbasiertes Online-Verstärkungslernen: Algorithmen und grundlegende Grenzen

基于成果的在线强化学习:等级和基本限制 2505.20268v1

Authors: Fan Chen, Zeyu Jia, Alexander Rakhlin, Tengyang Xie

Reinforcement learning with outcome-based feedback faces a fundamental challenge: when rewards are only observed at trajectory endpoints, how do we assign credit to the right actions? This paper provides the first comprehensive analysis of this problem in online RL with general function approximation. We develop a provably sample-efficient algorithm achieving $\widetilde{O}({C_{\rm cov} H^3}/{\epsilon^2})$ sample complexity, where $C_{\rm cov}$ is the coverability coefficient of the underlying MDP. By leveraging general function approximation, our approach works effectively in large or infinite state spaces where tabular methods fail, requiring only that value functions and reward functions can be represented by appropriate function classes. Our results also characterize when outcome-based feedback is statistically separated from per-step rewards, revealing an unavoidable exponential separation for certain MDPs. For deterministic MDPs, we show how to eliminate the completeness assumption, dramatically simplifying the algorithm. We further extend our approach to preference-based feedback settings, proving that equivalent statistical efficiency can be achieved even under more limited information. Together, these results constitute a theoretical foundation for understanding the statistical properties of outcome-based reinforcement learning.

nan

Article 1259

Title@2025-05-26 (1): syftr: Pareto-Optimal Generative AI

Title: syftr: Pareto-Optimal Generative AI

syftr: Pareto-Optimal Generative KI

Syftr: Pareto- Opmatimal 生成 AI 2505.20266v1

Authors: Alexander Conway, Debadeepta Dey, Stefan Hackmann, Matthew Hausknecht, Michael Schmidt, Mark Steadman, Nick Volynets

Retrieval-Augmented Generation (RAG) pipelines are central to applying large language models (LLMs) to proprietary or dynamic data. However, building effective RAG flows is complex, requiring careful selection among vector databases, embedding models, text splitters, retrievers, and synthesizing LLMs. The challenge deepens with the rise of agentic paradigms. Modules like verifiers, rewriters, and rerankers-each with intricate hyperparameter dependencies have to be carefully tuned. Balancing tradeoffs between latency, accuracy, and cost becomes increasingly difficult in performance-sensitive applications. We introduce syftr, a framework that performs efficient multi-objective search over a broad space of agentic and non-agentic RAG configurations. Using Bayesian Optimization, syftr discovers Pareto-optimal flows that jointly optimize task accuracy and cost. A novel early-stopping mechanism further improves efficiency by pruning clearly suboptimal candidates. Across multiple RAG benchmarks, syftr finds flows which are on average approximately 9 times cheaper while preserving most of the accuracy of the most accurate flows on the Pareto-frontier. Furthermore, syftr’s ability to design and optimize allows integrating new modules, making it even easier and faster to realize high-performing generative AI pipelines.

nan

Article 1260

Title@2025-05-26 (1): Lifelong Safety Alignment for Language Models

Title: Lifelong Safety Alignment for Language Models

Lebenslange Sicherheitsausrichtung für Sprachmodelle

语言模型终身安全比对 2505.20259v1

Authors: Haoyu Wang, Zeyu Qin, Yifei Zhao, Chao Du, Min Lin, Xueqian Wang, Tianyu Pang

LLMs have made impressive progress, but their growing capabilities also expose them to highly flexible jailbreaking attacks designed to bypass safety alignment. While many existing defenses focus on known types of attacks, it is more critical to prepare LLMs for unseen attacks that may arise during deployment. To address this, we propose a lifelong safety alignment framework that enables LLMs to continuously adapt to new and evolving jailbreaking strategies. Our framework introduces a competitive setup between two components: a Meta-Attacker, trained to actively discover novel jailbreaking strategies, and a Defender, trained to resist them. To effectively warm up the Meta-Attacker, we first leverage the GPT-4o API to extract key insights from a large collection of jailbreak-related research papers. Through iterative training, the first iteration Meta-Attacker achieves a 73% attack success rate (ASR) on RR and a 57% transfer ASR on LAT using only single-turn attacks. Meanwhile, the Defender progressively improves its robustness and ultimately reduces the Meta-Attacker’s success rate to just 7%, enabling safer and more reliable deployment of LLMs in open-ended environments. The code is available at https://github.com/sail-sg/LifelongSafetyAlignment.

nan

Article 1261

Title@2025-05-26 (1): GRAPE: Optimize Data Mixture for Group Robust Multi-target Adaptive Pretraining

Title: GRAPE: Optimize Data Mixture for Group Robust Multi-target Adaptive Pretraining

GRAPE: Optimierung der Datenmischung für ein robustes Multi-Target-Adaptives Vortraining

GRAPE: 优化集体强力多目标适应性预备培训的数据混合 2505.20380v1

Authors: Simin Fan, Maria Ios Glarou, Martin Jaggi

The performance of large language models (LLMs) across diverse downstream applications is fundamentally governed by the quality and composition of their pretraining corpora. Existing domain reweighting algorithms primarily optimize data mixtures for a single target task, thereby resulting in models that overfit to specialized objectives while exhibiting substantial performance degradation on other benchmarks. This paper introduces Group Robust Multi-target Adaptive PrEtraining (GRAPE), a novel multi-source-multi-target domain reweighting framework designed to calibrate pretraining data mixtures for robust performance across multiple target tasks simultaneously. GRAPE dynamically adjusts sampling weights across source domains (domain weights) while concurrently modulating task weights that quantify the relative importance of each individual target task. This adaptive process prioritizes tasks based on their learning difficulty throughout training. We formulate this interleaved reweighting mechanism as a minimax optimization problem: The inner maximization adjusts task weights leveraging group distributed-robust-optimization (DRO), where those tasks demonstrating the least improvement under the current data mixture are prioritized with higher weights; The outer minimization then optimizes domain weights to maximize loss reduction on the prioritized tasks. Experiments on ClimbLab and SlimPajama datasets demonstrate that GRAPE consistently outperforms baseline methods in terms of reasoning performance across 6 benchmarks. Furthermore, when applied to multilingual targets, GRAPE effectively identifies optimal training mixtures from mainstream languages, achieving superior language modeling capabilities across 8 low-resource target languages.

nan

Article 1262

Title@2025-05-26 (1): Position: Mechanistic Interpretability Should Prioritize Feature Consistency in SAEs

Title: Position: Mechanistic Interpretability Should Prioritize Feature Consistency in SAEs

Position: Mechanische Dolmetschbarkeit sollte Feature-Konsistenz in SAEs priorisieren

位置: 机械可解释性:应优先考虑高级专业环境评估中的地物一致性 2505.20254v1

Authors: Xiangchen Song, Aashiq Muhamed, Yujia Zheng, Lingjing Kong, Zeyu Tang, Mona T. Diab, Virginia Smith, Kun Zhang

Sparse Autoencoders (SAEs) are a prominent tool in mechanistic interpretability (MI) for decomposing neural network activations into interpretable features. However, the aspiration to identify a canonical set of features is challenged by the observed inconsistency of learned SAE features across different training runs, undermining the reliability and efficiency of MI research. This position paper argues that mechanistic interpretability should prioritize feature consistency in SAEs – the reliable convergence to equivalent feature sets across independent runs. We propose using the Pairwise Dictionary Mean Correlation Coefficient (PW-MCC) as a practical metric to operationalize consistency and demonstrate that high levels are achievable (0.80 for TopK SAEs on LLM activations) with appropriate architectural choices. Our contributions include detailing the benefits of prioritizing consistency; providing theoretical grounding and synthetic validation using a model organism, which verifies PW-MCC as a reliable proxy for ground-truth recovery; and extending these findings to real-world LLM data, where high feature consistency strongly correlates with the semantic similarity of learned feature explanations. We call for a community-wide shift towards systematically measuring feature consistency to foster robust cumulative progress in MI.

nan

Article 1263

Title@2025-05-26 (1): Unveiling AI’s Blind Spots: An Oracle for In-Domain, Out-of-Domain, and Adversarial Errors

Title: Unveiling AI’s Blind Spots: An Oracle for In-Domain, Out-of-Domain, and Adversarial Errors

Enthüllen der Blind-Spots von KI: Ein Oracle für In-Domain-, Out-of-Domain- und Adversarial-Fehler

大赦国际不懈的《盲人点:内地、外地和反向错误的甲骨文》 2410.02384v3

Authors: Shuangpeng Han, Mengmi Zhang

AI models make mistakes when recognizing images-whether in-domain, out-of-domain, or adversarial. Predicting these errors is critical for improving system reliability, reducing costly mistakes, and enabling proactive corrections in real-world applications such as healthcare, finance, and autonomous systems. However, understanding what mistakes AI models make, why they occur, and how to predict them remains an open challenge. Here, we conduct comprehensive empirical evaluations using a “mentor” model-a deep neural network designed to predict another “mentee” model’s errors. Our findings show that the mentor excels at learning from a mentee’s mistakes on adversarial images with small perturbations and generalizes effectively to predict in-domain and out-of-domain errors of the mentee. Additionally, transformer-based mentor models excel at predicting errors across various mentee architectures. Subsequently, we draw insights from these observations and develop an “oracle” mentor model, dubbed SuperMentor, that can outperform baseline mentors in predicting errors across different error types from the ImageNet-1K dataset. Our framework paves the way for future research on anticipating and correcting AI model behaviors, ultimately increasing trust in AI systems.

nan

Article 1264

Title@2025-05-26 (1): Learning Extrapolative Sequence Transformations from Markov Chains

Title: Learning Extrapolative Sequence Transformations from Markov Chains

Extrapolative Sequenztransformationen von Markov-Ketten lernen

来自Markov 链条的学习外推序列变换 2505.20251v1

Authors: Sophia Hager, Aleem Khan, Andrew Wang, Nicholas Andrews

Most successful applications of deep learning involve similar training and test conditions. However, tasks such as biological sequence design involve searching for sequences that improve desirable properties beyond previously known values, which requires novel hypotheses that \emph{extrapolate} beyond training data. In these settings, extrapolation may be achieved by using random search methods such as Markov chain Monte Carlo (MCMC), which, given an initial state, sample local transformations to approximate a target density that rewards states with the desired properties. However, even with a well-designed proposal, MCMC may struggle to explore large structured state spaces efficiently. Rather than relying on stochastic search, it would be desirable to have a model that greedily optimizes the properties of interest, successfully extrapolating in as few steps as possible. We propose to learn such a model from the Markov chains resulting from MCMC search. Specifically, our approach uses selected states from Markov chains as a source of training data for an autoregressive model, which is then able to efficiently generate novel sequences that extrapolate along the sequence-level properties of interest. The proposed approach is validated on three problems: protein sequence design, text sentiment control, and text anonymization. We find that the autoregressive model can extrapolate as well or better than MCMC, but with the additional benefits of scalability and significantly higher sample efficiency.

nan

Article 1265

Title@2025-05-26 (1): On the Guidance of Flow Matching

Title: On the Guidance of Flow Matching

Über die Anleitung von Flow Matching

流动配对指南 2502.02150v3

Authors: Ruiqi Feng, Chenglei Yu, Wenhao Deng, Peiyan Hu, Tailin Wu

Flow matching has shown state-of-the-art performance in various generative tasks, ranging from image generation to decision-making, where generation under energy guidance (abbreviated as guidance in the following) is pivotal. However, the guidance of flow matching is more general than and thus substantially different from that of its predecessor, diffusion models. Therefore, the challenge in guidance for general flow matching remains largely underexplored. In this paper, we propose the first framework of general guidance for flow matching. From this framework, we derive a family of guidance techniques that can be applied to general flow matching. These include a new training-free asymptotically exact guidance, novel training losses for training-based guidance, and two classes of approximate guidance that cover classical gradient guidance methods as special cases. We theoretically investigate these different methods to give a practical guideline for choosing suitable methods in different scenarios. Experiments on synthetic datasets, image inverse problems, and offline reinforcement learning demonstrate the effectiveness of our proposed guidance methods and verify the correctness of our flow matching guidance framework. Code to reproduce the experiments can be found at https://github.com/AI4Science-WestlakeU/flow_guidance.

nan

Article 1266

Title@2025-05-26 (1): TACO: Training-free Sound Prompted Segmentation via Semantically Constrained Audio-visual CO-factorization

Title: TACO: Training-free Sound Prompted Segmentation via Semantically Constrained Audio-visual CO-factorization

TACO: Schulungsfreie Klang-Prompt-Segmentierung über semantisch eingeschränkte Audio-visuelle CO-Fabrizierung

TACO:通过模拟压缩培训的视听共同推动因素,进行无培训、无培训的音频快速分割 2412.01488v3

Authors: Hugo Malard, Michel Olvera, Stephane Lathuiliere, Slim Essid

Large-scale pre-trained audio and image models demonstrate an unprecedented degree of generalization, making them suitable for a wide range of applications. Here, we tackle the specific task of sound-prompted segmentation, aiming to segment image regions corresponding to objects heard in an audio signal. Most existing approaches tackle this problem by fine-tuning pre-trained models or by training additional modules specifically for the task. We adopt a different strategy: we introduce a training-free approach that leverages Non-negative Matrix Factorization (NMF) to co-factorize audio and visual features from pre-trained models so as to reveal shared interpretable concepts. These concepts are passed on to an open-vocabulary segmentation model for precise segmentation maps. By using frozen pre-trained models, our method achieves high generalization and establishes state-of-the-art performance in unsupervised sound-prompted segmentation, significantly surpassing previous unsupervised methods.

nan

Article 1267

Title@2025-05-26 (1): Efficient Optimization Accelerator Framework for Multistate Ising Problems

Title: Efficient Optimization Accelerator Framework for Multistate Ising Problems

Effizientes Optimierungs-Beschleuniger-Framework für Multistate Ising-Probleme

高效高效优化多州化问题加速加速框架 2505.20250v1

Authors: Chirag Garg, Sayeef Salahuddin

Ising Machines are a prominent class of hardware architectures that aim to solve NP-hard combinatorial optimization problems. These machines consist of a network of interacting binary spins/neurons that evolve to represent the optimum ground state energy solution. Generally, combinatorial problems are transformed into quadratic unconstrained binary optimization (QUBO) form to harness the computational efficiency of these Ising machines. However, this transformation, especially for multi-state problems, often leads to a more complex exploration landscape than the original problem, thus severely impacting the solution quality. To address this challenge, we model the spin interactions as a generalized boolean logic function to significantly reduce the exploration space. We benchmark the graph coloring problem from the class of multi-state NP-hard optimization using probabilistic Ising solvers to illustrate the effectiveness of our framework. The proposed methodology achieves similar accuracy compared to state-of-the-art heuristics and machine learning algorithms, and demonstrates significant improvement over the existing Ising methods. Additionally, we demonstrate that combining parallel tempering with our existing framework further reduces the coloring error by up to 50% compared to the conventionally used Gibbs sampling algorithm. We also design a 1024-neuron all-to-all connected probabilistic Ising accelerator that shows up to 10000x performance acceleration compared to heuristics while reducing the number of required physical neurons by 1.5-4x compared to conventional Ising machines. Indeed, this accelerator solution demonstrates improvement across all metrics over the current methods, i.e., energy, performance, area, and solution quality. Thus, this work expands the potential of existing Ising hardware to solve a broad class of these multistate optimization problems.

nan

Article 1268

Title@2025-05-26 (1): RedAHD: Reduction-Based End-to-End Automatic Heuristic Design with Large Language Models

Title: RedAHD: Reduction-Based End-to-End Automatic Heuristic Design with Large Language Models

RedAHD: Reduktionsbasiertes, End-to-End-Automatisches Heuristisches Design mit großen Sprachmodellen

REDAHD: 具有大语言模型的后端至后端自动超量设计 2505.20242v1

Authors: Nguyen Thach, Aida Riahifar, Nathan Huynh, Hau Chan

Solving NP-hard combinatorial optimization problems (COPs) (e.g., traveling salesman problems (TSPs) and capacitated vehicle routing problems (CVRPs)) in practice traditionally involves handcrafting heuristics or specifying a search space for finding effective heuristics. The main challenges from these approaches, however, are the sheer amount of domain knowledge and implementation efforts required from human experts. Recently, significant progress has been made to address these challenges, particularly by using large language models (LLMs) to design heuristics within some predetermined generalized algorithmic framework (GAF, e.g., ant colony optimization and guided local search) for building key functions/components (e.g., a priori information on how promising it is to include each edge in a solution for TSP and CVRP). Although existing methods leveraging this idea have shown to yield impressive optimization performance, they are not fully end-to-end and still require considerable manual interventions. In this paper, we propose a novel end-to-end framework, named RedAHD, that enables these LLM-based heuristic design methods to operate without the need of GAFs. More specifically, RedAHD employs LLMs to automate the process of reduction, i.e., transforming the COP at hand into similar COPs that are better-understood, from which LLM-based heuristic design methods can design effective heuristics for directly solving the transformed COPs and, in turn, indirectly solving the original COP. Our experimental results, evaluated on six COPs, show that RedAHD is capable of designing heuristics with competitive or improved results over the state-of-the-art methods with minimal human involvement.

nan

Article 1269

Title@2025-05-26 (1): DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning

Title: DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning

DreamPRM: Domain-regewichtetes Prozess-Reward-Modell für multimodale Vernunft

DreamPRM: 多边理由解释的负重评分进程奖励模式 2505.20241v1

Authors: Qi Cao, Ruiyi Wang, Ruiyi Zhang, Sai Ashish Somayajula, Pengtao Xie

Reasoning has substantially improved the performance of large language models (LLMs) on complicated tasks. Central to the current reasoning studies, Process Reward Models (PRMs) offer a fine-grained evaluation of intermediate reasoning steps and guide the reasoning process. However, extending PRMs to multimodal large language models (MLLMs) introduces challenges. Since multimodal reasoning covers a wider range of tasks compared to text-only scenarios, the resulting distribution shift from the training to testing sets is more severe, leading to greater generalization difficulty. Training a reliable multimodal PRM, therefore, demands large and diverse datasets to ensure sufficient coverage. However, current multimodal reasoning datasets suffer from a marked quality imbalance, which degrades PRM performance and highlights the need for an effective data selection strategy. To address the issues, we introduce DreamPRM, a domain-reweighted training framework for multimodal PRMs which employs bi-level optimization. In the lower-level optimization, DreamPRM performs fine-tuning on multiple datasets with domain weights, allowing the PRM to prioritize high-quality reasoning signals and alleviating the impact of dataset quality imbalance. In the upper-level optimization, the PRM is evaluated on a separate meta-learning dataset; this feedback updates the domain weights through an aggregation loss function, thereby improving the generalization capability of trained PRM. Extensive experiments on multiple multimodal reasoning benchmarks covering both mathematical and general reasoning show that test-time scaling with DreamPRM consistently improves the performance of state-of-the-art MLLMs. Further comparisons reveal that DreamPRM’s domain-reweighting strategy surpasses other data selection methods and yields higher accuracy gains than existing test-time scaling approaches.

nan

Article 1270

Title@2025-05-26 (1): SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems

Title: SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems

SITCOM: Triple-Consistent Diffusions-Probenahme für inverse Probleme

SITCOM: 反问题递进三联扩散抽样 2410.04479v2

Authors: Ismail Alkhouri, Shijun Liang, Cheng-Han Huang, Jimmy Dai, Qing Qu, Saiprasad Ravishankar, Rongrong Wang

Diffusion models (DMs) are a class of generative models that allow sampling from a distribution learned over a training set. When applied to solving inverse problems, the reverse sampling steps are modified to approximately sample from a measurement-conditioned distribution. However, these modifications may be unsuitable for certain settings (e.g., presence of measurement noise) and non-linear tasks, as they often struggle to correct errors from earlier steps and generally require a large number of optimization and/or sampling steps. To address these challenges, we state three conditions for achieving measurement-consistent diffusion trajectories. Building on these conditions, we propose a new optimization-based sampling method that not only enforces standard data manifold measurement consistency and forward diffusion consistency, as seen in previous studies, but also incorporates our proposed step-wise and network-regularized backward diffusion consistency that maintains a diffusion trajectory by optimizing over the input of the pre-trained model at every sampling step. By enforcing these conditions (implicitly or explicitly), our sampler requires significantly fewer reverse steps. Therefore, we refer to our method as Step-wise Triple-Consistent Sampling (SITCOM). Compared to SOTA baselines, our experiments across several linear and non-linear tasks (with natural and medical images) demonstrate that SITCOM achieves competitive or superior results in terms of standard similarity metrics and run-time.

nan

Article 1271

Title@2025-05-26 (1): A Temporal Difference Method for Stochastic Continuous Dynamics

Title: A Temporal Difference Method for Stochastic Continuous Dynamics

Eine zeitliche Differenzmethode für stochastische kontinuierliche Dynamik

存储连续动态的时差方法 2505.15544v3

Authors: Haruki Settai, Naoya Takeishi, Takehisa Yairi

For continuous systems modeled by dynamical equations such as ODEs and SDEs, Bellman’s principle of optimality takes the form of the Hamilton-Jacobi-Bellman (HJB) equation, which provides the theoretical target of reinforcement learning (RL). Although recent advances in RL successfully leverage this formulation, the existing methods typically assume the underlying dynamics are known a priori because they need explicit access to the coefficient functions of dynamical equations to update the value function following the HJB equation. We address this inherent limitation of HJB-based RL; we propose a model-free approach still targeting the HJB equation and propose the corresponding temporal difference method. We demonstrate its potential advantages over transition kernel-based formulations, both qualitatively and empirically. The proposed formulation paves the way toward bridging stochastic optimal control and model-free reinforcement learning.

nan

Article 1272

Title@2025-05-26 (1): RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

Title: RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

RAGEN: Selbst-Evolution in LLM-Agenten durch Multi-Turn-Verstärkungs-Lernen verstehen

通过多阶段强化学习了解LLM代理商的自我演变 2504.20073v2

Authors: Zihan Wang, Kangrui Wang, Qineng Wang, Pingyue Zhang, Linjie Li, Zhengyuan Yang, Xing Jin, Kefan Yu, Minh Nhat Nguyen, Licheng Liu, Eli Gottlieb, Yiping Lu, Kyunghyun Cho, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li

Training large language models (LLMs) as interactive agents presents unique challenges including long-horizon decision making and interacting with stochastic environment feedback. While reinforcement learning (RL) has enabled progress in static tasks, multi-turn agent RL training remains underexplored. We propose StarPO (State-Thinking-Actions-Reward Policy Optimization), a general framework for trajectory-level agent RL, and introduce RAGEN, a modular system for training and evaluating LLM agents. Our study on four stylized environments reveals three core findings. First, our agent RL training shows a recurring mode of Echo Trap where reward variance cliffs and gradient spikes; we address this with StarPO-S, a stabilized variant with trajectory filtering, critic incorporation, and gradient stabilization. Second, we find the shaping of RL rollouts would benefit from diverse initial states, medium interaction granularity and more frequent sampling. Third, we show that without fine-grained, reasoning-aware reward signals, agent reasoning hardly emerge through multi-turn RL and they may show shallow strategies or hallucinated thoughts. Code and environments are available at https://github.com/RAGEN-AI/RAGEN.

nan

Article 1273

Title@2025-05-26 (1): SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Title: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

SFT-Erinnerungen, RL Generalisiert: Eine vergleichende Studie des Stiftungsmodells nach der Ausbildung

SFT Memorizes,RL一般化:基金会培训模式模型比较研究 2501.17161v2

Authors: Tianzhe Chu, Yuexiang Zhai, Jihan Yang, Shengbang Tong, Saining Xie, Dale Schuurmans, Quoc V. Le, Sergey Levine, Yi Ma

Supervised fine-tuning (SFT) and reinforcement learning (RL) are widely used post-training techniques for foundation models. However, their roles in enhancing model generalization capabilities remain unclear. This paper studies the difference between SFT and RL on generalization and memorization, focusing on text-based rule variants and visual variants. We introduce GeneralPoints, an arithmetic reasoning card game, and adopt V-IRL, a real-world navigation environment, to assess how models trained with SFT and RL generalize to unseen variants in both textual and visual domains. We show that RL, especially when trained with an outcome-based reward, generalizes across both rule-based textual and visual variants. SFT, in contrast, tends to memorize training data and struggles to generalize out-of-distribution scenarios. Further analysis reveals that RL improves the model’s underlying visual recognition capabilities, contributing to its enhanced generalization in the visual domain. Despite RL’s superior generalization, we show that SFT remains essential for effective RL training; SFT stabilizes the model’s output format, enabling subsequent RL to achieve its performance gains. These findings demonstrates the capability of RL for acquiring generalizable knowledge in complex, multi-modal tasks.

nan

Article 1274

Title@2025-05-26 (1): Variational Deep Learning via Implicit Regularization

Title: Variational Deep Learning via Implicit Regularization

Variationales Deep Learning durch Implizite Regularisierung

通过隐性规范化进行不同的深层学习 2505.20235v1

Authors: Jonathan Wenger, Beau Coker, Juraj Marusic, John P. Cunningham

Modern deep learning models generalize remarkably well in-distribution, despite being overparametrized and trained with little to no explicit regularization. Instead, current theory credits implicit regularization imposed by the choice of architecture, hyperparameters and optimization procedure. However, deploying deep learning models out-of-distribution, in sequential decision-making tasks, or in safety-critical domains, necessitates reliable uncertainty quantification, not just a point estimate. The machinery of modern approximate inference – Bayesian deep learning – should answer the need for uncertainty quantification, but its effectiveness has been challenged by our inability to define useful explicit inductive biases through priors, as well as the associated computational burden. Instead, in this work we demonstrate, both theoretically and empirically, how to regularize a variational deep network implicitly via the optimization procedure, just as for standard deep learning. We fully characterize the inductive bias of (stochastic) gradient descent in the case of an overparametrized linear model as generalized variational inference and demonstrate the importance of the choice of parametrization. Finally, we show empirically that our approach achieves strong in- and out-of-distribution performance without tuning of additional hyperparameters and with minimal time and memory overhead over standard deep learning.

nan

Article 1275

Title@2025-05-26 (1): Multimodal Federated Learning With Missing Modalities through Feature Imputation Network

Title: Multimodal Federated Learning With Missing Modalities through Feature Imputation Network

Multimodales Federated Learning mit fehlenden Modalitäten durch Feature Imputation Network

通过特征截肢网络以失踪模式进行多模式联邦学习 2505.20232v1

Authors: Pranav Poudel, Aavash Chhetri, Prashnna Gyawali, Georgios Leontidis, Binod Bhattarai

Multimodal federated learning holds immense potential for collaboratively training models from multiple sources without sharing raw data, addressing both data scarcity and privacy concerns, two key challenges in healthcare. A major challenge in training multimodal federated models in healthcare is the presence of missing modalities due to multiple reasons, including variations in clinical practice, cost and accessibility constraints, retrospective data collection, privacy concerns, and occasional technical or human errors. Previous methods typically rely on publicly available real datasets or synthetic data to compensate for missing modalities. However, obtaining real datasets for every disease is impractical, and training generative models to synthesize missing modalities is computationally expensive and prone to errors due to the high dimensionality of medical data. In this paper, we propose a novel, lightweight, low-dimensional feature translator to reconstruct bottleneck features of the missing modalities. Our experiments on three different datasets (MIMIC-CXR, NIH Open-I, and CheXpert), in both homogeneous and heterogeneous settings consistently improve the performance of competitive baselines. The code and implementation details are available at: https://github.com/bhattarailab/FedFeatGen

nan

Article 1276

Title@2025-05-26 (1): From What to How: Attributing CLIP’s Latent Components Reveals Unexpected Semantic Reliance

Title: From What to How: Attributing CLIP’s Latent Components Reveals Unexpected Semantic Reliance

Von was zu wie: Zuweisen von CLIPs latenten Komponenten zeigt ungeahnte semantische Zuverlässigkeit

从何到如何: 将 CLIP 的内部部件流出异常的语义依赖性归结为 CLIP 的内部批量。 2505.20229v1

Authors: Maximilian Dreyer, Lorenz Hufe, Jim Berend, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek

Transformer-based CLIP models are widely used for text-image probing and feature extraction, making it relevant to understand the internal mechanisms behind their predictions. While recent works show that Sparse Autoencoders (SAEs) yield interpretable latent components, they focus on what these encode and miss how they drive predictions. We introduce a scalable framework that reveals what latent components activate for, how they align with expected semantics, and how important they are to predictions. To achieve this, we adapt attribution patching for instance-wise component attributions in CLIP and highlight key faithfulness limitations of the widely used Logit Lens technique. By combining attributions with semantic alignment scores, we can automatically uncover reliance on components that encode semantically unexpected or spurious concepts. Applied across multiple CLIP variants, our method uncovers hundreds of surprising components linked to polysemous words, compound nouns, visual typography and dataset artifacts. While text embeddings remain prone to semantic ambiguity, they are more robust to spurious correlations compared to linear classifiers trained on image embeddings. A case study on skin lesion detection highlights how such classifiers can amplify hidden shortcuts, underscoring the need for holistic, mechanistic interpretability. We provide code at https://github.com/maxdreyer/attributing-clip.

nan

Article 1277

Title@2025-05-26 (1): FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

Title: FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

FLAME-MoE: Eine transparente End-to-End-Forschungsplattform für Mixture-of-Experts-Sprachmodelle

FLAME-MOE:混合专家语言模型透明端对端研究平台 2505.20225v1

Authors: Hao Kang, Zichun Yu, Chenyan Xiong

Recent large language models such as Gemini-1.5, DeepSeek-V3, and Llama-4 increasingly adopt Mixture-of-Experts (MoE) architectures, which offer strong efficiency-performance trade-offs by activating only a fraction of the model per token. Yet academic researchers still lack a fully open, end-to-end MoE platform for investigating scaling, routing, and expert behavior. We release FLAME-MoE, a completely open-source research suite composed of seven decoder-only models, ranging from 38M to 1.7B active parameters, whose architecture–64 experts with top-8 gating and 2 shared experts–closely reflects modern production LLMs. All training data pipelines, scripts, logs, and checkpoints are publicly available to enable reproducible experimentation. Across six evaluation tasks, FLAME-MoE improves average accuracy by up to 3.4 points over dense baselines trained with identical FLOPs. Leveraging full training trace transparency, we present initial analyses showing that (i) experts increasingly specialize on distinct token subsets, (ii) co-activation matrices remain sparse, reflecting diverse expert usage, and (iii) routing behavior stabilizes early in training. All code, training logs, and model checkpoints are available at https://github.com/cmu-flame/FLAME-MoE.

nan

Article 1278

Title@2025-05-26 (1): Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects

Title: Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects

Chain-of-Thought für autonomes Fahren: Eine umfassende Umfrage und Zukunftsaussichten

寻求自主驾驶:全面调查和未来前景 2505.20223v1

Authors: Yixin Cui, Haotian Lin, Shuo Yang, Yixiao Wang, Yanjun Huang, Hong Chen

The rapid evolution of large language models in natural language processing has substantially elevated their semantic understanding and logical reasoning capabilities. Such proficiencies have been leveraged in autonomous driving systems, contributing to significant improvements in system performance. Models such as OpenAI o1 and DeepSeek-R1, leverage Chain-of-Thought (CoT) reasoning, an advanced cognitive method that simulates human thinking processes, demonstrating remarkable reasoning capabilities in complex tasks. By structuring complex driving scenarios within a systematic reasoning framework, this approach has emerged as a prominent research focus in autonomous driving, substantially improving the system’s ability to handle challenging cases. This paper investigates how CoT methods improve the reasoning abilities of autonomous driving models. Based on a comprehensive literature review, we present a systematic analysis of the motivations, methodologies, challenges, and future research directions of CoT in autonomous driving. Furthermore, we propose the insight of combining CoT with self-learning to facilitate self-evolution in driving systems. To ensure the relevance and timeliness of this study, we have compiled a dynamic repository of literature and open-source projects, diligently updated to incorporate forefront developments. The repository is publicly available at https://github.com/cuiyx1720/Awesome-CoT4AD.

nan

Article 1279

Title@2025-05-26 (1): Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Title: Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Rollen Sie die Würfel & Blick, bevor Sie springen: Gehen über die kreativen Grenzen der Next-Token-Vorhersage

跳跃前的骰子滚动和看一看:超越了次声预测的创造性极限 2504.15266v2

Authors: Vaishnavh Nagarajan, Chen Henry Wu, Charles Ding, Aditi Raghunathan

We design a suite of minimal algorithmic tasks that are a loose abstraction of open-ended real-world tasks. This allows us to cleanly and controllably quantify the creative limits of the present-day language model. Much like real-world tasks that require a creative, far-sighted leap of thought, our tasks require an implicit, open-ended stochastic planning step that either (a) discovers new connections in an abstract knowledge graph (like in wordplay, drawing analogies, or research) or (b) constructs new patterns (like in designing math problems or new proteins). In these tasks, we empirically and conceptually argue how next-token learning is myopic and memorizes excessively; multi-token approaches, namely teacherless training and diffusion models, comparatively excel in producing diverse and original output. Secondly, to elicit randomness without hurting coherence, we find that injecting noise at the input layer (dubbed as seed-conditioning) works surprisingly as well as (and in some conditions, better than) temperature sampling from the output layer. Thus, our work offers a principled, minimal test-bed for analyzing open-ended creative skills, and offers new arguments for going beyond next-token learning and temperature sampling. We make part of the code available under https://github.com/chenwu98/algorithmic-creativity

nan

Article 1280

Title@2025-05-26 (1): Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Title: Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Gradient Flow Passend zum Lernen von Update-Dynamik im neuralen Netzwerktraining

神经网络培训中学习更新动态动态的渐进流程匹配 2505.20221v1

Authors: Xiao Shou, Yanna Ding, Jianxi Gao

Training deep neural networks remains computationally intensive due to the itera2 tive nature of gradient-based optimization. We propose Gradient Flow Matching (GFM), a continuous-time modeling framework that treats neural network training as a dynamical system governed by learned optimizer-aware vector fields. By leveraging conditional flow matching, GFM captures the underlying update rules of optimizers such as SGD, Adam, and RMSprop, enabling smooth extrapolation of weight trajectories toward convergence. Unlike black-box sequence models, GFM incorporates structural knowledge of gradient-based updates into the learning objective, facilitating accurate forecasting of final weights from partial training sequences. Empirically, GFM achieves forecasting accuracy that is competitive with Transformer-based models and significantly outperforms LSTM and other classical baselines. Furthermore, GFM generalizes across neural architectures and initializations, providing a unified framework for studying optimization dynamics and accelerating convergence prediction.

nan

Article 1281

Title@2025-05-26 (1): Open the Eyes of MPNN: Vision Enhances MPNN in Link Prediction

Title: Open the Eyes of MPNN: Vision Enhances MPNN in Link Prediction

Öffnen Sie die Augen von MPNN: Vision verbessert MPNN in Link Prediction

MPNNN的 “ 睁开眼 “ :愿景在 “ 连结预测 “ 中加强MPNN 2505.08266v2

Authors: Yanbin Wei, Xuehao Wang, Zhan Zhuang, Yang Chen, Shuhao Chen, Yulong Zhang, Yu Zhang, James Kwok

Message-passing graph neural networks (MPNNs) and structural features (SFs) are cornerstones for the link prediction task. However, as a common and intuitive mode of understanding, the potential of visual perception has been overlooked in the MPNN community. For the first time, we equip MPNNs with vision structural awareness by proposing an effective framework called Graph Vision Network (GVN), along with a more efficient variant (E-GVN). Extensive empirical results demonstrate that with the proposed frameworks, GVN consistently benefits from the vision enhancement across seven link prediction datasets, including challenging large-scale graphs. Such improvements are compatible with existing state-of-the-art (SOTA) methods and GVNs achieve new SOTA results, thereby underscoring a promising novel direction for link prediction.

nan

Article 1282

Title@2025-05-26 (1): New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results

Title: New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results

Neue Perspektiven auf die Polyak Stepsize: Surrogate-Funktionen und negative Ergebnisse

关于 “ 多边步骤的新观点:代理功能和消极结果 “ 2505.20219v1

Authors: Francesco Orabona, Ryan D’Orazio

The Polyak stepsize has been proven to be a fundamental stepsize in convex optimization, giving near optimal gradient descent rates across a wide range of assumptions. The universality of the Polyak stepsize has also inspired many stochastic variants, with theoretical guarantees and strong empirical performance. Despite the many theoretical results, our understanding of the convergence properties and shortcomings of the Polyak stepsize or its variants is both incomplete and fractured across different analyses. We propose a new, unified, and simple perspective for the Polyak stepsize and its variants as gradient descent on a surrogate loss. We show that each variant is equivalent to minimize a surrogate function with stepsizes that adapt to a guaranteed local curvature. Our general surrogate loss perspective is then used to provide a unified analysis of existing variants across different assumptions. Moreover, we show a number of negative results proving that the non-convergence results in some of the upper bounds is indeed real.

nan

Article 1283

Title@2025-05-26 (1): Fine-grained List-wise Alignment for Generative Medication Recommendation

Title: Fine-grained List-wise Alignment for Generative Medication Recommendation

Feinkörnige List-Wise-Ausrichtung für Generative Medikamente Empfehlung

生产用药建议精制清单调整 2505.20218v1

Authors: Chenxiao Fan, Chongming Gao, Wentao Shi, Yaxin Gong, Zihao Zhao, Fuli Feng

Accurate and safe medication recommendations are critical for effective clinical decision-making, especially in multimorbidity cases. However, existing systems rely on point-wise prediction paradigms that overlook synergistic drug effects and potential adverse drug-drug interactions (DDIs). We propose FLAME, a fine-grained list-wise alignment framework for large language models (LLMs), enabling drug-by-drug generation of drug lists. FLAME formulates recommendation as a sequential decision process, where each step adds or removes a single drug. To provide fine-grained learning signals, we devise step-wise Group Relative Policy Optimization (GRPO) with potential-based reward shaping, which explicitly models DDIs and optimizes the contribution of each drug to the overall prescription. Furthermore, FLAME enhances patient modeling by integrating structured clinical knowledge and collaborative information into the representation space of LLMs. Experiments on benchmark datasets demonstrate that FLAME achieves state-of-the-art performance, delivering superior accuracy, controllable safety-accuracy trade-offs, and strong generalization across diverse clinical scenarios. Our code is available at https://github.com/cxfann/Flame.

nan

Article 1284

Title@2025-05-26 (1): Parameter-Efficient Fine-Tuning with Column Space Projection

Title: Parameter-Efficient Fine-Tuning with Column Space Projection

Parameter-Effizient Feintuning mit Säulenraumprojektion

带有列空间投射的高效参数精密设计 2505.20211v1

Authors: Junseo Hwang, Wonguk Cho, Taesup Kim

Fine-tuning large language models (LLMs) with minimal computational overhead is essential for efficiently adapting them to downstream tasks under resource constraints. Parameter-efficient fine-tuning (PEFT) methods, such as Low-Rank Adaptation (LoRA), facilitate this by updating only a small subset of parameters. However, recent studies show that LoRA diverges from full fine-tuning (Full FT) in its learning behavior, particularly in terms of spectral properties. Motivated by these findings, we propose PiCa, the first theoretically grounded PEFT method based on the spectral properties of fine-tuned weights. PiCa projects gradients onto the low-rank column subspace of pre-trained weights and exhibits learning patterns more closely aligned with Full FT. Furthermore, we show that combining PiCa with weight sharing drastically reduces the number of trainable parameters without compromising performance, enabling to achieve superior performance than LoRA using 13x fewer trainable parameters. Extensive experiments demonstrate PiCa achieves the state-of-the-art performance compared to existing PEFT methods.

nan

Article 1285

Title@2025-05-26 (1): FedECA: A Federated External Control Arm Method for Causal Inference with Time-To-Event Data in Distributed Settings

Title: FedECA: A Federated External Control Arm Method for Causal Inference with Time-To-Event Data in Distributed Settings

FedECA: Eine Federated External Control Arm Methode für ursächliche Schlussfolgerungen mit Zeit-bis-Event-Daten in verteilten Einstellungen

FedECA:在分布环境中利用时间到时间的数据进行因果关系推断的联邦外部控制武器法 2311.16984v9

Authors: Jean Ogier du Terrail, Quentin Klopfenstein, Honghao Li, Imke Mayer, Nicolas Loiseau, Mohammad Hallal, Michael Debouver, Thibault Camalon, Thibault Fouqueray, Jorge Arellano Castro, Zahia Yanes, Laëtitia Dahan, Julien Taïeb, Pierre Laurent-Puig, Jean-Baptiste Bachet, Shulin Zhao, Remy Nicolle, Jérome Cros, Daniel Gonzalez, Robert Carreras-Torres, Adelaida Garcia Velasco, Kawther Abdilleh, Sudheer Doss, Félix Balazard, Mathieu Andreux

External control arms (ECA) can inform the early clinical development of experimental drugs and provide efficacy evidence for regulatory approval. However, the main challenge in implementing ECA lies in accessing real-world or historical clinical trials data. Indeed, regulations protecting patients’ rights by strictly controlling data processing make pooling data from multiple sources in a central server often difficult. To address these limitations, we develop a new method, ‘FedECA’ that leverages federated learning (FL) to enable inverse probability of treatment weighting (IPTW) for time-to-event outcomes on separate cohorts without needing to pool data. To showcase the potential of FedECA, we apply it in different settings of increasing complexity culminating with a real-world use-case in which FedECA is used to compare the treatment effect of two approved chemotherapy regimens using data from three separate cohorts of patients with metastatic pancreatic cancer. By sharing our code, we hope FedECA will foster the creation of federated research networks and thus accelerate drug development.

nan

Article 1286

Title@2025-05-26 (1): Temporal Sampling for Forgotten Reasoning in LLMs

Title: Temporal Sampling for Forgotten Reasoning in LLMs

Zeitliche Probenahme für vergessene Vernunft in LLMs

LLM 被遗忘原因的时间抽样 2505.20196v1

Authors: Yuetai Li, Zhangchen Xu, Fengqing Jiang, Bhaskar Ramasubramanian, Luyao Niu, Bill Yuchen Lin, Xiang Yue, Radha Poovendran

Fine-tuning large language models (LLMs) is intended to improve their reasoning capabilities, yet we uncover a counterintuitive effect: models often forget how to solve problems they previously answered correctly during training. We term this phenomenon temporal forgetting and show that it is widespread across model sizes, fine-tuning methods (both Reinforcement Learning and Supervised Fine-Tuning), and multiple reasoning benchmarks. To address this gap, we introduce Temporal Sampling, a simple decoding strategy that draws outputs from multiple checkpoints along the training trajectory. This approach recovers forgotten solutions without retraining or ensembling, and leads to substantial improvements in reasoning performance, gains from 4 to 19 points in Pass@k and consistent gains in Majority@k across several benchmarks. We further extend our method to LoRA-adapted models, demonstrating that storing only adapter weights across checkpoints achieves similar benefits with minimal storage cost. By leveraging the temporal diversity inherent in training, Temporal Sampling offers a practical, compute-efficient way to surface hidden reasoning ability and rethink how we evaluate LLMs.

nan

Article 1287

Title: FunReason: Enhancing Large Language Models’ Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement

FunReason: Erweiterung der Funktion großer Sprachmodelle durch Multiscale-Verluste und automatisierte Datenverfeinerung durch Selbst-Refinement

FunReason:通过自我改进、多尺度损失和数据自动化改进加强大语言模型功能 2505.20192v1

Authors: Bingguang Hao, Maolin Wang, Zengzhuang Xu, Cunyin Peng, Yicheng Chen, Xiangyu Zhao, Jinjie Gu, Chenyi Zhuang

The integration of large language models (LLMs) with function calling has emerged as a crucial capability for enhancing their practical utility in real-world applications. However, effectively combining reasoning processes with accurate function execution remains a significant challenge. Traditional training approaches often struggle to balance the detailed reasoning steps with the precision of function calls, leading to suboptimal performance. To address these limitations, we introduce FunReason, a novel framework that enhances LLMs’ function calling capabilities through an automated data refinement strategy and a Self-Refinement Multiscale Loss (SRML) approach. FunReason leverages LLMs’ natural reasoning abilities to generate high-quality training examples, focusing on query parseability, reasoning coherence, and function call precision. The SRML approach dynamically balances the contribution of reasoning processes and function call accuracy during training, addressing the inherent trade-off between these two critical aspects. FunReason achieves performance comparable to GPT-4o while effectively mitigating catastrophic forgetting during fine-tuning. FunReason provides a comprehensive solution for enhancing LLMs’ function calling capabilities by introducing a balanced training methodology and a data refinement pipeline. For code and dataset, please refer to our repository at GitHub https://github.com/BingguangHao/FunReason

nan

Article 1288

Title@2025-05-26 (1): Private Geometric Median in Nearly-Linear Time

Title: Private Geometric Median in Nearly-Linear Time

Private Geometrische Medien in fast linearer Zeit

近利时私人几何中位数 2505.20189v1

Authors: Syamantak Kumar, Daogao Liu, Kevin Tian, Chutong Yang

Estimating the geometric median of a dataset is a robust counterpart to mean estimation, and is a fundamental problem in computational geometry. Recently, [HSU24] gave an $(\varepsilon, \delta)$-differentially private algorithm obtaining an $\alpha$-multiplicative approximation to the geometric median objective, $\frac 1 n \sum_{i \in [n]} |\cdot - \mathbf{x}i|$, given a dataset $\mathcal{D} := {\mathbf{x}_i}{i \in [n]} \subset \mathbb{R}^d$. Their algorithm requires $n \gtrsim \sqrt d \cdot \frac 1 {\alpha\varepsilon}$ samples, which they prove is information-theoretically optimal. This result is surprising because its error scales with the \emph{effective radius} of $\mathcal{D}$ (i.e., of a ball capturing most points), rather than the worst-case radius. We give an improved algorithm that obtains the same approximation quality, also using $n \gtrsim \sqrt d \cdot \frac 1 {\alpha\epsilon}$ samples, but in time $\widetilde{O}(nd + \frac d {\alpha^2})$. Our runtime is nearly-linear, plus the cost of the cheapest non-private first-order method due to [CLM+16]. To achieve our results, we use subsampling and geometric aggregation tools inspired by FriendlyCore [TCK+22] to speed up the “warm start” component of the [HSU24] algorithm, combined with a careful custom analysis of DP-SGD’s sensitivity for the geometric median objective.

nan

Article 1289

Title@2025-05-26 (1): Research on feature fusion and multimodal patent text based on graph attention network

Title: Research on feature fusion and multimodal patent text based on graph attention network

Forschungsarbeiten über Feature Fusion und multimodalen Patenttext auf der Grundlage von Graphen Aufmerksamkeit Netzwerk

根据图示关注网络研究地物聚合和多式专利法 2505.20188v1

Authors: Zhenzhen Song, Ziwei Liu, Hongji Li

Aiming at the problems of cross-modal feature fusion, low efficiency of long text modeling and lack of hierarchical semantic coherence in patent text semantic mining, this study proposes HGM-Net, a deep learning framework that integrates Hierarchical Comparative Learning (HCL), Multi-modal Graph Attention Network (M-GAT) and Multi-Granularity Sparse Attention (MSA), which builds a dynamic mask, contrast and cross-structural similarity constraints on the word, sentence and paragraph hierarchies through HCL. Contrast and cross-structural similarity constraints are constructed at the word and paragraph levels by HCL to strengthen the local semantic and global thematic consistency of patent text; M-GAT models patent classification codes, citation relations and text semantics as heterogeneous graph structures, and achieves dynamic fusion of multi-source features by cross-modal gated attention; MSA adopts a hierarchical sparsity strategy to optimize the computational efficiency of long text modeling at word, phrase, sentence and paragraph granularity. Experiments show that the framework demonstrates significant advantages over existing deep learning methods in tasks such as patent classification and similarity matching, and provides a solution with both theoretical innovation and practical value for solving the problems of patent examination efficiency improvement and technology relevance mining.

nan

Article 1290

Title@2025-05-26 (1): UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design

Title: UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design

UniMoMo: Unified Generative Modellierung von 3D-Molekülen für De Novo Binder Design

UniMomo:De Novo Binder 设计3D Molecules的统一生成模型 2503.19300v3

Authors: Xiangzhe Kong, Zishen Zhang, Ziting Zhang, Rui Jiao, Jianzhu Ma, Wenbing Huang, Kai Liu, Yang Liu

The design of target-specific molecules such as small molecules, peptides, and antibodies is vital for biological research and drug discovery. Existing generative methods are restricted to single-domain molecules, failing to address versatile therapeutic needs or utilize cross-domain transferability to enhance model performance. In this paper, we introduce Unified generative Modeling of 3D Molecules (UniMoMo), the first framework capable of designing binders of multiple molecular domains using a single model. In particular, UniMoMo unifies the representations of different molecules as graphs of blocks, where each block corresponds to either a standard amino acid or a molecular fragment. Subsequently, UniMoMo utilizes a geometric latent diffusion model for 3D molecular generation, featuring an iterative full-atom autoencoder to compress blocks into latent space points, followed by an E(3)-equivariant diffusion process. Extensive benchmarks across peptides, antibodies, and small molecules demonstrate the superiority of our unified framework over existing domain-specific models, highlighting the benefits of multi-domain training.

nan

Article 1291

Title@2025-05-26 (1): Linearization of ReLU Activation Function for Neural Network-Embedded Optimization: Optimal Day-Ahead Energy Scheduling

Title: Linearization of ReLU Activation Function for Neural Network-Embedded Optimization: Optimal Day-Ahead Energy Scheduling

Linearisierung der ReLU-Aktivierungsfunktion für neurale Netzwerk-Embedded-Optimierung: Optimale Day-Ahead-Energieplanung

ReLU神经网络激活功能的线性化 2310.01758v2

Authors: Cunzhi Zhao, Fan Jiang, Xingpeng Li

Recently, neural networks have been widely applied in the power system area. They can be used for better predicting input information and modeling system performance with increased accuracy. In some applications such as battery degradation neural network-based microgrid day-ahead energy scheduling, the input features of the trained learning model are variables to be solved in optimization models that enforce limits on the output of the same learning model. This will create a neural network-embedded optimization problem; the use of nonlinear activation functions in the neural network will make such problems extremely hard to solve if not unsolvable. To address this emerging challenge, this paper investigated different methods for linearizing the nonlinear activation functions with a particular focus on the widely used rectified linear unit (ReLU) function. Four linearization methods tailored for the ReLU activation function are developed, analyzed and compared in this paper. Each method employs a set of linear constraints to replace the ReLU function, effectively linearizing the optimization problem, which can overcome the computational challenges associated with the nonlinearity of the neural network model. These proposed linearization methods provide valuable tools for effectively solving optimization problems that integrate neural network models with ReLU activation functions

nan

Article 1292

Title@2025-05-26 (1): Bayesian Optimisation Against Climate Change: Applications and Benchmarks

Title: Bayesian Optimisation Against Climate Change: Applications and Benchmarks

Bayesische Optimierung gegen den Klimawandel: Anwendungen und Benchmarks

Bayesian最佳应对气候变化:应用和基准 2306.04343v2

Authors: Sigrid Passano Hellan, Christopher G. Lucas, Nigel H. Goddard

Bayesian optimisation is a powerful method for optimising black-box functions, popular in settings where the true function is expensive to evaluate and no gradient information is available. Bayesian optimisation can improve responses to many optimisation problems within climate change for which simulator models are unavailable or expensive to sample from. While there have been several demonstrations of climate-related applications, there has been no unifying review of applications and benchmarks. We provide such a review here, to encourage the use of Bayesian optimisation for important and well-suited applications. We identify four main application domains: material discovery, wind farm layout, optimal renewable control and environmental monitoring. For each domain we identify a public benchmark or data set that is easy to use and evaluate systems against, while being representative of real-world problems. Due to the lack of a suitable benchmark for environmental monitoring, we propose LAQN-BO, based on air pollution data. Our contributions are: a) summarising Bayesian optimisation applications related to climate change; b) identifying a representative range of benchmarks, providing example code where necessary; and c) introducing a new benchmark, LAQN-BO.

nan

Article 1293

Title@2025-05-26 (1): On the Volatility of Shapley-Based Contribution Metrics in Federated Learning

Title: On the Volatility of Shapley-Based Contribution Metrics in Federated Learning

Über die Volatilität von Shapley-Based Contribution Metrics im Federated Learning

联邦学习中基于毛质的贡献度量变化无常 2405.08044v4

Authors: Arno Geimer, Beltran Fiz, Radu State

Federated learning (FL) is a collaborative and privacy-preserving Machine Learning paradigm, allowing the development of robust models without the need to centralize sensitive data. A critical challenge in FL lies in fairly and accurately allocating contributions from diverse participants. Inaccurate allocation can undermine trust, lead to unfair compensation, and thus participants may lack the incentive to join or actively contribute to the federation. Various remuneration strategies have been proposed to date, including auction-based approaches and Shapley-value-based methods, the latter offering a means to quantify the contribution of each participant. However, little to no work has studied the stability of these contribution evaluation methods. In this paper, we evaluate participant contributions in federated learning using gradient-based model reconstruction techniques with Shapley values and compare the round-based contributions to a classic data contribution measurement scheme. We provide an extensive analysis of the discrepancies of Shapley values across a set of aggregation strategies and examine them on an overall and a per-client level. We show that, between different aggregation techniques, Shapley values lead to unstable reward allocations among participants. Our analysis spans various data heterogeneity distributions, including independent and identically distributed (IID) and non-IID scenarios.

nan

Article 1294

Title@2025-05-26 (1): No Free Lunch: Non-Asymptotic Analysis of Prediction-Powered Inference

Title: No Free Lunch: Non-Asymptotic Analysis of Prediction-Powered Inference

Kein kostenloses Mittagessen: Nicht-asymptotische Analyse von Vorhersage-Powered Inferenz

无免费午餐:预测力推论的非心理分析 2505.20178v1

Authors: Pranav Mani, Peng Xu, Zachary C. Lipton, Michael Oberst

Prediction-Powered Inference (PPI) is a popular strategy for combining gold-standard and possibly noisy pseudo-labels to perform statistical estimation. Prior work has shown an asymptotic “free lunch” for PPI++, an adaptive form of PPI, showing that the asymptotic variance of PPI++ is always less than or equal to the variance obtained from using gold-standard labels alone. Notably, this result holds regardless of the quality of the pseudo-labels. In this work, we demystify this result by conducting an exact finite-sample analysis of the estimation error of PPI++ on the mean estimation problem. We give a “no free lunch” result, characterizing the settings (and sample sizes) where PPI++ has provably worse estimation error than using gold-standard labels alone. Specifically, PPI++ will outperform if and only if the correlation between pseudo- and gold-standard is above a certain level that depends on the number of labeled samples ($n$). In some cases our results simplify considerably: For Gaussian data, the correlation must be at least $1/\sqrt{n - 2}$ in order to see improvement, and a similar result holds for binary labels. In experiments, we illustrate that our theoretical findings hold on real-world datasets, and give insights into trade-offs between single-sample and sample-splitting variants of PPI++.

nan

Article 1295

Title@2025-05-26 (1): The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination

Title: The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination

Die Macht des iterativen Filterns für überwachtes Lernen mit (schwerer) Kontaminierung

受监督学习(重)污染的迭代过滤功能 2505.20177v1

Authors: Adam R. Klivans, Konstantinos Stavropoulos, Kevin Tian, Arsen Vasilyan

Inspired by recent work on learning with distribution shift, we give a general outlier removal algorithm called iterative polynomial filtering and show a number of striking applications for supervised learning with contamination: (1) We show that any function class that can be approximated by low-degree polynomials with respect to a hypercontractive distribution can be efficiently learned under bounded contamination (also known as nasty noise). This is a surprising resolution to a longstanding gap between the complexity of agnostic learning and learning with contamination, as it was widely believed that low-degree approximators only implied tolerance to label noise. (2) For any function class that admits the (stronger) notion of sandwiching approximators, we obtain near-optimal learning guarantees even with respect to heavy additive contamination, where far more than $1/2$ of the training set may be added adversarially. Prior related work held only for regression and in a list-decodable setting. (3) We obtain the first efficient algorithms for tolerant testable learning of functions of halfspaces with respect to any fixed log-concave distribution. Even the non-tolerant case for a single halfspace in this setting had remained open. These results significantly advance our understanding of efficient supervised learning under contamination, a setting that has been much less studied than its unsupervised counterpart.

nan

Article 1296

Title@2025-05-26 (1): “KAN you hear me?” Exploring Kolmogorov-Arnold Networks for Spoken Language Understanding

Title: “KAN you hear me?” Exploring Kolmogorov-Arnold Networks for Spoken Language Understanding

“KAN hörst du mich?” Kolmogorov-Arnold-Netzwerke für gesprochenes Sprachverständnis erkunden

探索科尔莫戈洛夫-阿诺尔德语言理解网络 2505.20176v1

Authors: Alkis Koudounas, Moreno La Quatra, Eliana Pastor, Sabato Marco Siniscalchi, Elena Baralis

Kolmogorov-Arnold Networks (KANs) have recently emerged as a promising alternative to traditional neural architectures, yet their application to speech processing remains under explored. This work presents the first investigation of KANs for Spoken Language Understanding (SLU) tasks. We experiment with 2D-CNN models on two datasets, integrating KAN layers in five different configurations within the dense block. The best-performing setup, which places a KAN layer between two linear layers, is directly applied to transformer-based models and evaluated on five SLU datasets with increasing complexity. Our results show that KAN layers can effectively replace the linear layers, achieving comparable or superior performance in most cases. Finally, we provide insights into how KAN and linear layers on top of transformers differently attend to input regions of the raw waveforms.

nan

Article 1297

Title@2025-05-26 (1): mPOLICE: Provable Enforcement of Multi-Region Affine Constraints in Deep Neural Networks

Title: mPOLICE: Provable Enforcement of Multi-Region Affine Constraints in Deep Neural Networks

mPOLICE: Wahrscheinliche Durchsetzung von Multi-Region Affine-Konstraints in tiefen neuralen Netzwerken

MPOLICE: 在深神经网络中以可行方式执行多种区域同系限制 2502.02434v2

Authors: Mohammadmehdi Ataei, Hyunmin Cheong, Adrian Butscher

Deep neural networks are increasingly used in safety-critical domains such as robotics and scientific modeling, where strict adherence to output constraints is essential. Methods like POLICE, which are tailored for single convex regions, face challenges when extended to multiple disjoint regions, often leading to constraint violations or unwanted affine behavior across regions. This paper proposes mPOLICE, a new approach that generalizes POLICE to provably enforce affine constraints over multiple disjoint convex regions. At its core, mPOLICE assigns distinct neuron activation patterns to each constrained region, enabling localized affine behavior and avoiding unintended generalization. This is implemented through a layer-wise optimization of the network parameters. Additionally, we introduce a training algorithm that incorporates mPOLICE into conventional deep learning pipelines, balancing task-specific performance with constraint enforcement using periodic sign pattern enforcement. We validate the flexibility and effectiveness of mPOLICE through experiments across various applications, including safety-critical reinforcement learning, implicit 3D shape representation with geometric constraints, and fluid dynamics simulations with boundary condition enforcement. Importantly, mPOLICE incurs no runtime overhead during inference, making it a practical and reliable solution for constraint handling in deep neural networks.

nan

Article 1298

Title@2025-05-26 (1): Virtual Cells: Predict, Explain, Discover

Title: Virtual Cells: Predict, Explain, Discover

Virtuelle Zellen: Vorhersagen, Erklären, Entdecken

虚拟细胞: 预测、解释、发现 2505.14613v2

Authors: Emmanuel Noutahi, Jason Hartford, Prudencio Tossou, Shawn Whitfield, Alisandra K. Denton, Cas Wognum, Kristina Ulicna, Michael Craig, Jonathan Hsu, Michael Cuccarese, Emmanuel Bengio, Dominique Beaini, Christopher Gibson, Daniel Cohen, Berton Earnshaw

Drug discovery is fundamentally a process of inferring the effects of treatments on patients, and would therefore benefit immensely from computational models that can reliably simulate patient responses, enabling researchers to generate and test large numbers of therapeutic hypotheses safely and economically before initiating costly clinical trials. Even a more specific model that predicts the functional response of cells to a wide range of perturbations would be tremendously valuable for discovering safe and effective treatments that successfully translate to the clinic. Creating such virtual cells has long been a goal of the computational research community that unfortunately remains unachieved given the daunting complexity and scale of cellular biology. Nevertheless, recent advances in AI, computing power, lab automation, and high-throughput cellular profiling provide new opportunities for reaching this goal. In this perspective, we present a vision for developing and evaluating virtual cells that builds on our experience at Recursion. We argue that in order to be a useful tool to discover novel biology, virtual cells must accurately predict the functional response of a cell to perturbations and explain how the predicted response is a consequence of modifications to key biomolecular interactions. We then introduce key principles for designing therapeutically-relevant virtual cells, describe a lab-in-the-loop approach for generating novel insights with them, and advocate for biologically-grounded benchmarks to guide virtual cell development. Finally, we make the case that our approach to virtual cells provides a useful framework for building other models at higher levels of organization, including virtual patients. We hope that these directions prove useful to the research community in developing virtual models optimized for positive impact on drug discovery outcomes.

nan

Article 1299

Title@2025-05-26 (1): A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation

Title: A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation

Ein theoretischer Rahmen für Grokking: Interpolation gefolgt von Riemannsche Norm Minimierung

Grokking理论框架:内插,然后是Riemannian Norm 最小化 2505.20172v1

Authors: Etienne Boursier, Scott Pesme, Radu-Alexandru Dragomir

We study the dynamics of gradient flow with small weight decay on general training losses $F: \mathbb{R}^d \to \mathbb{R}$. Under mild regularity assumptions and assuming convergence of the unregularised gradient flow, we show that the trajectory with weight decay $\lambda$ exhibits a two-phase behaviour as $\lambda \to 0$. During the initial fast phase, the trajectory follows the unregularised gradient flow and converges to a manifold of critical points of $F$. Then, at time of order $1/\lambda$, the trajectory enters a slow drift phase and follows a Riemannian gradient flow minimising the $\ell_2$-norm of the parameters. This purely optimisation-based phenomenon offers a natural explanation for the \textit{grokking} effect observed in deep learning, where the training loss rapidly reaches zero while the test loss plateaus for an extended period before suddenly improving. We argue that this generalisation jump can be attributed to the slow norm reduction induced by weight decay, as explained by our analysis. We validate this mechanism empirically on several synthetic regression tasks.

nan

Article 1300

Title@2025-05-26 (1): From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data

Title: From Alignment to Advancement: Bootstrapping Audio-Language Alignment with Synthetic Data

Von der Ausrichtung zur Weiterentwicklung: Bootstrapping Audio-Language Alignment mit synthetischen Daten

从对齐到推进: 用合成数据推动音频语言对齐 2505.20166v1

Authors: Chun-Yi Kuan, Hung-yi Lee

Audio-aware large language models (ALLMs) have recently made great strides in understanding and processing audio inputs. These models are typically adapted from text-based large language models (LLMs) through additional training on audio-related tasks. However, this adaptation process presents two major limitations. First, ALLMs often suffer from catastrophic forgetting, where important textual capabilities such as instruction-following are lost after training on audio data. In some cases, models may even hallucinate sounds that are not present in the input audio, raising concerns about their reliability. Second, achieving cross-modal alignment between audio and language typically relies on large collections of task-specific question-answer pairs for instruction tuning, making the process resource-intensive. To address these issues, we leverage the backbone LLMs from ALLMs to synthesize general-purpose caption-style alignment data. We refer to this process as bootstrapping audio-language alignment via synthetic data generation from backbone LLMs (BALSa). Building on BALSa, we introduce LISTEN (Learning to Identify Sounds Through Extended Negative Samples), a contrastive-like training method designed to improve ALLMs’ ability to distinguish between present and absent sounds. We further extend BALSa to multi-audio scenarios, where the model either explains the differences between audio inputs or produces a unified caption that describes them all, thereby enhancing audio-language alignment. Experimental results indicate that our method effectively mitigates audio hallucinations while reliably maintaining strong performance in audio understanding, reasoning, and instruction-following skills. Moreover, incorporating multi-audio training further enhances the model’s comprehension and reasoning capabilities. Overall, BALSa offers an efficient and scalable approach to the development of ALLMs.

nan

Article 1301

Title@2025-05-26 (1): Capability-Based Scaling Laws for LLM Red-Teaming

Title: Capability-Based Scaling Laws for LLM Red-Teaming

Capability-Based Scaling-Gesetze für LLM Red-Teaming

LLM 红色团队合作以能力为基础的增强法律 2505.20162v1

Authors: Alexander Panfilov, Paul Kassianik, Maksym Andriushchenko, Jonas Geiping

As large language models grow in capability and agency, identifying vulnerabilities through red-teaming becomes vital for safe deployment. However, traditional prompt-engineering approaches may prove ineffective once red-teaming turns into a weak-to-strong problem, where target models surpass red-teamers in capabilities. To study this shift, we frame red-teaming through the lens of the capability gap between attacker and target. We evaluate more than 500 attacker-target pairs using LLM-based jailbreak attacks that mimic human red-teamers across diverse families, sizes, and capability levels. Three strong trends emerge: (i) more capable models are better attackers, (ii) attack success drops sharply once the target’s capability exceeds the attacker’s, and (iii) attack success rates correlate with high performance on social science splits of the MMLU-Pro benchmark. From these trends, we derive a jailbreaking scaling law that predicts attack success for a fixed target based on attacker-target capability gap. These findings suggest that fixed-capability attackers (e.g., humans) may become ineffective against future models, increasingly capable open-source models amplify risks for existing systems, and model providers must accurately measure and control models’ persuasive and manipulative abilities to limit their effectiveness as attackers.

nan

Article 1302

Title@2025-05-26 (1): Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Title: Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Prismatische Synthese: Gradientenbasierte Datendiversifizierung steigert Generalisierung in LLM-Reasoning

理论综合:基于逐步的数据多样化促进LLM理由说明的概括化 2505.20161v1

Authors: Jaehun Jung, Seungju Han, Ximing Lu, Skyler Hallinan, David Acuna, Shrimai Prabhumoye, Mostafa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Yejin Choi

Effective generalization in language models depends critically on the diversity of their training data. Yet existing diversity metrics often fall short of this goal, relying on surface-level heuristics that are decoupled from model behavior. This motivates us to ask: What kind of diversity in training data actually drives generalization in language models – and how can we measure and amplify it? Through large-scale empirical analyses spanning over 300 training runs, carefully controlled for data scale and quality, we show that data diversity can be a strong predictor of generalization in LLM reasoning – as measured by average model performance on unseen out-of-distribution benchmarks. We introduce G-Vendi, a metric that quantifies diversity via the entropy of model-induced gradients. Despite using a small off-the-shelf proxy model for gradients, G-Vendi consistently outperforms alternative measures, achieving strong correlation (Spearman’s $\rho \approx 0.9$) with out-of-distribution (OOD) performance on both natural language inference (NLI) and math reasoning tasks. Building on this insight, we present Prismatic Synthesis, a framework for generating diverse synthetic data by targeting underrepresented regions in gradient space. Experimental results show that Prismatic Synthesis consistently improves model performance as we scale synthetic data – not just on in-distribution test but across unseen, out-of-distribution benchmarks – significantly outperforming state-of-the-art models that rely on 20 times larger data generator than ours. For example, PrismMath-7B, our model distilled from a 32B LLM, outperforms R1-Distill-Qwen-7B – the same base model trained on proprietary data generated by 671B R1 – on 6 out of 7 challenging benchmarks.

nan

Article 1303

Title@2025-05-26 (1): Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities

Title: Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities

Gedachte politische Optimierung: Überwindung externer Leitlinien und interner Fähigkeiten

优化政策:将外部指导和内部能力结合起来 2505.15692v2

Authors: Jinyang Wu, Chonghua Liao, Mingkuan Feng, Shuai Zhang, Zhengqi Wen, Pengpeng Shao, Huazhe Xu, Jianhua Tao

Reinforcement learning (RL) has emerged as an effective method for training reasoning models. However, existing RL approaches typically bias the model’s output distribution toward reward-maximizing paths without introducing external knowledge. This limits their exploration capacity and results in a narrower reasoning capability boundary compared to base models. To address this limitation, we propose TAPO (Thought-Augmented Policy Optimization), a novel framework that augments RL by incorporating external high-level guidance (“thought patterns”). By adaptively integrating structured thoughts during training, TAPO effectively balances model-internal exploration and external guidance exploitation. Extensive experiments show that our approach significantly outperforms GRPO by 99% on AIME, 41% on AMC, and 17% on Minerva Math. Notably, these high-level thought patterns, abstracted from only 500 prior samples, generalize effectively across various tasks and models. This highlights TAPO’s potential for broader applications across multiple tasks and domains. Our further analysis reveals that introducing external guidance produces powerful reasoning models with superior explainability of inference behavior and enhanced output readability.

nan

Article 1304

Title@2025-05-26 (1): Polynomial, trigonometric, and tropical activations

Title: Polynomial, trigonometric, and tropical activations

Polynomische, trigonometrische und tropische Aktivierungen

多边、三角和热带活性 2502.01247v2

Authors: Ismail Khalfaoui-Hassani, Stefan Kesselheim

Which functions can be used as activations in deep neural networks? This article explores families of functions based on orthonormal bases, including the Hermite polynomial basis and the Fourier trigonometric basis, as well as a basis resulting from the tropicalization of a polynomial basis. Our study shows that, through simple variance-preserving initialization and without additional clamping mechanisms, these activations can successfully be used to train deep models, such as GPT-2 for next-token prediction on OpenWebText and ConvNeXt for image classification on ImageNet. Our work addresses the issue of exploding and vanishing activations and gradients, particularly prevalent with polynomial activations, and opens the door for improving the efficiency of large-scale learning tasks. Furthermore, our approach provides insight into the structure of neural networks, revealing that networks with polynomial activations can be interpreted as multivariate polynomial mappings. Finally, using Hermite interpolation, we show that our activations can closely approximate classical ones in pre-trained models by matching both the function and its derivative, making them especially useful for fine-tuning tasks. These activations are available in the torchortho library, which can be accessed via: https://github.com/K-H-Ismail/torchortho.

nan

Article 1305

Title@2025-05-26 (1): On the (Non) Injectivity of Piecewise Linear Janossy Pooling

Title: On the (Non) Injectivity of Piecewise Linear Janossy Pooling

Auf der (Nicht-)Injektivität der stückweise linearen Janossy-Pooling

在Peaxy Linear Janosy 集合的喷射上, 2505.20150v1

Authors: Ilai Reshef, Nadav Dym

Multiset functions, which are functions that map multisets to vectors, are a fundamental tool in the construction of neural networks for multisets and graphs. To guarantee that the vector representation of the multiset is faithful, it is often desirable to have multiset mappings that are both injective and bi-Lipschitz. Currently, there are several constructions of multiset functions achieving both these guarantees, leading to improved performance in some tasks but often also to higher compute time than standard constructions. Accordingly, it is natural to inquire whether simpler multiset functions achieving the same guarantees are available. In this paper, we make a large step towards giving a negative answer to this question. We consider the family of k-ary Janossy pooling, which includes many of the most popular multiset models, and prove that no piecewise linear Janossy pooling function can be injective. On the positive side, we show that when restricted to multisets without multiplicities, even simple deep-sets models suffice for injectivity and bi-Lipschitzness.

nan

Article 1306

Title@2025-05-26 (1): SeMe: Training-Free Language Model Merging via Semantic Alignment

Title: SeMe: Training-Free Language Model Merging via Semantic Alignment

SeMe: Training-freies Sprachmodell Zusammenführen über semantische Ausrichtung

SeME:通过语义一致合并的无培训语言模式 2505.20144v1

Authors: Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

Despite the remarkable capabilities of Language Models (LMs) across diverse tasks, no single model consistently outperforms others, necessitating efficient methods to combine their strengths without expensive retraining. Existing model merging techniques, such as parameter averaging and task-guided fusion, often rely on data-dependent computations or fail to preserve internal knowledge, limiting their robustness and scalability. We introduce SeMe (Semantic-based Merging), a novel, data-free, and training-free approach that leverages latent semantic alignment to merge LMs at a fine-grained, layer-wise level. Unlike prior work, SeMe not only preserves model behaviors but also explicitly stabilizes internal knowledge, addressing a critical gap in LM fusion. Through extensive experiments across diverse architectures and tasks, we demonstrate that SeMe outperforms existing methods in both performance and efficiency while eliminating reliance on external data. Our work establishes a new paradigm for knowledge-aware model merging and provides insights into the semantic structure of LMs, paving the way for more scalable and interpretable model composition.

nan

Article 1307

Title@2025-05-26 (1): Model Stitching by Functional Latent Alignment

Title: Model Stitching by Functional Latent Alignment

Modellstitching durch funktionale Latent Alignment

通过功能性前端对齐进行模型切换 2505.20142v1

Authors: Ioannis Athanasiadis, Anmar Karmush, Michael Felsberg

Evaluating functional similarity involves quantifying the degree to which independently trained neural networks learn functionally similar representations. Reliably inferring the functional similarity of these networks remains an open problem with far-reaching implications for AI. Model stitching has emerged as a promising paradigm, where an optimal affine transformation aligns two models to solve a task, with the stitched model serving as a proxy for functional similarity. In this work, we draw inspiration from the knowledge distillation literature and propose Functional Latent Alignment (FuLA) as a novel optimality condition for model stitching. We revisit previously explored functional similarity testbeds and introduce a new one, based on which FuLA emerges as an overall more reliable method of functional similarity. Specifically, our experiments in (a) adversarial training, (b) shortcut training and, (c) cross-layer stitching, reveal that FuLA is less prone to artifacts tied to training on task cues while achieving non-trivial alignments that are missed by stitch-level matching.

nan

Article 1308

Title@2025-05-26 (1): GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models

Title: GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models

GUARD: Rollenspiel zur Generierung von Jailbreakings in natürlicher Sprache zur Prüfung der Einhaltung der Leitlinie für große Sprachmodelle

GUARD: 利用《大语言模式遵守试验准则准则》创造以自然语言破门破门 2402.03299v5

Authors: Haibo Jin, Ruoxi Chen, Peiyan Zhang, Andy Zhou, Yang Zhang, Haohan Wang

The discovery of “jailbreaks” to bypass safety filters of Large Language Models (LLMs) and harmful responses have encouraged the community to implement safety measures. One major safety measure is to proactively test the LLMs with jailbreaks prior to the release. Therefore, such testing will require a method that can generate jailbreaks massively and efficiently. In this paper, we follow a novel yet intuitive strategy to generate jailbreaks in the style of the human generation. We propose a role-playing system that assigns four different roles to the user LLMs to collaborate on new jailbreaks. Furthermore, we collect existing jailbreaks and split them into different independent characteristics using clustering frequency and semantic patterns sentence by sentence. We organize these characteristics into a knowledge graph, making them more accessible and easier to retrieve. Our system of different roles will leverage this knowledge graph to generate new jailbreaks, which have proved effective in inducing LLMs to generate unethical or guideline-violating responses. In addition, we also pioneer a setting in our system that will automatically follow the government-issued guidelines to generate jailbreaks to test whether LLMs follow the guidelines accordingly. We refer to our system as GUARD (Guideline Upholding through Adaptive Role-play Diagnostics). We have empirically validated the effectiveness of GUARD on three cutting-edge open-sourced LLMs (Vicuna-13B, LongChat-7B, and Llama-2-7B), as well as a widely-utilized commercial LLM (ChatGPT). Moreover, our work extends to the realm of vision language models (MiniGPT-v2 and Gemini Vision Pro), showcasing GUARD’s versatility and contributing valuable insights for the development of safer, more reliable LLM-based applications across diverse modalities.

nan

Article 1309

Title@2025-05-26 (1): Error Optimization: Overcoming Exponential Signal Decay in Deep Predictive Coding Networks

Title: Error Optimization: Overcoming Exponential Signal Decay in Deep Predictive Coding Networks

Fehler-Optimierung: Überwindung exponentieller Signaldekay in tiefen vorausschauenden Codierungsnetzwerken

错误优化 : 克服深预报编码网络中的指数信号衰减 2505.20137v1

Authors: Cédric Goemaere, Gaspard Oliviers, Rafal Bogacz, Thomas Demeester

Predictive Coding (PC) offers a biologically plausible alternative to backpropagation for neural network training, yet struggles with deeper architectures. This paper identifies the root cause: an inherent signal decay problem where gradients attenuate exponentially with depth, becoming computationally negligible due to numerical precision constraints. To address this fundamental limitation, we introduce Error Optimization (EO), a novel reparameterization that preserves PC’s theoretical properties while eliminating signal decay. By optimizing over prediction errors rather than states, EO enables signals to reach all layers simultaneously and without attenuation, converging orders of magnitude faster than standard PC. Experiments across multiple architectures and datasets demonstrate that EO matches backpropagation’s performance even for deeper models where conventional PC struggles. Besides practical improvements, our work provides theoretical insight into PC dynamics and establishes a foundation for scaling biologically-inspired learning to deeper architectures on digital hardware and beyond.

nan

Article 1310

Title@2025-05-26 (1): P$^2$ Law: Scaling Law for Post-Training After Model Pruning

Title: P$^2$ Law: Scaling Law for Post-Training After Model Pruning

P$^2$ Gesetz: Skalierungsgesetz für Post-Training nach Modellprüfung

P$2美元法律:示范 “ 谨慎 “ 后培训后培训后扩大法 2411.10272v3

Authors: Xiaodong Chen, Yuxuan Hu, Xiaokang Zhang, Yanling Wang, Cuiping Li, Hong Chen, Jing Zhang

Pruning has become a widely adopted technique for reducing the hardware requirements of large language models (LLMs). To recover model performance after pruning, post-training is commonly employed to mitigate the resulting performance degradation. While post-training benefits from larger datasets, once the dataset size is already substantial, increasing the training data provides only limited performance gains. To balance post-training cost and model performance, it is necessary to explore the optimal amount of post-training data.Through extensive experiments on the Llama-3 and Qwen-2.5 series models, pruned using various common pruning methods, we uncover the scaling \textbf{Law} for \textbf{P}ost-training after model \textbf{P}runing, referred to as the P$^2$ Law.This law identifies four key factors for predicting the pruned model’s post-training loss: the model size before pruning, the number of post-training tokens, the pruning rate, and the model’s loss before pruning. Moreover, P$^2$ Law can generalize to larger dataset sizes, larger model sizes, and higher pruning rates, offering valuable insights for the post-training of pruned LLMs.

nan

Article 1311

Title@2025-05-26 (1): AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings

Title: AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings

AweDist: Aufmerksamkeitsbewusste Einbettung Destillation für neue Eingabe-Token-Einbettungen

AweDist: 新的输入式嵌入式嵌入器的注意嵌入蒸馏 2505.20133v1

Authors: Konstantin Dobler, Desmond Elliott, Gerard de Melo

Current language models rely on static vocabularies determined at pretraining time, which can lead to decreased performance and increased computational cost for domains underrepresented in the original vocabulary. New tokens can be added to solve this problem, when coupled with a good initialization for their new embeddings. However, existing embedding initialization methods either require expensive further training or pretraining of additional modules. In this paper, we propose AweDist and show that by distilling representations obtained using the original tokenization, we can quickly learn high-quality input embeddings for new tokens. Experimental results with a wide range of open-weight models show that AweDist is able to outperform even strong baselines.

nan

Article 1312

Title@2025-05-26 (1): InfoBridge: Mutual Information estimation via Bridge Matching

Title: InfoBridge: Mutual Information estimation via Bridge Matching

InfoBridge: Gegenseitige Informationsschätzung über Bridge Matching

InfoBridge:通过桥梁匹配进行相互信息估计 2502.01383v2

Authors: Sergei Kholkin, Ivan Butakov, Evgeny Burnaev, Nikita Gushchin, Alexander Korotin

Diffusion bridge models have recently become a powerful tool in the field of generative modeling. In this work, we leverage their power to address another important problem in machine learning and information theory, the estimation of the mutual information (MI) between two random variables. We show that by using the theory of diffusion bridges, one can construct an unbiased estimator for data posing difficulties for conventional MI estimators. We showcase the performance of our estimator on two standard MI estimation benchmarks, i.e., low-dimensional and image-based, and on real-world data, i.e., protein language model embeddings.

nan

Article 1313

Title@2025-05-26 (1): Outcome-based Reinforcement Learning to Predict the Future

Title: Outcome-based Reinforcement Learning to Predict the Future

Ergebnisbasiertes Bewehrungslernen zur Vorhersage der Zukunft

基于成果的强化学习,以预测未来 2505.17989v2

Authors: Benjamin Turtel, Danny Franklin, Kris Skotheim, Luke Hewitt, Philipp Schoenegger

Reinforcement learning with verifiable rewards (RLVR) has boosted math and coding in large language models, yet there has been little effort to extend RLVR into messier, real-world domains like forecasting. One sticking point is that outcome-based reinforcement learning for forecasting must learn from binary, delayed, and noisy rewards, a regime where standard fine-tuning is brittle. We show that outcome-only online RL on a 14B model can match frontier-scale accuracy and surpass it in calibration and hypothetical prediction market betting by adapting two leading algorithms, Group-Relative Policy Optimisation (GRPO) and ReMax, to the forecasting setting. Our adaptations remove per-question variance scaling in GRPO, apply baseline-subtracted advantages in ReMax, hydrate training with 100k temporally consistent synthetic questions, and introduce lightweight guard-rails that penalise gibberish, non-English responses and missing rationales, enabling a single stable pass over 110k events. Scaling ReMax to 110k questions and ensembling seven predictions yields a 14B model that matches frontier baseline o1 on accuracy on our holdout set (Brier = 0.193, p = 0.23) while beating it in calibration (ECE = 0.042, p < 0.001). A simple trading rule turns this calibration edge into $127 of hypothetical profit versus $92 for o1 (p = 0.037). This demonstrates that refined RLVR methods can convert small-scale LLMs into potentially economically valuable forecasting tools, with implications for scaling this to larger models.

nan

Article 1314

Title@2025-05-26 (1): Tensorization is a powerful but underexplored tool for compression and interpretability of neural networks

Title: Tensorization is a powerful but underexplored tool for compression and interpretability of neural networks

Tensorisierung ist ein leistungsfähiges, aber unerforschtes Werkzeug zur Kompression und Interpretationsfähigkeit neuronaler Netzwerke

电温是压缩和解释神经网络的强大但探索不足的工具 2505.20132v1

Authors: Safa Hamreras, Sukhbinder Singh, Román Orús

Tensorizing a neural network involves reshaping some or all of its dense weight matrices into higher-order tensors and approximating them using low-rank tensor network decompositions. This technique has shown promise as a model compression strategy for large-scale neural networks. However, despite encouraging empirical results, tensorized neural networks (TNNs) remain underutilized in mainstream deep learning. In this position paper, we offer a perspective on both the potential and current limitations of TNNs. We argue that TNNs represent a powerful yet underexplored framework for deep learning–one that deserves greater attention from both engineering and theoretical communities. Beyond compression, we highlight the value of TNNs as a flexible class of architectures with distinctive scaling properties and increased interpretability. A central feature of TNNs is the presence of bond indices, which introduce new latent spaces not found in conventional networks. These internal representations may provide deeper insight into the evolution of features across layers, potentially advancing the goals of mechanistic interpretability. We conclude by outlining several key research directions aimed at overcoming the practical barriers to scaling and adopting TNNs in modern deep learning workflows.

nan

Article 1315

Title@2025-05-26 (1): MolEditRL: Structure-Preserving Molecular Editing via Discrete Diffusion and Reinforcement Learning

Title: MolEditRL: Structure-Preserving Molecular Editing via Discrete Diffusion and Reinforcement Learning

MolEditRL: Strukturschonende molekulare Bearbeitung durch diskretes Diffusions- und Verstärkungslernen

MoldEditRL:通过分解分解和扩散及强化学习保持结构的分子编辑 2505.20131v1

Authors: Yuanxin Zhuang, Dazhong Shen, Ying Sun

Molecular editing aims to modify a given molecule to optimize desired chemical properties while preserving structural similarity. However, current approaches typically rely on string-based or continuous representations, which fail to adequately capture the discrete, graph-structured nature of molecules, resulting in limited structural fidelity and poor controllability. In this paper, we propose MolEditRL, a molecular editing framework that explicitly integrates structural constraints with precise property optimization. Specifically, MolEditRL consists of two stages: (1) a discrete graph diffusion model pretrained to reconstruct target molecules conditioned on source structures and natural language instructions; (2) an editing-aware reinforcement learning fine-tuning stage that further enhances property alignment and structural preservation by explicitly optimizing editing decisions under graph constraints. For comprehensive evaluation, we construct MolEdit-Instruct, the largest and most property-rich molecular editing dataset, comprising 3 million diverse examples spanning single- and multi-property tasks across 10 chemical attributes. Experimental results demonstrate that MolEditRL significantly outperforms state-of-the-art methods in both property optimization accuracy and structural fidelity, achieving a 74\% improvement in editing success rate while using 98\% fewer parameters.

nan

Article 1316

Title@2025-05-26 (1): Balancing Interference and Correlation in Spatial Experimental Designs: A Causal Graph Cut Approach

Title: Balancing Interference and Correlation in Spatial Experimental Designs: A Causal Graph Cut Approach

Balance zwischen Interferenz und Korrelation in räumlichen Experimentaldesigns: Ein ursächlicher Graphenschnitt-Ansatz

空间实验设计中平衡干扰和关联:因果图表切割法 2505.20130v1

Authors: Zhu Jin, Li Jingyi, Zhou Hongyi, Lin Yinan, Lin Zhenhua, Shi Chengchun

This paper focuses on the design of spatial experiments to optimize the amount of information derived from the experimental data and enhance the accuracy of the resulting causal effect estimator. We propose a surrogate function for the mean squared error (MSE) of the estimator, which facilitates the use of classical graph cut algorithms to learn the optimal design. Our proposal offers three key advances: (1) it accommodates moderate to large spatial interference effects; (2) it adapts to different spatial covariance functions; (3) it is computationally efficient. Theoretical results and numerical experiments based on synthetic environments and a dispatch simulator that models a city-scale ridesharing market, further validate the effectiveness of our design. A python implementation of our method is available at https://github.com/Mamba413/CausalGraphCut.

nan

Article 1317

Title@2025-05-26 (1): Uncertainty Quantification for LLM-Based Survey Simulations

Title: Uncertainty Quantification for LLM-Based Survey Simulations

Ungewissheitsquantifizierung für LLM-basierte Umfragesimulationen

以LLM为基础的LLM调查模拟器的不确定性定量 2502.17773v3

Authors: Chengpiao Huang, Yuhang Wu, Kaizheng Wang

We investigate the use of large language models (LLMs) to simulate human responses to survey questions, and perform uncertainty quantification to gain reliable insights. Our approach converts imperfect LLM-simulated responses into confidence sets for population parameters of human responses, addressing the distribution shift between the simulated and real populations. A key innovation lies in determining the optimal number of simulated responses: too many produce overly narrow confidence sets with poor coverage, while too few yield excessively loose estimates. To resolve this, our method adaptively selects the simulation sample size, ensuring valid average-case coverage guarantees. It is broadly applicable to any LLM, irrespective of its fidelity, and any procedure for constructing confidence sets. Additionally, the selected sample size quantifies the degree of misalignment between the LLM and the target human population. We illustrate our method on real datasets and LLMs.

nan

Article 1318

Title@2025-05-26 (1): From Tables to Time: How TabPFN-v2 Outperforms Specialized Time Series Forecasting Models

Title: From Tables to Time: How TabPFN-v2 Outperforms Specialized Time Series Forecasting Models

Von Tabellen zur Zeit: Wie TabPFN-v2 Modelle der speziellen Zeitreihenvorhersage übertrifft

从表格到时间: TabPFN-v2 如何表现超过专门时间序列预测模型 2501.02945v3

Authors: Shi Bin Hoo, Samuel Müller, David Salinas, Frank Hutter

Foundation models have become increasingly popular for forecasting due to their ability to provide predictions without requiring a lot of training data. In this work, we demonstrate how TabPFN-v2, a general tabular foundation model, can be effectively applied to time series forecasting. We introduce TabPFN-TS, a simple method that combines TabPFN-v2 with lightweight feature engineering to enable both point and probabilistic forecasting. Despite its simplicity and compact size (11M parameters), TabPFN-TS achieves top rank on the public GIFT-Eval leaderboard in both forecasting tasks. Through ablation studies, we investigate factors contributing to this surprising effectiveness, especially considering TabPFN-v2 was pretrained solely on synthetic tabular data with no exposure to time series. Our results highlights the potential of tabular foundation models like TabPFN-v2 as a valuable new approach for time series forecasting. Our implementation is available at https://github.com/PriorLabs/tabpfn-time-series.

nan

Article 1319

Title@2025-05-26 (1): Understanding Generalization in Diffusion Models via Probability Flow Distance

Title: Understanding Generalization in Diffusion Models via Probability Flow Distance

Verallgemeinerung in Diffusionsmodellen über Wahrscheinlichkeitsflussentfernung verstehen

通过概率流动远距离理解扩散模型的通用化 2505.20123v1

Authors: Huijie Zhang, Zijian Huang, Siyi Chen, Jinfan Zhou, Zekai Zhang, Peng Wang, Qing Qu

Diffusion models have emerged as a powerful class of generative models, capable of producing high-quality samples that generalize beyond the training data. However, evaluating this generalization remains challenging: theoretical metrics are often impractical for high-dimensional data, while no practical metrics rigorously measure generalization. In this work, we bridge this gap by introducing probability flow distance ($\texttt{PFD}$), a theoretically grounded and computationally efficient metric to measure distributional generalization. Specifically, $\texttt{PFD}$ quantifies the distance between distributions by comparing their noise-to-data mappings induced by the probability flow ODE. Moreover, by using $\texttt{PFD}$ under a teacher-student evaluation protocol, we empirically uncover several key generalization behaviors in diffusion models, including: (1) scaling behavior from memorization to generalization, (2) early learning and double descent training dynamics, and (3) bias-variance decomposition. Beyond these insights, our work lays a foundation for future empirical and theoretical studies on generalization in diffusion models.

nan

Article 1320

Title@2025-05-26 (1): Likelihood-Ratio Regularized Quantile Regression: Adapting Conformal Prediction to High-Dimensional Covariate Shifts

Title: Likelihood-Ratio Regularized Quantile Regression: Adapting Conformal Prediction to High-Dimensional Covariate Shifts

Likelihood-Ratio Regularized Quantile Regression: Anpassung der konformen Vorhersage an hochdimensionale Kovariate Verschiebungen

常规量化递减:调整对高多元共变变化的正规预测 2502.13030v2

Authors: Sunay Joshi, Shayan Kiyani, George Pappas, Edgar Dobriban, Hamed Hassani

We consider the problem of conformal prediction under covariate shift. Given labeled data from a source domain and unlabeled data from a covariate shifted target domain, we seek to construct prediction sets with valid marginal coverage in the target domain. Most existing methods require estimating the unknown likelihood ratio function, which can be prohibitive for high-dimensional data such as images. To address this challenge, we introduce the likelihood ratio regularized quantile regression (LR-QR) algorithm, which combines the pinball loss with a novel choice of regularization in order to construct a threshold function without directly estimating the unknown likelihood ratio. We show that the LR-QR method has coverage at the desired level in the target domain, up to a small error term that we can control. Our proofs draw on a novel analysis of coverage via stability bounds from learning theory. Our experiments demonstrate that the LR-QR algorithm outperforms existing methods on high-dimensional prediction tasks, including a regression task for the Communities and Crime dataset, an image classification task from the WILDS repository, and an LLM question-answering task on the MMLU benchmark.

nan

Article 1321

Title@2025-05-26 (1): Algorithmic Control Improves Residential Building Energy and EV Management when PV Capacity is High but Battery Capacity is Low

Title: Algorithmic Control Improves Residential Building Energy and EV Management when PV Capacity is High but Battery Capacity is Low

Algorithmische Steuerung verbessert Wohngebäude Energie-und EV-Management, wenn PV-Kapazität ist hoch, aber Batterie-Kapazität ist gering

当光电池容量高但电池容量低时,控制电量控制改进住宅建筑的能源和EV管理,改善住宅建筑的能源和EV管理 2505.20377v1

Authors: Lennart Ullner, Alona Zharova, Felix Creutzig

Efficient energy management in prosumer households is key to alleviating grid stress in an energy transition marked by electric vehicles (EV), renewable energies and battery storage. However, it is unclear how households optimize prosumer EV charging. Here we study real-world data from 90 households on fixed-rate electricity tariffs in German-speaking countries to investigate the potential of Deep Reinforcement Learning (DRL) and other control approaches (Rule-Based, Model Predictive Control) to manage the dynamic and uncertain environment of Home Energy Management (HEM) and optimize household charging patterns. The DRL agent efficiently aligns charging of EV and battery storage with photovoltaic (PV) surplus. We find that frequent EV charging transactions, early EV connections and PV surplus increase optimization potential. A detailed analysis of nine households (1 hour resolution, 1 year) demonstrates that high battery capacity facilitates self optimization; in this case further algorithmic control shows little value. In cases with relatively low battery capacity, algorithmic control with DRL improves energy management and cost savings by a relevant margin. This result is further corroborated by our simulation of a synthetic household. We conclude that prosumer households with optimization potential would profit from DRL, thus benefiting also the full electricity system and its decarbonization.

nan

Article 1322

Title@2025-05-26 (1): Generative diffusion for perceptron problems: statistical physics analysis and efficient algorithms

Title: Generative diffusion for perceptron problems: statistical physics analysis and efficient algorithms

Generative Diffusion für Perceptronprobleme: statistische Physikanalyse und effiziente Algorithmen

生成感官问题扩散:统计物理分析和有效算法 2502.16292v2

Authors: Elizaveta Demyanenko, Davide Straziota, Carlo Baldassi, Carlo Lucibello

We consider random instances of non-convex perceptron problems in the high-dimensional limit of a large number of examples $M$ and weights $N$, with finite load $\alpha = M/N$. We develop a formalism based on replica theory to predict the fundamental limits of efficiently sampling the solution space using generative diffusion algorithms, conjectured to be saturated when the score function is provided by Approximate Message Passing. For the spherical perceptron with negative margin $\kappa$, we find that the uniform distribution over solutions can be efficiently sampled in most of the Replica Symmetric region of the $\alpha-\kappa$ plane. In contrast, for binary weights, sampling from the uniform distribution remains intractable. A theoretical analysis of this obstruction leads us to identify a potential $U(s) = -\log(s)$, under which the corresponding tilted distribution becomes efficiently samplable via diffusion. Moreover, we show numerically that an annealing procedure over the shape of this potential yields a fast and robust Markov Chain Monte Carlo algorithm for sampling the solution space of the binary perceptron.

nan

Article 1323

Title@2025-05-26 (1): Proxy-Free GFlowNet

Title: Proxy-Free GFlowNet

Proxy-freies GFlowNet

无代理的GFlowNet 2505.20110v1

Authors: Ruishuo Chen, Xun Wang, Rui Hu, Zhuoran Li, Longbo Huang

Generative Flow Networks (GFlowNets) are a promising class of generative models designed to sample diverse, high-reward structures by modeling distributions over compositional objects. In many real-world applications, obtaining the reward function for such objects is expensive, time-consuming, or requires human input, making it necessary to train GFlowNets from historical datasets. Most existing methods adopt a model-based approach, learning a proxy model from the dataset to approximate the reward function. However, this strategy inherently ties the quality of the learned policy to the accuracy of the proxy, introducing additional complexity and uncertainty into the training process. To overcome these limitations, we propose \textbf{Trajectory-Distilled GFlowNet (TD-GFN)}, a \emph{proxy-free} training framework that eliminates the need for out-of-dataset reward queries. Our method is motivated by the key observation that different edges in the associated directed acyclic graph (DAG) contribute unequally to effective policy learning. TD-GFN leverages inverse reinforcement learning to estimate edge-level rewards from the offline dataset, which are then used to ingeniously prune the DAG and guide backward trajectory sampling during training. This approach directs the policy toward high-reward regions while reducing the complexity of model fitting. Empirical results across multiple tasks show that TD-GFN trains both efficiently and reliably, significantly outperforming existing baselines in convergence speed and sample quality.

nan

Article 1324

Title@2025-05-26 (1): Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning

Title: Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning

Verfeinerung von Text-zu-Multiview-Diffusion durch Verstärkungslernen

通过强化学习改进微小的中文本到多视图传播 2505.20107v1

Authors: Ziyi Zhang, Li Shen, Deheng Ye, Yong Luo, Huangxuan Zhao, Lefei Zhang

Text-to-multiview (T2MV) generation, which produces coherent multiview images from a single text prompt, remains computationally intensive, while accelerated T2MV methods using few-step diffusion models often sacrifice image fidelity and view consistency. To address this, we propose a novel reinforcement learning (RL) finetuning framework tailored for few-step T2MV diffusion models to jointly optimize per-view fidelity and cross-view consistency. Specifically, we first reformulate T2MV denoising across all views as a single unified Markov decision process, enabling multiview-aware policy optimization driven by a joint-view reward objective. Next, we introduce ZMV-Sampling, a test-time T2MV sampling technique that adds an inversion-denoising pass to reinforce both viewpoint and text conditioning, resulting in improved T2MV generation at the cost of inference time. To internalize its performance gains into the base sampling policy, we develop MV-ZigAL, a novel policy optimization strategy that uses reward advantages of ZMV-Sampling over standard sampling as learning signals for policy updates. Finally, noting that the joint-view reward objective under-optimizes per-view fidelity but naively optimizing single-view metrics neglects cross-view alignment, we reframe RL finetuning for T2MV diffusion models as a constrained optimization problem that maximizes per-view fidelity subject to an explicit joint-view constraint, thereby enabling more efficient and balanced policy updates. By integrating this constrained optimization paradigm with MV-ZigAL, we establish our complete RL finetuning framework, referred to as MVC-ZigAL, which effectively refines the few-step T2MV diffusion baseline in both fidelity and consistency while preserving its few-step efficiency.

nan

Article 1325

Title@2025-05-26 (1): Preference-Based Gradient Estimation for ML-Guided Approximate Combinatorial Optimization

Title: Preference-Based Gradient Estimation for ML-Guided Approximate Combinatorial Optimization

Präferenzbasierte Gradientenschätzung für ML-geführte annähernde Kombinator-Optimierung

ML- Guided 近似组合优化的基于优惠的渐进式测算 2502.19377v2

Authors: Arman Mielke, Uwe Bauknecht, Thilo Strauss, Mathias Niepert

Combinatorial optimization (CO) problems arise across a broad spectrum of domains, including medicine, logistics, and manufacturing. While exact solutions are often computationally infeasible, many practical applications require high-quality solutions within a given time budget. To address this, we propose a learning-based approach that enhances existing non-learned approximation algorithms for CO. Specifically, we parameterize these approximation algorithms and train graph neural networks (GNNs) to predict parameter values that yield near-optimal solutions. Our method is trained end-to-end in a self-supervised fashion, using a novel gradient estimation scheme that treats the approximation algorithm as a black box. This approach combines the strengths of learning and traditional algorithms: the GNN learns from data to guide the algorithm toward better solutions, while the approximation algorithm ensures feasibility. We validate our method on two well-known combinatorial optimization problems: the travelling salesman problem (TSP) and the minimum k-cut problem. Our results demonstrate that the proposed approach is competitive with state-of-the-art learned CO solvers.

nan

Article 1326

Title@2025-05-26 (1): Spurious Privacy Leakage in Neural Networks

Title: Spurious Privacy Leakage in Neural Networks

Spurious Privacy Leakage in neuralen Netzwerken

神经网络中的净隐私渗漏 2505.20095v1

Authors: Chenxiang Zhang, Jun Pang, Sjouke Mauw

Neural networks are vulnerable to privacy attacks aimed at stealing sensitive data. The risks can be amplified in a real-world scenario, particularly when models are trained on limited and biased data. In this work, we investigate the impact of spurious correlation bias on privacy vulnerability. We introduce \emph{spurious privacy leakage}, a phenomenon where spurious groups are significantly more vulnerable to privacy attacks than non-spurious groups. We further show that group privacy disparity increases in tasks with simpler objectives (e.g. fewer classes) due to the persistence of spurious features. Surprisingly, we find that reducing spurious correlation using spurious robust methods does not mitigate spurious privacy leakage. This leads us to introduce a perspective on privacy disparity based on memorization, where mitigating spurious correlation does not mitigate the memorization of spurious data, and therefore, neither the privacy level. Lastly, we compare the privacy of different model architectures trained with spurious data, demonstrating that, contrary to prior works, architectural choice can affect privacy outcomes.

nan

Article 1327

Title@2025-05-26 (1): A fast sound power prediction tool for genset noise using machine learning

Title: A fast sound power prediction tool for genset noise using machine learning

Ein schnelles Sound-Power-Prognose-Tool für Genset-Rausch mit maschinellem Lernen

利用机器学习来快速可靠电源预测工具,用于使用机器学习的genseet噪音 2505.20079v1

Authors: Saurabh Pargal, Abhijit A. Sane

This paper investigates the application of machine learning regression algorithms Kernel Ridge Regression (KRR), Huber Regressor (HR), and Gaussian Process Regression (GPR) for predicting sound power levels of gensets, offering significant value for marketing and sales teams during the early bidding process. When engine sizes and genset enclosure dimensions are tentative, and measured noise data is unavailable, these algorithms enable reliable noise level estimation for unbuilt gensets. The study utilizes high fidelity datasets from over 100 experiments conducted at Cummins Acoustics Technology Center (ATC) in a hemi-anechoic chamber, adhering to ISO 3744 standards. By using readily available information from the bidding and initial design stages, KRR predicts sound power with an average accuracy of within 5 dBA. While HR and GPR show slightly higher prediction errors, all models effectively capture the overall noise trends across various genset configurations. These findings present a promising method for early-stage noise estimation in genset design.

nan

Article 1328

Title@2025-05-26 (1): Grokking ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior

Title: Grokking ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior

Grokking ExPLAIND: Vereinheitlichung von Modell, Daten und Trainingszuweisung zum Studieren von Modellverhalten

Grokking ExPLAIND: 用于研究模型行为的统一模型、数据和培训归属 2505.20076v1

Authors: Florian Eichin, Yupei Du, Philipp Mondorf, Barbara Plank, Michael A. Hedderich

Post-hoc interpretability methods typically attribute a model’s behavior to its components, data, or training trajectory in isolation. This leads to explanations that lack a unified view and may miss key interactions. While combining existing methods or applying them at different training stages offers broader insights, these approaches usually lack theoretical support. In this work, we present ExPLAIND, a unified framework that integrates all three perspectives. First, we generalize recent work on gradient path kernels, which reformulate models trained by gradient descent as a kernel machine, to more realistic training settings. Empirically, we find that both a CNN and a Transformer model are replicated accurately by this reformulation. Second, we derive novel parameter- and step-wise influence scores from the kernel feature maps. We show their effectiveness in parameter pruning that is comparable to existing methods, reinforcing their value for model component attribution. Finally, jointly interpreting model components and data over the training process, we leverage ExPLAIND to analyze a Transformer that exhibits Grokking. Among other things, our findings support previously proposed stages of Grokking, while refining the final phase as one of alignment of input embeddings and final layers around a representation pipeline learned after the memorization phase. Overall, ExPLAIND provides a theoretically grounded, unified framework to interpret model behavior and training dynamics.

nan

Article 1329

Title@2025-05-26 (1): An Out-Of-Distribution Membership Inference Attack Approach for Cross-Domain Graph Attacks

Title: An Out-Of-Distribution Membership Inference Attack Approach for Cross-Domain Graph Attacks

Ein Out-Of-Distribution-Mitgliedschaft Inferenz Angriff Ansatz für Cross-Domain Graph Attacks

跨领域石块袭击的批外分配成员推推攻击方法 2505.20074v1

Authors: Jinyan Wang, Liu Yang, Yuecen Wei, Jiaxuan Si, Chenhao Guo, Qingyun Sun, Xianxian Li, Xingcheng Fu

Graph Neural Network-based methods face privacy leakage risks due to the introduction of topological structures about the targets, which allows attackers to bypass the target’s prior knowledge of the sensitive attributes and realize membership inference attacks (MIA) by observing and analyzing the topology distribution. As privacy concerns grow, the assumption of MIA, which presumes that attackers can obtain an auxiliary dataset with the same distribution, is increasingly deviating from reality. In this paper, we categorize the distribution diversity issue in real-world MIA scenarios as an Out-Of-Distribution (OOD) problem, and propose a novel Graph OOD Membership Inference Attack (GOOD-MIA) to achieve cross-domain graph attacks. Specifically, we construct shadow subgraphs with distributions from different domains to model the diversity of real-world data. We then explore the stable node representations that remain unchanged under external influences and consider eliminating redundant information from confounding environments and extracting task-relevant key information to more clearly distinguish between the characteristics of training data and unseen data. This OOD-based design makes cross-domain graph attacks possible. Finally, we perform risk extrapolation to optimize the attack’s domain adaptability during attack inference to generalize the attack to other domains. Experimental results demonstrate that GOOD-MIA achieves superior attack performance in datasets designed for multiple domains.

nan

Article 1330

Title@2025-05-26 (1): SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

Title: SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

SafeDPO: Ein einfacher Ansatz zur direkten Preference-Optimierung mit erhöhter Sicherheit

SafeDPO: 以强化安全方式直接优化优惠的简单办法 2505.20065v1

Authors: Geon-Hyeong Kim, Youngsoo Jang, Yu Jin Kim, Byoungjip Kim, Honglak Lee, Kyunghoon Bae, Moontae Lee

As Large Language Models (LLMs) continue to advance and find applications across a growing number of fields, ensuring the safety of LLMs has become increasingly critical. To address safety concerns, recent studies have proposed integrating safety constraints into Reinforcement Learning from Human Feedback (RLHF). However, these approaches tend to be complex, as they encompass complicated procedures in RLHF along with additional steps required by the safety constraints. Inspired by Direct Preference Optimization (DPO), we introduce a new algorithm called SafeDPO, which is designed to directly optimize the safety alignment objective in a single stage of policy learning, without requiring relaxation. SafeDPO introduces only one additional hyperparameter to further enhance safety and requires only minor modifications to standard DPO. As a result, it eliminates the need to fit separate reward and cost models or to sample from the language model during fine-tuning, while still enhancing the safety of LLMs. Finally, we demonstrate that SafeDPO achieves competitive performance compared to state-of-the-art safety alignment algorithms, both in terms of aligning with human preferences and improving safety.

nan

Article 1331

Title@2025-05-26 (1): SAEs Are Good for Steering – If You Select the Right Features

Title: SAEs Are Good for Steering – If You Select the Right Features

SAEs sind gut für das Lenken – wenn Sie die richtigen Funktionen auswählen

SAEs 有利于指导 – – 如果您选择了正确的特性 2505.20063v1

Authors: Dana Arad, Aaron Mueller, Yonatan Belinkov

Sparse Autoencoders (SAEs) have been proposed as an unsupervised approach to learn a decomposition of a model’s latent space. This enables useful applications such as steering - influencing the output of a model towards a desired concept - without requiring labeled data. Current methods identify SAE features to steer by analyzing the input tokens that activate them. However, recent work has highlighted that activations alone do not fully describe the effect of a feature on the model’s output. In this work, we draw a distinction between two types of features: input features, which mainly capture patterns in the model’s input, and output features, which have a human-understandable effect on the model’s output. We propose input and output scores to characterize and locate these types of features, and show that high values for both scores rarely co-occur in the same features. These findings have practical implications: after filtering out features with low output scores, we obtain 2-3x improvements when steering with SAEs, making them competitive with supervised methods.

nan

Article 1332

Title@2025-05-26 (1): Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting

Title: Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting

Time-VLM: Erforschung multimodaler Vision-Sprachenmodelle für Augmented Time Series Forecasting

时间-VLM:探索扩大时间序列预测的多模式愿景-语言模型 2502.04395v2

Authors: Siru Zhong, Weilin Ruan, Ming Jin, Huan Li, Qingsong Wen, Yuxuan Liang

Recent advancements in time series forecasting have explored augmenting models with text or vision modalities to improve accuracy. While text provides contextual understanding, it often lacks fine-grained temporal details. Conversely, vision captures intricate temporal patterns but lacks semantic context, limiting the complementary potential of these modalities. To address this, we propose \method, a novel multimodal framework that leverages pre-trained Vision-Language Models (VLMs) to bridge temporal, visual, and textual modalities for enhanced forecasting. Our framework comprises three key components: (1) a Retrieval-Augmented Learner, which extracts enriched temporal features through memory bank interactions; (2) a Vision-Augmented Learner, which encodes time series as informative images; and (3) a Text-Augmented Learner, which generates contextual textual descriptions. These components collaborate with frozen pre-trained VLMs to produce multimodal embeddings, which are then fused with temporal features for final prediction. Extensive experiments demonstrate that Time-VLM achieves superior performance, particularly in few-shot and zero-shot scenarios, thereby establishing a new direction for multimodal time series forecasting. Code is available at https://github.com/CityMind-Lab/ICML25-TimeVLM.

nan

Article 1333

Title@2025-05-26 (1): Sable: a Performant, Efficient and Scalable Sequence Model for MARL

Title: Sable: a Performant, Efficient and Scalable Sequence Model for MARL

Sable: ein leistungsfähiges, effizientes und skalierbares Sequenzmodell für MARL

电缆:MARL的性能、高效和可缩放序列模型 2410.01706v5

Authors: Omayma Mahjoub, Sasha Abramowitz, Ruan de Kock, Wiem Khlifi, Simon du Toit, Jemma Daniel, Louay Ben Nessir, Louise Beyers, Claude Formanek, Liam Clark, Arnu Pretorius

As multi-agent reinforcement learning (MARL) progresses towards solving larger and more complex problems, it becomes increasingly important that algorithms exhibit the key properties of (1) strong performance, (2) memory efficiency, and (3) scalability. In this work, we introduce Sable, a performant, memory-efficient, and scalable sequence modeling approach to MARL. Sable works by adapting the retention mechanism in Retentive Networks (Sun et al., 2023) to achieve computationally efficient processing of multi-agent observations with long context memory for temporal reasoning. Through extensive evaluations across six diverse environments, we demonstrate how Sable is able to significantly outperform existing state-of-the-art methods in a large number of diverse tasks (34 out of 45 tested). Furthermore, Sable maintains performance as we scale the number of agents, handling environments with more than a thousand agents while exhibiting a linear increase in memory usage. Finally, we conduct ablation studies to isolate the source of Sable’s performance gains and confirm its efficient computational memory usage.

nan

Article 1334

Title@2025-05-26 (1): Ankh3: Multi-Task Pretraining with Sequence Denoising and Completion Enhances Protein Representations

Title: Ankh3: Multi-Task Pretraining with Sequence Denoising and Completion Enhances Protein Representations

Ankh3: Multi-Task Pretraining mit Sequenz Denoisieren und Vollendung verbessert Proteindarstellungen

Ankh3: 具有序列取消和完成的多任务预先培训,加强蛋白质代表制 2505.20052v1

Authors: Hazem Alsamkary, Mohamed Elshaffei, Mohamed Elkerdawy, Ahmed Elnaggar

Protein language models (PLMs) have emerged as powerful tools to detect complex patterns of protein sequences. However, the capability of PLMs to fully capture information on protein sequences might be limited by focusing on single pre-training tasks. Although adding data modalities or supervised objectives can improve the performance of PLMs, pre-training often remains focused on denoising corrupted sequences. To push the boundaries of PLMs, our research investigated a multi-task pre-training strategy. We developed Ankh3, a model jointly optimized on two objectives: masked language modeling with multiple masking probabilities and protein sequence completion relying only on protein sequences as input. This multi-task pre-training demonstrated that PLMs can learn richer and more generalizable representations solely from protein sequences. The results demonstrated improved performance in downstream tasks, such as secondary structure prediction, fluorescence, GB1 fitness, and contact prediction. The integration of multiple tasks gave the model a more comprehensive understanding of protein properties, leading to more robust and accurate predictions.

nan

Article 1335

Title@2025-05-26 (1): Catoni-Style Change Point Detection for Regret Minimization in Non-Stationary Heavy-Tailed Bandits

Title: Catoni-Style Change Point Detection for Regret Minimization in Non-Stationary Heavy-Tailed Bandits

Catoni-Style Change Point Detection für Reue Minimierung in nicht-stationären schwer-gefährdeten Banditen

用于在非连续重型重航匪徒中最遗憾最小化的卡特托尼- 轮式变速点探测 2505.20051v1

Authors: Gianmarco Genalti, Sujay Bhatt, Nicola Gatti, Alberto Maria Metelli

Regret minimization in stochastic non-stationary bandits gained popularity over the last decade, as it can model a broad class of real-world problems, from advertising to recommendation systems. Existing literature relies on various assumptions about the reward-generating process, such as Bernoulli or subgaussian rewards. However, in settings such as finance and telecommunications, heavy-tailed distributions naturally arise. In this work, we tackle the heavy-tailed piecewise-stationary bandit problem. Heavy-tailed bandits, introduced by Bubeck et al., 2013, operate on the minimal assumption that the finite absolute centered moments of maximum order $1+\epsilon$ are uniformly bounded by a constant $v<+\infty$, for some $\epsilon \in (0,1]$. We focus on the most popular non-stationary bandit setting, i.e., the piecewise-stationary setting, in which the mean of reward-generating distributions may change at unknown time steps. We provide a novel Catoni-style change-point detection strategy tailored for heavy-tailed distributions that relies on recent advancements in the theory of sequential estimation, which is of independent interest. We introduce Robust-CPD-UCB, which combines this change-point detection strategy with optimistic algorithms for bandits, providing its regret upper bound and an impossibility result on the minimum attainable regret for any policy. Finally, we validate our approach through numerical experiments on synthetic and real-world datasets.

nan

Article 1336

Title@2025-05-26 (1): Synthetic Time Series Forecasting with Transformer Architectures: Extensive Simulation Benchmarks

Title: Synthetic Time Series Forecasting with Transformer Architectures: Extensive Simulation Benchmarks

Synthetische Zeitreihenprognosen mit Transformer-Architekturen: Umfangreiche Simulations-Benchmarks

利用变形建筑结构预测合成时间序列:广泛模拟基准 2505.20048v1

Authors: Ali Forootani, Mohammad Khosravi

Time series forecasting plays a critical role in domains such as energy, finance, and healthcare, where accurate predictions inform decision-making under uncertainty. Although Transformer-based models have demonstrated success in sequential modeling, their adoption for time series remains limited by challenges such as noise sensitivity, long-range dependencies, and a lack of inductive bias for temporal structure. In this work, we present a unified and principled framework for benchmarking three prominent Transformer forecasting architectures-Autoformer, Informer, and Patchtst-each evaluated through three architectural variants: Minimal, Standard, and Full, representing increasing levels of complexity and modeling capacity. We conduct over 1500 controlled experiments on a suite of ten synthetic signals, spanning five patch lengths and five forecast horizons under both clean and noisy conditions. Our analysis reveals consistent patterns across model families. To advance this landscape further, we introduce the Koopman-enhanced Transformer framework, Deep Koopformer, which integrates operator-theoretic latent state modeling to improve stability and interpretability. We demonstrate its efficacy on nonlinear and chaotic dynamical systems. Our results highlight Koopman based Transformer as a promising hybrid approach for robust, interpretable, and theoretically grounded time series forecasting in noisy and complex real-world conditions.

nan

Article 1337

Title@2025-05-26 (1): Convex Approximation of Two-Layer ReLU Networks for Hidden State Differential Privacy

Title: Convex Approximation of Two-Layer ReLU Networks for Hidden State Differential Privacy

Convex-Annäherung von Zwei-Layer-ReLU-Netzwerken für versteckte staatliche differentielle Privatsphäre

隐藏式国家差异隐私双线雷路网络的连接近似 2407.04884v3

Authors: Rob Romijnders, Antti Koskela

The hidden state threat model of differential privacy (DP) assumes that the adversary has access only to the final trained machine learning (ML) model, without seeing intermediate states during training. However, the current privacy analyses under this model are restricted to convex optimization problems, reducing their applicability to multi-layer neural networks, which are essential in modern deep learning applications. Notably, the most successful applications of the hidden state privacy analyses in classification tasks have only been for logistic regression models. We demonstrate that it is possible to privately train convex problems with privacy-utility trade-offs comparable to those of 2-layer ReLU networks trained with DP stochastic gradient descent (DP-SGD). This is achieved through a stochastic approximation of a dual formulation of the ReLU minimization problem, resulting in a strongly convex problem. This enables the use of existing hidden state privacy analyses and provides accurate privacy bounds also for the noisy cyclic mini-batch gradient descent (NoisyCGD) method with fixed disjoint mini-batches. Empirical results on benchmark classification tasks demonstrate that NoisyCGD can achieve privacy-utility trade-offs on par with DP-SGD applied to 2-layer ReLU networks.

nan

Article 1338

Title@2025-05-26 (1): Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

Title: Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

Kontrolle des neuralen Zusammenbruchs verbessert Out-of-Distribution Detection und Transfer Learning

控制神经崩溃增强传播外探测和转让学习 2502.10691v2

Authors: Md Yousuf Harun, Jhair Gallardo, Christopher Kanan

Out-of-distribution (OOD) detection and OOD generalization are widely studied in Deep Neural Networks (DNNs), yet their relationship remains poorly understood. We empirically show that the degree of Neural Collapse (NC) in a network layer is inversely related with these objectives: stronger NC improves OOD detection but degrades generalization, while weaker NC enhances generalization at the cost of detection. This trade-off suggests that a single feature space cannot simultaneously achieve both tasks. To address this, we develop a theoretical framework linking NC to OOD detection and generalization. We show that entropy regularization mitigates NC to improve generalization, while a fixed Simplex Equiangular Tight Frame (ETF) projector enforces NC for better detection. Based on these insights, we propose a method to control NC at different DNN layers. In experiments, our method excels at both tasks across OOD datasets and DNN architectures.

nan

Article 1339

Title@2025-05-26 (1): Beyond Simple Concatenation: Fairly Assessing PLM Architectures for Multi-Chain Protein-Protein Interactions Prediction

Title: Beyond Simple Concatenation: Fairly Assessing PLM Architectures for Multi-Chain Protein-Protein Interactions Prediction

Beyond Simple Concatenation: Fairly Assessing PLM Architectures for Multi-Chain Protein-Protein Interaktionen Prediction

超越简单星系:公平评估多沙因蛋白因-蛋白因相互作用预测的PLM结构 2505.20036v1

Authors: Hazem Alsamkary, Mohamed Elshaffei, Mohamed Soudy, Sara Ossman, Abdallah Amr, Nehal Adel Abdelsalam, Mohamed Elkerdawy, Ahmed Elnaggar

Protein-protein interactions (PPIs) are fundamental to numerous cellular processes, and their characterization is vital for understanding disease mechanisms and guiding drug discovery. While protein language models (PLMs) have demonstrated remarkable success in predicting protein structure and function, their application to sequence-based PPI binding affinity prediction remains relatively underexplored. This gap is often attributed to the scarcity of high-quality, rigorously refined datasets and the reliance on simple strategies for concatenating protein representations. In this work, we address these limitations. First, we introduce a meticulously curated version of the PPB-Affinity dataset of a total of 8,207 unique protein-protein interaction entries, by resolving annotation inconsistencies and duplicate entries for multi-chain protein interactions. This dataset incorporates a stringent, less than or equal to 30%, sequence identity threshold to ensure robust splitting into training, validation, and test sets, minimizing data leakage. Second, we propose and systematically evaluate four architectures for adapting PLMs to PPI binding affinity prediction: embeddings concatenation (EC), sequences concatenation (SC), hierarchical pooling (HP), and pooled attention addition (PAD). These architectures were assessed using two training methods: full fine-tuning and a lightweight approach employing ConvBERT heads over frozen PLM features. Our comprehensive experiments across multiple leading PLMs (ProtT5, ESM2, Ankh, Ankh2, and ESM3) demonstrated that the HP and PAD architectures consistently outperform conventional concatenation methods, achieving up to 12% increase in terms of Spearman correlation. These results highlight the necessity of sophisticated architectural designs to fully exploit the capabilities of PLMs for nuanced PPI binding affinity prediction.

nan

Article 1340

Title@2025-05-26 (1): TeleSparse: Practical Privacy-Preserving Verification of Deep Neural Networks

Title: TeleSparse: Practical Privacy-Preserving Verification of Deep Neural Networks

TeleSparse: Praktische Datenschutz-Bewahrung von Tiefen-Neural-Netzwerken

远程分离:深海神经网络的实际隐私保护核查 2504.19274v2

Authors: Mohammad M Maheri, Hamed Haddadi, Alex Davidson

Verification of the integrity of deep learning inference is crucial for understanding whether a model is being applied correctly. However, such verification typically requires access to model weights and (potentially sensitive or private) training data. So-called Zero-knowledge Succinct Non-Interactive Arguments of Knowledge (ZK-SNARKs) would appear to provide the capability to verify model inference without access to such sensitive data. However, applying ZK-SNARKs to modern neural networks, such as transformers and large vision models, introduces significant computational overhead. We present TeleSparse, a ZK-friendly post-processing mechanisms to produce practical solutions to this problem. TeleSparse tackles two fundamental challenges inherent in applying ZK-SNARKs to modern neural networks: (1) Reducing circuit constraints: Over-parameterized models result in numerous constraints for ZK-SNARK verification, driving up memory and proof generation costs. We address this by applying sparsification to neural network models, enhancing proof efficiency without compromising accuracy or security. (2) Minimizing the size of lookup tables required for non-linear functions, by optimizing activation ranges through neural teleportation, a novel adaptation for narrowing activation functions’ range. TeleSparse reduces prover memory usage by 67% and proof generation time by 46% on the same model, with an accuracy trade-off of approximately 1%. We implement our framework using the Halo2 proving system and demonstrate its effectiveness across multiple architectures (Vision-transformer, ResNet, MobileNet) and datasets (ImageNet,CIFAR-10,CIFAR-100). This work opens new directions for ZK-friendly model design, moving toward scalable, resource-efficient verifiable deep learning.

nan

Article 1341

Title: ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers

ViTaPEs: Visuotaktile Positionskodierungen für die modulübergreifende Ausrichtung in multimodalen Transformatoren

ViTAPEs:多式变换器中跨模式对齐的变量定位位置编码 2505.20032v1

Authors: Fotios Lygerakis, Ozan Özdenizci, Elmar Rückert

Tactile sensing provides local essential information that is complementary to visual perception, such as texture, compliance, and force. Despite recent advances in visuotactile representation learning, challenges remain in fusing these modalities and generalizing across tasks and environments without heavy reliance on pre-trained vision-language models. Moreover, existing methods do not study positional encodings, thereby overlooking the multi-scale spatial reasoning needed to capture fine-grained visuotactile correlations. We introduce ViTaPEs, a transformer-based framework that robustly integrates visual and tactile input data to learn task-agnostic representations for visuotactile perception. Our approach exploits a novel multi-scale positional encoding scheme to capture intra-modal structures, while simultaneously modeling cross-modal cues. Unlike prior work, we provide provable guarantees in visuotactile fusion, showing that our encodings are injective, rigid-motion-equivariant, and information-preserving, validating these properties empirically. Experiments on multiple large-scale real-world datasets show that ViTaPEs not only surpasses state-of-the-art baselines across various recognition tasks but also demonstrates zero-shot generalization to unseen, out-of-domain scenarios. We further demonstrate the transfer-learning strength of ViTaPEs in a robotic grasping task, where it outperforms state-of-the-art baselines in predicting grasp success. Project page: https://sites.google.com/view/vitapes

nan

Article 1342

Title@2025-05-26 (1): Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

Title: Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

Mehrere Abstiege im Deep Learning als Folge von Order-Chaos-Übergängen

作为有秩序的赵国过渡的一个序列的深层学习中的多种族后裔 2505.20030v1

Authors: Wenbo Wei, Nicholas Chong Jia Le, Choy Heng Lai, Ling Feng

We observe a novel ‘multiple-descent’ phenomenon during the training process of LSTM, in which the test loss goes through long cycles of up and down trend multiple times after the model is overtrained. By carrying out asymptotic stability analysis of the models, we found that the cycles in test loss are closely associated with the phase transition process between order and chaos, and the local optimal epochs are consistently at the critical transition point between the two phases. More importantly, the global optimal epoch occurs at the first transition from order to chaos, where the ‘width’ of the ‘edge of chaos’ is the widest, allowing the best exploration of better weight configurations for learning.

nan

Article 1343

Title@2025-05-26 (1): Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)

Title: Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)

Korrelation von Instruktions-Tuning (in multimodalen Modellen) mit visionssprachlicher Verarbeitung (im Gehirn)

与视觉语言处理(大脑中)相交校正(多式联运模式) 2505.20029v1

Authors: Subba Reddy Oota, Akshett Jindal, Ishani Mondal, Khushbu Pahwa, Satya Sai Srinath Namburi, Manish Shrivastava, Maneesh Singh, Bapi S. Raju, Manish Gupta

Transformer-based language models, though not explicitly trained to mimic brain recordings, have demonstrated surprising alignment with brain activity. Progress in these models-through increased size, instruction-tuning, and multimodality-has led to better representational alignment with neural data. Recently, a new class of instruction-tuned multimodal LLMs (MLLMs) have emerged, showing remarkable zero-shot capabilities in open-ended multimodal vision tasks. However, it is unknown whether MLLMs, when prompted with natural instructions, lead to better brain alignment and effectively capture instruction-specific representations. To address this, we first investigate brain alignment, i.e., measuring the degree of predictivity of neural visual activity using text output response embeddings from MLLMs as participants engage in watching natural scenes. Experiments with 10 different instructions show that MLLMs exhibit significantly better brain alignment than vision-only models and perform comparably to non-instruction-tuned multimodal models like CLIP. We also find that while these MLLMs are effective at generating high-quality responses suitable to the task-specific instructions, not all instructions are relevant for brain alignment. Further, by varying instructions, we make the MLLMs encode instruction-specific visual concepts related to the input image. This analysis shows that MLLMs effectively capture count-related and recognition-related concepts, demonstrating strong alignment with brain activity. Notably, the majority of the explained variance of the brain encoding models is shared between MLLM embeddings of image captioning and other instructions. These results suggest that enhancing MLLMs’ ability to capture task-specific information could lead to better differentiation between various types of instructions, and thereby improving their precision in predicting brain responses.

nan

Article 1344

Title: Multi-modal brain encoding models for multi-modal stimuli

Multimodale Gehirnkodierungsmodelle für multimodale Reize

多模式刺激多模式大脑编码模型 2505.20027v1

Authors: Subba Reddy Oota, Khushbu Pahwa, Mounika Marreddy, Maneesh Singh, Manish Gupta, Bapi S. Raju

Despite participants engaging in unimodal stimuli, such as watching images or silent videos, recent work has demonstrated that multi-modal Transformer models can predict visual brain activity impressively well, even with incongruent modality representations. This raises the question of how accurately these multi-modal models can predict brain activity when participants are engaged in multi-modal stimuli. As these models grow increasingly popular, their use in studying neural activity provides insights into how our brains respond to such multi-modal naturalistic stimuli, i.e., where it separates and integrates information across modalities through a hierarchy of early sensory regions to higher cognition. We investigate this question by using multiple unimodal and two types of multi-modal models-cross-modal and jointly pretrained-to determine which type of model is more relevant to fMRI brain activity when participants are engaged in watching movies. We observe that both types of multi-modal models show improved alignment in several language and visual regions. This study also helps in identifying which brain regions process unimodal versus multi-modal information. We further investigate the contribution of each modality to multi-modal alignment by carefully removing unimodal features one by one from multi-modal representations, and find that there is additional information beyond the unimodal embeddings that is processed in the visual and language regions. Based on this investigation, we find that while for cross-modal models, their brain alignment is partially attributed to the video modality; for jointly pretrained models, it is partially attributed to both the video and audio modalities. This serves as a strong motivation for the neuroscience community to investigate the interpretability of these models for deepening our understanding of multi-modal information processing in brain.

nan

Article 1345

Title@2025-05-26 (1): Gradient Inversion Transcript: Leveraging Robust Generative Priors to Reconstruct Training Data from Gradient Leakage

Title: Gradient Inversion Transcript: Leveraging Robust Generative Priors to Reconstruct Training Data from Gradient Leakage

Gradient Inversion Transcript: Leveraging Robust Generative Priors to Reconstruct Trainingsdaten von Gradient Leakage

梯度反转轨迹:从梯度渗漏中重新构建培训数据的杠杆化强力生成前程 2505.20026v1

Authors: Xinping Chen, Chen Liu

We propose Gradient Inversion Transcript (GIT), a novel generative approach for reconstructing training data from leaked gradients. GIT employs a generative attack model, whose architecture is tailored to align with the structure of the leaked model based on theoretical analysis. Once trained offline, GIT can be deployed efficiently and only relies on the leaked gradients to reconstruct the input data, rendering it applicable under various distributed learning environments. When used as a prior for other iterative optimization-based methods, GIT not only accelerates convergence but also enhances the overall reconstruction quality. GIT consistently outperforms existing methods across multiple datasets and demonstrates strong robustness under challenging conditions, including inaccurate gradients, data distribution shifts and discrepancies in model parameters.

nan

Article 1346

Title@2025-05-26 (1): Human-Aligned Image Models Improve Visual Decoding from the Brain

Title: Human-Aligned Image Models Improve Visual Decoding from the Brain

Menschlich ausgerichtete Imagemodelle verbessern die visuelle Dekodierung aus dem Gehirn

人与人之间的图像模型改进大脑的视觉解码 2502.03081v2

Authors: Nona Rajabi, Antônio H. Ribeiro, Miguel Vasco, Farzaneh Taleb, Mårten Björkman, Danica Kragic

Decoding visual images from brain activity has significant potential for advancing brain-computer interaction and enhancing the understanding of human perception. Recent approaches align the representation spaces of images and brain activity to enable visual decoding. In this paper, we introduce the use of human-aligned image encoders to map brain signals to images. We hypothesize that these models more effectively capture perceptual attributes associated with the rapid visual stimuli presentations commonly used in visual brain data recording experiments. Our empirical results support this hypothesis, demonstrating that this simple modification improves image retrieval accuracy by up to 21% compared to state-of-the-art methods. Comprehensive experiments confirm consistent performance improvements across diverse EEG architectures, image encoders, alignment methods, participants, and brain imaging modalities

nan

Article 1347

Title@2025-05-26 (1): Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare

Title: Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare

Ontologie- und LLM-basierte Datenharmonisierung für das Federated Learning in Healthcare

以本体学和LLM为基础的保健方面联邦学习数据统一 2505.20020v1

Authors: Natallia Kokash, Lei Wang, Thomas H. Gillespie, Adam Belloum, Paola Grosso, Sara Quinney, Lang Li, Bernard de Bono

The rise of electronic health records (EHRs) has unlocked new opportunities for medical research, but privacy regulations and data heterogeneity remain key barriers to large-scale machine learning. Federated learning (FL) enables collaborative modeling without sharing raw data, yet faces challenges in harmonizing diverse clinical datasets. This paper presents a two-step data alignment strategy integrating ontologies and large language models (LLMs) to support secure, privacy-preserving FL in healthcare, demonstrating its effectiveness in a real-world project involving semantic mapping of EHR data.

nan

Article 1348

Title@2025-05-26 (1): ProcessBench: Identifying Process Errors in Mathematical Reasoning

Title: ProcessBench: Identifying Process Errors in Mathematical Reasoning

ProcessBench: Identifizierung von Prozessfehlern in mathematischer Reasoning

进程快节: 识别数学原因中的进程错误 2412.06559v4

Authors: Chujie Zheng, Zhenru Zhang, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin

As language models regularly make mistakes when solving math problems, automated identification of errors in the reasoning process becomes increasingly significant for their scalable oversight. In this paper, we introduce ProcessBench for measuring the ability to identify erroneous steps in mathematical reasoning. It consists of 3,400 test cases, primarily focused on competition- and Olympiad-level math problems. Each test case contains a step-by-step solution with error location annotated by human experts. Models are required to identify the earliest step that contains an error, or conclude that all steps are correct. We conduct extensive evaluation on ProcessBench, involving two types of models: process reward models (PRMs) and critic models, where for the latter we prompt general language models to critique each solution step by step. We draw two main observations: (1) Existing PRMs typically fail to generalize to more challenging math problems beyond GSM8K and MATH. They underperform both critic models (i.e., prompted general language models) and our own trained PRM that is straightforwardly fine-tuned on the PRM800K dataset. (2) The best open-source model, QwQ-32B-Preview, has demonstrated the critique capability competitive with the proprietary model GPT-4o, despite that it still lags behind the reasoning-specialized o1-mini. We hope ProcessBench can foster future research in reasoning process assessment, paving the way toward scalable oversight of language models.

nan

Article 1349

Title@2025-05-26 (1): Kernel-based estimators for functional causal effects

Title: Kernel-based estimators for functional causal effects

kernbasierte Schätzwerte für funktionelle kausale Effekte

功能因果效应的内核核心估计值 2503.05024v3

Authors: Yordan P. Raykov, Hengrui Luo, Justin D. Strait, Wasiur R. KhudaBukhsh

We propose causal effect estimators based on empirical Fr'{e}chet means and operator-valued kernels, tailored to functional data spaces. These methods address the challenges of high-dimensionality, sequential ordering, and model complexity while preserving robustness to treatment misspecification. Using structural assumptions, we obtain compact representations of potential outcomes, enabling scalable estimation of causal effects over time and across covariates. We provide both theoretical, regarding the consistency of functional causal effects, as well as empirical comparison of a range of proposed causal effect estimators. Applications to binary treatment settings with functional outcomes illustrate the framework’s utility in biomedical monitoring, where outcomes exhibit complex temporal dynamics. Our estimators accommodate scenarios with registered covariates and outcomes, aligning them to the Fr'{e}chet means, as well as cases requiring higher-order representations to capture intricate covariate-outcome interactions. These advancements extend causal inference to dynamic and non-linear domains, offering new tools for understanding complex treatment effects in functional data settings.

nan

Article 1350

Title@2025-05-26 (1): Data-Dependent Regret Bounds for Constrained MABs

Title: Data-Dependent Regret Bounds for Constrained MABs

Datendependent Regret Bounds for Constrained MABs

受约束 MAB 的受控数据依赖的 Regret Bounds 2505.20010v1

Authors: Gianmarco Genalti, Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

This paper initiates the study of data-dependent regret bounds in constrained MAB settings. These bounds depend on the sequence of losses that characterize the problem instance. Thus, they can be much smaller than classical $\widetilde{\mathcal{O}}(\sqrt{T})$ regret bounds, while being equivalent to them in the worst case. Despite this, data-dependent regret bounds have been completely overlooked in constrained MAB settings. The goal of this paper is to answer the following question: Can data-dependent regret bounds be derived in the presence of constraints? We answer this question affirmatively in constrained MABs with adversarial losses and stochastic constraints. Specifically, our main focus is on the most challenging and natural settings with hard constraints, where the learner must ensure that the constraints are always satisfied with high probability. We design an algorithm with a regret bound consisting of two data-dependent terms. The first term captures the difficulty of satisfying the constraints, while the second one encodes the complexity of learning independently of the presence of constraints. We also prove a lower bound showing that these two terms are not artifacts of our specific approach and analysis, but rather the fundamental components that inherently characterize the complexities of the problem. Finally, in designing our algorithm, we also derive some novel results in the related (and easier) soft constraints settings, which may be of independent interest.

nan

Article 1351

Title@2025-05-26 (1): Prediction-Powered E-Values

Title: Prediction-Powered E-Values

Voraussichtliche E-Werte

预测力电子价值 2502.04294v2

Authors: Daniel Csillag, Claudio José Struchiner, Guilherme Tegoni Goedert

Quality statistical inference requires a sufficient amount of data, which can be missing or hard to obtain. To this end, prediction-powered inference has risen as a promising methodology, but existing approaches are largely limited to Z-estimation problems such as inference of means and quantiles. In this paper, we apply ideas of prediction-powered inference to e-values. By doing so, we inherit all the usual benefits of e-values – such as anytime-validity, post-hoc validity and versatile sequential inference – as well as greatly expand the set of inferences achievable in a prediction-powered manner. In particular, we show that every inference procedure that can be framed in terms of e-values has a prediction-powered counterpart, given by our method. We showcase the effectiveness of our framework across a wide range of inference tasks, from simple hypothesis testing and confidence intervals to more involved procedures for change-point detection and causal discovery, which were out of reach of previous techniques. Our approach is modular and easily integrable into existing algorithms, making it a compelling choice for practical applications.

nan

Article 1352

Title@2025-05-26 (1): TabPFN: One Model to Rule Them All?

Title: TabPFN: One Model to Rule Them All?

TabPFN: Ein Modell, um sie alle zu beherrschen?

TabPFN: 一种模式来统治他们吗? 2505.20003v1

Authors: Qiong Zhang, Yan Shuo Tan, Qinglong Tian, Pengfei Li

Hollmann et al. (Nature 637 (2025) 319-326) recently introduced TabPFN, a transformer-based deep learning model for regression and classification on tabular data, which they claim “outperforms all previous methods on datasets with up to 10,000 samples by a wide margin, using substantially less training time.” Furthermore, they have called TabPFN a “foundation model” for tabular data, as it can support “data generation, density estimation, learning reusable embeddings and fine-tuning”. If these statements are well-supported, TabPFN may have the potential to supersede existing modeling approaches on a wide range of statistical tasks, mirroring a similar revolution in other areas of artificial intelligence that began with the advent of large language models. In this paper, we provide a tailored explanation of how TabPFN works for a statistics audience, by emphasizing its interpretation as approximate Bayesian inference. We also provide more evidence of TabPFN’s “foundation model” capabilities: We show that an out-of-the-box application of TabPFN vastly outperforms specialized state-of-the-art methods for semi-supervised parameter estimation, prediction under covariate shift, and heterogeneous treatment effect estimation. We further show that TabPFN can outperform LASSO at sparse regression and can break a robustness-efficiency trade-off in classification. All experiments can be reproduced using the code provided at https://github.com/qinglong-tian/tabpfn_study (https://github.com/qinglong-tian/tabpfn_study).

nan

Article 1353

Title@2025-05-26 (1): Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents

Title: Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents

Unvollkommenheit: Simulieren von Studenten mit unterschiedlichen kognitiven Ebenen mit LLM-basierten Agenten

普及缺陷:利用基于LLM的代理物模拟具有不同认知水平的学生 2505.19997v1

Authors: Tao Wu, Jingyuan Chen, Wang Lin, Mengze Li, Yumeng Zhu, Ang Li, Kun Kuang, Fei Wu

Large language models (LLMs) are revolutionizing education, with LLM-based agents playing a key role in simulating student behavior. A major challenge in student simulation is modeling the diverse learning patterns of students at various cognitive levels. However, current LLMs, typically trained as ``helpful assistants’’, target at generating perfect responses. As a result, they struggle to simulate students with diverse cognitive abilities, as they often produce overly advanced answers, missing the natural imperfections that characterize student learning and resulting in unrealistic simulations. To address this issue, we propose a training-free framework for student simulation. We begin by constructing a cognitive prototype for each student using a knowledge graph, which captures their understanding of concepts from past learning records. This prototype is then mapped to new tasks to predict student performance. Next, we simulate student solutions based on these predictions and iteratively refine them using a beam search method to better replicate realistic mistakes. To validate our approach, we construct the \texttt{Student_100} dataset, consisting of $100$ students working on Python programming and $5,000$ learning records. Experimental results show that our method consistently outperforms baseline models, achieving $100\%$ improvement in simulation accuracy.

nan

Article 1354

Title@2025-05-26 (1): Learning Optimal Multimodal Information Bottleneck Representations

Title: Learning Optimal Multimodal Information Bottleneck Representations

Optimales Lernen multimodaler Informationen Engpässe Vertretungen

学习最佳最佳多模式信息 2505.19996v1

Authors: Qilong Wu, Yiyang Shao, Jun Wang, Xiaobo Sun

Leveraging high-quality joint representations from multimodal data can greatly enhance model performance in various machine-learning based applications. Recent multimodal learning methods, based on the multimodal information bottleneck (MIB) principle, aim to generate optimal MIB with maximal task-relevant information and minimal superfluous information via regularization. However, these methods often set ad hoc regularization weights and overlook imbalanced task-relevant information across modalities, limiting their ability to achieve optimal MIB. To address this gap, we propose a novel multimodal learning framework, Optimal Multimodal Information Bottleneck (OMIB), whose optimization objective guarantees the achievability of optimal MIB by setting the regularization weight within a theoretically derived bound. OMIB further addresses imbalanced task-relevant information by dynamically adjusting regularization weights per modality, promoting the inclusion of all task-relevant information. Moreover, we establish a solid information-theoretical foundation for OMIB’s optimization and implement it under the variational approximation framework for computational efficiency. Finally, we empirically validate the OMIB’s theoretical properties on synthetic data and demonstrate its superiority over the state-of-the-art benchmark methods in various downstream tasks.

nan

Article 1355

Title@2025-05-26 (1): Distortion Resilience for Goal-Oriented Semantic Communication

Title: Distortion Resilience for Goal-Oriented Semantic Communication

Distortion Resilienz für zielorientierte semantische Kommunikation

目标导向语义交流的扭曲复原力 2309.14587v2

Authors: Minh-Duong Nguyen, Quang-Vinh Do, Zhaohui Yang, Quoc-Viet Pham, Won-Joo Hwang

Recent research efforts on Semantic Communication (SemCom) have mostly considered accuracy as a main problem for optimizing goal-oriented communication systems. However, these approaches introduce a paradox: the accuracy of Artificial Intelligence (AI) tasks should naturally emerge through training rather than being dictated by network constraints. Acknowledging this dilemma, this work introduces an innovative approach that leverages the rate distortion theory to analyze distortions induced by communication and compression, thereby analyzing the learning process. Specifically, we examine the distribution shift between the original data and the distorted data, thus assessing its impact on the AI model’s performance. Founding upon this analysis, we can preemptively estimate the empirical accuracy of AI tasks, making the goal-oriented SemCom problem feasible. To achieve this objective, we present the theoretical foundation of our approach, accompanied by simulations and experiments that demonstrate its effectiveness. The experimental results indicate that our proposed method enables accurate AI task performance while adhering to network constraints, establishing it as a valuable contribution to the field of signal processing. Furthermore, this work advances research in goal-oriented SemCom and highlights the significance of data-driven approaches in optimizing the performance of intelligent systems.

nan

Article 1356

Title@2025-05-26 (1): Federated Domain Generalization with Data-free On-server Matching Gradient

Title: Federated Domain Generalization with Data-free On-server Matching Gradient

Föderierte Domain-Verallgemeinerung mit datenfreiem On-Server-Zustimmungs-Gradient

具有无数据观测站上与渐变匹配的无数据观测器的联邦通用域 2501.14653v2

Authors: Trong-Binh Nguyen, Minh-Duong Nguyen, Jinsun Park, Quoc-Viet Pham, Won Joo Hwang

Domain Generalization (DG) aims to learn from multiple known source domains a model that can generalize well to unknown target domains. One of the key approaches in DG is training an encoder which generates domain-invariant representations. However, this approach is not applicable in Federated Domain Generalization (FDG), where data from various domains are distributed across different clients. In this paper, we introduce a novel approach, dubbed Federated Learning via On-server Matching Gradient (FedOMG), which can \emph{efficiently leverage domain information from distributed domains}. Specifically, we utilize the local gradients as information about the distributed models to find an invariant gradient direction across all domains through gradient inner product maximization. The advantages are two-fold: 1) FedOMG can aggregate the characteristics of distributed models on the centralized server without incurring any additional communication cost, and 2) FedOMG is orthogonal to many existing FL/FDG methods, allowing for additional performance improvements by being seamlessly integrated with them. Extensive experimental evaluations on various settings to demonstrate the robustness of FedOMG compared to other FL/FDG baselines. Our method outperforms recent SOTA baselines on four FL benchmark datasets (MNIST, EMNIST, CIFAR-10, and CIFAR-100), and three FDG benchmark datasets (PACS, VLCS, and OfficeHome).

nan

Article 1357

Title@2025-05-26 (1): Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach

Title: Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach

Bedauerliche Analyse von durchschnittlichen Unichain-MDPs über einen actor-Critic-Ansatz

通过“行动者-批评办法”对平均回报单链式微DP的遗憾分析 2505.19986v1

Authors: Swetha Ganesh, Vaneet Aggarwal

Actor-Critic methods are widely used for their scalability, yet existing theoretical guarantees for infinite-horizon average-reward Markov Decision Processes (MDPs) often rely on restrictive ergodicity assumptions. We propose NAC-B, a Natural Actor-Critic with Batching, that achieves order-optimal regret of $\tilde{O}(\sqrt{T})$ in infinite-horizon average-reward MDPs under the unichain assumption, which permits both transient states and periodicity. This assumption is among the weakest under which the classic policy gradient theorem remains valid for average-reward settings. NAC-B employs function approximation for both the actor and the critic, enabling scalability to problems with large state and action spaces. The use of batching in our algorithm helps mitigate potential periodicity in the MDP and reduces stochasticity in gradient estimates, and our analysis formalizes these benefits through the introduction of the constants $C_{\text{hit}}$ and $C_{\text{tar}}$, which characterize the rate at which empirical averages over Markovian samples converge to the stationary distribution.

nan

Article 1358

Title@2025-05-26 (1): Bridging The Multi-Modality Gaps of Audio, Visual and Linguistic for Speech Enhancement

Title: Bridging The Multi-Modality Gaps of Audio, Visual and Linguistic for Speech Enhancement

Überbrückung der Multi-Modalitätslücken von Audio, Visual und Linguistik zur Sprachverbesserung

弥合视听和语言的多模式差距,加强语言、视听能力 2501.13375v2

Authors: Meng-Ping Lin, Jen-Cheng Hou, Chia-Wei Chen, Shao-Yi Chien, Jun-Cheng Chen, Xugang Lu, Yu Tsao

Speech enhancement (SE) aims to improve the quality and intelligibility of speech in noisy environments. Recent studies have shown that incorporating visual cues in audio signal processing can enhance SE performance. Given that human speech communication naturally involves audio, visual, and linguistic modalities, it is reasonable to expect additional improvements by integrating linguistic information. However, effectively bridging these modality gaps, particularly during knowledge transfer remains a significant challenge. In this paper, we propose a novel multi-modal learning framework, termed DLAV-SE, which leverages a diffusion-based model integrating audio, visual, and linguistic information for audio-visual speech enhancement (AVSE). Within this framework, the linguistic modality is modeled using a pretrained language model (PLM), which transfers linguistic knowledge to the audio-visual domain through a cross-modal knowledge transfer (CMKT) mechanism during training. After training, the PLM is no longer required at inference, as its knowledge is embedded into the AVSE model through the CMKT process. We conduct a series of SE experiments to evaluate the effectiveness of our approach. Results show that the proposed DLAV-SE system significantly improves speech quality and reduces generative artifacts, such as phonetic confusion, compared to state-of-the-art (SOTA) methods. Furthermore, visualization analyses confirm that the CMKT method enhances the generation quality of the AVSE outputs. These findings highlight both the promise of diffusion-based methods for advancing AVSE and the value of incorporating linguistic information to further improve system performance.

nan

Article 1359

Title@2025-05-26 (1): Rethinking Probabilistic Circuit Parameter Learning

Title: Rethinking Probabilistic Circuit Parameter Learning

Probabilistisches Parameter-Lernen neu denken

重新思考概率电路参数学习 2505.19982v1

Authors: Anji Liu, Guy Van den Broeck

Probabilistic Circuits (PCs) offer a computationally scalable framework for generative modeling, supporting exact and efficient inference of a wide range of probabilistic queries. While recent advances have significantly improved the expressiveness and scalability of PCs, effectively training their parameters remains a challenge. In particular, a widely used optimization method, full-batch Expectation-Maximization (EM), requires processing the entire dataset before performing a single update, making it ineffective for large datasets. While empirical extensions to the mini-batch setting have been proposed, it remains unclear what objective these algorithms are optimizing, making it difficult to assess their theoretical soundness. This paper bridges the gap by establishing a novel connection between the general EM objective and the standard full-batch EM algorithm. Building on this, we derive a theoretically grounded generalization to the mini-batch setting and demonstrate its effectiveness through preliminary empirical results.

nan

Article 1360

Title@2025-05-26 (1): Differential Privacy Analysis of Decentralized Gossip Averaging under Varying Threat Models

Title: Differential Privacy Analysis of Decentralized Gossip Averaging under Varying Threat Models

Differential Privacy Analyse dezentralisierter Gossip Average unter unterschiedlichen Bedrohungsmodellen

对不同威胁模式下分散的流民的隐私差异分析 2505.19969v1

Authors: Antti Koskela, Tejas Kulkarni

Fully decentralized training of machine learning models offers significant advantages in scalability, robustness, and fault tolerance. However, achieving differential privacy (DP) in such settings is challenging due to the absence of a central aggregator and varying trust assumptions among nodes. In this work, we present a novel privacy analysis of decentralized gossip-based averaging algorithms with additive node-level noise, both with and without secure summation over each node’s direct neighbors. Our main contribution is a new analytical framework based on a linear systems formulation that accurately characterizes privacy leakage across these scenarios. This framework significantly improves upon prior analyses, for example, reducing the R'enyi DP parameter growth from $O(T^2)$ to $O(T)$, where $T$ is the number of training rounds. We validate our analysis with numerical results demonstrating superior DP bounds compared to existing approaches. We further illustrate our analysis with a logistic regression experiment on MNIST image classification in a fully decentralized setting, demonstrating utility comparable to central aggregation methods.

nan

Article 1361

Title@2025-05-26 (1): Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)

Title: Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)

Position: Löse schichtweise lineare Modelle, um neurale dynamische Phänomene zu verstehen (Neuraler Kollaps, Emergence, Lazy/Rich Regime und Grokking)

位置:首先理解神经动态现象的解层图层线性模型(神经崩溃、新出现、Lazy/Rich制度和Grokking) 2502.21009v2

Authors: Yoonsoo Nam, Seok Hyeong Lee, Clementine C J Domine, Yeachan Park, Charles London, Wonyl Choi, Niclas Goring, Seungjai Lee

In physics, complex systems are often simplified into minimal, solvable models that retain only the core principles. In machine learning, layerwise linear models (e.g., linear neural networks) act as simplified representations of neural network dynamics. These models follow the dynamical feedback principle, which describes how layers mutually govern and amplify each other’s evolution. This principle extends beyond the simplified models, successfully explaining a wide range of dynamical phenomena in deep neural networks, including neural collapse, emergence, lazy and rich regimes, and grokking. In this position paper, we call for the use of layerwise linear models retaining the core principles of neural dynamical phenomena to accelerate the science of deep learning.

nan

Article 1362

Title@2025-05-26 (1): Learning to Select In-Context Demonstration Preferred by Large Language Model

Title: Learning to Select In-Context Demonstration Preferred by Large Language Model

Lernen, In-Kontext-Demonstration zu wählen Bevorzugt nach großen Sprachmodellen

学习选择大语言模式首选的文本内演示 2505.19966v1

Authors: Zheng Zhang, Shaocheng Lan, Lei Song, Jiang Bian, Yexin Li, Kan Ren

In-context learning (ICL) enables large language models (LLMs) to adapt to new tasks during inference using only a few demonstrations. However, ICL performance is highly dependent on the selection of these demonstrations. Recent work explores retrieval-based methods for selecting query-specific demonstrations, but these approaches often rely on surrogate objectives such as metric learning, failing to directly optimize ICL performance. Consequently, they struggle to identify truly beneficial demonstrations. Moreover, their discriminative retrieval paradigm is ineffective when the candidate pool lacks sufficient high-quality demonstrations. To address these challenges, we propose GenICL, a novel generative preference learning framework that leverages LLM feedback to directly optimize demonstration selection for ICL. Experiments on 19 datasets across 11 task categories demonstrate that GenICL achieves superior performance than existing methods in selecting the most effective demonstrations, leading to better ICL performance.

nan

Article 1363

Title@2025-05-26 (1): The Limits of Preference Data for Post-Training

Title: The Limits of Preference Data for Post-Training

Die Grenzen der Präferenzdaten für das Post-Training

培训后优先数据限值 2505.19964v1

Authors: Eric Zhao, Jessica Dai, Pranjal Awasthi

Recent progress in strengthening the capabilities of large language models has stemmed from applying reinforcement learning to domains with automatically verifiable outcomes. A key question is whether we can similarly use RL to optimize for outcomes in domains where evaluating outcomes inherently requires human feedback; for example, in tasks like deep research and trip planning, outcome evaluation is qualitative and there are many possible degrees of success. One attractive and scalable modality for collecting human feedback is preference data: ordinal rankings (pairwise or $k$-wise) that indicate, for $k$ given outcomes, which one is preferred. In this work, we study a critical roadblock: preference data fundamentally and significantly limits outcome-based optimization. Even with idealized preference data (infinite, noiseless, and online), the use of ordinal feedback can prevent obtaining even approximately optimal solutions. We formalize this impossibility using voting theory, drawing an analogy between how a model chooses to answer a query with how voters choose a candidate to elect. This indicates that grounded human scoring and algorithmic innovations are necessary for extending the success of RL post-training to domains demanding human feedback. We also explore why these limitations have disproportionately impacted RLHF when it comes to eliciting reasoning behaviors (e.g., backtracking) versus situations where RLHF has been historically successful (e.g., instruction-tuning and safety training), finding that the limitations of preference data primarily suppress RLHF’s ability to elicit robust strategies – a class that encompasses most reasoning behaviors.

nan

Article 1364

Title@2025-05-26 (1): Robustly optimal dynamics for active matter reservoir computing

Title: Robustly optimal dynamics for active matter reservoir computing

Robust optimale Dynamik für das Recreservoir Computing mit aktiven Materien

活性物质储油层计算强有力的最佳动态 2505.05420v2

Authors: Mario U. Gaimann, Miriam Klopotek

Information processing abilities of active matter are studied in the reservoir computing (RC) paradigm to infer the future state of a chaotic signal. We uncover an exceptional regime of agent dynamics that has been overlooked previously. It appears robustly optimal for performance under many conditions, thus providing valuable insights into computation with physical systems more generally. The key to forming effective mechanisms for information processing appears in the system’s intrinsic relaxation abilities. These are probed without actually enforcing a specific inference goal. The dynamical regime that achieves optimal computation is located just below a critical damping threshold, involving a relaxation with multiple stages, and is readable at the single-particle level. At the many-body level, it yields substrates robustly optimal for RC across varying physical parameters and inference tasks. A system in this regime exhibits a strong diversity of dynamic mechanisms under highly fluctuating driving forces. Correlations of agent dynamics can express a tight relationship between the responding system and the fluctuating forces driving it. As this model is interpretable in physical terms, it facilitates re-framing inquiries regarding learning and unconventional computing with a fresh rationale for many-body physics out of equilibrium.

nan

Article 1365

Title@2025-05-26 (1): Explanatory Summarization with Discourse-Driven Planning

Title: Explanatory Summarization with Discourse-Driven Planning

Erklärende Zusammenfassung mit diskursgetriebener Planung

与 “ 分流规划 “ 结合的解释性总结 2504.19339v3

Authors: Dongqi Liu, Xi Yu, Vera Demberg, Mirella Lapata

Lay summaries for scientific documents typically include explanations to help readers grasp sophisticated concepts or arguments. However, current automatic summarization methods do not explicitly model explanations, which makes it difficult to align the proportion of explanatory content with human-written summaries. In this paper, we present a plan-based approach that leverages discourse frameworks to organize summary generation and guide explanatory sentences by prompting responses to the plan. Specifically, we propose two discourse-driven planning strategies, where the plan is conditioned as part of the input or part of the output prefix, respectively. Empirical experiments on three lay summarization datasets show that our approach outperforms existing state-of-the-art methods in terms of summary quality, and it enhances model robustness, controllability, and mitigates hallucination.

nan

Article 1366

Title@2025-05-26 (1): RAP: Runtime-Adaptive Pruning for LLM Inference

Title: RAP: Runtime-Adaptive Pruning for LLM Inference

RAP: Runtime-Adaptive Pruning für LLM-Inferenz

RAP:LLM 推断的运行时间-适应性节制 2505.17138v2

Authors: Huanrong Liu, Chunlin Tian, Xuyang Wei, Jiaheng Dai, Qin Liu, Tianqi Wei, Qingbiao Li, Li Li

Large language models (LLMs) excel at language understanding and generation, but their enormous computational and memory requirements hinder deployment. Compression offers a potential solution to mitigate these constraints. However, most existing methods rely on fixed heuristics and thus fail to adapt to runtime memory variations or heterogeneous KV-cache demands arising from diverse user requests. To address these limitations, we propose RAP, an elastic pruning framework driven by reinforcement learning (RL) that dynamically adjusts compression strategies in a runtime-aware manner. Specifically, RAP dynamically tracks the evolving ratio between model parameters and KV-cache across practical execution. Recognizing that FFNs house most parameters, whereas parameter -light attention layers dominate KV-cache formation, the RL agent retains only those components that maximize utility within the current memory budget, conditioned on instantaneous workload and device state. Extensive experiments results demonstrate that RAP outperforms state-of-the-art baselines, marking the first time to jointly consider model weights and KV-cache on the fly.

nan

Article 1367

Title@2025-05-26 (1): Multi-Type Point Cloud Autoencoder: A Complete Equivariant Embedding for Molecule Conformation and Pose

Title: Multi-Type Point Cloud Autoencoder: A Complete Equivariant Embedding for Molecule Conformation and Pose

Multi-Type-Punkt-Cloud-Autoencoder: Ein komplettes Equivariant-Embedding für Molekülkonformation und Pose

多类型点云云自动编码器:分子构造和脉冲的完全等同嵌入 2405.13791v3

Authors: Michael Kilgour, Mark Tuckerman, Jutta Rogal

Representations are a foundational component of any modelling protocol, including on molecules and molecular solids. For tasks that depend on knowledge of both molecular conformation and 3D orientation, such as the modelling of molecular dimers, clusters, or condensed phases, we desire a rotatable representation that is provably complete in the types and positions of atomic nuclei and roto-inversion equivariant with respect to the input point cloud. In this paper, we develop, train, and evaluate a new type of autoencoder, molecular O(3) encoding net (Mo3ENet), for multi-type point clouds, for which we propose a new reconstruction loss, capitalizing on a Gaussian mixture representation of the input and output point clouds. Mo3ENet is end-to-end equivariant, meaning the learned representation can be manipulated on O(3), a practical bonus. An appropriately trained Mo3ENet latent space comprises a universal embedding for scalar and vector molecule property prediction tasks, as well as other downstream tasks incorporating the 3D molecular pose, and we demonstrate its fitness on several such tasks.

nan

Article 1368

Title@2025-05-26 (1): MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research

Title: MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research

MLR-Bench: Bewertung von KI-Agenten auf Open-Ended Machine Learning Research

MLR-Bench:评估AI公司在开放式机械学习研究方面的代理机构 2505.19955v1

Authors: Hui Chen, Miao Xiong, Yujie Lu, Wei Han, Ailin Deng, Yufei He, Jiaying Wu, Yibo Li, Yue Liu, Bryan Hooi

Recent advancements in AI agents have demonstrated their growing potential to drive and support scientific discovery. In this work, we introduce MLR-Bench, a comprehensive benchmark for evaluating AI agents on open-ended machine learning research. MLR-Bench includes three key components: (1) 201 research tasks sourced from NeurIPS, ICLR, and ICML workshops covering diverse ML topics; (2) MLR-Judge, an automated evaluation framework combining LLM-based reviewers with carefully designed review rubrics to assess research quality; and (3) MLR-Agent, a modular agent scaffold capable of completing research tasks through four stages: idea generation, proposal formulation, experimentation, and paper writing. Our framework supports both stepwise assessment across these distinct research stages, and end-to-end evaluation of the final research paper. We then use MLR-Bench to evaluate six frontier LLMs and an advanced coding agent, finding that while LLMs are effective at generating coherent ideas and well-structured papers, current coding agents frequently (e.g., in 80% of the cases) produce fabricated or invalidated experimental results–posing a major barrier to scientific reliability. We validate MLR-Judge through human evaluation, showing high agreement with expert reviewers, supporting its potential as a scalable tool for research evaluation. We open-source MLR-Bench to help the community benchmark, diagnose, and improve AI research agents toward trustworthy and transparent scientific discovery.

nan

Article 1369

Title@2025-05-26 (1): An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning

Title: An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning

Ein erklärbares Diagnose-Framework für neurodegenerative Dementias durch Verstärkungsoptimierte LLM-Reasoning

通过强化-优化LLM解释性理疗理由的神经医学性痴呆症可解释的诊断框架 2505.19954v1

Authors: Andrew Zamai, Nathanael Fijalkow, Boris Mansencal, Laurent Simon, Eloi Navet, Pierrick Coupe

The differential diagnosis of neurodegenerative dementias is a challenging clinical task, mainly because of the overlap in symptom presentation and the similarity of patterns observed in structural neuroimaging. To improve diagnostic efficiency and accuracy, deep learning-based methods such as Convolutional Neural Networks and Vision Transformers have been proposed for the automatic classification of brain MRIs. However, despite their strong predictive performance, these models find limited clinical utility due to their opaque decision making. In this work, we propose a framework that integrates two core components to enhance diagnostic transparency. First, we introduce a modular pipeline for converting 3D T1-weighted brain MRIs into textual radiology reports. Second, we explore the potential of modern Large Language Models (LLMs) to assist clinicians in the differential diagnosis between Frontotemporal dementia subtypes, Alzheimer’s disease, and normal aging based on the generated reports. To bridge the gap between predictive accuracy and explainability, we employ reinforcement learning to incentivize diagnostic reasoning in LLMs. Without requiring supervised reasoning traces or distillation from larger models, our approach enables the emergence of structured diagnostic rationales grounded in neuroimaging findings. Unlike post-hoc explainability methods that retrospectively justify model decisions, our framework generates diagnostic rationales as part of the inference process-producing causally grounded explanations that inform and guide the model’s decision-making process. In doing so, our framework matches the diagnostic performance of existing deep learning methods while offering rationales that support its diagnostic conclusions.

nan

Article 1370

Title@2025-05-26 (1): Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions

Title: Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions

Welche Datenattribute stimulieren die Mathe- und Code-Reasoning? Eine Untersuchung über Einflussfunktionen

哪些数据属性刺激数学和代码理由? 通过影响函数进行调查 2505.19949v1

Authors: Siqi Kou, Qingyuan Tian, Hanwen Xu, Zihao Zeng, Zhijie Deng

Large language models (LLMs) have demonstrated remarkable reasoning capabilities in math and coding, often bolstered by post-training on the chain-of-thoughts (CoTs) generated by stronger models. However, existing strategies for curating such training data predominantly rely on heuristics, limiting generalizability and failing to capture subtleties underlying in data. To address these limitations, we leverage influence functions to systematically attribute LLMs’ reasoning ability on math and coding to individual training examples, sequences, and tokens, enabling deeper insights into effective data characteristics. Our Influence-based Reasoning Attribution (Infra) uncovers nontrivial cross-domain effects across math and coding tasks: high-difficulty math examples improve both math and code reasoning, while low-difficulty code tasks most effectively benefit code reasoning. Based on these findings, we introduce a simple yet effective dataset reweighting strategy by flipping task difficulty, which doubles AIME24 accuracy from 10\% to 20\% and boosts LiveCodeBench accuracy from 33.8\% to 35.3\% for Qwen2.5-7B-Instruct. Moreover, our fine-grained attribution reveals that the sequence-level exploratory behaviors enhance reasoning performance in both math and code, and the token-level influence patterns are distinct for math and code reasoning: the former prefers natural language logic connectors and the latter emphasizes structural syntax.

nan

Article 1371

Title@2025-05-26 (1): SaSi: A Self-augmented and Self-interpreted Deep Learning Approach for Few-shot Cryo-ET Particle Detection

Title: SaSi: A Self-augmented and Self-interpreted Deep Learning Approach for Few-shot Cryo-ET Particle Detection

SaSi: Ein selbst-augmentierter und selbst-interpretierter Deep-Learning-Ansatz für die wenige Schuss Cryo-ET Partikelerkennung

SaSi:对几近的Cryo-ET粒子探测自增强和自我解释的深层学习方法 2505.19948v1

Authors: Gokul Adethya, Bhanu Pratyush Mantha, Tianyang Wang, Xingjian Li, Min Xu

Cryo-electron tomography (cryo-ET) has emerged as a powerful technique for imaging macromolecular complexes in their near-native states. However, the localization of 3D particles in cellular environments still presents a significant challenge due to low signal-to-noise ratios and missing wedge artifacts. Deep learning approaches have shown great potential, but they need huge amounts of data, which can be a challenge in cryo-ET scenarios where labeled data is often scarce. In this paper, we propose a novel Self-augmented and Self-interpreted (SaSi) deep learning approach towards few-shot particle detection in 3D cryo-ET images. Our method builds upon self-augmentation techniques to further boost data utilization and introduces a self-interpreted segmentation strategy for alleviating dependency on labeled data, hence improving generalization and robustness. As demonstrated by experiments conducted on both simulated and real-world cryo-ET datasets, the SaSi approach significantly outperforms existing state-of-the-art methods for particle localization. This research increases understanding of how to detect particles with very few labels in cryo-ET and thus sets a new benchmark for few-shot learning in structural biology.

nan

Article 1372

Title@2025-05-26 (1): Dynamically Learned Test-Time Model Routing in Language Model Zoos with Service Level Guarantees

Title: Dynamically Learned Test-Time Model Routing in Language Model Zoos with Service Level Guarantees

Dynamisch gelerntes Test-Time-Modell-Routing in Sprachmodell Zoos mit Service-Level-Garantien

具有服务级保障的语文示范动物园动态学习测试时间模型运行 2505.19947v1

Authors: Herbert Woisetschläger, Ryan Zhang, Shiqiang Wang, Hans-Arno Jacobsen

Open-weight LLM zoos provide access to numerous high-quality models, but selecting the appropriate model for specific tasks remains challenging and requires technical expertise. Most users simply want factually correct, safe, and satisfying responses without concerning themselves with model technicalities, while inference service providers prioritize minimizing operating costs. These competing interests are typically mediated through service level agreements (SLAs) that guarantee minimum service quality. We introduce MESS+, a stochastic optimization algorithm for cost-optimal LLM request routing while providing rigorous SLA compliance guarantees. MESS+ learns request satisfaction probabilities of LLMs in real-time as users interact with the system, based on which model selection decisions are made by solving a per-request optimization problem. Our algorithm includes a novel combination of virtual queues and request satisfaction prediction, along with a theoretical analysis of cost optimality and constraint satisfaction. Across a wide range of state-of-the-art LLM benchmarks, MESS+ achieves an average of 2x cost savings compared to existing LLM routing techniques.

nan

Article 1373

Title@2025-05-26 (1): Inverse Q-Learning Done Right: Offline Imitation Learning in $Q^π$-Realizable MDPs

Title: Inverse Q-Learning Done Right: Offline Imitation Learning in $Q^π$-Realizable MDPs

Inverse Q-Learning Done Right: Offline-Imitation Lernen in $Q^π$-realisierbaren MDPs

逆向Q- 学习完成右: 以可变元DP为单位的离线模拟学习($$- $- 可变 MDP) 2505.19946v1

Authors: Antoine Moulin, Gergely Neu, Luca Viano

We study the problem of offline imitation learning in Markov decision processes (MDPs), where the goal is to learn a well-performing policy given a dataset of state-action pairs generated by an expert policy. Complementing a recent line of work on this topic that assumes the expert belongs to a tractable class of known policies, we approach this problem from a new angle and leverage a different type of structural assumption about the environment. Specifically, for the class of linear $Q^\pi$-realizable MDPs, we introduce a new algorithm called saddle-point offline imitation learning (\SPOIL), which is guaranteed to match the performance of any expert up to an additive error $\varepsilon$ with access to $\mathcal{O}(\varepsilon^{-2})$ samples. Moreover, we extend this result to possibly non-linear $Q^\pi$-realizable MDPs at the cost of a worse sample complexity of order $\mathcal{O}(\varepsilon^{-4})$. Finally, our analysis suggests a new loss function for training critic networks from expert data in deep imitation learning. Empirical evaluations on standard benchmarks demonstrate that the neural net implementation of \SPOIL is superior to behavior cloning and competitive with state-of-the-art algorithms.

nan

Article 1374

Title: RefinedFields: Radiance Fields Refinement for Planar Scene Representations

Verfeinerte Felder: Strahlungsfelder Verfeinerung für planare Szenendarstellungen

精炼田地: 辐射田地 2312.00639v4

Authors: Karim Kassab, Antoine Schnepf, Jean-Yves Franceschi, Laurent Caraffa, Jeremie Mary, Valérie Gouet-Brunet

Planar scene representations have recently witnessed increased interests for modeling scenes from images, as their lightweight planar structure enables compatibility with image-based models. Notably, K-Planes have gained particular attention as they extend planar scene representations to support in-the-wild scenes, in addition to object-level scenes. However, their visual quality has recently lagged behind that of state-of-the-art techniques. To reduce this gap, we propose RefinedFields, a method that leverages pre-trained networks to refine K-Planes scene representations via optimization guidance using an alternating training procedure. We carry out extensive experiments and verify the merit of our method on synthetic data and real tourism photo collections. RefinedFields enhances rendered scenes with richer details and improves upon its base representation on the task of novel view synthesis. Our project page can be found at https://refinedfields.github.io .

nan

Article 1375

Title@2025-05-26 (1): Can Visual Encoder Learn to See Arrows?

Title: Can Visual Encoder Learn to See Arrows?

Kann Visual Encoder lernen, Pfeile zu sehen?

视觉编码器能学会看到箭头吗 ? 2505.19944v1

Authors: Naoyuki Terashita, Yusuke Tozaki, Hideaki Omote, Congkha Nguyen, Ryosuke Nakamoto, Yuta Koreeda, Hiroaki Ozaki

The diagram is a visual representation of a relationship illustrated with edges (lines or arrows), which is widely used in industrial and scientific communication. Although recognizing diagrams is essential for vision language models (VLMs) to comprehend domain-specific knowledge, recent studies reveal that many VLMs fail to identify edges in images. We hypothesize that these failures stem from an over-reliance on textual and positional biases, preventing VLMs from learning explicit edge features. Based on this idea, we empirically investigate whether the image encoder in VLMs can learn edge representation through training on a diagram dataset in which edges are biased neither by textual nor positional information. To this end, we conduct contrastive learning on an artificially generated diagram–caption dataset to train an image encoder and evaluate its diagram-related features on three tasks: probing, image retrieval, and captioning. Our results show that the finetuned model outperforms pretrained CLIP in all tasks and surpasses zero-shot GPT-4o and LLaVA-Mistral in the captioning task. These findings confirm that eliminating textual and positional biases fosters accurate edge recognition in VLMs, offering a promising path for advancing diagram understanding.

nan

Article 1376

Title@2025-05-26 (1): Beyond Freezing: Sparse Tuning Enhances Plasticity in Continual Learning with Pre-Trained Models

Title: Beyond Freezing: Sparse Tuning Enhances Plasticity in Continual Learning with Pre-Trained Models

Beyond Freezing: Sparse Tuning verbessert Plastizität im kontinuierlichen Lernen mit vortrainierten Modellen

超出冻结范围:在继续学习过程中,采用培训前模式,粗略的加注可增强可塑性 2505.19943v1

Authors: Huan Zhang, Fan Lyu, Shuyu Dong, Shenghua Fan, Yujin Zheng, Dingwen Wang

Continual Learning with Pre-trained Models holds great promise for efficient adaptation across sequential tasks. However, most existing approaches freeze PTMs and rely on auxiliary modules like prompts or adapters, limiting model plasticity and leading to suboptimal generalization when facing significant distribution shifts. While full fine-tuning can improve adaptability, it risks disrupting crucial pre-trained knowledge. In this paper, we propose Mutual Information-guided Sparse Tuning (MIST), a plug-and-play method that selectively updates a small subset of PTM parameters, less than 5%, based on sensitivity to mutual information objectives. MIST enables effective task-specific adaptation while preserving generalization. To further reduce interference, we introduce strong sparsity regularization by randomly dropping gradients during tuning, resulting in fewer than 0.5% of parameters being updated per step. Applied before standard freeze-based methods, MIST consistently boosts performance across diverse continual learning benchmarks. Experiments show that integrating our method into multiple baselines yields significant performance gains. Our code is available at https://github.com/zhwhu/MIST.

nan

Article 1377

Title@2025-05-26 (1): Task-Oriented Low-Label Semantic Communication With Self-Supervised Learning

Title: Task-Oriented Low-Label Semantic Communication With Self-Supervised Learning

Aufgabenorientierte kabelarme semantische Kommunikation mit selbstüberwachtem Lernen

以任务为导向的低标签低标签语义交流与自控学习 2505.19940v1

Authors: Run Gu, Wei Xu, Zhaohui Yang, Dusit Niyato, Aylin Yener

Task-oriented semantic communication enhances transmission efficiency by conveying semantic information rather than exact messages. Deep learning (DL)-based semantic communication can effectively cultivate the essential semantic knowledge for semantic extraction, transmission, and interpretation by leveraging massive labeled samples for downstream task training. In this paper, we propose a self-supervised learning-based semantic communication framework (SLSCom) to enhance task inference performance, particularly in scenarios with limited access to labeled samples. Specifically, we develop a task-relevant semantic encoder using unlabeled samples, which can be collected by devices in real-world edge networks. To facilitate task-relevant semantic extraction, we introduce self-supervision for learning contrastive features and formulate the information bottleneck (IB) problem to balance the tradeoff between the informativeness of the extracted features and task inference performance. Given the computational challenges of the IB problem, we devise a practical and effective solution by employing self-supervised classification and reconstruction pretext tasks. We further propose efficient joint training methods to enhance end-to-end inference accuracy over wireless channels, even with few labeled samples. We evaluate the proposed framework on image classification tasks over multipath wireless channels. Extensive simulation results demonstrate that SLSCom significantly outperforms conventional digital coding methods and existing DL-based approaches across varying labeled data set sizes and SNR conditions, even when the unlabeled samples are irrelevant to the downstream tasks.

nan

Article 1378

Title@2025-05-26 (1): Efficient Time Series Processing for Transformers and State-Space Models through Token Merging

Title: Efficient Time Series Processing for Transformers and State-Space Models through Token Merging

Effiziente Zeitreihenverarbeitung für Transformatoren und State-Space-Modelle durch Token Merging

通过 Token 合并对变形器和国家空间模型的有效时间序列处理 2405.17951v2

Authors: Leon Götz, Marcel Kollovieh, Stephan Günnemann, Leo Schwinn

Despite recent advances in subquadratic attention mechanisms or state-space models, processing long token sequences still imposes significant computational requirements. Token merging has emerged as a solution to increase computational efficiency in computer vision architectures. In this work, we perform the first investigations of token merging in time series analysis on both transformers and state-space models. We further introduce local merging, a domain-specific token merging algorithm that selectively combines tokens within a local neighborhood, achieving two major benefits: a) Local merging can adjust its computational complexity from quadratic to linear based on the neighborhood size to effectively scale to long sequences; b) Local merging is the first causal merging scheme enabling token merging in transformer decoders. Further, we identify spectral properties of the input data that reliably predict the potential benefits of local merging without requiring evaluation on downstream tasks. Our comprehensive empirical evaluation demonstrates that local merging offers substantial efficiency gains with minimal impact on accuracy, achieving up to 5400% acceleration on the recently proposed Chronos foundation model.

nan

Article 1379

Title@2025-05-26 (1): Constructing a BPE Tokenization DFA

Title: Constructing a BPE Tokenization DFA

Aufbau einer BPE Tokenization DFA

正在构建 BPE 磁盘化 DFA 2405.07671v2

Authors: Martin Berglund, Willeke Martens, Brink van der Merwe

Many natural language processing systems operate over tokenizations of text to address the open-vocabulary problem. In this paper, we give and analyze an algorithm for the efficient construction of deterministic finite automata (DFA) designed to operate directly on tokenizations produced by the popular byte pair encoding (BPE) technique. This makes it possible to apply many existing techniques and algorithms to the tokenized case, such as pattern matching, equivalence checking of tokenization dictionaries, and composing tokenized languages in various ways. The construction preserves some key properties of the automaton, and we use this to establish asymptotic bounds on the state complexity of the automata that result. Finally, we demonstrate how to construct an input-deterministic (subsequential) string-to-string transducer which precisely describes the relationship between strings and their correct tokenizations.

nan

Article 1380

Title@2025-05-26 (1): Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Title: Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Modellierung von Multi-Task-Modellen, die als adaptives projektives Gradientenabsinken zusammenwachsen

模拟多任务模式模型合并为适应性预测梯度下层 2501.01230v3

Authors: Yongxian Wei, Anke Tang, Li Shen, Zixuan Hu, Chun Yuan, Xiaochun Cao

Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. Existing methods attempt to alleviate task conflicts by sparsifying task vectors or promoting orthogonality among them. However, they overlook the fundamental target of model merging: the merged model performs as closely as possible to task-specific models on respective tasks. We find these methods inevitably discard task-specific information that, while causing conflicts, is crucial for performance. Based on our findings, we frame model merging as a constrained optimization problem ($\textit{i.e.}$, minimizing the gap between the merged model and individual models, subject to the constraint of retaining shared knowledge) and solve it via adaptive projective gradient descent. Specifically, we align the merged model with individual models by decomposing and reconstituting the loss function, alleviating conflicts through $\textit{data-free}$ optimization of task vectors. To retain shared knowledge, we optimize this objective by projecting gradients within a $\textit{shared subspace}$ spanning all tasks. Moreover, we view merging coefficients as adaptive learning rates and propose a task-aware, training-free strategy. Experiments show that our plug-and-play approach consistently outperforms previous methods, achieving state-of-the-art results across diverse architectures and tasks in both vision and NLP domains.

nan

Article 1381

Title@2025-05-26 (1): Logic Gate Neural Networks are Good for Verification

Title: Logic Gate Neural Networks are Good for Verification

Logic Gate Neural Networks sind gut für die Verifikation

逻辑门神经网络有利于核查 2505.19932v1

Authors: Fabian Kresse, Emily Yu, Christoph H. Lampert, Thomas A. Henzinger

Learning-based systems are increasingly deployed across various domains, yet the complexity of traditional neural networks poses significant challenges for formal verification. Unlike conventional neural networks, learned Logic Gate Networks (LGNs) replace multiplications with Boolean logic gates, yielding a sparse, netlist-like architecture that is inherently more amenable to symbolic verification, while still delivering promising performance. In this paper, we introduce a SAT encoding for verifying global robustness and fairness in LGNs. We evaluate our method on five benchmark datasets, including a newly constructed 5-class variant, and find that LGNs are both verification-friendly and maintain strong predictive performance.

nan

Article 1382

Title@2025-05-26 (1): JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs

Title: JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs

JailbreakRadar: Umfassende Bewertung von Jailbreak Attacken gegen LLMs

Jailbreb Radar:全面评估对LLMs的越狱袭击 2402.05668v3

Authors: Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang

Jailbreak attacks aim to bypass the LLMs’ safeguards. While researchers have proposed different jailbreak attacks in depth, they have done so in isolation – either with unaligned settings or comparing a limited range of methods. To fill this gap, we present a large-scale evaluation of various jailbreak attacks. We collect 17 representative jailbreak attacks, summarize their features, and establish a novel jailbreak attack taxonomy. Then we conduct comprehensive measurement and ablation studies across nine aligned LLMs on 160 forbidden questions from 16 violation categories. Also, we test jailbreak attacks under eight advanced defenses. Based on our taxonomy and experiments, we identify some important patterns, such as heuristic-based attacks could achieve high attack success rates but are easy to mitigate by defenses, causing low practicality. Our study offers valuable insights for future research on jailbreak attacks and defenses. We hope our work could help the community avoid incremental work and serve as an effective benchmark tool for practitioners.

nan

Article 1383

Title@2025-05-26 (1): Semantic-Aware Resource Management for C-V2X Platooning via Multi-Agent Reinforcement Learning

Title: Semantic-Aware Resource Management for C-V2X Platooning via Multi-Agent Reinforcement Learning

Semantic-Aware Ressourcenmanagement für C-V2X Platooning über Multi-Agent Verstärkungslernen

通过多机构强化学习进行 C-V2X 等离子处理的语义软件资源管理 2411.04672v2

Authors: Wenjun Zhang, Qiong Wu, Pingyi Fan, Kezhi Wang, Nan Cheng, Wen Chen, Khaled B. Letaief

Semantic communication transmits the extracted features of information rather than raw data, significantly reducing redundancy, which is crucial for addressing spectrum and energy challenges in 6G networks. In this paper, we introduce semantic communication into a cellular vehicle-to-everything (C-V2X)- based autonomous vehicle platoon system for the first time, aiming to achieve efficient management of communication resources in a dynamic environment. Firstly, we construct a mathematical model for semantic communication in platoon systems, in which the DeepSC model and MU-DeepSC model are used to semantically encode and decode unimodal and multi-modal data, respectively. Then, we propose the quality of experience (QoE) metric based on semantic similarity and semantic rate. Meanwhile, we consider the success rate of semantic information transmission (SRS) metric to ensure the fairness of channel resource allocation. Next, the optimization problem is posed with the aim of maximizing the QoE in vehicle-to-vehicle (V2V) links while improving SRS. To solve this mixed integer nonlinear programming problem (MINLP) and adapt to time-varying channel conditions, the paper proposes a distributed semantic-aware multi-modal resource allocation (SAMRA) algorithm based on multi-agent reinforcement learning (MARL), referred to as SAMRAMARL. The algorithm can dynamically allocate channels and power and determine semantic symbol length based on the contextual importance of the transmitted information, ensuring efficient resource utilization. Finally, extensive simulations have demonstrated that SAMRAMARL outperforms existing methods, achieving significant gains in QoE, SRS, and communication delay in C-V2X platooning scenarios.

nan

Article 1384

Title@2025-05-26 (1): Cellwise and Casewise Robust Covariance in High Dimensions

Title: Cellwise and Casewise Robust Covariance in High Dimensions

Cellwise und Casewise Robuste Kovarianz in hohen Abmessungen

高维度的单元格和大小写常量 2505.19925v1

Authors: Fabio Centofanti, Mia Hubert, Peter J. Rousseeuw

The sample covariance matrix is a cornerstone of multivariate statistics, but it is highly sensitive to outliers. These can be casewise outliers, such as cases belonging to a different population, or cellwise outliers, which are deviating cells (entries) of the data matrix. Recently some robust covariance estimators have been developed that can handle both types of outliers, but their computation is only feasible up to at most 20 dimensions. To remedy this we propose the cellRCov method, a robust covariance estimator that simultaneously handles casewise outliers, cellwise outliers, and missing data. It relies on a decomposition of the covariance on principal and orthogonal subspaces, leveraging recent work on robust PCA. It also employs a ridge-type regularization to stabilize the estimated covariance matrix. We establish some theoretical properties of cellRCov, including its casewise and cellwise influence functions as well as consistency and asymptotic normality. A simulation study demonstrates the superior performance of cellRCov in contaminated and missing data scenarios. Furthermore, its practical utility is illustrated in a real-world application to anomaly detection. We also construct and illustrate the cellRCCA method for robust and regularized canonical correlation analysis.

nan

Article 1385

Title@2025-05-26 (1): Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL

Title: Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL

Bellman-Updates vertrauen lernen: Selektive State-Adaptive Regularisierung für Offline RL

学习信任 Bellman 更新信息: 选择性国家适应性离线转线常规化 2505.19923v1

Authors: Qin-Wen Luo, Ming-Kun Xie, Ye-Wen Wang, Sheng-Jun Huang

Offline reinforcement learning (RL) aims to learn an effective policy from a static dataset. To alleviate extrapolation errors, existing studies often uniformly regularize the value function or policy updates across all states. However, due to substantial variations in data quality, the fixed regularization strength often leads to a dilemma: Weak regularization strength fails to address extrapolation errors and value overestimation, while strong regularization strength shifts policy learning toward behavior cloning, impeding potential performance enabled by Bellman updates. To address this issue, we propose the selective state-adaptive regularization method for offline RL. Specifically, we introduce state-adaptive regularization coefficients to trust state-level Bellman-driven results, while selectively applying regularization on high-quality actions, aiming to avoid performance degradation caused by tight constraints on low-quality actions. By establishing a connection between the representative value regularization method, CQL, and explicit policy constraint methods, we effectively extend selective state-adaptive regularization to these two mainstream offline RL approaches. Extensive experiments demonstrate that the proposed method significantly outperforms the state-of-the-art approaches in both offline and offline-to-online settings on the D4RL benchmark.

nan

Article 1386

Title@2025-05-26 (1): (Un)supervised Learning of Maximal Lyapunov Functions

Title: (Un)supervised Learning of Maximal Lyapunov Functions

(Un)überwachtes Lernen von maximalen Lyapunov-Funktionen

(无受监督的学习 Maximal Lyapunov 函数的学习 2408.17246v2

Authors: Matthieu Barreau, Nicola Bastianello

In this paper, we address the problem of discovering maximal Lyapunov functions, as a means of determining the region of attraction of a dynamical system. To this end, we design a novel neural network architecture, which we prove to be a universal approximator of (maximal) Lyapunov functions. The architecture combines a local quadratic approximation with the output of a neural network, which models global higher-order terms in the Taylor expansion. We formulate the problem of training the Lyapunov function as an unsupervised optimization problem with dynamical constraints, which can be solved leveraging techniques from physics-informed learning. We propose and analyze a tailored training algorithm, based on the primal-dual algorithm, that can efficiently solve the problem. Additionally, we show how the learning problem formulation can be adapted to integrate data, when available. We apply the proposed approach to different classes of systems, showing that it matches or outperforms state-of-the-art alternatives in the accuracy of the approximated regions of attraction.

nan

Article 1387

Title@2025-05-26 (1): A Probabilistic Model for Non-Contrastive Learning

Title: A Probabilistic Model for Non-Contrastive Learning

Ein probabilistisches Modell für nicht kontrastives Lernen

非交流性学习概率模型 2501.13031v2

Authors: Maximilian Fleissner, Pascal Esser, Debarghya Ghoshdastidar

Self-supervised learning (SSL) aims to find meaningful representations from unlabeled data by encoding semantic similarities through data augmentations. Despite its current popularity, theoretical insights about SSL are still scarce. For example, it is not yet known whether commonly used SSL loss functions can be related to a statistical model, much in the same as OLS, generalized linear models or PCA naturally emerge as maximum likelihood estimates of an underlying generative process. In this short paper, we consider a latent variable statistical model for SSL that exhibits an interesting property: Depending on the informativeness of the data augmentations, the MLE of the model either reduces to PCA, or approaches a simple non-contrastive loss. We analyze the model and also empirically illustrate our findings.

nan

Article 1388

Title@2025-05-26 (1): APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization

Title: APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization

APE: Ein datenzentrischer Benchmark für effiziente LLM-Anpassung in der Textzusammenfassung

APE: 文本摘要中高效LLM适应数据中心基准 2505.19912v1

Authors: Javier Marín

We present Adjacent Possible Exploration (APE), a simple yet effective method for adapting large language models to specific tasks using minimal computational resources. Unlike traditional fine-tuning that requires extensive compute, APE iteratively fine-tunes models on small, carefully selected data batches (200 examples), retaining only improvements. On news summarization, APE achieves 40 percent BLEU improvement using just a T4 GPU in 60 minutes, matching or exceeding more complex methods like LoRA while remaining conceptually simple. Our approach is particularly valuable for researchers and practitioners with limited computational resources. We provide open-source code and demonstrate APE’s effectiveness through both automatic metrics and human evaluation. While inspired by evolutionary theory’s “adjacent possible”, APE’s core insight has a very practical application: small, iterative data perturbations can efficiently guide LLMs toward task-specific performance without expensive retraining.

nan

Article 1389

Title@2025-05-26 (1): Inverse Problem Sampling in Latent Space Using Sequential Monte Carlo

Title: Inverse Problem Sampling in Latent Space Using Sequential Monte Carlo

Inverse Problem-Sampling im Latent Space mit Sequential Monte Carlo

利用定序蒙特卡洛在低层空间进行逆向问题抽样 2502.05908v2

Authors: Idan Achituve, Hai Victor Habi, Amir Rosenfeld, Arnon Netzer, Idit Diamant, Ethan Fetaya

In image processing, solving inverse problems is the task of finding plausible reconstructions of an image that was corrupted by some (usually known) degradation operator. Commonly, this process is done using a generative image model that can guide the reconstruction towards solutions that appear natural. The success of diffusion models over the last few years has made them a leading candidate for this task. However, the sequential nature of diffusion models makes this conditional sampling process challenging. Furthermore, since diffusion models are often defined in the latent space of an autoencoder, the encoder-decoder transformations introduce additional difficulties. To address these challenges, we suggest a novel sampling method based on sequential Monte Carlo (SMC) in the latent space of diffusion models. We name our method LD-SMC. We define a generative model for the data using additional auxiliary observations and perform posterior inference with SMC sampling based on a backward diffusion process. Empirical evaluations on ImageNet and FFHQ show the benefits of LD-SMC over competing methods in various inverse problem tasks and especially in challenging inpainting tasks.

nan

Article 1390

Title@2025-05-26 (1): ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining

Title: ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining

ESLM: Risiko-Averse Selective Language Modeling für effizientes Vortraining

ESLM: 有效培训前风险-反风险选择语言建模 2505.19893v1

Authors: Melis Ilayda Bal, Volkan Cevher, Michael Muehlebach

Large language model pretraining is compute-intensive, yet many tokens contribute marginally to learning, resulting in inefficiency. We introduce Efficient Selective Language Modeling (ESLM), a risk-aware algorithm that improves training efficiency and distributional robustness by performing online token-level batch selection. ESLM leverages per-token statistics (e.g., entropy or loss) and applies value-at-risk thresholding to retain only the most informative tokens per batch. This data-centric mechanism reshapes the training loss, prioritizing high-risk tokens and eliminating redundant gradient computation. We frame ESLM as a bilevel game: the model competes with a masking adversary that selects worst-case token subsets under a constrained thresholding rule. In the loss-based setting, ESLM recovers conditional value-at-risk loss minimization, providing a principled connection to distributionally robust optimization. We extend our approach to Ada-ESLM, which adaptively tunes the selection confidence during training. Experiments on GPT-2 pretraining show that ESLM significantly reduces training FLOPs while maintaining or improving both perplexity and downstream performance compared to baselines. Our approach also scales across model sizes, pretraining corpora, and integrates naturally with knowledge distillation.

nan

Article 1391

Title@2025-05-26 (1): APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs

Title: APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs

APB: Beschleunigen des verteilten Long-Context-Schlussfolgerungens durch Übergeben von komprimierten Kontextblöcken über GPUs

APP: 通过通过横跨 GPU 传递压缩的上下文区块加速分布式长文字推文 2502.12085v2

Authors: Yuxiang Huang, Mingye Li, Xu Han, Chaojun Xiao, Weilin Zhao, Sun Ao, Hao Zhou, Jie Zhou, Zhiyuan Liu, Maosong Sun

While long-context inference is crucial for advancing large language model (LLM) applications, its prefill speed remains a significant bottleneck. Current approaches, including sequence parallelism strategies and compute reduction through approximate attention mechanisms, still fall short of delivering optimal inference efficiency. This hinders scaling the inputs to longer sequences and processing long-context queries in a timely manner. To address this, we introduce APB, an efficient long-context inference framework that leverages multi-host approximate attention to enhance prefill speed by reducing compute and enhancing parallelism simultaneously. APB introduces a communication mechanism for essential key-value pairs within a sequence parallelism framework, enabling a faster inference speed while maintaining task performance. We implement APB by incorporating a tailored FlashAttn kernel alongside optimized distribution strategies, supporting diverse models and parallelism configurations. APB achieves speedups of up to 9.2x, 4.2x, and 1.6x compared with FlashAttn, RingAttn, and StarAttn, respectively, without any observable task performance degradation. We provide the implementation and experiment code of APB in https://github.com/thunlp/APB.

nan

Article 1392

Title@2025-05-26 (1): A Langevin sampling algorithm inspired by the Adam optimizer

Title: A Langevin sampling algorithm inspired by the Adam optimizer

Ein Langevin-Sampling-Algorithmus, inspiriert vom Adam-Optimierer

由亚当优化器启发的Langevin取样算法 2504.18911v2

Authors: Benedict Leimkuhler, René Lohmann, Peter Whalley

We present a framework for adaptive-stepsize MCMC sampling based on time-rescaled Langevin dynamics, in which the stepsize variation is dynamically driven by an additional degree of freedom. Our approach augments the phase space by an additional variable which in turn defines a time reparameterization. The use of an auxiliary relaxation equation allows accumulation of a moving average of a local monitor function and provides for precise control of the timestep while circumventing the need to modify the drift term in the physical system. Our algorithm is straightforward to implement and can be readily combined with any off-the-peg fixed-stepsize Langevin integrator. As a particular example, we consider control of the stepsize by monitoring the norm of the log-posterior gradient, which takes inspiration from the Adam optimizer, the stepsize being automatically reduced in regions of steep change of the log posterior and increased on plateaus, improving numerical stability and convergence speed. As in Adam, the stepsize variation depends on the recent history of the gradient norm, which enhances stability and improves accuracy compared to more immediate control approaches. We demonstrate the potential benefit of this method–both in accuracy and in stability–in numerical experiments including Neal’s funnel and a Bayesian neural network for classification of MNIST data.

nan

Article 1393

Title@2025-05-26 (1): Learning mechanical systems from real-world data using discrete forced Lagrangian dynamics

Title: Learning mechanical systems from real-world data using discrete forced Lagrangian dynamics

Mechanische Systeme aus realen Daten mit diskreter, erzwungener Lagrange-Dynamik lernen

使用离散强制拉格朗江动力从真实世界数据中学习机械系统 2505.20370v1

Authors: Martine Dyring Hansen, Elena Celledoni, Benjamin Kwanen Tapley

We introduce a data-driven method for learning the equations of motion of mechanical systems directly from position measurements, without requiring access to velocity data. This is particularly relevant in system identification tasks where only positional information is available, such as motion capture, pixel data or low-resolution tracking. Our approach takes advantage of the discrete Lagrange-d’Alembert principle and the forced discrete Euler-Lagrange equations to construct a physically grounded model of the system’s dynamics. We decompose the dynamics into conservative and non-conservative components, which are learned separately using feed-forward neural networks. In the absence of external forces, our method reduces to a variational discretization of the action principle naturally preserving the symplectic structure of the underlying Hamiltonian system. We validate our approach on a variety of synthetic and real-world datasets, demonstrating its effectiveness compared to baseline methods. In particular, we apply our model to (1) measured human motion data and (2) latent embeddings obtained via an autoencoder trained on image sequences. We demonstrate that we can faithfully reconstruct and separate both the conservative and forced dynamics, yielding interpretable and physically consistent predictions.

nan

Article 1394

Title@2025-05-26 (1): Single-Agent vs. Multi-Agent LLM Strategies for Automated Student Reflection Assessment

Title: Single-Agent vs. Multi-Agent LLM Strategies for Automated Student Reflection Assessment

Single-Agent vs. Multi-Agent LLM-Strategien für die automatisierte Bewertung von Studentenreflexionen

学生自动反省评估战略 2504.05716v2

Authors: Gen Li, Li Chen, Cheng Tang, Valdemar Švábenský, Daisuke Deguchi, Takayoshi Yamashita, Atsushi Shimada

We explore the use of Large Language Models (LLMs) for automated assessment of open-text student reflections and prediction of academic performance. Traditional methods for evaluating reflections are time-consuming and may not scale effectively in educational settings. In this work, we employ LLMs to transform student reflections into quantitative scores using two assessment strategies (single-agent and multi-agent) and two prompting techniques (zero-shot and few-shot). Our experiments, conducted on a dataset of 5,278 reflections from 377 students over three academic terms, demonstrate that the single-agent with few-shot strategy achieves the highest match rate with human evaluations. Furthermore, models utilizing LLM-assessed reflection scores outperform baselines in both at-risk student identification and grade prediction tasks. These findings suggest that LLMs can effectively automate reflection assessment, reduce educators’ workload, and enable timely support for students who may need additional assistance. Our work emphasizes the potential of integrating advanced generative AI technologies into educational practices to enhance student engagement and academic success.

nan

Article 1395

Title@2025-05-26 (1): Target Specific De Novo Design of Drug Candidate Molecules with Graph Transformer-based Generative Adversarial Networks

Title: Target Specific De Novo Design of Drug Candidate Molecules with Graph Transformer-based Generative Adversarial Networks

Zielspezifisches De Novo-Design von Wirkstoff-Kandidatenmolekülen mit Graph Transformer-basierten Generativen Adversarial-Netzwerken

配有基于图形变形器的成形反转基因网络的药物候选分子具体新设计 2302.07868v7

Authors: Atabey Ünlü, Elif Çevrim, Melih Gökay Yiğit, Ahmet Sarıgün, Hayriye Çelikbilek, Osman Bayram, Deniz Cansen Kahraman, Abdurrahman Olğaç, Ahmet Sureyya Rifaioğlu, Erden Banoğlu, Tunca Doğan

Discovering novel drug candidate molecules is one of the most fundamental and critical steps in drug development. Generative deep learning models, which create synthetic data given a probability distribution, offer a high potential for designing de novo molecules. However, to be utilisable in real life drug development pipelines, these models should be able to design drug like and target centric molecules. In this study, we propose an end to end generative system, DrugGEN, for the de novo design of drug candidate molecules that interact with intended target proteins. The proposed method represents molecules as graphs and processes them via a generative adversarial network comprising graph transformer layers. The system is trained using a large dataset of drug like compounds and target specific bioactive molecules to design effective inhibitory molecules against the AKT1 protein, which is critically important in developing treatments for various types of cancer. We conducted molecular docking and dynamics to assess the target centric generation performance of the model, as well as attention score visualisation to examine model interpretability. In parallel, selected compounds were chemically synthesised and evaluated in the context of in vitro enzymatic assays, which identified two bioactive molecules that inhibited AKT1 at low micromolar concentrations. These results indicate that DrugGEN’s de novo molecules have a high potential for interacting with the AKT1 protein at the level of its native ligands. Using the open access DrugGEN codebase, it is possible to easily train models for other druggable proteins, given a dataset of experimentally known bioactive molecules.

nan

Article 1396

Title@2025-05-26 (1): Risk-Averse Reinforcement Learning with Itakura-Saito Loss

Title: Risk-Averse Reinforcement Learning with Itakura-Saito Loss

Risiko-Averse Verstärkungs-Lernen mit Itakura-Saito-Verlust

以Itakuura-Saito损失进行反风险强化学习 2505.16925v2

Authors: Igor Udovichenko, Olivier Croissant, Anita Toleutaeva, Evgeny Burnaev, Alexander Korotin

Risk-averse reinforcement learning finds application in various high-stakes fields. Unlike classical reinforcement learning, which aims to maximize expected returns, risk-averse agents choose policies that minimize risk, occasionally sacrificing expected value. These preferences can be framed through utility theory. We focus on the specific case of the exponential utility function, where one can derive the Bellman equations and employ various reinforcement learning algorithms with few modifications. To address this, we introduce to the broad machine learning community a numerically stable and mathematically sound loss function based on the Itakura-Saito divergence for learning state-value and action-value functions. We evaluate the Itakura-Saito loss function against established alternatives, both theoretically and empirically. In the experimental section, we explore multiple scenarios, some with known analytical solutions, and show that the considered loss function outperforms the alternatives.

nan

Article 1397

Title@2025-05-26 (1): Explaining the role of Intrinsic Dimensionality in Adversarial Training

Title: Explaining the role of Intrinsic Dimensionality in Adversarial Training

Erklärung der Rolle der Intrinsischen Dimensionalität im Adversarial Training

解释内在多面性在相互培训中的作用 2405.17130v2

Authors: Enes Altinisik, Safa Messaoud, Husrev Taha Sencar, Hassan Sajjad, Sanjay Chawla

Adversarial Training (AT) impacts different architectures in distinct ways: vision models gain robustness but face reduced generalization, encoder-based models exhibit limited robustness improvements with minimal generalization loss, and recent work in latent-space adversarial training (LAT) demonstrates that decoder-based models achieve improved robustness by applying AT across multiple layers. We provide the first explanation for these trends by leveraging the manifold conjecture: off-manifold adversarial examples (AEs) enhance robustness, while on-manifold AEs improve generalization. We show that vision and decoder-based models exhibit low intrinsic dimensionality in earlier layers (favoring off-manifold AEs), whereas encoder-based models do so in later layers (favoring on-manifold AEs). Exploiting this property, we introduce SMAAT, which improves the scalability of AT for encoder-based models by perturbing the layer with the lowest intrinsic dimensionality. This reduces the projected gradient descent (PGD) chain length required for AE generation, cutting GPU time by 25-33% while significantly boosting robustness. We validate SMAAT across multiple tasks, including text generation, sentiment classification, safety filtering, and retrieval augmented generation setups, demonstrating superior robustness with comparable generalization to standard training.

nan

Article 1398

Title@2025-05-26 (1): Multi-Graph Inductive Representation Learning for Large-Scale Urban Rail Demand Prediction under Disruptions

Title: Multi-Graph Inductive Representation Learning for Large-Scale Urban Rail Demand Prediction under Disruptions

Multi-Graph Induktives Representationslernen für großflächige Nachfragevorhersage für die Stadtbahn unter Störungen

大型城市铁路需求预测中断下的大型城市铁路需求预测 2408.15619v2

Authors: Dang Viet Anh Nguyen, J. Victor Flensburg, Fabrizio Cerreto, Bianca Pascariu, Paola Pellegrini, Carlos Lima Azevedo, Filipe Rodrigues

With the expansion of cities over time, URT (Urban Rail Transit) networks have also grown significantly. Demand prediction plays an important role in supporting planning, scheduling, fleet management, and other operational decisions. In this study, we propose an Origin-Destination (OD) demand prediction model called Multi-Graph Inductive Representation Learning (mGraphSAGE) for large-scale URT networks under operational uncertainties. Our main contributions are twofold: we enhance prediction results while ensuring scalability for large networks by relying simultaneously on multiple graphs, where each OD pair is a node on a graph and distinct OD relationships, such as temporal and spatial correlations; we show the importance of including operational uncertainties such as train delays and cancellations as inputs in demand prediction for daily operations. The model is validated on three different scales of the URT network in Copenhagen, Denmark. Experimental results show that by leveraging information from neighboring ODs and learning node representations via sampling and aggregation, mGraphSAGE is particularly suitable for OD demand prediction in large-scale URT networks, outperforming reference machine learning methods. Furthermore, during periods with train cancellations and delays, the performance gap between mGraphSAGE and other methods improves compared to normal operating conditions, demonstrating its ability to leverage system reliability information for predicting OD demand under uncertainty.

nan

Article 1399

Title@2025-05-26 (1): Deep Active Inference Agents for Delayed and Long-Horizon Environments

Title: Deep Active Inference Agents for Delayed and Long-Horizon Environments

Tiefe aktive Inferenz-Agenten für verzögerte und lang-Horizonte Umgebungen

延迟和长-Horizon环境的深海活性推断剂 2505.19867v1

Authors: Yavar Taheri Yeganeh, Mohsen Jafari, Andrea Matta

With the recent success of world-model agents, which extend the core idea of model-based reinforcement learning by learning a differentiable model for sample-efficient control across diverse tasks, active inference (AIF) offers a complementary, neuroscience-grounded paradigm that unifies perception, learning, and action within a single probabilistic framework powered by a generative model. Despite this promise, practical AIF agents still rely on accurate immediate predictions and exhaustive planning, a limitation that is exacerbated in delayed environments requiring plans over long horizons, tens to hundreds of steps. Moreover, most existing agents are evaluated on robotic or vision benchmarks which, while natural for biological agents, fall short of real-world industrial complexity. We address these limitations with a generative-policy architecture featuring (i) a multi-step latent transition that lets the generative model predict an entire horizon in a single look-ahead, (ii) an integrated policy network that enables the transition and receives gradients of the expected free energy, (iii) an alternating optimization scheme that updates model and policy from a replay buffer, and (iv) a single gradient step that plans over long horizons, eliminating exhaustive planning from the control loop. We evaluate our agent in an environment that mimics a realistic industrial scenario with delayed and long-horizon settings. The empirical results confirm the effectiveness of the proposed approach, demonstrating the coupled world-model with the AIF formalism yields an end-to-end probabilistic controller capable of effective decision making in delayed, long-horizon settings without handcrafted rewards or expensive planning.

nan

Article 1400

Title@2025-05-26 (1): HS-STAR: Hierarchical Sampling for Self-Taught Reasoners via Difficulty Estimation and Budget Reallocation

Title: HS-STAR: Hierarchical Sampling for Self-Taught Reasoners via Difficulty Estimation and Budget Reallocation

HS-STAR: Hierarchische Probenahme für selbstlernende Vernunfter über Schwierigkeitsschätzung und Budget-Umverteilung

HS-STAR:通过难以估计和预算重新定位为自学理性者进行等级抽样 2505.19866v1

Authors: Feng Xiong, Hongling Xu, Yifei Wang, Runxi Cheng, Yong Wang, Xiangxiang Chu

Self-taught reasoners (STaRs) enhance the mathematical reasoning abilities of large language models (LLMs) by leveraging self-generated responses for self-training. Recent studies have incorporated reward models to guide response selection or decoding, aiming to obtain higher-quality data. However, they typically allocate a uniform sampling budget across all problems, overlooking the varying utility of problems at different difficulty levels. In this work, we conduct an empirical study and find that problems near the boundary of the LLM’s reasoning capability offer significantly greater learning utility than both easy and overly difficult ones. To identify and exploit such problems, we propose HS-STaR, a Hierarchical Sampling framework for Self-Taught Reasoners. Given a fixed sampling budget, HS-STaR first performs lightweight pre-sampling with a reward-guided difficulty estimation strategy to efficiently identify boundary-level problems. Subsequently, it dynamically reallocates the remaining budget toward these high-utility problems during a re-sampling phase, maximizing the generation of valuable training data. Extensive experiments across multiple reasoning benchmarks and backbone LLMs demonstrate that HS-STaR significantly outperforms other baselines without requiring additional sampling budget.

nan

Article 1401

Title@2025-05-26 (1): Information-theoretic Generalization Analysis for Expected Calibration Error

Title: Information-theoretic Generalization Analysis for Expected Calibration Error

Informationstheoretische Generalisierungsanalyse für erwarteten Kalibrierungsfehler

预期校准错误信息理论概括分析 2405.15709v2

Authors: Futoshi Futami, Masahiro Fujisawa

While the expected calibration error (ECE), which employs binning, is widely adopted to evaluate the calibration performance of machine learning models, theoretical understanding of its estimation bias is limited. In this paper, we present the first comprehensive analysis of the estimation bias in the two common binning strategies, uniform mass and uniform width binning. Our analysis establishes upper bounds on the bias, achieving an improved convergence rate. Moreover, our bounds reveal, for the first time, the optimal number of bins to minimize the estimation bias. We further extend our bias analysis to generalization error analysis based on the information-theoretic approach, deriving upper bounds that enable the numerical evaluation of how small the ECE is for unknown data. Experiments using deep learning models show that our bounds are nonvacuous thanks to this information-theoretic generalization analysis approach.

nan

Article 1402

Title@2025-05-26 (1): FruitNeRF++: A Generalized Multi-Fruit Counting Method Utilizing Contrastive Learning and Neural Radiance Fields

Title: FruitNeRF++: A Generalized Multi-Fruit Counting Method Utilizing Contrastive Learning and Neural Radiance Fields

FruitNeRF++: Eine generalisierte Multi-Fruit-Counting-Methode, die kontrastives Lernen und neurale Strahlungsfelder nutzt

水果NeRF++:通用的多功能计数方法,利用矛盾学习和神经辐射场 2505.19863v1

Authors: Lukas Meyer, Andrei-Timotei Ardelean, Tim Weyrich, Marc Stamminger

We introduce FruitNeRF++, a novel fruit-counting approach that combines contrastive learning with neural radiance fields to count fruits from unstructured input photographs of orchards. Our work is based on FruitNeRF, which employs a neural semantic field combined with a fruit-specific clustering approach. The requirement for adaptation for each fruit type limits the applicability of the method, and makes it difficult to use in practice. To lift this limitation, we design a shape-agnostic multi-fruit counting framework, that complements the RGB and semantic data with instance masks predicted by a vision foundation model. The masks are used to encode the identity of each fruit as instance embeddings into a neural instance field. By volumetrically sampling the neural fields, we extract a point cloud embedded with the instance features, which can be clustered in a fruit-agnostic manner to obtain the fruit count. We evaluate our approach using a synthetic dataset containing apples, plums, lemons, pears, peaches, and mangoes, as well as a real-world benchmark apple dataset. Our results demonstrate that FruitNeRF++ is easier to control and compares favorably to other state-of-the-art methods.

nan

Article 1403

Title@2025-05-26 (1): KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics example

Title: KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics example

KAN verbessern wir die HEP-Klassifizierungsaufgaben? Kolmogorov-Arnold Networks für ein LHC-Physikbeispiel

KAN我们改进了HEP分类任务? KAN我们改进了HEP分类任务? Kolmogorov-Arnold网络应用到一个LHC物理范例 2408.02743v2

Authors: Johannes Erdmann, Florian Mausolf, Jan Lukas Späh

Recently, Kolmogorov-Arnold Networks (KANs) have been proposed as an alternative to multilayer perceptrons, suggesting advantages in performance and interpretability. We study a typical binary event classification task in high-energy physics including high-level features and comment on the performance and interpretability of KANs in this context. Consistent with expectations, we find that the learned activation functions of a one-layer KAN resemble the univariate log-likelihood ratios of the respective input features. In deeper KANs, the activations in the first layer differ from those in the one-layer KAN, which indicates that the deeper KANs learn more complex representations of the data, a pattern commonly observed in other deep-learning architectures. We study KANs with different depths and widths and we compare them to multilayer perceptrons in terms of performance and number of trainable parameters. For the chosen classification task, we do not find that KANs are more parameter efficient. However, small KANs may offer advantages in terms of interpretability that come at the cost of only a moderate loss in performance.

nan

Article 1404

Title@2025-05-26 (1): Variance-Reduced Cascade Q-learning: Algorithms and Sample Complexity

Title: Variance-Reduced Cascade Q-learning: Algorithms and Sample Complexity

Varianzreduziertes Kaskade Q-Lernen: Algorithmen und Probenkomplexität

差异减少的连级学习:等级和抽样复杂性 2408.06544v2

Authors: Mohammad Boveiri, Peyman Mohajerin Esfahani

We study the problem of estimating the optimal Q-function of $\gamma$-discounted Markov decision processes (MDPs) under the synchronous setting, where independent samples for all state-action pairs are drawn from a generative model at each iteration. We introduce and analyze a novel model-free algorithm called Variance-Reduced Cascade Q-learning (VRCQ). VRCQ comprises two key building blocks: (i) the established direct variance reduction technique and (ii) our proposed variance reduction scheme, Cascade Q-learning. By leveraging these techniques, VRCQ provides superior guarantees in the $\ell_\infty$-norm compared with the existing model-free stochastic approximation-type algorithms. Specifically, we demonstrate that VRCQ is minimax optimal. Additionally, when the action set is a singleton (so that the Q-learning problem reduces to policy evaluation), it achieves non-asymptotic instance optimality while requiring the minimum number of samples theoretically possible. Our theoretical results and their practical implications are supported by numerical experiments.

nan

Article 1405

Title@2025-05-26 (1): REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Large Reasoning Models

Title: REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Large Reasoning Models

REA-RL: Reflection-Aware Online-Verstärkungs-Lernen für effiziente große Vernunftmodelle

REA-RL:为高效大型理由模型进行反思-软件在线强化学习 2505.19862v1

Authors: Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Jun Rao, Min Zhang

Large Reasoning Models (LRMs) demonstrate strong performance in complex tasks but often face the challenge of overthinking, leading to substantially high inference costs. Existing approaches synthesize shorter reasoning responses for LRMs to learn, but are inefficient for online usage due to the time-consuming data generation and filtering processes. Meanwhile, online reinforcement learning mainly adopts a length reward to encourage short reasoning responses, but tends to lose the reflection ability and harm the performance. To address these issues, we propose REA-RL, which introduces a small reflection model for efficient scaling in online training, offering both parallel sampling and sequential revision. Besides, a reflection reward is designed to further prevent LRMs from favoring short yet non-reflective responses. Experiments show that both methods maintain or enhance performance while significantly improving inference efficiency. Their combination achieves a good balance between performance and efficiency, reducing inference costs by 35% without compromising performance. Further analysis demonstrates that our methods are effective by maintaining reflection frequency for hard problems while appropriately reducing it for simpler ones without losing reflection ability. Codes are available at https://github.com/hexuandeng/REA-RL.

nan

Article 1406

Title@2025-05-26 (1): Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?

Title: Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?

Editing as Unlearning: Sind Methoden der Wissensbearbeitung starke Grundlagen für großes Sprachmodell Unlearning?

编辑为 “ 重新学习:知识编辑方法是否为大语言模式的 “ 退出学习 “ 的 “ 大语言模式 “ 的 “ 坚实基线 “ ? 2505.19855v1

Authors: Zexi Li, Xiangzhu Wang, William F. Shen, Meghdad Kurmanji, Xinchi Qiu, Dongqi Cai, Chao Wu, Nicholas D. Lane

Large language Model (LLM) unlearning, i.e., selectively removing information from LLMs, is vital for responsible model deployment. Differently, LLM knowledge editing aims to modify LLM knowledge instead of removing it. Though editing and unlearning seem to be two distinct tasks, we find there is a tight connection between them. In this paper, we conceptualize unlearning as a special case of editing where information is modified to a refusal or “empty set” $\emptyset$ response, signifying its removal. This paper thus investigates if knowledge editing techniques are strong baselines for LLM unlearning. We evaluate state-of-the-art (SOTA) editing methods (e.g., ROME, MEMIT, GRACE, WISE, and AlphaEdit) against existing unlearning approaches on pretrained and finetuned knowledge. Results show certain editing methods, notably WISE and AlphaEdit, are effective unlearning baselines, especially for pretrained knowledge, and excel in generating human-aligned refusal answers. To better adapt editing methods for unlearning applications, we propose practical recipes including self-improvement and query merging. The former leverages the LLM’s own in-context learning ability to craft a more human-aligned unlearning target, and the latter enables ROME and MEMIT to perform well in unlearning longer sample sequences. We advocate for the unlearning community to adopt SOTA editing methods as baselines and explore unlearning from an editing perspective for more holistic LLM memory control.

nan

Article 1407

Title@2025-05-26 (1): DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

Title: DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

DISCOVER: Automatisiertes Curricula für Sparse-Reward-Verstärkungs-Lernen

DISCOV: 失学-退职强化学习自动化课程 2505.19850v1

Authors: Leander Diaz-Bone, Marco Bagatella, Jonas Hübotter, Andreas Krause

Sparse-reward reinforcement learning (RL) can model a wide range of highly complex tasks. Solving sparse-reward tasks is RL’s core premise - requiring efficient exploration coupled with long-horizon credit assignment - and overcoming these challenges is key for building self-improving agents with superhuman ability. We argue that solving complex and high-dimensional tasks requires solving simpler tasks that are relevant to the target task. In contrast, most prior work designs strategies for selecting exploratory tasks with the objective of solving any task, making exploration of challenging high-dimensional, long-horizon tasks intractable. We find that the sense of direction, necessary for effective exploration, can be extracted from existing RL algorithms, without needing any prior information. Based on this finding, we propose a method for directed sparse-reward goal-conditioned very long-horizon RL (DISCOVER), which selects exploratory goals in the direction of the target task. We connect DISCOVER to principled exploration in bandits, formally bounding the time until the target task becomes achievable in terms of the agent’s initial distance to the target, but independent of the volume of the space of all tasks. Empirically, we perform a thorough evaluation in high-dimensional environments. We find that the directed goal selection of DISCOVER solves exploration problems that are beyond the reach of prior state-of-the-art exploration methods in RL.

nan

Article 1408

Title@2025-05-26 (1): Efficient Deconvolution in Populational Inverse Problems

Title: Efficient Deconvolution in Populational Inverse Problems

Effiziente Dekonvolution in inversen Bevölkerungsproblemen

人口逆向问题的有效演变 2505.19841v1

Authors: Arnaud Vadeboncoeur, Mark Girolami, Andrew M. Stuart

This work is focussed on the inversion task of inferring the distribution over parameters of interest leading to multiple sets of observations. The potential to solve such distributional inversion problems is driven by increasing availability of data, but a major roadblock is blind deconvolution, arising when the observational noise distribution is unknown. However, when data originates from collections of physical systems, a population, it is possible to leverage this information to perform deconvolution. To this end, we propose a methodology leveraging large data sets of observations, collected from different instantiations of the same physical processes, to simultaneously deconvolve the data corrupting noise distribution, and to identify the distribution over model parameters defining the physical processes. A parameter-dependent mathematical model of the physical process is employed. A loss function characterizing the match between the observed data and the output of the mathematical model is defined; it is minimized as a function of the both the parameter inputs to the model of the physics and the parameterized observational noise. This coupled problem is addressed with a modified gradient descent algorithm that leverages specific structure in the noise model. Furthermore, a new active learning scheme is proposed, based on adaptive empirical measures, to train a surrogate model to be accurate in parameter regions of interest; this approach accelerates computation and enables automatic differentiation of black-box, potentially nondifferentiable, code computing parameter-to-solution maps. The proposed methodology is demonstrated on porous medium flow, damped elastodynamics, and simplified models of atmospheric dynamics.

nan

Article 1409

Title@2025-05-26 (1): One Surrogate to Fool Them All: Universal, Transferable, and Targeted Adversarial Attacks with CLIP

Title: One Surrogate to Fool Them All: Universal, Transferable, and Targeted Adversarial Attacks with CLIP

Ein Surrogate an Narren: All: Universelle, übertragbare und gezielte Widersacherangriffe mit CLIP

以CLIP取代 “ 愚人Them all “ :通用、可转移和有针对性的对立攻击 2505.19840v1

Authors: Binyan Xu, Xilin Dai, Di Tang, Kehuan Zhang

Deep Neural Networks (DNNs) have achieved widespread success yet remain prone to adversarial attacks. Typically, such attacks either involve frequent queries to the target model or rely on surrogate models closely mirroring the target model – often trained with subsets of the target model’s training data – to achieve high attack success rates through transferability. However, in realistic scenarios where training data is inaccessible and excessive queries can raise alarms, crafting adversarial examples becomes more challenging. In this paper, we present UnivIntruder, a novel attack framework that relies solely on a single, publicly available CLIP model and publicly available datasets. By using textual concepts, UnivIntruder generates universal, transferable, and targeted adversarial perturbations that mislead DNNs into misclassifying inputs into adversary-specified classes defined by textual concepts. Our extensive experiments show that our approach achieves an Attack Success Rate (ASR) of up to 85% on ImageNet and over 99% on CIFAR-10, significantly outperforming existing transfer-based methods. Additionally, we reveal real-world vulnerabilities, showing that even without querying target models, UnivIntruder compromises image search engines like Google and Baidu with ASR rates up to 84%, and vision language models like GPT-4 and Claude-3.5 with ASR rates up to 80%. These findings underscore the practicality of our attack in scenarios where traditional avenues are blocked, highlighting the need to reevaluate security paradigms in AI applications.

nan

Article 1410

Title@2025-05-26 (1): Multi-Agent Reinforcement Learning in Cybersecurity: From Fundamentals to Applications

Title: Multi-Agent Reinforcement Learning in Cybersecurity: From Fundamentals to Applications

Multi-Agenten-Verstärkung Lernen in Cybersicherheit: Von Grundlagen zu Anwendungen

网络安全多机构强化多机构网络安全学习:从基础到应用 2505.19837v1

Authors: Christoph R. Landolt, Christoph Würsch, Roland Meier, Alain Mermoud, Julian Jang-Jaccard

Multi-Agent Reinforcement Learning (MARL) has shown great potential as an adaptive solution for addressing modern cybersecurity challenges. MARL enables decentralized, adaptive, and collaborative defense strategies and provides an automated mechanism to combat dynamic, coordinated, and sophisticated threats. This survey investigates the current state of research in MARL applications for automated cyber defense (ACD), focusing on intruder detection and lateral movement containment. Additionally, it examines the role of Autonomous Intelligent Cyber-defense Agents (AICA) and Cyber Gyms in training and validating MARL agents. Finally, the paper outlines existing challenges, such as scalability and adversarial robustness, and proposes future research directions. This also discusses how MARL integrates in AICA to provide adaptive, scalable, and dynamic solutions to counter the increasingly sophisticated landscape of cyber threats. It highlights the transformative potential of MARL in areas like intrusion detection and lateral movement containment, and underscores the value of Cyber Gyms for training and validation of AICA.

nan

Article 1411

Title@2025-05-26 (1): DiffNMR: Advancing Inpainting of Randomly Sampled Nuclear Magnetic Resonance Signals

Title: DiffNMR: Advancing Inpainting of Randomly Sampled Nuclear Magnetic Resonance Signals

DiffNMR: Advancing Inpainting von zufällig gemusterten Kernmagnetresonanzsignalen

DiffNMR:推进随机抽样核磁共振信号的油漆 2505.20367v1

Authors: Sen Yan, Fabrizio Gabellieri, Etienne Goffinet, Filippo Castiglione, Thomas Launey

Nuclear Magnetic Resonance (NMR) spectroscopy leverages nuclear magnetization to probe molecules’ chemical environment, structure, and dynamics, with applications spanning from pharmaceuticals to the petroleum industry. Despite its utility, the high cost of NMR instrumentation, operation and the lengthy duration of experiments necessitate the development of computational techniques to optimize acquisition times. Non-Uniform sampling (NUS) is widely employed as a sub-sampling method to address these challenges, but it often introduces artifacts and degrades spectral quality, offsetting the benefits of reduced acquisition times. In this work, we propose the use of deep learning techniques to enhance the reconstruction quality of NUS spectra. Specifically, we explore the application of diffusion models, a relatively untapped approach in this domain. Our methodology involves applying diffusion models to both time-time and time-frequency NUS data, yielding satisfactory reconstructions of challenging spectra from the benchmark Artina dataset. This approach demonstrates the potential of diffusion models to improve the efficiency and accuracy of NMR spectroscopy as well as the superiority of using a time-frequency domain data over the time-time one, opening new landscapes for future studies.

nan

Article 1412

Title@2025-05-26 (1): Revisiting Glorot Initialization for Long-Range Linear Recurrences

Title: Revisiting Glorot Initialization for Long-Range Linear Recurrences

Wiederbesuch der Glorot-Initialisierung für langanhaltende lineare Wiederholungen

重新审查长频线性线性重现的地球初始化 2505.19827v1

Authors: Noga Bar, Mariia Seleznova, Yotam Alexander, Gitta Kutyniok, Raja Giryes

Proper initialization is critical for Recurrent Neural Networks (RNNs), particularly in long-range reasoning tasks, where repeated application of the same weight matrix can cause vanishing or exploding signals. A common baseline for linear recurrences is Glorot initialization, designed to ensure stable signal propagation–but derived under the infinite-width, fixed-length regime–an unrealistic setting for RNNs processing long sequences. In this work, we show that Glorot initialization is in fact unstable: small positive deviations in the spectral radius are amplified through time and cause the hidden state to explode. Our theoretical analysis demonstrates that sequences of length $t = O(\sqrt{n})$, where $n$ is the hidden width, are sufficient to induce instability. To address this, we propose a simple, dimension-aware rescaling of Glorot that shifts the spectral radius slightly below one, preventing rapid signal explosion or decay. These results suggest that standard initialization schemes may break down in the long-sequence regime, motivating a separate line of theory for stable recurrent initialization.

nan

Article 1413

Title@2025-05-26 (1): Foundation Models for Tabular Data within Systemic Contexts Need Grounding

Title: Foundation Models for Tabular Data within Systemic Contexts Need Grounding

Basismodelle für tabellarische Daten in systemischen Kontexten benötigen Erdung

系统环境中需要依据的表格数据基础模型 2505.19825v1

Authors: Tassilo Klein, Johannes Hoffart

Current research on tabular foundation models often overlooks the complexities of large-scale, real-world data by treating tables as isolated entities and assuming information completeness, thereby neglecting the vital operational context. To address this, we introduce the concept of Semantically Linked Tables (SLT), recognizing that tables are inherently connected to both declarative and procedural operational knowledge. We propose Foundation Models for Semantically Linked Tables (FMSLT), which integrate these components to ground tabular data within its true operational context. This comprehensive representation unlocks the full potential of machine learning for complex, interconnected tabular data across diverse domains. Realizing FMSLTs requires access to operational knowledge that is often unavailable in public datasets, highlighting the need for close collaboration between domain experts and researchers. Our work exposes the limitations of current tabular foundation models and proposes a new direction centered on FMSLTs, aiming to advance robust, context-aware models for structured data.

nan

Article 1414

Title@2025-05-26 (1): An Introductory Survey to Autoencoder-based Deep Clustering – Sandboxes for Combining Clustering with Deep Learning

Title: An Introductory Survey to Autoencoder-based Deep Clustering – Sandboxes for Combining Clustering with Deep Learning

Eine Einführungsstudie zum Autoencoder-basierten Deep Clustering – Sandboxen für die Kombination von Clustering mit Deep Learning

以自动编码器为基础的深层集束 – – 将集束与深层学习相结合的沙箱的介绍性调查 2504.02087v2

Authors: Collin Leiber, Lukas Miklautz, Claudia Plant, Christian Böhm

Autoencoders offer a general way of learning low-dimensional, non-linear representations from data without labels. This is achieved without making any particular assumptions about the data type or other domain knowledge. The generality and domain agnosticism in combination with their simplicity make autoencoders a perfect sandbox for researching and developing novel (deep) clustering algorithms. Clustering methods group data based on similarity, a task that benefits from the lower-dimensional representation learned by an autoencoder, mitigating the curse of dimensionality. Specifically, the combination of deep learning with clustering, called Deep Clustering, enables to learn a representation tailored to specific clustering tasks, leading to high-quality results. This survey provides an introduction to fundamental autoencoder-based deep clustering algorithms that serve as building blocks for many modern approaches.

nan

Article 1415

Title@2025-05-26 (1): LAPA-based Dynamic Privacy Optimization for Wireless Federated Learning in Heterogeneous Environments

Title: LAPA-based Dynamic Privacy Optimization for Wireless Federated Learning in Heterogeneous Environments

LAPA-basierte Dynamic Privacy Optimization for Wireless Federated Learning in heterogenen Umgebungen

以LAPA为基础的在多种不同环境无线联邦学习的动态隐私优化 2505.19823v1

Authors: Pengcheng Sun, Erwu Liu, Wei Ni, Rui Wang, Yuanzhe Geng, Lijuan Lai, Abbas Jamalipour

Federated Learning (FL) is a distributed machine learning paradigm based on protecting data privacy of devices, which however, can still be broken by gradient leakage attack via parameter inversion techniques. Differential privacy (DP) technology reduces the risk of private data leakage by adding artificial noise to the gradients, but detrimental to the FL utility at the same time, especially in the scenario where the data is Non-Independent Identically Distributed (Non-IID). Based on the impact of heterogeneous data on aggregation performance, this paper proposes a Lightweight Adaptive Privacy Allocation (LAPA) strategy, which assigns personalized privacy budgets to devices in each aggregation round without transmitting any additional information beyond gradients, ensuring both privacy protection and aggregation efficiency. Furthermore, the Deep Deterministic Policy Gradient (DDPG) algorithm is employed to optimize the transmission power, in order to determine the optimal timing at which the adaptively attenuated artificial noise aligns with the communication noise, enabling an effective balance between DP and system utility. Finally, a reliable aggregation strategy is designed by integrating communication quality and data distribution characteristics, which improves aggregation performance while preserving privacy. Experimental results demonstrate that the personalized noise allocation and dynamic optimization strategy based on LAPA proposed in this paper enhances convergence performance while satisfying the privacy requirements of FL.

nan

Article 1416

Title@2025-05-26 (1): Poison in the Well: Feature Embedding Disruption in Backdoor Attacks

Title: Poison in the Well: Feature Embedding Disruption in Backdoor Attacks

Gift im Brunnen: Feature Einbetten von Disruption in Backdoor-Angriffe

井中毒:幕后袭击中的特异性嵌入干扰 2505.19821v1

Authors: Zhou Feng, Jiahao Chen, Chunyi Zhou, Yuwen Pu, Qingming Li, Shouling Ji

Backdoor attacks embed malicious triggers into training data, enabling attackers to manipulate neural network behavior during inference while maintaining high accuracy on benign inputs. However, existing backdoor attacks face limitations manifesting in excessive reliance on training data, poor stealth, and instability, which hinder their effectiveness in real-world applications. Therefore, this paper introduces ShadowPrint, a versatile backdoor attack that targets feature embeddings within neural networks to achieve high ASRs and stealthiness. Unlike traditional approaches, ShadowPrint reduces reliance on training data access and operates effectively with exceedingly low poison rates (as low as 0.01%). It leverages a clustering-based optimization strategy to align feature embeddings, ensuring robust performance across diverse scenarios while maintaining stability and stealth. Extensive evaluations demonstrate that ShadowPrint achieves superior ASR (up to 100%), steady CA (with decay no more than 1% in most cases), and low DDR (averaging below 5%) across both clean-label and dirty-label settings, and with poison rates ranging from as low as 0.01% to 0.05%, setting a new standard for backdoor attack capabilities and emphasizing the need for advanced defense strategies focused on feature space manipulations.

nan

Article 1417

Title@2025-05-26 (1): InfoCons: Identifying Interpretable Critical Concepts in Point Clouds via Information Theory

Title: InfoCons: Identifying Interpretable Critical Concepts in Point Clouds via Information Theory

InfoCons: Identifizieren von interpretierbaren kritischen Konzepten in Punktwolken über Informationstheorie

信息库:通过信息理论确定点云中可解释的关键概念 2505.19820v1

Authors: Feifei Li, Mi Zhang, Zhaoxiang Wang, Min Yang

Interpretability of point cloud (PC) models becomes imperative given their deployment in safety-critical scenarios such as autonomous vehicles. We focus on attributing PC model outputs to interpretable critical concepts, defined as meaningful subsets of the input point cloud. To enable human-understandable diagnostics of model failures, an ideal critical subset should be faithful (preserving points that causally influence predictions) and conceptually coherent (forming semantically meaningful structures that align with human perception). We propose InfoCons, an explanation framework that applies information-theoretic principles to decompose the point cloud into 3D concepts, enabling the examination of their causal effect on model predictions with learnable priors. We evaluate InfoCons on synthetic datasets for classification, comparing it qualitatively and quantitatively with four baselines. We further demonstrate its scalability and flexibility on two real-world datasets and in two applications that utilize critical scores of PC.

nan

Article 1418

Title: Fast Differentiable Modal Simulation of Non-linear Strings, Membranes, and Plates

Schnelle differenzierbare Modale Simulation von nichtlinearen Strings, Membranen und Platten

非线性字符串、膜和平板等非线性字符串的快速可区分模式模拟 2505.05940v2

Authors: Rodrigo Diaz, Mark Sandler

Modal methods for simulating vibrations of strings, membranes, and plates are widely used in acoustics and physically informed audio synthesis. However, traditional implementations, particularly for non-linear models like the von K'arm'an plate, are computationally demanding and lack differentiability, limiting inverse modelling and real-time applications. We introduce a fast, differentiable, GPU-accelerated modal framework built with the JAX library, providing efficient simulations and enabling gradient-based inverse modelling. Benchmarks show that our approach significantly outperforms CPU and GPU-based implementations, particularly for simulations with many modes. Inverse modelling experiments demonstrate that our approach can recover physical parameters, including tension, stiffness, and geometry, from both synthetic and experimental data. Although fitting physical parameters is more sensitive to initialisation compared to other methods, it provides greater interpretability and more compact parameterisation. The code is released as open source to support future research and applications in differentiable physical modelling and sound synthesis.

nan

Article 1419

Title@2025-05-26 (1): Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models

Title: Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models

Jailbreak-AudioBench: In-Depth-Bewertung und Analyse von Jailbreak-Bedrohungen für große Audio-Sprachenmodelle

监狱破碎-AudioBennch:对大型音频语言模型的监狱破碎威胁进行内部评价和分析 2501.13772v2

Authors: Hao Cheng, Erjia Xiao, Jing Shao, Yichi Wang, Le Yang, Chao Sheng, Philip Torr, Jindong Gu, Renjing Xu

Large Language Models (LLMs) demonstrate impressive zero-shot performance across a wide range of natural language processing tasks. Integrating various modality encoders further expands their capabilities, giving rise to Multimodal Large Language Models (MLLMs) that process not only text but also visual and auditory modality inputs. However, these advanced capabilities may also pose significant security risks, as models can be exploited to generate harmful or inappropriate content through jailbreak attacks. While prior work has extensively explored how manipulating textual or visual modality inputs can circumvent safeguards in LLMs and MLLMs, the vulnerability of audio-specific Jailbreak on Large Audio-Language Models (LALMs) remains largely underexplored. To address this gap, we introduce Jailbreak-AudioBench, which consists of the Toolbox, curated Dataset, and comprehensive Benchmark. The Toolbox supports not only text-to-audio conversion but also a range of audio editing techniques. The curated Dataset provides diverse explicit and implicit jailbreak audio examples in both original and edited forms. Utilizing this dataset, we evaluate multiple state-of-the-art LALMs, establishing the most comprehensive audio jailbreak benchmark to date. Finally, Jailbreak-AudioBench establishes a foundation for advancing future research on LALMs safety alignment by enabling the in-depth exposure of more powerful jailbreak threats, such as query-based audio editing, and by facilitating the development of effective defense mechanisms.

nan

Article 1420

Title@2025-05-26 (1): Density Ratio-Free Doubly Robust Proxy Causal Learning

Title: Density Ratio-Free Doubly Robust Proxy Causal Learning

Dichte Verhältnis-frei doppelt robust Proxy Kausal Lernen

低密度比率-无杜布利强力代理原因学习 2505.19807v1

Authors: Bariscan Bozkurt, Houssam Zenati, Dimitri Meunier, Liyuan Xu, Arthur Gretton

We study the problem of causal function estimation in the Proxy Causal Learning (PCL) framework, where confounders are not observed but proxies for the confounders are available. Two main approaches have been proposed: outcome bridge-based and treatment bridge-based methods. In this work, we propose two kernel-based doubly robust estimators that combine the strengths of both approaches, and naturally handle continuous and high-dimensional variables. Our identification strategy builds on a recent density ratio-free method for treatment bridge-based PCL; furthermore, in contrast to previous approaches, it does not require indicator functions or kernel smoothing over the treatment variable. These properties make it especially well-suited for continuous or high-dimensional treatments. By using kernel mean embeddings, we have closed-form solutions and strong consistency guarantees. Our estimators outperform existing methods on PCL benchmarks, including a prior doubly robust method that requires both kernel smoothing and density ratio estimation.

nan

Article 1421

Title@2025-05-26 (1): Continuous Simplicial Neural Networks

Title: Continuous Simplicial Neural Networks

Kontinuierliche simplizielle Neuralnetze

简单连续神经网络 2503.12919v2

Authors: Aref Einizade, Dorina Thanou, Fragkiskos D. Malliaros, Jhony H. Giraldo

Simplicial complexes provide a powerful framework for modeling high-order interactions in structured data, making them particularly suitable for applications such as trajectory prediction and mesh processing. However, existing simplicial neural networks (SNNs), whether convolutional or attention-based, rely primarily on discrete filtering techniques, which can be restrictive. In contrast, partial differential equations (PDEs) on simplicial complexes offer a principled approach to capture continuous dynamics in such structures. In this work, we introduce continuous simplicial neural network (COSIMO), a novel SNN architecture derived from PDEs on simplicial complexes. We provide theoretical and experimental justifications of COSIMO’s stability under simplicial perturbations. Furthermore, we investigate the over-smoothing phenomenon, a common issue in geometric deep learning, demonstrating that COSIMO offers better control over this effect than discrete SNNs. Our experiments on real-world datasets demonstrate that COSIMO achieves competitive performance compared to state-of-the-art SNNs in complex and noisy environments.

nan

Article 1422

Title@2025-05-26 (1): Modulated differentiable STFT and balanced spectrum metric for freight train wheelset bearing cross-machine transfer monitoring under speed fluctuations

Title: Modulated differentiable STFT and balanced spectrum metric for freight train wheelset bearing cross-machine transfer monitoring under speed fluctuations

Modulierte differenzierbare STFT und symmetrische Spektralmetrik für Güterzug-Radsatzlager-Übertragungsüberwachung unter Geschwindigkeitsschwankungen

根据速度波动情况对具有跨机械转移监测的货运火车轮轮车采用机动机动的可机动机动式STFT和平衡频谱度指标 2406.11917v3

Authors: Chao He, Hongmei Shi, Ruixin Li, Jianbo Li, ZuJun Yu

The service conditions of wheelset bearings has a direct impact on the safe operation of railway heavy haul freight trains as the key components. However, speed fluctuation of the trains and few fault samples are the two main problems that restrict the accuracy of bearing fault diagnosis. Therefore, a cross-machine transfer diagnosis (pyDSN) network coupled with interpretable modulated differentiable short-time Fourier transform (STFT) and physics-informed balanced spectrum quality metric is proposed to learn domain-invariant and discriminative features under time-varying speeds. Firstly, due to insufficiency in extracting extract frequency components of time-varying speed signals using fixed windows, a modulated differentiable STFT (MDSTFT) that is interpretable with STFT-informed theoretical support, is proposed to extract the robust time-frequency spectrum (TFS). During training process, multiple windows with different lengths dynamically change. Also, in addition to the classification metric and domain discrepancy metric, we creatively introduce a third kind of metric, referred to as the physics-informed metric, to enhance transferable TFS. A physics-informed balanced spectrum quality (BSQ) regularization loss is devised to guide an optimization direction for MDSTFT and model. With it, not only can model acquire high-quality TFS, but also a physics-restricted domain adaptation network can be also acquired, making it learn real-world physics knowledge, ultimately diminish the domain discrepancy across different datasets. The experiment is conducted in the scenario of migrating from the laboratory datasets to the freight train dataset, indicating that the hybrid-driven pyDSN outperforms existing methods and has practical value.

nan

Article 1423

Title@2025-05-26 (1): Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks

Title: Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks

Erforschung des Bewusstseins in LLMs: Eine systematische Untersuchung von Theorien, Implementierungen und Grenzrisiken

探索LLMM中的觉悟:对理论、实施和前沿风险的系统调查 2505.19806v1

Authors: Sirui Chen, Shuqin Ma, Shu Yu, Hanwang Zhang, Shengjie Zhao, Chaochao Lu

Consciousness stands as one of the most profound and distinguishing features of the human mind, fundamentally shaping our understanding of existence and agency. As large language models (LLMs) develop at an unprecedented pace, questions concerning intelligence and consciousness have become increasingly significant. However, discourse on LLM consciousness remains largely unexplored territory. In this paper, we first clarify frequently conflated terminologies (e.g., LLM consciousness and LLM awareness). Then, we systematically organize and synthesize existing research on LLM consciousness from both theoretical and empirical perspectives. Furthermore, we highlight potential frontier risks that conscious LLMs might introduce. Finally, we discuss current challenges and outline future directions in this emerging field. The references discussed in this paper are organized at https://github.com/OpenCausaLab/Awesome-LLM-Consciousness.

nan

Article 1424

Title@2025-05-26 (1): GraphAU-Pain: Graph-based Action Unit Representation for Pain Intensity Estimation

Title: GraphAU-Pain: Graph-based Action Unit Representation for Pain Intensity Estimation

GraphAU-Pain: Darstellung der Graph-basierten Aktionseinheit für Schmerzintensitätsabschätzung

图AAU-Pain: 以图表为基础的行动股疼痛强度估计代表 2505.19802v1

Authors: Zhiyu Wang, Yang Liu, Hatice Gunes

Understanding pain-related facial behaviors is essential for digital healthcare in terms of effective monitoring, assisted diagnostics, and treatment planning, particularly for patients unable to communicate verbally. Existing data-driven methods of detecting pain from facial expressions are limited due to interpretability and severity quantification. To this end, we propose GraphAU-Pain, leveraging a graph-based framework to model facial Action Units (AUs) and their interrelationships for pain intensity estimation. AUs are represented as graph nodes, with co-occurrence relationships as edges, enabling a more expressive depiction of pain-related facial behaviors. By utilizing a relational graph neural network, our framework offers improved interpretability and significant performance gains. Experiments conducted on the publicly available UNBC dataset demonstrate the effectiveness of the GraphAU-Pain, achieving an F1-score of 66.21% and accuracy of 87.61% in pain intensity estimation.

nan

Article 1425

Title@2025-05-26 (1): Non-asymptotic convergence analysis of the stochastic gradient Hamiltonian Monte Carlo algorithm with discontinuous stochastic gradient with applications to training of ReLU neural networks

Title: Non-asymptotic convergence analysis of the stochastic gradient Hamiltonian Monte Carlo algorithm with discontinuous stochastic gradient with applications to training of ReLU neural networks

对随机梯度汉密尔顿·汉密尔顿·蒙特-蒙特卡洛算法进行非症状趋同分析,使用不连续的随机梯度,并用于RELU神经网络培训 2409.17107v2

Authors: Luxu Liang, Ariel Neufeld, Ying Zhang

In this paper, we provide a non-asymptotic analysis of the convergence of the stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm to a target measure in Wasserstein-1 and Wasserstein-2 distance. Crucially, compared to the existing literature on SGHMC, we allow its stochastic gradient to be discontinuous. This allows us to provide explicit upper bounds, which can be controlled to be arbitrarily small, for the expected excess risk of non-convex stochastic optimization problems with discontinuous stochastic gradients, including, among others, the training of neural networks with ReLU activation function. To illustrate the applicability of our main results, we consider numerical experiments on quantile estimation and on several optimization problems involving ReLU neural networks relevant in finance and artificial intelligence.

nan

Article 1426

Title@2025-05-26 (1): The Missing Point in Vision Transformers for Universal Image Segmentation

Title: The Missing Point in Vision Transformers for Universal Image Segmentation

Der fehlende Punkt in Vision Transformers für die universelle Bildsegmentierung

通用图像分割的愿景变异器中的缺失点 2505.19795v1

Authors: Sajjad Shahabodini, Mobina Mansoori, Farnoush Bayatmakou, Jamshid Abouei, Konstantinos N. Plataniotis, Arash Mohammadi

Image segmentation remains a challenging task in computer vision, demanding robust mask generation and precise classification. Recent mask-based approaches yield high-quality masks by capturing global context. However, accurately classifying these masks, especially in the presence of ambiguous boundaries and imbalanced class distributions, remains an open challenge. In this work, we introduce ViT-P, a novel two-stage segmentation framework that decouples mask generation from classification. The first stage employs a proposal generator to produce class-agnostic mask proposals, while the second stage utilizes a point-based classification model built on the Vision Transformer (ViT) to refine predictions by focusing on mask central points. ViT-P serves as a pre-training-free adapter, allowing the integration of various pre-trained vision transformers without modifying their architecture, ensuring adaptability to dense prediction tasks. Furthermore, we demonstrate that coarse and bounding box annotations can effectively enhance classification without requiring additional training on fine annotation datasets, reducing annotation costs while maintaining strong performance. Extensive experiments across COCO, ADE20K, and Cityscapes datasets validate the effectiveness of ViT-P, achieving state-of-the-art results with 54.0 PQ on ADE20K panoptic segmentation, 87.4 mIoU on Cityscapes semantic segmentation, and 63.6 mIoU on ADE20K semantic segmentation. The code and pretrained models are available at: https://github.com/sajjad-sh33/ViT-P}{https://github.com/sajjad-sh33/ViT-P.

nan

Article 1427

Title@2025-05-26 (1): What Can RL Bring to VLA Generalization? An Empirical Study

Title: What Can RL Bring to VLA Generalization? An Empirical Study

Was kann RL zur VLA-Verallgemeinerung bringen? Eine empirische Studie

RL能带给VLA的概括化带来什么?经验研究。 2505.19789v1

Authors: Jijia Liu, Feng Gao, Bingwen Wei, Xinlei Chen, Qingmin Liao, Yi Wu, Chao Yu, Yu Wang

Large Vision-Language Action (VLA) models have shown significant potential for embodied AI. However, their predominant training via supervised fine-tuning (SFT) limits generalization due to susceptibility to compounding errors under distribution shifts. Reinforcement learning (RL) offers a path to overcome these limitations by optimizing for task objectives via trial-and-error, yet a systematic understanding of its specific generalization benefits for VLAs compared to SFT is lacking. To address this, our study introduces a comprehensive benchmark for evaluating VLA generalization and systematically investigates the impact of RL fine-tuning across diverse visual, semantic, and execution dimensions. Our extensive experiments reveal that RL fine-tuning, particularly with PPO, significantly enhances generalization in semantic understanding and execution robustness over SFT, while maintaining comparable visual robustness. We identify PPO as a more effective RL algorithm for VLAs than LLM-derived methods like DPO and GRPO. We also develop a simple recipe for efficient PPO training on VLAs, and demonstrate its practical utility for improving VLA generalization. The project page is at https://rlvla.github.io

nan

Article 1428

Title@2025-05-26 (1): MedDreamer: Model-Based Reinforcement Learning with Latent Imagination on Complex EHRs for Clinical Decision Support

Title: MedDreamer: Model-Based Reinforcement Learning with Latent Imagination on Complex EHRs for Clinical Decision Support

MedDreamer: Modellbasiertes Verstärkungslernen mit latenter Imagination auf komplexen EHRs für die klinische Entscheidungsunterstützung

Medreamer:以模型为基础的强化学习,对临床决定支助的复杂电子人力资源进行中层想象 2505.19785v1

Authors: Qianyi Xu, Gousia Habib, Dilruk Perera, Mengling Feng

Timely and personalized treatment decisions are essential across a wide range of healthcare settings where patient responses vary significantly and evolve over time. Clinical data used to support these decisions are often irregularly sampled, sparse, and noisy. Existing decision support systems commonly rely on discretization and imputation, which can distort critical temporal dynamics and degrade decision quality. Moreover, they often overlook the clinical significance of irregular recording frequencies, filtering out patterns in how and when data is collected. Reinforcement Learning (RL) is a natural fit for clinical decision-making, enabling sequential, long-term optimization in dynamic, uncertain environments. However, most existing treatment recommendation systems are model-free and trained solely on offline data, making them sample-inefficient, sensitive to data quality, and poorly generalizable across tasks or cohorts. To address these limitations, we propose MedDreamer, a two-phase model-based RL framework for personalized treatment recommendation. MedDreamer uses a world model with an Adaptive Feature Integration (AFI) module to effectively model irregular, sparse clinical data. Through latent imagination, it simulates plausible patient trajectories to enhance learning, refining its policy using a mix of real and imagined experiences. This enables learning policies that go beyond suboptimal historical decisions while remaining grounded in clinical data. To our knowledge, this is the first application of latent imagination to irregular healthcare data. Evaluations on sepsis and mechanical ventilation (MV) treatment using two large-scale EHR datasets show that MedDreamer outperforms both model-free and model-based baselines in clinical outcomes and off-policy metrics.

nan

Article 1429

Title@2025-05-26 (1): Out-of-distribution Reject Option Method for Dataset Shift Problem in Early Disease Onset Prediction

Title: Out-of-distribution Reject Option Method for Dataset Shift Problem in Early Disease Onset Prediction

Out-of-Distribution Ablehnung der Option Methode für Datensatz Verschiebung Problem bei Früherkrankungen Beginn Vorhersage

用于早期疾病上移预测中数据集移位问题的不分发拒绝选项方法 2405.19864v2

Authors: Taisei Tosaki, Eiichiro Uchino, Ryosuke Kojima, Yohei Mineharu, Yuji Okamoto, Mikio Arita, Nobuyuki Miyai, Yoshinori Tamada, Tatsuya Mikami, Koichi Murashita, Shigeyuki Nakaji, Yasushi Okuno

Machine learning is increasingly used to predict lifestyle-related disease onset using health and medical data. However, its predictive accuracy for use is often hindered by dataset shift, which refers to discrepancies in data distribution between the training and testing datasets. This issue leads to the misclassification of out-of-distribution (OOD) data. To diminish dataset shift in real-world settings, this paper proposes the out-of-distribution reject option for prediction (ODROP). This method integrates an OOD detection model to preclude OOD data from the prediction phase. We used two real-world health checkup datasets (Hirosaki and Wakayama) with dataset shift, across three disease onset prediction tasks: diabetes, dyslipidemia, and hypertension. Both components of ODROP method – the OOD detection model and the prediction model – were trained on the Hirosaki dataset. We assessed the effectiveness of ODROP on the Wakayama dataset using AUROC-rejection rate curve plot. In the five OOD detection approaches (the variational autoencoder, neural network ensemble std, neural network ensemble epistemic, neural network energy, and neural network gaussian mixture based energy measurement), the variational autoencoder method demonstrated notably higher stability and a greater improvement in AUROC. For example, in the Wakayama dataset, the AUROC for diabetes onset increased from 0.80 without ODROP to 0.90 at a 31.1% rejection rate, and for dyslipidemia, it improved from 0.70 without ODROP to 0.76 at a 34% rejection rate. In addition, we categorized dataset shifts into two types using SHAP clustering – those that considerably affect predictions and those that do not. This study is the first to apply OOD detection to actual health and medical data, demonstrating its potential to substantially improve the accuracy and reliability of disease prediction models amidst dataset shift.

nan

Article 1430

Title@2025-05-26 (1): Mol-LLM: Multimodal Generalist Molecular LLM with Improved Graph Utilization

Title: Mol-LLM: Multimodal Generalist Molecular LLM with Improved Graph Utilization

Mol-LLM: Multimodaler Generalist Molecular LLM mit verbesserter Graphenverwendung

Mol-LLM:利用改进图表的多式通用主义分子有限力M 2502.02810v2

Authors: Chanhui Lee, Hanbum Ko, Yuheon Song, YongJun Jeong, Rodrigo Hormazabal, Sehui Han, Kyunghoon Bae, Sungbin Lim, Sungwoong Kim

Recent advances in large language models (LLMs) have led to models that tackle diverse molecular tasks, such as chemical reaction prediction and molecular property prediction. Large-scale molecular instruction-tuning datasets have enabled sequence-only (e.g., SMILES or SELFIES) generalist molecular LLMs, and researchers are now exploring multimodal approaches that incorporate molecular structural information for further gains. However, a genuinely multimodal, generalist LLM that covers a broad spectrum of molecular tasks has yet to be fully investigated. We observe that naive next token prediction training ignores graph-structural information, limiting an LLM’s ability to exploit molecular graphs. To address this, we propose (i) Molecular structure Preference Optimization (MolPO), which facilitates graph usage by optimizing preferences between pairs of correct and perturbed molecular structures, and (ii) an advanced graph encoder with a tailored pre-training strategy to improve the effect of graph utilization by MolPO. Building on these contributions, we introduce Mol-LLM, the first multimodal generalist model that (a) handles a broad spectrum of molecular tasks among molecular LLMs, (b) explicitly leverages molecular-structure information, and (c) takes advantage of extensive instruction tuning. Mol-LLM attains state-of-the-art or comparable results across the most comprehensive molecular-LLM benchmark-even on out-of-distribution datasets for reaction and property prediction, where it surpasses prior generalist molecular LLMs by a large margin.

nan

Article 1431

Title@2025-05-26 (1): Advancements in Medical Image Classification through Fine-Tuning Natural Domain Foundation Models

Title: Advancements in Medical Image Classification through Fine-Tuning Natural Domain Foundation Models

Fortschritte bei der Klassifikation medizinischer Bilder durch Modelle der Fine-Tuning Natural Domain Foundation

通过精美开发自然域基金会模型提高医学图像分类 2505.19779v1

Authors: Mobina Mansoori, Sajjad Shahabodini, Farnoush Bayatmakou, Jamshid Abouei, Konstantinos N. Plataniotis, Arash Mohammadi

Using massive datasets, foundation models are large-scale, pre-trained models that perform a wide range of tasks. These models have shown consistently improved results with the introduction of new methods. It is crucial to analyze how these trends impact the medical field and determine whether these advancements can drive meaningful change. This study investigates the application of recent state-of-the-art foundation models, DINOv2, MAE, VMamba, CoCa, SAM2, and AIMv2, for medical image classification. We explore their effectiveness on datasets including CBIS-DDSM for mammography, ISIC2019 for skin lesions, APTOS2019 for diabetic retinopathy, and CHEXPERT for chest radiographs. By fine-tuning these models and evaluating their configurations, we aim to understand the potential of these advancements in medical image classification. The results indicate that these advanced models significantly enhance classification outcomes, demonstrating robust performance despite limited labeled data. Based on our results, AIMv2, DINOv2, and SAM2 models outperformed others, demonstrating that progress in natural domain training has positively impacted the medical domain and improved classification outcomes. Our code is publicly available at: https://github.com/sajjad-sh33/Medical-Transfer-Learning.

nan

Article 1432

Title@2025-05-26 (1): Query Performance Prediction using Relevance Judgments Generated by Large Language Models

Title: Query Performance Prediction using Relevance Judgments Generated by Large Language Models

Abfrage der Leistungsvorhersage anhand von Relevanzurteilen, die von großen Sprachmodellen erzeugt werden

使用大语言模型产生的相关性判断的查询性绩效预测 2404.01012v3

Authors: Chuan Meng, Negar Arabzadeh, Arian Askari, Mohammad Aliannejadi, Maarten de Rijke

Query performance prediction (QPP) aims to estimate the retrieval quality of a search system for a query without human relevance judgments. Previous QPP methods typically return a single scalar value and do not require the predicted values to approximate a specific information retrieval (IR) evaluation measure, leading to certain drawbacks: (i) a single scalar is insufficient to accurately represent different IR evaluation measures, especially when metrics do not highly correlate, and (ii) a single scalar limits the interpretability of QPP methods because solely using a scalar is insufficient to explain QPP results. To address these issues, we propose a QPP framework using automatically generated relevance judgments (QPP-GenRE), which decomposes QPP into independent subtasks of predicting the relevance of each item in a ranked list to a given query. This allows us to predict any IR evaluation measure using the generated relevance judgments as pseudo-labels. This also allows us to interpret predicted IR evaluation measures, and identify, track and rectify errors in generated relevance judgments to improve QPP quality. We predict an item’s relevance by using open-source large language models (LLMs) to ensure scientific reproducibility. We face two main challenges: (i) excessive computational costs of judging an entire corpus for predicting a metric considering recall, and (ii) limited performance in prompting open-source LLMs in a zero-/few-shot manner. To solve the challenges, we devise an approximation strategy to predict an IR measure considering recall and propose to fine-tune open-source LLMs using human-labeled relevance judgments. Experiments on the TREC 2019 to 2022 deep learning tracks and CAsT-19 and 20 datasets show that QPP-GenRE achieves state-of-the-art QPP quality for both lexical and neural rankers.

nan

Article 1433

Title@2025-05-26 (1): Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Title: Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Verständnis der Leistungslücke im Preference Learning: Eine Dichotomie von RLHF und DPO

了解优先学习方面的绩效差距:RLHF和DPO的二分切开术 2505.19770v1

Authors: Ruizhe Shi, Minhak Song, Runlong Zhou, Zihan Zhang, Maryam Fazel, Simon S. Du

We present a fine-grained theoretical analysis of the performance gap between reinforcement learning from human feedback (RLHF) and direct preference optimization (DPO) under a representation gap. Our study decomposes this gap into two sources: an explicit representation gap under exact optimization and an implicit representation gap under finite samples. In the exact optimization setting, we characterize how the relative capacities of the reward and policy model classes influence the final policy qualities. We show that RLHF, DPO, or online DPO can outperform one another depending on the type of model mis-specifications. Notably, online DPO can outperform both RLHF and standard DPO when the reward and policy model classes are isomorphic and both mis-specified. In the approximate optimization setting, we provide a concrete construction where the ground-truth reward is implicitly sparse and show that RLHF requires significantly fewer samples than DPO to recover an effective reward model – highlighting a statistical advantage of two-stage learning. Together, these results provide a comprehensive understanding of the performance gap between RLHF and DPO under various settings, and offer practical insights into when each method is preferred.

nan

Article 1434

Title@2025-05-26 (1): Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases

Title: Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases

Diff-Def: Diffusionsgenerierte Deformationsfelder für Bedingte Atlase

Diff- Def: 用于条件图集的 Diff- Def: 用于条件图集的 Dif- 扩散- 驱动解析字段 2403.16776v2

Authors: Sophie Starck, Vasiliki Sideri-Lampretsa, Bernhard Kainz, Martin J. Menten, Tamara T. Mueller, Daniel Rueckert

Anatomical atlases are widely used for population studies and analysis. Conditional atlases target a specific sub-population defined via certain conditions, such as demographics or pathologies, and allow for the investigation of fine-grained anatomical differences like morphological changes associated with ageing or disease. Existing approaches use either registration-based methods that are often unable to handle large anatomical variations or generative adversarial models, which are challenging to train since they can suffer from training instabilities. Instead of generating atlases directly in as intensities, we propose using latent diffusion models to generate deformation fields, which transform a general population atlas into one representing a specific sub-population. Our approach ensures structural integrity, enhances interpretability and avoids hallucinations that may arise during direct image synthesis by generating this deformation field and regularising it using a neighbourhood of images. We compare our method to several state-of-the-art atlas generation methods using brain MR images from the UK Biobank. Our method generates highly realistic atlases with smooth transformations and high anatomical fidelity, outperforming existing baselines. We demonstrate the quality of these atlases through comprehensive evaluations, including quantitative metrics for anatomical accuracy, perceptual similarity, and qualitative analyses displaying the consistency and realism of the generated atlases.

nan

Article 1435

Title@2025-05-26 (1): Agentic Predictor: Performance Prediction for Agentic Workflows via Multi-View Encoding

Title: Agentic Predictor: Performance Prediction for Agentic Workflows via Multi-View Encoding

Agentic Predictor: Leistungsvorhersage für Agentic Workflows über Multi-View-Encoding

AG 预测员:通过多查看编码对AG-工作流程的性能预测 2505.19764v1

Authors: Patara Trirat, Wonyong Jeong, Sung Ju Hwang

Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks, but optimizing LLM-based agentic systems remains challenging due to the vast search space of agent configurations, prompting strategies, and communication patterns. Existing approaches often rely on heuristic-based tuning or exhaustive evaluation, which can be computationally expensive and suboptimal. This paper proposes Agentic Predictor, a lightweight predictor for efficient agentic workflow evaluation. Agentic Predictor is equipped with a multi-view workflow encoding technique that leverages multi-view representation learning of agentic systems by incorporating code architecture, textual prompts, and interaction graph features. To achieve high predictive accuracy while significantly reducing the number of required workflow evaluations for training a predictor, Agentic Predictor employs cross-domain unsupervised pretraining. By learning to approximate task success rates, Agentic Predictor enables fast and accurate selection of optimal agentic workflow configurations for a given task, significantly reducing the need for expensive trial-and-error evaluations. Experiments on a carefully curated benchmark spanning three domains show that our predictor outperforms state-of-the-art methods in both predictive accuracy and workflow utility, highlighting the potential of performance predictors in streamlining the design of LLM-based agentic workflows.

nan

Article 1436

Title@2025-05-26 (1): Unfolding AlphaFold’s Bayesian Roots in Probability Kinematics

Title: Unfolding AlphaFold’s Bayesian Roots in Probability Kinematics

AlphaFolds Bayesische Wurzeln in der Wahrscheinlichkeitskinematik entfalten

将 AlphaFold 的贝叶根在概率 Kinematics 中卸载 2505.19763v1

Authors: Thomas Hamelryck, Kanti V. Mardia

We present a novel theoretical interpretation of AlphaFold1. The seminal breakthrough of AlphaFold1 in protein structure prediction by deep learning relied on a learned potential energy function, in contrast to the later end-to-end architectures of AlphaFold2 and AlphaFold3. While this potential was originally justified by referring to physical potentials of mean force (PMFs), we reinterpret AlphaFold1’s potential as an instance of probability kinematics - also known as Jeffrey conditioning - a principled but underrecognised generalization of conventional Bayesian updating. Probability kinematics accommodates uncertain or soft evidence in the form of updated probabilities over a partition. This perspective reveals AlphaFold1’s potential as a form of generalized Bayesian updating, rather than a thermodynamic potential. To confirm our probabilistic framework’s scope and precision, we analyze a synthetic 2D model in which an angular random walk prior is updated with evidence on distances via probability kinematics, mirroring AlphaFold1’s approach. This theoretical contribution connects AlphaFold1 to a broader class of well-justified Bayesian methods, allowing precise quantification, surpassing merely qualitative heuristics based on PMFs. More broadly, given the achievements of AlphaFold1, probability kinematics holds considerable promise for probabilistic deep learning, as it allows for the formulation of complex models from a few simpler components.

nan

Article 1437

Title@2025-05-26 (1): In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement

Title: In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement

In-Context-Demonstrationsfragen: Zur Prompt-Optimierung für Pseudo-Supervision-Verfeinerung

内文示范事项:关于Psuedo-监督改进的迅速优化 2410.03124v2

Authors: Zhen-Yu Zhang, Jiandong Zhang, Huaxiu Yao, Gang Niu, Masashi Sugiyama

Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality. Most existing methods rely on human supervision or parameter retraining, both of which are costly in terms of data collection and computational resources. To handle these challenges, a direct solution is to generate ``high-confidence’’ data from unsupervised downstream tasks and use them for in-context prompting or prompt optimization to refine the pseudo-supervision. However, relying solely on such data may lead to overfitting. In this paper, we leverage the in-context learning (ICL) abilities of LLMs and propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision. The proposed learning objective ensures that the optimized prompt guides the LLM to generate consistent responses for a given input when pseudo-supervised data from the downstream task are used as demonstrations, enabling refinement over the entire pseudo-supervision. The prompt is optimized by translating gradient signals into textual critiques, which serve as feedback to iteratively refine the prompt and model responses. Theoretical analysis in a simplified classification setting shows that the refined pseudo-supervision exhibits a geometric clustering structure, helping to mitigate overfitting. Experiments on question answering, natural language inference benchmarks, and a real-world molecule optimization task, show the effectiveness of the proposed algorithm.

nan

Article 1438

Title@2025-05-26 (1): Semantic-Aware Interpretable Multimodal Music Auto-Tagging

Title: Semantic-Aware Interpretable Multimodal Music Auto-Tagging

Semantic-Aware Interpretierbare multimodale Musik Auto-Tagging

解析多式音乐自动调制 2505.17233v2

Authors: Andreas Patakis, Vassilis Lyberatos, Spyridon Kantarelis, Edmund Dervakos, Giorgos Stamou

Music auto-tagging is essential for organizing and discovering music in extensive digital libraries. While foundation models achieve exceptional performance in this domain, their outputs often lack interpretability, limiting trust and usability for researchers and end-users alike. In this work, we present an interpretable framework for music auto-tagging that leverages groups of musically meaningful multimodal features, derived from signal processing, deep learning, ontology engineering, and natural language processing. To enhance interpretability, we cluster features semantically and employ an expectation maximization algorithm, assigning distinct weights to each group based on its contribution to the tagging process. Our method achieves competitive tagging performance while offering a deeper understanding of the decision-making process, paving the way for more transparent and user-centric music tagging systems.

nan

Article 1439

Title@2025-05-26 (1): CIDRe: A Reference-Free Multi-Aspect Criterion for Code Comment Quality Measurement

Title: CIDRe: A Reference-Free Multi-Aspect Criterion for Code Comment Quality Measurement

CIDRe: Ein referenzfreies Multi-Aspekt-Kriterium für die Qualitätsmessung von Code Comment

CIDRe: 守则评论质量衡量的无参考性、无参考性、多特征的多标准标准 2505.19757v1

Authors: Maria Dziuba, Valentin Malykh

Effective generation of structured code comments requires robust quality metrics for dataset curation, yet existing approaches (SIDE, MIDQ, STASIS) suffer from limited code-comment analysis. We propose CIDRe, a language-agnostic reference-free quality criterion combining four synergistic aspects: (1) relevance (code-comment semantic alignment), (2) informativeness (functional coverage), (3) completeness (presence of all structure sections), and (4) description length (detail sufficiency). We validate our criterion on a manually annotated dataset. Experiments demonstrate CIDRe’s superiority over existing metrics, achieving improvement in cross-entropy evaluation. When applied to filter comments, the models finetuned on CIDRe-filtered data show statistically significant quality gains in GPT-4o-mini assessments.

nan

Article 1440

Title@2025-05-26 (1): Discrete Markov Bridge

Title: Discrete Markov Bridge

Diskretierte Markov-Brücke

分立马尔科夫桥 2505.19752v1

Authors: Hengli Li, Yuxuan Wang, Song-Chun Zhu, Ying Nian Wu, Zilong Zheng

Discrete diffusion has recently emerged as a promising paradigm in discrete data modeling. However, existing methods typically rely on a fixed rate transition matrix during training, which not only limits the expressiveness of latent representations, a fundamental strength of variational methods, but also constrains the overall design space. To address these limitations, we propose Discrete Markov Bridge, a novel framework specifically designed for discrete representation learning. Our approach is built upon two key components: Matrix Learning and Score Learning. We conduct a rigorous theoretical analysis, establishing formal performance guarantees for Matrix Learning and proving the convergence of the overall framework. Furthermore, we analyze the space complexity of our method, addressing practical constraints identified in prior studies. Extensive empirical evaluations validate the effectiveness of the proposed Discrete Markov Bridge, which achieves an Evidence Lower Bound (ELBO) of 1.38 on the Text8 dataset, outperforming established baselines. Moreover, the proposed model demonstrates competitive performance on the CIFAR-10 dataset, achieving results comparable to those obtained by image-specific generation approaches.

nan

Article 1441

Title@2025-05-26 (1): Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data

Title: Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data

Maschinelles Lernen Algorithmen zur Lärmreduzierung und krankheitsverursachende Gen-Feature-Extraktion in Gensequenzierungsdaten

用于减少噪音和在基因测序数据中进行疾病传播的基因特征采掘的机器学习算法 2505.19740v1

Authors: Weichen Si, Yihao Ou, Zhen Tian

In this study, we propose a machine learning-based method for noise reduction and disease-causing gene feature extraction in gene sequencing DeepSeqDenoise algorithm combines CNN and RNN to effectively remove the sequencing noise, and improves the signal-to-noise ratio by 9.4 dB. We screened 17 key features by feature engineering, and constructed an integrated learning model to predict disease-causing genes with 94.3% accuracy. We successfully identified 57 new candidate disease-causing genes in a cardiovascular disease cohort validation, and detected 3 missed variants in clinical applications. The method significantly outperforms existing tools and provides strong support for accurate diagnosis of genetic diseases.

nan

Article 1442

Title@2025-05-26 (1): Weighted Leave-One-Out Cross Validation

Title: Weighted Leave-One-Out Cross Validation

Gewichtete Leave-One-Out Cross-Validierung

加权请假一次性离职后交叉验证 2505.19737v1

Authors: Luc Pronzato, Maria-João Rendas

We present a weighted version of Leave-One-Out (LOO) cross-validation for estimating the Integrated Squared Error (ISE) when approximating an unknown function by a predictor that depends linearly on evaluations of the function over a finite collection of sites. The method relies on the construction of the best linear estimator of the squared prediction error at an arbitrary unsampled site based on squared LOO residuals, assuming that the function is a realization of a Gaussian Process (GP). A theoretical analysis of performance of the ISE estimator is presented, and robustness with respect to the choice of the GP kernel is investigated first analytically, then through numerical examples. Overall, the estimation of ISE is significantly more precise than with classical, unweighted, LOO cross validation. Application to model selection is briefly considered through examples.

nan

Article 1443

Title@2025-05-26 (1): Using Time Structure to Estimate Causal Effects

Title: Using Time Structure to Estimate Causal Effects

Zeitstruktur zur Schätzung von Kausalitätseffekten verwenden

利用时间结构估计因果关系 2504.11076v2

Authors: Tom Hochsprung, Jakob Runge, Andreas Gerhardus

There exist several approaches for estimating causal effects in time series when latent confounding is present. Many of these approaches rely on additional auxiliary observed variables or time series such as instruments, negative controls or time series that satisfy the front- or backdoor criterion in certain graphs. In this paper, we present a novel approach for estimating direct (and via Wright’s path rule total) causal effects in a time series setup which does not rely on additional auxiliary observed variables or time series. This approach assumes that the underlying time series is a Structural Vector Autoregressive (SVAR) process and estimates direct causal effects by solving certain linear equation systems made up of different covariances and model parameters. We state sufficient graphical criteria in terms of the so-called full time graph under which these linear equations systems are uniquely solvable and under which their solutions contain the to-be-identified direct causal effects as components. We also state sufficient lag-based criteria under which the previously mentioned graphical conditions are satisfied and, thus, under which direct causal effects are identifiable. Several numerical experiments underline the correctness and applicability of our results.

nan

Article 1444

Title@2025-05-26 (1): Accelerating Nash Learning from Human Feedback via Mirror Prox

Title: Accelerating Nash Learning from Human Feedback via Mirror Prox

Beschleunigendes Nash-Lernen aus menschlichem Feedback über Spiegelprox

通过镜像Prox从人类反馈中加快学习 2505.19731v1

Authors: Daniil Tiapkin, Daniele Calandriello, Denis Belomestny, Eric Moulines, Alexey Naumov, Kashif Rasul, Michal Valko, Pierre Menard

Traditional Reinforcement Learning from Human Feedback (RLHF) often relies on reward models, frequently assuming preference structures like the Bradley-Terry model, which may not accurately capture the complexities of real human preferences (e.g., intransitivity). Nash Learning from Human Feedback (NLHF) offers a more direct alternative by framing the problem as finding a Nash equilibrium of a game defined by these preferences. In this work, we introduce Nash Mirror Prox ($\mathtt{Nash-MP}$), an online NLHF algorithm that leverages the Mirror Prox optimization scheme to achieve fast and stable convergence to the Nash equilibrium. Our theoretical analysis establishes that Nash-MP exhibits last-iterate linear convergence towards the $\beta$-regularized Nash equilibrium. Specifically, we prove that the KL-divergence to the optimal policy decreases at a rate of order $(1+2\beta)^{-N/2}$, where $N$ is a number of preference queries. We further demonstrate last-iterate linear convergence for the exploitability gap and uniformly for the span semi-norm of log-probabilities, with all these rates being independent of the size of the action space. Furthermore, we propose and analyze an approximate version of Nash-MP where proximal steps are estimated using stochastic policy gradients, making the algorithm closer to applications. Finally, we detail a practical implementation strategy for fine-tuning large language models and present experiments that demonstrate its competitive performance and compatibility with existing methods.

nan

Article 1445

Title@2025-05-26 (1): Stuffed Mamba: Oversized States Lead to the Inability to Forget

Title: Stuffed Mamba: Oversized States Lead to the Inability to Forget

Gefüllte Mamba: Übergroße Staaten führen zu der Unfähigkeit zu vergessen

马姆巴:国家规模过大,导致无法忘却 2410.07145v2

Authors: Yingfa Chen, Xinrong Zhang, Shengding Hu, Xu Han, Zhiyuan Liu, Maosong Sun

Recent advancements in recurrent architectures, such as Mamba and RWKV, have showcased strong language capabilities. Unlike transformer-based models, these architectures encode all contextual information into a fixed-size state, leading to great inference efficiency. However, this approach can cause information interference, where different token data conflicts, resulting in performance degradation and incoherent outputs beyond a certain context length. To prevent this, most RNNs incorporate mechanisms designed to “forget” earlier tokens. In this paper, we reveal that Mamba-based models struggle to effectively forget earlier tokens even with built-in forgetting mechanisms. We demonstrate that this issue stems from training on contexts that are too short for the state size, enabling the model to perform well without needing to learn how to forget. Then, we show that the minimum training length required for the model to learn forgetting scales linearly with the state size, and the maximum context length for accurate retrieval of a 5-digit passkey scales exponentially with the state size, indicating that the model retains some information beyond the point where forgetting begins. These findings highlight a critical limitation in current RNN architectures and provide valuable insights for improving long-context modeling. Our work suggests that future RNN designs must account for the interplay between state size, training length, and forgetting mechanisms to achieve robust performance in long-context tasks.

nan

Article 1446

Title@2025-05-26 (1): A Structured Tour of Optimization with Finite Differences

Title: A Structured Tour of Optimization with Finite Differences

Eine strukturierte Tour der Optimierung mit endlichen Unterschieden

结构化优化与有限差异旅游 2505.19720v1

Authors: Marco Rando, Cesare Molinari, Lorenzo Rosasco, Silvia Villa

Finite-difference methods are widely used for zeroth-order optimization in settings where gradient information is unavailable or expensive to compute. These procedures mimic first-order strategies by approximating gradients through function evaluations along a set of random directions. From a theoretical perspective, recent studies indicate that imposing structure (such as orthogonality) on the chosen directions allows for the derivation of convergence rates comparable to those achieved with unstructured random directions (i.e., directions sampled independently from a distribution). Empirically, although structured directions are expected to enhance performance, they often introduce additional computational costs, which can limit their applicability in high-dimensional settings. In this work, we examine the impact of structured direction selection in finite-difference methods. We review and extend several strategies for constructing structured direction matrices and compare them with unstructured approaches in terms of computational cost, gradient approximation quality, and convergence behavior. Our evaluation spans both synthetic tasks and real-world applications such as adversarial perturbation. The results demonstrate that structured directions can be generated with computational costs comparable to unstructured ones while significantly improving gradient estimation accuracy and optimization performance.

nan

Article 1447

Title@2025-05-26 (1): OCN: Effectively Utilizing Higher-Order Common Neighbors for Better Link Prediction

Title: OCN: Effectively Utilizing Higher-Order Common Neighbors for Better Link Prediction

OCN: Höhere Ordnung effektiv nutzen gemeinsame Nachbarn für bessere Link-Vorhersage

OCN:有效利用高端共同邻居改善联系预测 2505.19719v1

Authors: Juntong Wang, Xiyuan Wang, Muhan Zhang

Common Neighbors (CNs) and their higher-order variants are important pairwise features widely used in state-of-the-art link prediction methods. However, existing methods often struggle with the repetition across different orders of CNs and fail to fully leverage their potential. We identify that these limitations stem from two key issues: redundancy and over-smoothing in high-order common neighbors. To address these challenges, we design orthogonalization to eliminate redundancy between different-order CNs and normalization to mitigate over-smoothing. By combining these two techniques, we propose Orthogonal Common Neighbor (OCN), a novel approach that significantly outperforms the strongest baselines by an average of 7.7% on popular link prediction benchmarks. A thorough theoretical analysis is provided to support our method. Ablation studies also verify the effectiveness of our orthogonalization and normalization techniques.

nan

Article 1448

Title@2025-05-26 (1): Graceful Forgetting in Generative Language Models

Title: Graceful Forgetting in Generative Language Models

Anmutiges Vergessen in generativen Sprachmodellen

在创用语言模型中优雅地忘却 2505.19715v1

Authors: Chunyang Jiang, Chi-min Chan, Yiyang Cai, Yulong Liu, Wei Xue, Yike Guo

Recently, the pretrain-finetune paradigm has become a cornerstone in various deep learning areas. While in general the pre-trained model would promote both effectiveness and efficiency of downstream tasks fine-tuning, studies have shown that not all knowledge acquired during pre-training is beneficial. Some of the knowledge may actually bring detrimental effects to the fine-tuning tasks, which is also known as negative transfer. To address this problem, graceful forgetting has emerged as a promising approach. The core principle of graceful forgetting is to enhance the learning plasticity of the target task by selectively discarding irrelevant knowledge. However, this approach remains underexplored in the context of generative language models, and it is often challenging to migrate existing forgetting algorithms to these models due to architecture incompatibility. To bridge this gap, in this paper we propose a novel framework, Learning With Forgetting (LWF), to achieve graceful forgetting in generative language models. With Fisher Information Matrix weighting the intended parameter updates, LWF computes forgetting confidence to evaluate self-generated knowledge regarding the forgetting task, and consequently, knowledge with high confidence is periodically unlearned during fine-tuning. Our experiments demonstrate that, although thoroughly uncovering the mechanisms of knowledge interaction remains challenging in pre-trained language models, applying graceful forgetting can contribute to enhanced fine-tuning performance.

nan

Article 1449

Title@2025-05-26 (1): MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning

Title: MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning

MT$^{3}$: Skalierung von MLLM-basierten Textbildmaschinenübersetzungen über Multi-Task-Verstärkungslernen

MT$=%3}$:通过多任务强化学习,扩大基于MLLM的文本图像机翻译 2505.19714v1

Authors: Zhaopeng Feng, Yupu Liang, Shaosheng Cao, Jiayuan Su, Jiahan Ren, Zhe Xu, Yao Hu, Wenxuan Huang, Jian Wu, Zuozhu Liu

Text Image Machine Translation (TIMT)-the task of translating textual content embedded in images-is critical for applications in accessibility, cross-lingual information access, and real-world document understanding. However, TIMT remains a complex challenge due to the need for accurate optical character recognition (OCR), robust visual-text reasoning, and high-quality translation, often requiring cascading multi-stage pipelines. Recent advances in large-scale Reinforcement Learning (RL) have improved reasoning in Large Language Models (LLMs) and Multimodal LLMs (MLLMs), but their application to end-to-end TIMT is still underexplored. To bridge this gap, we introduce MT$^{3}$, the first framework to apply Multi-Task RL to MLLMs for end-to-end TIMT. MT$^{3}$ adopts a multi-task optimization paradigm targeting three key sub-skills: text recognition, context-aware reasoning, and translation. It is trained using a novel multi-mixed reward mechanism that adapts rule-based RL strategies to TIMT’s intricacies, offering fine-grained, non-binary feedback across tasks. Furthermore, to facilitate the evaluation of TIMT in authentic cross-cultural and real-world social media contexts, we introduced XHSPost, the first social media TIMT benchmark. Our MT$^{3}$-7B-Zero achieves state-of-the-art results on the latest in-domain MIT-10M benchmark, outperforming strong baselines such as Qwen2.5-VL-72B and InternVL2.5-78B by notable margins across multiple metrics. Additionally, the model shows strong generalization to out-of-distribution language pairs and datasets. In-depth analyses reveal how multi-task synergy, reinforcement learning initialization, curriculum design, and reward formulation contribute to advancing MLLM-driven TIMT.

nan

Article 1450

Title@2025-05-26 (1): On the Relation between Rectified Flows and Optimal Transport

Title: On the Relation between Rectified Flows and Optimal Transport

Über die Beziehung zwischen rektifizierten Strömungen und optimalem Verkehr

纠正性流动与最佳运输之间的关系 2505.19712v1

Authors: Johannes Hertrich, Antonin Chambolle, Julie Delon

This paper investigates the connections between rectified flows, flow matching, and optimal transport. Flow matching is a recent approach to learning generative models by estimating velocity fields that guide transformations from a source to a target distribution. Rectified flow matching aims to straighten the learned transport paths, yielding more direct flows between distributions. Our first contribution is a set of invariance properties of rectified flows and explicit velocity fields. In addition, we also provide explicit constructions and analysis in the Gaussian (not necessarily independent) and Gaussian mixture settings and study the relation to optimal transport. Our second contribution addresses recent claims suggesting that rectified flows, when constrained such that the learned velocity field is a gradient, can yield (asymptotically) solutions to optimal transport problems. We study the existence of solutions for this problem and demonstrate that they only relate to optimal transport under assumptions that are significantly stronger than those previously acknowledged. In particular, we present several counter-examples that invalidate earlier equivalence results in the literature, and we argue that enforcing a gradient constraint on rectified flows is, in general, not a reliable method for computing optimal transport maps.

nan

Article 1451

Title@2025-05-26 (1): Automated Scientific Discovery: From Equation Discovery to Autonomous Discovery Systems

Title: Automated Scientific Discovery: From Equation Discovery to Autonomous Discovery Systems

Automatisierte wissenschaftliche Entdeckung: Von der Gleichungserkundung zu autonomen Entdeckungssystemen

自动科学发现:从赤道发现到自主发现系统 2305.02251v2

Authors: Stefan Kramer, Mattia Cerrato, Jannis Brugger, Sašo Džeroski, Ross King

The paper surveys automated scientific discovery, from equation discovery and symbolic regression to autonomous discovery systems and agents. It discusses the individual approaches from a “big picture” perspective and in context, but also discusses open issues and recent topics like the various roles of deep neural networks in this area, aiding in the discovery of human-interpretable knowledge. Further, we will present closed-loop scientific discovery systems, starting with the pioneering work on the Adam system up to current efforts in fields from material science to astronomy. Finally, we will elaborate on autonomy from a machine learning perspective, but also in analogy to the autonomy levels in autonomous driving. The maximal level, level five, is defined to require no human intervention at all in the production of scientific knowledge. Achieving this is one step towards solving the Nobel Turing Grand Challenge to develop AI Scientists: AI systems capable of making Nobel-quality scientific discoveries highly autonomously at a level comparable, and possibly superior, to the best human scientists by 2050.

nan

Article 1452

Title@2025-05-26 (1): Solving Euler equations with Multiple Discontinuities via Separation-Transfer Physics-Informed Neural Networks

Title: Solving Euler equations with Multiple Discontinuities via Separation-Transfer Physics-Informed Neural Networks

Lösen von Euler-Gleichungen mit mehreren Diskontinuitäten über Separation-Transfer-Physik-informierte Neuronale Netzwerke

通过分离-传输、物理内建神经网络解决多断裂的电动方程式 2505.20361v1

Authors: Chuanxing Wang, Hui Luo, Kai Wang, Guohuai Zhu, Mingxing Luo

Despite the remarkable progress of physics-informed neural networks (PINNs) in scientific computing, they continue to face challenges when solving hydrodynamic problems with multiple discontinuities. In this work, we propose Separation-Transfer Physics Informed Neural Networks (ST-PINNs) to address such problems. By sequentially resolving discontinuities from strong to weak and leveraging transfer learning during training, ST-PINNs significantly reduce the problem complexity and enhance solution accuracy. To the best of our knowledge, this is the first study to apply a PINNs-based approach to the two-dimensional unsteady planar shock refraction problem, offering new insights into the application of PINNs to complex shock-interface interactions. Numerical experiments demonstrate that ST-PINNs more accurately capture sharp discontinuities and substantially reduce solution errors in hydrodynamic problems involving multiple discontinuities.

nan

Article 1453

Title: Future-Oriented Navigation: Dynamic Obstacle Avoidance with One-Shot Energy-Based Multimodal Motion Prediction

Zukunftsorientierte Navigation: Dynamische Hindernisvermeidung mit einer heißen energiebasierten Multimodal-Bewegungsvorhersage

面向未来的导航:以单热能源为基础的多模式动力预测,动态障碍避免动态障碍 2505.00237v2

Authors: Ze Zhang, Georg Hess, Junjie Hu, Emmanuel Dean, Lennart Svensson, Knut Åkesson

This paper proposes an integrated approach for the safe and efficient control of mobile robots in dynamic and uncertain environments. The approach consists of two key steps: one-shot multimodal motion prediction to anticipate motions of dynamic obstacles and model predictive control to incorporate these predictions into the motion planning process. Motion prediction is driven by an energy-based neural network that generates high-resolution, multi-step predictions in a single operation. The prediction outcomes are further utilized to create geometric shapes formulated as mathematical constraints. Instead of treating each dynamic obstacle individually, predicted obstacles are grouped by proximity in an unsupervised way to improve performance and efficiency. The overall collision-free navigation is handled by model predictive control with a specific design for proactive dynamic obstacle avoidance. The proposed approach allows mobile robots to navigate effectively in dynamic environments. Its performance is accessed across various scenarios that represent typical warehouse settings. The results demonstrate that the proposed approach outperforms other existing dynamic obstacle avoidance methods.

nan

Article 1454

Title@2025-05-26 (1): HRP: High-Rank Preheating for Superior LoRA Initialization

Title: HRP: High-Rank Preheating for Superior LoRA Initialization

HRP: Hochanker Vorwärmung für die Superior LoRA Initialisierung

HRP: 高级LORA初始化的高热预热 2502.07739v3

Authors: Yuzhu Chen, Yingjie Wang, Shi Fu, Li Shen, Yongcheng Jing, Xinmei Tian, Dacheng Tao

This paper studies the crucial impact of initialization in Low-Rank Adaptation (LoRA). Through theoretical analysis, we demonstrate that the fine-tuned result of LoRA is highly sensitive to initialization, which is likely to lead suboptimal low-rank results. While this issue can be mitigated by adjusting the initial direction towards the main singular vectors of the target $\Delta W$, which is, however, typically unknown in real-world scenarios. To approximate this initial direction, we propose High-Rank Preheating (HRP), which first trains LoRA with a higher preheating rank for a few steps, then uses the main singular vectors of the derived $BA^\top$ as initialization for the main fine-tuning process. With only a modification in the initial direction, we prove that HRP makes LoRA achieve better fine-tuned results than random initialization in expectation, and the enhancement grows with the preheating rank. We validate our theoretical findings through extensive experiments in various models and tasks, where HRP significantly enhances LoRA’s effectiveness and outperforms other initialization strategies and other LoRA variants.

nan

Article 1455

Title@2025-05-26 (1): Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments

Title: Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments

Mosaic: Datenfreies Wissen Destillieren über Mixture-of-Experts für Heterogene verteilte Umgebungen

Mosaic:通过混合专家进行无数据知识蒸馏,促进异基因分布式环境 2505.19699v1

Authors: Junming Liu, Yanting Gao, Siyuan Meng, Yifei Sun, Aoqi Wu, Yufei Jin, Yirong Chen, Ding Wang, Guosun Zeng

Federated Learning (FL) is a decentralized machine learning paradigm that enables clients to collaboratively train models while preserving data privacy. However, the coexistence of model and data heterogeneity gives rise to inconsistent representations and divergent optimization dynamics across clients, ultimately hindering robust global performance. To transcend these challenges, we propose Mosaic, a novel data-free knowledge distillation framework tailored for heterogeneous distributed environments. Mosaic first trains local generative models to approximate each client’s personalized distribution, enabling synthetic data generation that safeguards privacy through strict separation from real data. Subsequently, Mosaic forms a Mixture-of-Experts (MoE) from client models based on their specialized knowledge, and distills it into a global model using the generated data. To further enhance the MoE architecture, Mosaic integrates expert predictions via a lightweight meta model trained on a few representative prototypes. Extensive experiments on standard image classification benchmarks demonstrate that Mosaic consistently outperforms state-of-the-art approaches under both model and data heterogeneity. The source code has been published at https://github.com/Wings-Of-Disaster/Mosaic.

nan

Article 1456

Title@2025-05-26 (1): Graph Guided Diffusion: Unified Guidance for Conditional Graph Generation

Title: Graph Guided Diffusion: Unified Guidance for Conditional Graph Generation

Graph Guided Diffusion: Unified Guidance for Conditional Graph Generation

向导扩散:有条件图形生成统一指南 2505.19685v1

Authors: Victor M. Tenorio, Nicolas Zilberstein, Santiago Segarra, Antonio G. Marques

Diffusion models have emerged as powerful generative models for graph generation, yet their use for conditional graph generation remains a fundamental challenge. In particular, guiding diffusion models on graphs under arbitrary reward signals is difficult: gradient-based methods, while powerful, are often unsuitable due to the discrete and combinatorial nature of graphs, and non-differentiable rewards further complicate gradient-based guidance. We propose Graph Guided Diffusion (GGDiff), a novel guidance framework that interprets conditional diffusion on graphs as a stochastic control problem to address this challenge. GGDiff unifies multiple guidance strategies, including gradient-based guidance (for differentiable rewards), control-based guidance (using control signals from forward reward evaluations), and zero-order approximations (bridging gradient-based and gradient-free optimization). This comprehensive, plug-and-play framework enables zero-shot guidance of pre-trained diffusion models under both differentiable and non-differentiable reward functions, adapting well-established guidance techniques to graph generation–a direction largely unexplored. Our formulation balances computational efficiency, reward alignment, and sample quality, enabling practical conditional generation across diverse reward types. We demonstrate the efficacy of GGDiff in various tasks, including constraints on graph motifs, fairness, and link prediction, achieving superior alignment with target rewards while maintaining diversity and fidelity.

nan

Article 1457

Title@2025-05-26 (1): CauSkelNet: Causal Representation Learning for Human Behaviour Analysis

Title: CauSkelNet: Causal Representation Learning for Human Behaviour Analysis

CauSkelNet: Kausales Repräsentationslernen für die menschliche Verhaltensanalyse

CauSkelNet: 人类行为分析的因果关系学习 2409.15564v3

Authors: Xingrui Gu, Chuyi Jiang, Erte Wang, Zekun Wu, Qiang Cui, Leimin Tian, Lianlong Wu, Siyang Song, Chuang Yu

Traditional machine learning methods for movement recognition often struggle with limited model interpretability and a lack of insight into human movement dynamics. This study introduces a novel representation learning framework based on causal inference to address these challenges. Our two-stage approach combines the Peter-Clark (PC) algorithm and Kullback-Leibler (KL) divergence to identify and quantify causal relationships between human joints. By capturing joint interactions, the proposed causal Graph Convolutional Network (GCN) produces interpretable and robust representations. Experimental results on the EmoPain dataset demonstrate that the causal GCN outperforms traditional GCNs in accuracy, F1 score, and recall, particularly in detecting protective behaviors. This work contributes to advancing human motion analysis and lays a foundation for adaptive and intelligent healthcare solutions.

nan

Article 1458

Title@2025-05-26 (1): Deep Actor-Critics with Tight Risk Certificates

Title: Deep Actor-Critics with Tight Risk Certificates

Deep Actor-Critics mit engen Risikozertifikaten

具有严格风险证书的深行为者-批评者 2505.19682v1

Authors: Bahareh Tasdighi, Manuel Haussmann, Yi-Shan Wu, Andres R. Masegosa, Melih Kandemir

After a period of research, deep actor-critic algorithms have reached a level where they influence our everyday lives. They serve as the driving force behind the continual improvement of large language models through user-collected feedback. However, their deployment in physical systems is not yet widely adopted, mainly because no validation scheme that quantifies their risk of malfunction. We demonstrate that it is possible to develop tight risk certificates for deep actor-critic algorithms that predict generalization performance from validation-time observations. Our key insight centers on the effectiveness of minimal evaluation data. Surprisingly, a small feasible of evaluation roll-outs collected from a pretrained policy suffices to produce accurate risk certificates when combined with a simple adaptation of PAC-Bayes theory. Specifically, we adopt a recently introduced recursive PAC-Bayes approach, which splits validation data into portions and recursively builds PAC-Bayes bounds on the excess loss of each portion’s predictor, using the predictor from the previous portion as a data-informed prior. Our empirical results across multiple locomotion tasks and policy expertise levels demonstrate risk certificates that are tight enough to be considered for practical use.

nan

Article 1459

Title@2025-05-26 (1): Cut out and Replay: A Simple yet Versatile Strategy for Multi-Label Online Continual Learning

Title: Cut out and Replay: A Simple yet Versatile Strategy for Multi-Label Online Continual Learning

Cut out und Replay: Eine einfache, aber vielseitige Strategie für Multi-Label Online Continual Learning

剪切和重放:一个简单但通俗易懂的多标签在线持续学习战略 2505.19680v1

Authors: Xinrui Wang, Shao-yuan Li, Jiaqiang Zhang, Songcan Chen

Multi-Label Online Continual Learning (MOCL) requires models to learn continuously from endless multi-label data streams, facing complex challenges including persistent catastrophic forgetting, potential missing labels, and uncontrollable imbalanced class distributions. While existing MOCL methods attempt to address these challenges through various techniques, \textit{they all overlook label-specific region identifying and feature learning} - a fundamental solution rooted in multi-label learning but challenging to achieve in the online setting with incremental and partial supervision. To this end, we first leverage the inherent structural information of input data to evaluate and verify the innate localization capability of different pre-trained models. Then, we propose CUTER (CUT-out-and-Experience-Replay), a simple yet versatile strategy that provides fine-grained supervision signals by further identifying, strengthening and cutting out label-specific regions for efficient experience replay. It not only enables models to simultaneously address catastrophic forgetting, missing labels, and class imbalance challenges, but also serves as an orthogonal solution that seamlessly integrates with existing approaches. Extensive experiments on multiple multi-label image benchmarks demonstrate the superiority of our proposed method. The code is available at \href{https://github.com/wxr99/Cut-Replay}{https://github.com/wxr99/Cut-Replay}

nan

Article 1460

Title@2025-05-26 (1): Optimal Multi-Fidelity Best-Arm Identification

Title: Optimal Multi-Fidelity Best-Arm Identification

Optimale Multi-Fidelity Best-Arm-Identifikation

最佳最佳多纤维最佳武器标识 2406.03033v2

Authors: Riccardo Poiani, Rémy Degenne, Emilie Kaufmann, Alberto Maria Metelli, Marcello Restelli

In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost. Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm. Our first contribution is a tight, instance-dependent lower bound on the cost complexity. The study of the optimization problem featured in the lower bound provides new insights to devise computationally efficient algorithms, and leads us to propose a gradient-based approach with asymptotically optimal cost complexity. We demonstrate the benefits of the new algorithm compared to existing methods in experiments. Our theoretical and empirical findings also shed light on an intriguing concept of optimal fidelity for each arm.

nan

Article 1461

Title@2025-05-26 (1): Bridging Privacy and Robustness for Trustworthy Machine Learning

Title: Bridging Privacy and Robustness for Trustworthy Machine Learning

Überbrückung von Privatsphäre und Robustheit für vertrauenswürdiges maschinelles Lernen

连接隐私和强力,促进可信赖的机器学习 2403.16591v4

Authors: Xiaojin Zhang, Wei Chen

The advent of machine learning has led to transformative changes across various domains, but the sensitive nature of data raises concerns about privacy and security. While Local Differential Privacy (LDP) has been a cornerstone in addressing these concerns, recent research has proposed privacy concepts aligned with the Bayesian inference perspective of an adversary, such as Average Bayesian Privacy (ABP) and Maximum Bayesian Privacy (MBP). This paper explores the intricate relationships between LDP, ABP, and MBP, and their implications for algorithmic robustness. We establish theoretical connections between these privacy notions, proving that LDP implies MBP and vice versa under certain conditions, and deriving bounds connecting MBP and ABP. We also investigate the relationship between PAC robust learning and privacy preservation, demonstrating how to derive PAC robustness from privacy-preserving algorithms and construct privacy-preserving algorithms from PAC robust ones. Our findings provide valuable insights for constructing privacy-preserving and robust machine learning algorithms.

nan

Article 1462

Title@2025-05-26 (1): Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling

Title: Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling

Zero-Shot-Streaming-Text zur Sprachsynthese mit Transducer und Auto-Regressive Modellierung

零热流文本,用于带有传感器和自动递减建模的语音合成 2505.19669v1

Authors: Haiyang Sun, Shujie Hu, Shujie Liu, Lingwei Meng, Hui Wang, Bing Han, Yifan Yang, Yanqing Liu, Sheng Zhao, Yan Lu, Yanmin Qian

Zero-shot streaming text-to-speech is an important research topic in human-computer interaction. Existing methods primarily use a lookahead mechanism, relying on future text to achieve natural streaming speech synthesis, which introduces high processing latency. To address this issue, we propose SMLLE, a streaming framework for generating high-quality speech frame-by-frame. SMLLE employs a Transducer to convert text into semantic tokens in real time while simultaneously obtaining duration alignment information. The combined outputs are then fed into a fully autoregressive (AR) streaming model to reconstruct mel-spectrograms. To further stabilize the generation process, we design a Delete < Bos > Mechanism that allows the AR model to access future text introducing as minimal delay as possible. Experimental results suggest that the SMLLE outperforms current streaming TTS methods and achieves comparable performance over sentence-level TTS systems. Samples are available on https://anonymous.4open.science/w/demo_page-48B7/.

nan

Article 1463

Title@2025-05-26 (1): GTR: Graph-Table-RAG for Cross-Table Question Answering

Title: GTR: Graph-Table-RAG for Cross-Table Question Answering

GTR: Graph-Table-RAG für Cross-Table-Frageantworten

GTR:用于跨表问题解答的图表表-RAG 2504.01346v3

Authors: Jiaru Zou, Dongqi Fu, Sirui Chen, Xinrui He, Zihao Li, Yada Zhu, Jiawei Han, Jingrui He

Beyond pure text, a substantial amount of knowledge is stored in tables. In real-world scenarios, user questions often require retrieving answers that are distributed across multiple tables. GraphRAG has recently attracted much attention for enhancing LLMs’ reasoning capabilities by organizing external knowledge to address ad-hoc and complex questions, exemplifying a promising direction for cross-table question answering. In this paper, to address the current gap in available data, we first introduce a multi-table benchmark, MutliTableQA, comprising 60k tables and 25k user queries collected from real-world sources. Then, we propose the first Graph-Table-RAG framework, namely GTR, which reorganizes table corpora into a heterogeneous graph, employs a hierarchical coarse-to-fine retrieval process to extract the most relevant tables, and integrates graph-aware prompting for downstream LLMs’ tabular reasoning. Extensive experiments show that GTR exhibits superior cross-table question-answering performance while maintaining high deployment efficiency, demonstrating its real-world practical applicability.

nan

Article 1464

Title@2025-05-26 (1): Unveil Multi-Picture Descriptions for Multilingual Mild Cognitive Impairment Detection via Contrastive Learning

Title: Unveil Multi-Picture Descriptions for Multilingual Mild Cognitive Impairment Detection via Contrastive Learning

Mehrbildbeschreibungen für mehrsprachige, leichte Kognitive Impairment-Erkennung durch kontrastives Lernen enthüllen

通过差异学习发现多语种轻视认知缺陷的单形多语种描述 2505.17067v2

Authors: Kristin Qi, Jiali Cheng, Youxiang Zhu, Hadi Amiri, Xiaohui Liang

Detecting Mild Cognitive Impairment from picture descriptions is critical yet challenging, especially in multilingual and multiple picture settings. Prior work has primarily focused on English speakers describing a single picture (e.g., the ‘Cookie Theft’). The TAUKDIAL-2024 challenge expands this scope by introducing multilingual speakers and multiple pictures, which presents new challenges in analyzing picture-dependent content. To address these challenges, we propose a framework with three components: (1) enhancing discriminative representation learning via supervised contrastive learning, (2) involving image modality rather than relying solely on speech and text modalities, and (3) applying a Product of Experts (PoE) strategy to mitigate spurious correlations and overfitting. Our framework improves MCI detection performance, achieving a +7.1% increase in Unweighted Average Recall (UAR) (from 68.1% to 75.2%) and a +2.9% increase in F1 score (from 80.6% to 83.5%) compared to the text unimodal baseline. Notably, the contrastive learning component yields greater gains for the text modality compared to speech. These results highlight our framework’s effectiveness in multilingual and multi-picture MCI detection.

nan

Article 1465

Title@2025-05-26 (1): Best-Arm Identification in Unimodal Bandits

Title: Best-Arm Identification in Unimodal Bandits

Best-Arm-Identifikation in unimodalen Banditen

统一强盗中的最佳武器识别 2411.01898v2

Authors: Riccardo Poiani, Marc Jourdan, Emilie Kaufmann, Rémy Degenne

We study the fixed-confidence best-arm identification problem in unimodal bandits, in which the means of the arms increase with the index of the arm up to their maximum, then decrease. We derive two lower bounds on the stopping time of any algorithm. The instance-dependent lower bound suggests that due to the unimodal structure, only three arms contribute to the leading confidence-dependent cost. However, a worst-case lower bound shows that a linear dependence on the number of arms is unavoidable in the confidence-independent cost. We propose modifications of Track-and-Stop and a Top Two algorithm that leverage the unimodal structure. Both versions of Track-and-Stop are asymptotically optimal for one-parameter exponential families. The Top Two algorithm is asymptotically near-optimal for Gaussian distributions and we prove a non-asymptotic guarantee matching the worse-case lower bound. The algorithms can be implemented efficiently and we demonstrate their competitive empirical performance.

nan

Article 1466

Title@2025-05-26 (1): MoESD: Unveil Speculative Decoding’s Potential for Accelerating Sparse MoE

Title: MoESD: Unveil Speculative Decoding’s Potential for Accelerating Sparse MoE

MoESD: Spekulatives Decoding-Potential zur Beschleunigung von Sparse MoE enthüllen

MOESD: Unveil 投机性代谢潜力加速偏散的中导体 2505.19645v1

Authors: Zongle Huang, Lei Zhu, Zongyuan Zhan, Ting Hu, Weikai Mao, Xianzhi Yu, Yongpan Liu, Tianyu Zhang

Large Language Models (LLMs) have achieved remarkable success across many applications, with Mixture of Experts (MoE) models demonstrating great potential. Compared to traditional dense models, MoEs achieve better performance with less computation. Speculative decoding (SD) is a widely used technique to accelerate LLM inference without accuracy loss, but it has been considered efficient only for dense models. In this work, we first demonstrate that, under medium batch sizes, MoE surprisingly benefits more from SD than dense models. Furthermore, as MoE becomes sparser – the prevailing trend in MoE designs – the batch size range where SD acceleration is expected to be effective becomes broader. To quantitatively understand tradeoffs involved in SD, we develop a reliable modeling based on theoretical analyses. While current SD research primarily focuses on improving acceptance rates of algorithms, changes in workload and model architecture can still lead to degraded SD acceleration even with high acceptance rates. To address this limitation, we introduce a new metric ‘target efficiency’ that characterizes these effects, thus helping researchers identify system bottlenecks and understand SD acceleration more comprehensively. For scenarios like private serving, this work unveils a new perspective to speed up MoE inference, where existing solutions struggle. Experiments on different GPUs show up to 2.29x speedup for Qwen2-57B-A14B at medium batch sizes and validate our theoretical predictions.

nan

Article 1467

Title@2025-05-26 (1): Navigating Conflicting Views: Harnessing Trust for Learning

Title: Navigating Conflicting Views: Harnessing Trust for Learning

Navigieren gegensätzlicher Ansichten: Vertrauen fürs Lernen gewinnen

引导冲突观点:利用信任学习 2406.00958v3

Authors: Jueqing Lu, Wray Buntine, Yuanyuan Qi, Joanna Dipnall, Belinda Gabbe, Lan Du

Resolving conflicts is critical for improving the reliability of multi-view classification. While prior work focuses on learning consistent and informative representations across views, it often assumes perfect alignment and equal importance of all views, an assumption rarely met in real-world scenarios, as some views may express distinct information. To address this, we develop a computational trust-based discounting method that enhances the Evidential Multi-view framework by accounting for the instance-wise reliability of each view through a probability-sensitive trust mechanism. We evaluate our method on six real-world datasets using Top-1 Accuracy, Fleiss’ Kappa, and a new metric, Multi-View Agreement with Ground Truth, to assess prediction reliability. We also assess the effectiveness of uncertainty in indicating prediction correctness via AUROC.Additionally, we test the scalability of our method through end-to-end training on a large-scale dataset. The experimental results show that computational trust can effectively resolve conflicts, paving the way for more reliable multi-view classification models in real-world applications.

nan

Article 1468

Title@2025-05-26 (1): When fractional quasi p-norms concentrate

Title: When fractional quasi p-norms concentrate

Wenn fraktioniertes Quasi-P-Normen-Konzentrat

当分微分准微调集中时 2505.19635v1

Authors: Ivan Y. Tyukin, Bogdan Grechuk, Evgeny M. Mirkes, Alexander N. Gorban

Concentration of distances in high dimension is an important factor for the development and design of stable and reliable data analysis algorithms. In this paper, we address the fundamental long-standing question about the concentration of distances in high dimension for fractional quasi $p$-norms, $p\in(0,1)$. The topic has been at the centre of various theoretical and empirical controversies. Here we, for the first time, identify conditions when fractional quasi $p$-norms concentrate and when they don’t. We show that contrary to some earlier suggestions, for broad classes of distributions, fractional quasi $p$-norms admit exponential and uniform in $p$ concentration bounds. For these distributions, the results effectively rule out previously proposed approaches to alleviate concentration by “optimal” setting the values of $p$ in $(0,1)$. At the same time, we specify conditions and the corresponding families of distributions for which one can still control concentration rates by appropriate choices of $p$. We also show that in an arbitrarily small vicinity of a distribution from a large class of distributions for which uniform concentration occurs, there are uncountably many other distributions featuring anti-concentration properties. Importantly, this behavior enables devising relevant data encoding or representation schemes favouring or discouraging distance concentration. The results shed new light on this long-standing problem and resolve the tension around the topic in both theory and empirical evidence reported in the literature.

nan

Article 1469

Title@2025-05-26 (1): Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs

Title: Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs

Entkoppelung Spatio-Temporale Vorhersage: Wenn leichte große Modelle adaptive Hypergraphen treffen

脱钩的SPadio-TT时间预测:当轻量大模型与适应性高光谱相匹配时 2505.19620v1

Authors: Jiawen Chen, Qi Shao, Duxin Chen, Wenwu Yu

Spatio-temporal prediction is a pivotal task with broad applications in traffic management, climate monitoring, energy scheduling, etc. However, existing methodologies often struggle to balance model expressiveness and computational efficiency, especially when scaling to large real-world datasets. To tackle these challenges, we propose STH-SepNet (Spatio-Temporal Hypergraph Separation Networks), a novel framework that decouples temporal and spatial modeling to enhance both efficiency and precision. Therein, the temporal dimension is modeled using lightweight large language models, which effectively capture low-rank temporal dynamics. Concurrently, the spatial dimension is addressed through an adaptive hypergraph neural network, which dynamically constructs hyperedges to model intricate, higher-order interactions. A carefully designed gating mechanism is integrated to seamlessly fuse temporal and spatial representations. By leveraging the fundamental principles of low-rank temporal dynamics and spatial interactions, STH-SepNet offers a pragmatic and scalable solution for spatio-temporal prediction in real-world applications. Extensive experiments on large-scale real-world datasets across multiple benchmarks demonstrate the effectiveness of STH-SepNet in boosting predictive performance while maintaining computational efficiency. This work may provide a promising lightweight framework for spatio-temporal prediction, aiming to reduce computational demands and while enhancing predictive performance. Our code is avaliable at https://github.com/SEU-WENJIA/ST-SepNet-Lightweight-LLMs-Meet-Adaptive-Hypergraphs.

nan

Article 1470

Title@2025-05-26 (1): SESaMo: Symmetry-Enforcing Stochastic Modulation for Normalizing Flows

Title: SESaMo: Symmetry-Enforcing Stochastic Modulation for Normalizing Flows

SESaMo: Symmetrie-verstärkende stochastische Modulation für normalisierende Strömungen

SESaMo: 正常流动的对称性-强化斯托调动 2505.19619v1

Authors: Janik Kreit, Dominic Schuh, Kim A. Nicoli, Lena Funcke

Deep generative models have recently garnered significant attention across various fields, from physics to chemistry, where sampling from unnormalized Boltzmann-like distributions represents a fundamental challenge. In particular, autoregressive models and normalizing flows have become prominent due to their appealing ability to yield closed-form probability densities. Moreover, it is well-established that incorporating prior knowledge - such as symmetries - into deep neural networks can substantially improve training performances. In this context, recent advances have focused on developing symmetry-equivariant generative models, achieving remarkable results. Building upon these foundations, this paper introduces Symmetry-Enforcing Stochastic Modulation (SESaMo). Similar to equivariant normalizing flows, SESaMo enables the incorporation of inductive biases (e.g., symmetries) into normalizing flows through a novel technique called stochastic modulation. This approach enhances the flexibility of the generative model, allowing to effectively learn a variety of exact and broken symmetries. Our numerical experiments benchmark SESaMo in different scenarios, including an 8-Gaussian mixture model and physically relevant field theories, such as the $\phi^4$ theory and the Hubbard model.

nan

Article 1471

Title@2025-05-26 (1): When the Left Foot Leads to the Right Path: Bridging Initial Prejudice and Trainability

Title: When the Left Foot Leads to the Right Path: Bridging Initial Prejudice and Trainability

Wenn der linke Fuß auf den rechten Weg führt: Überbrückung von anfänglichen Vorurteilen und Trainingsfähigkeit

当左脚引向右路时:弥合最初的偏见和可训练性 2505.12096v2

Authors: Alberto Bassi, Carlo Albert, Aurelien Lucchi, Marco Baity-Jesi, Emanuele Francazi

Understanding the statistical properties of deep neural networks (DNNs) at initialization is crucial for elucidating both their trainability and the intrinsic architectural biases they encode prior to data exposure. Mean-field (MF) analyses have demonstrated that the parameter distribution in randomly initialized networks dictates whether gradients vanish or explode. Concurrently, untrained DNNs were found to exhibit an initial-guessing bias (IGB), in which large regions of the input space are assigned to a single class. In this work, we derive a theoretical proof establishing the correspondence between IGB and previous MF theories, thereby connecting a network prejudice toward specific classes with the conditions for fast and accurate learning. This connection yields the counter-intuitive conclusion: the initialization that optimizes trainability is necessarily biased, rather than neutral. Furthermore, we extend the MF/IGB framework to multi-node activation functions, offering practical guidelines for designing initialization schemes that ensure stable optimization in architectures employing max- and average-pooling layers.

nan

Article 1472

Title@2025-05-26 (1): Learning and Interpreting Gravitational-Wave Features from CNNs with a Random Forest Approach

Title: Learning and Interpreting Gravitational-Wave Features from CNNs with a Random Forest Approach

Erlernen und Dolmetschen von Gravitational-Wave-Features von CNNs mit einem zufälligen Waldansatz

使用随机森林方法从有线电视新闻网读取和解释引力维学特征 2505.20357v1

Authors: Jun Tian, He Wang, Jibo He, Yu Pan, Shuo Cao, Qingquan Jiang

Convolutional neural networks (CNNs) have become widely adopted in gravitational wave (GW) detection pipelines due to their ability to automatically learn hierarchical features from raw strain data. However, the physical meaning of these learned features remains underexplored, limiting the interpretability of such models. In this work, we propose a hybrid architecture that combines a CNN-based feature extractor with a random forest (RF) classifier to improve both detection performance and interpretability. Unlike prior approaches that directly connect classifiers to CNN outputs, our method introduces four physically interpretable metrics - variance, signal-to-noise ratio (SNR), waveform overlap, and peak amplitude - computed from the final convolutional layer. These are jointly used with the CNN output in the RF classifier to enable more informed decision boundaries. Tested on long-duration strain datasets, our hybrid model outperforms a baseline CNN model, achieving a relative improvement of 21\% in sensitivity at a fixed false alarm rate of 10 events per month. Notably, it also shows improved detection of low-SNR signals (SNR $\le$ 10), which are especially vulnerable to misclassification in noisy environments. Feature attribution via the RF model reveals that both CNN-extracted and handcrafted features contribute significantly to classification decisions, with learned variance and CNN outputs ranked among the most informative. These findings suggest that physically motivated post-processing of CNN feature maps can serve as a valuable tool for interpretable and efficient GW detection, bridging the gap between deep learning and domain knowledge.

nan

Article 1473

Title@2025-05-26 (1): Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models

Title: Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models

Diagnostizieren und Abmildern von Modalitätsstörungen in multimodalen großen Sprachmodellen

多式联运大语言模型中的诊断和减缓模式干预 2505.19616v1

Authors: Rui Cai, Bangzheng Li, Xiaofei Wen, Muhao Chen, Zhe Zhao

Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities across tasks, yet they often exhibit difficulty in distinguishing task-relevant from irrelevant signals, particularly in tasks like Visual Question Answering (VQA), which can lead to susceptibility to misleading or spurious inputs. We refer to this broader limitation as the Cross-Modality Competency Problem: the model’s inability to fairly evaluate all modalities. This vulnerability becomes more evident in modality-specific tasks such as image classification or pure text question answering, where models are expected to rely solely on one modality. In such tasks, spurious information from irrelevant modalities often leads to significant performance degradation. We refer to this failure as Modality Interference, which serves as a concrete and measurable instance of the cross-modality competency problem. We further design a perturbation-based causal diagnostic experiment to verify and quantify this problem. To mitigate modality interference, we propose a novel framework to fine-tune MLLMs, including perturbation-based data augmentations with both heuristic perturbations and adversarial perturbations via Projected Gradient Descent (PGD), and a consistency regularization strategy applied to model outputs with original and perturbed inputs. Experiments on multiple benchmark datasets (image-heavy, text-heavy, and VQA tasks) and multiple model families with different scales demonstrate significant improvements in robustness and cross-modality competency, indicating our method’s effectiveness in boosting unimodal reasoning ability while enhancing performance on multimodal tasks.

nan

Article 1474

Title@2025-05-26 (1): Multiplicity is an Inevitable and Inherent Challenge in Multimodal Learning

Title: Multiplicity is an Inevitable and Inherent Challenge in Multimodal Learning

Vielfältigkeit ist eine unvermeidliche und inhärente Herausforderung im multimodalen Lernen

多重性是多模式学习中不可避免和内在的挑战。 2505.19614v1

Authors: Sanghyuk Chun

Multimodal learning has seen remarkable progress, particularly with the emergence of large-scale pre-training across various modalities. However, most current approaches are built on the assumption of a deterministic, one-to-one alignment between modalities. This oversimplifies real-world multimodal relationships, where their nature is inherently many-to-many. This phenomenon, named multiplicity, is not a side-effect of noise or annotation error, but an inevitable outcome of semantic abstraction, representational asymmetry, and task-dependent ambiguity in multimodal tasks. This position paper argues that multiplicity is a fundamental bottleneck that manifests across all stages of the multimodal learning pipeline: from data construction to training and evaluation. This paper examines the causes and consequences of multiplicity, and highlights how multiplicity introduces training uncertainty, unreliable evaluation, and low dataset quality. This position calls for new research directions on multimodal learning: novel multiplicity-aware learning frameworks and dataset construction protocols considering multiplicity.

nan

Article 1475

Title@2025-05-26 (1): Skrull: Towards Efficient Long Context Fine-tuning through Dynamic Data Scheduling

Title: Skrull: Towards Efficient Long Context Fine-tuning through Dynamic Data Scheduling

Skrull: Auf dem Weg zu einem effizienten langen Kontext Feinabstimmung durch Dynamic Data Scheduling

Skrull:通过动态数据安排,实现高效长处微调 2505.19609v1

Authors: Hongtao Xu, Wenting Shen, Yuanxin Wei, Ang Wang, Guo Runfan, Tianxing Wang, Yong Li, Mingzhen Li, Weile Jia

Long-context supervised fine-tuning (Long-SFT) plays a vital role in enhancing the performance of large language models (LLMs) on long-context tasks. To smoothly adapt LLMs to long-context scenarios, this process typically entails training on mixed datasets containing both long and short sequences. However, this heterogeneous sequence length distribution poses significant challenges for existing training systems, as they fail to simultaneously achieve high training efficiency for both long and short sequences, resulting in sub-optimal end-to-end system performance in Long-SFT. In this paper, we present a novel perspective on data scheduling to address the challenges posed by the heterogeneous data distributions in Long-SFT. We propose Skrull, a dynamic data scheduler specifically designed for efficient long-SFT. Through dynamic data scheduling, Skrull balances the computation requirements of long and short sequences, improving overall training efficiency. Furthermore, we formulate the scheduling process as a joint optimization problem and thoroughly analyze the trade-offs involved. Based on those analysis, Skrull employs a lightweight scheduling algorithm to achieve near-zero cost online scheduling in Long-SFT. Finally, we implement Skrull upon DeepSpeed, a state-of-the-art distributed training system for LLMs. Experimental results demonstrate that Skrull outperforms DeepSpeed by 3.76x on average (up to 7.54x) in real-world long-SFT scenarios.

nan

Article 1476

Title@2025-05-26 (1): Energy-based Preference Optimization for Test-time Adaptation

Title: Energy-based Preference Optimization for Test-time Adaptation

Energiebasierte Preference-Optimierung für die Testzeitanpassung

以能源为基础的试验时间适应最佳应用 2505.19607v1

Authors: Yewon Han, Seoyun Yang, Taesup Kim

Test-Time Adaptation (TTA) enhances model robustness by enabling adaptation to target distributions that differ from training distributions, improving real-world generalizability. Existing TTA approaches focus on adjusting the conditional distribution; however these methods often depend on uncertain predictions in the absence of label information, leading to unreliable performance. Energy-based frameworks suggest a promising alternative to address distribution shifts without relying on uncertain predictions, instead computing the marginal distribution of target data. However, they involve the critical challenge of requiring extensive SGLD sampling, which is impractical for test-time scenarios requiring immediate adaptation. In this work, we propose Energy-based Preference Optimization for Test-time Adaptation (EPOTTA), which is based on a sampling free strategy. We first parameterize the target model using a pretrained model and residual energy function, enabling marginal likelihood maximization of target data without sampling. Building on the observation that the parameterization is mathematically equivalent to DPO objective, we then directly adapt the model to a target distribution without explicitly training the residual. Our experiments verify that EPOTTA is well-calibrated and performant while achieving computational efficiency.

nan

Article 1477

Title@2025-05-26 (1): Kuramoto-FedAvg: Using Synchronization Dynamics to Improve Federated Learning Optimization under Statistical Heterogeneity

Title: Kuramoto-FedAvg: Using Synchronization Dynamics to Improve Federated Learning Optimization under Statistical Heterogeneity

Kuramoto-FedAvg: Synchronisationsdynamik zur Verbesserung der Federated Learning Optimization unter statistischer Heterogenität

Kuramoto-FedAvg:利用同步动态改善统计多样性下的联邦学习优化 2505.19605v1

Authors: Aggrey Muhebwa, Khotso Selialia, Fatima Anwar, Khalid K. Osman

Federated learning on heterogeneous (non-IID) client data experiences slow convergence due to client drift. To address this challenge, we propose Kuramoto-FedAvg, a federated optimization algorithm that reframes the weight aggregation step as a synchronization problem inspired by the Kuramoto model of coupled oscillators. The server dynamically weighs each client’s update based on its phase alignment with the global update, amplifying contributions that align with the global gradient direction while minimizing the impact of updates that are out of phase. We theoretically prove that this synchronization mechanism reduces client drift, providing a tighter convergence bound compared to the standard FedAvg under heterogeneous data distributions. Empirical validation supports our theoretical findings, showing that Kuramoto-FedAvg significantly accelerates convergence and improves accuracy across multiple benchmark datasets. Our work highlights the potential of coordination and synchronization-based strategies for managing gradient diversity and accelerating federated optimization in realistic non-IID settings.

nan

Article 1478

Title@2025-05-26 (1): Evaluating Machine Translation Models for English-Hindi Language Pairs: A Comparative Analysis

Title: Evaluating Machine Translation Models for English-Hindi Language Pairs: A Comparative Analysis

Machine Translation Models für Englisch-Hindi Sprachpaare bewerten: Eine vergleichende Analyse

英文-中文语文配对评价机器翻译模型:比较分析 2505.19604v1

Authors: Ahan Prasannakumar Shetty

Machine translation has become a critical tool in bridging linguistic gaps, especially between languages as diverse as English and Hindi. This paper comprehensively evaluates various machine translation models for translating between English and Hindi. We assess the performance of these models using a diverse set of automatic evaluation metrics, both lexical and machine learning-based metrics. Our evaluation leverages an 18000+ corpus of English Hindi parallel dataset and a custom FAQ dataset comprising questions from government websites. The study aims to provide insights into the effectiveness of different machine translation approaches in handling both general and specialized language domains. Results indicate varying performance levels across different metrics, highlighting strengths and areas for improvement in current translation systems.

nan

Article 1479

Title@2025-05-26 (1): Distributional Reinforcement Learning with Dual Expectile-Quantile Regression

Title: Distributional Reinforcement Learning with Dual Expectile-Quantile Regression

Verstärktes Lernen mit Dual Expectile-Quantile Regression

双预期量递减分布强化学习 2305.16877v4

Authors: Sami Jullien, Romain Deffayet, Jean-Michel Renders, Paul Groth, Maarten de Rijke

Distributional reinforcement learning (RL) has proven useful in multiple benchmarks as it enables approximating the full distribution of returns and extracts rich feedback from environment samples. The commonly used quantile regression approach to distributional RL – based on asymmetric $L_1$ losses – provides a flexible and effective way of learning arbitrary return distributions. In practice, it is often improved by using a more efficient, asymmetric hybrid $L_1$-$L_2$ Huber loss for quantile regression. However, by doing so, distributional estimation guarantees vanish, and we empirically observe that the estimated distribution rapidly collapses to its mean. Indeed, asymmetric $L_2$ losses, corresponding to expectile regression, cannot be readily used for distributional temporal difference. Motivated by the efficiency of $L_2$-based learning, we propose to jointly learn expectiles and quantiles of the return distribution in a way that allows efficient learning while keeping an estimate of the full distribution of returns. We prove that our proposed operator converges to the distributional Bellman operator in the limit of infinite estimated quantile and expectile fractions, and we benchmark a practical implementation on a toy example and at scale. On the Atari benchmark, our approach matches the performance of the Huber-based IQN-1 baseline after $200$M training frames but avoids distributional collapse and keeps estimates of the full distribution of returns.

nan

Article 1480

Title@2025-05-26 (1): Rep3D: Re-parameterize Large 3D Kernels with Low-Rank Receptive Modeling for Medical Imaging

Title: Rep3D: Re-parameterize Large 3D Kernels with Low-Rank Receptive Modeling for Medical Imaging

Rep3D: Große 3D-Kernel mit Low-Rank-Empfangsmodellierung für die medizinische Bildgebung neu parametrieren

Rep3D: 医疗成像低射感应模型的大型 3D 内核再修复 2505.19603v1

Authors: Ho Hin Lee, Quan Liu, Shunxing Bao, Yuankai Huo, Bennett A. Landman

In contrast to vision transformers, which model long-range dependencies through global self-attention, large kernel convolutions provide a more efficient and scalable alternative, particularly in high-resolution 3D volumetric settings. However, naively increasing kernel size often leads to optimization instability and degradation in performance. Motivated by the spatial bias observed in effective receptive fields (ERFs), we hypothesize that different kernel elements converge at variable rates during training. To support this, we derive a theoretical connection between element-wise gradients and first-order optimization, showing that structurally re-parameterized convolution blocks inherently induce spatially varying learning rates. Building on this insight, we introduce Rep3D, a 3D convolutional framework that incorporates a learnable spatial prior into large kernel training. A lightweight two-stage modulation network generates a receptive-biased scaling mask, adaptively re-weighting kernel updates and enabling local-to-global convergence behavior. Rep3D adopts a plain encoder design with large depthwise convolutions, avoiding the architectural complexity of multi-branch compositions. We evaluate Rep3D on five challenging 3D segmentation benchmarks and demonstrate consistent improvements over state-of-the-art baselines, including transformer-based and fixed-prior re-parameterization methods. By unifying spatial inductive bias with optimization-aware learning, Rep3D offers an interpretable, and scalable solution for 3D medical image analysis. The source code is publicly available at https://github.com/leeh43/Rep3D.

nan

Article 1481

Title@2025-05-26 (1): Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

Title: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

Speichereffiziente visuelle Autoregressive Modellierung mit Scale-Aware-KV-Cache-Kompression

KV缓存压缩的内存有效视觉自动递减模型 2505.19602v1

Authors: Kunjun Li, Zigeng Chen, Cheng-Yen Yang, Jenq-Neng Hwang

Visual Autoregressive (VAR) modeling has garnered significant attention for its innovative next-scale prediction approach, which yields substantial improvements in efficiency, scalability, and zero-shot generalization. Nevertheless, the coarse-to-fine methodology inherent in VAR results in exponential growth of the KV cache during inference, causing considerable memory consumption and computational redundancy. To address these bottlenecks, we introduce ScaleKV, a novel KV cache compression framework tailored for VAR architectures. ScaleKV leverages two critical observations: varying cache demands across transformer layers and distinct attention patterns at different scales. Based on these insights, ScaleKV categorizes transformer layers into two functional groups: drafters and refiners. Drafters exhibit dispersed attention across multiple scales, thereby requiring greater cache capacity. Conversely, refiners focus attention on the current token map to process local details, consequently necessitating substantially reduced cache capacity. ScaleKV optimizes the multi-scale inference pipeline by identifying scale-specific drafters and refiners, facilitating differentiated cache management tailored to each scale. Evaluation on the state-of-the-art text-to-image VAR model family, Infinity, demonstrates that our approach effectively reduces the required KV cache memory to 10% while preserving pixel-level fidelity.

nan

Article 1482

Title@2025-05-26 (1): Preference Optimization by Estimating the Ratio of the Data Distribution

Title: Preference Optimization by Estimating the Ratio of the Data Distribution

Präferenzoptimierung durch Schätzung des Verhältnisses der Datenverteilung

通过估计数据分配比率实现最佳优化 2505.19601v1

Authors: Yeongmin Kim, Heesun Bae, Byeonghu Na, Il-Chul Moon

Direct preference optimization (DPO) is widely used as a simple and stable method for aligning large language models (LLMs) with human preferences. This paper investigates a generalized DPO loss that enables a policy model to match the target policy from a likelihood ratio estimation perspective. The ratio of the target policy provides a unique identification of the policy distribution without relying on reward models or partition functions. This allows the generalized loss to retain both simplicity and theoretical guarantees, which prior work such as $f$-PO fails to achieve simultaneously. We propose Bregman preference optimization (BPO), a generalized framework for ratio matching that provides a family of objective functions achieving target policy optimality. BPO subsumes DPO as a special case and offers tractable forms for all instances, allowing implementation with a few lines of code. We further develop scaled Basu’s power divergence (SBA), a gradient scaling method that can be used for BPO instances. The BPO framework complements other DPO variants and is applicable to target policies defined by these variants. In experiments, unlike other probabilistic loss extensions such as $f$-DPO or $f$-PO, which exhibit a trade-off between generation fidelity and diversity, instances of BPO improve both win rate and entropy compared with DPO. When applied to Llama-3-Instruct-8B, BPO achieves state-of-the-art performance among Llama-3-8B backbones, with a 55.9\% length-controlled win rate on AlpacaEval2.

nan

Article 1483

Title@2025-05-26 (1): Inconsistent Tokenizations Cause Language Models to be Perplexed by Japanese Grammar

Title: Inconsistent Tokenizations Cause Language Models to be Perplexed by Japanese Grammar

Inkonsistente Tokenisierungen führen dazu, dass Sprachmodelle von japanischer Grammatik verblüfft werden.

前后不一致的招数导致语言模式被日语语法所混淆 2505.19599v1

Authors: Andrew Gambardella, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo

Typical methods for evaluating the performance of language models evaluate their ability to answer questions accurately. These evaluation metrics are acceptable for determining the extent to which language models can understand and reason about text in a general sense, but fail to capture nuanced capabilities, such as the ability of language models to recognize and obey rare grammar points, particularly in languages other than English. We measure the perplexity of language models when confronted with the “first person psych predicate restriction” grammar point in Japanese. Weblab is the only tested open source model in the 7-10B parameter range which consistently assigns higher perplexity to ungrammatical psych predicate sentences than grammatical ones. We give evidence that Weblab’s uniformly bad tokenization is a possible root cause for its good performance, and show that Llama 3’s perplexity on grammatical psych predicate sentences can be reduced by orders of magnitude (28x difference) by restricting test sentences to those with uniformly well-behaved tokenizations. We show in further experiments on machine translation tasks that language models will use alternative grammar patterns in order to produce grammatical sentences when tokenization issues prevent the most natural sentence from being output.

nan

Article 1484

Title@2025-05-26 (1): Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

Title: Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

Residual Connections und Normalisierung können eine Übersäuerung in GNNs wahrscheinlich verhindern

残留连接和正常化可可可避免防止全球NN的过度移动 2406.02997v3

Authors: Michael Scholkemper, Xinyi Wu, Ali Jadbabaie, Michael T. Schaub

Residual connections and normalization layers have become standard design choices for graph neural networks (GNNs), and were proposed as solutions to the mitigate the oversmoothing problem in GNNs. However, how exactly these methods help alleviate the oversmoothing problem from a theoretical perspective is not well understood. In this work, we provide a formal and precise characterization of (linearized) GNNs with residual connections and normalization layers. We establish that (a) for residual connections, the incorporation of the initial features at each layer can prevent the signal from becoming too smooth, and determines the subspace of possible node representations; (b) batch normalization prevents a complete collapse of the output embedding space to a one-dimensional subspace through the individual rescaling of each column of the feature matrix. This results in the convergence of node representations to the top-$k$ eigenspace of the message-passing operator; (c) moreover, we show that the centering step of a normalization layer – which can be understood as a projection – alters the graph signal in message-passing in such a way that relevant information can become harder to extract. We therefore introduce a novel, principled normalization layer called GraphNormv2 in which the centering step is learned such that it does not distort the original graph signal in an undesirable way. Experimental results confirm the effectiveness of our method.

nan

Article 1485

Title@2025-05-26 (1): How Well Can Differential Privacy Be Audited in One Run?

Title: How Well Can Differential Privacy Be Audited in One Run?

Wie gut kann die Privatsphäre in einem einzigen Lauf überprüft werden?

如何在单一运行中对差异隐私进行审计? 2503.07199v2

Authors: Amit Keinan, Moshe Shenfeld, Katrina Ligett

Recent methods for auditing the privacy of machine learning algorithms have improved computational efficiency by simultaneously intervening on multiple training examples in a single training run. Steinke et al. (2024) prove that one-run auditing indeed lower bounds the true privacy parameter of the audited algorithm, and give impressive empirical results. Their work leaves open the question of how precisely one-run auditing can uncover the true privacy parameter of an algorithm, and how that precision depends on the audited algorithm. In this work, we characterize the maximum achievable efficacy of one-run auditing and show that the key barrier to its efficacy is interference between the observable effects of different data elements. We present new conceptual approaches to minimize this barrier, towards improving the performance of one-run auditing of real machine learning algorithms.

nan

Article 1486

Title@2025-05-26 (1): Learning to Reason without External Rewards

Title: Learning to Reason without External Rewards

Vernunft lernen ohne externe Belohnungen

学习没有外部奖励的理性 2505.19590v1

Authors: Xuandong Zhao, Zhewei Kang, Aosong Feng, Sergey Levine, Dawn Song

Training large language models (LLMs) for complex reasoning via Reinforcement Learning with Verifiable Rewards (RLVR) is effective but limited by reliance on costly, domain-specific supervision. We explore Reinforcement Learning from Internal Feedback (RLIF), a framework that enables LLMs to learn from intrinsic signals without external rewards or labeled data. We propose Intuitor, an RLIF method that uses a model’s own confidence, termed self-certainty, as its sole reward signal. Intuitor replaces external rewards in Group Relative Policy Optimization (GRPO) with self-certainty scores, enabling fully unsupervised learning. Experiments demonstrate that Intuitor matches GRPO’s performance on mathematical benchmarks while achieving superior generalization to out-of-domain tasks like code generation, without requiring gold solutions or test cases. Our findings show that intrinsic model signals can drive effective learning across domains, offering a scalable alternative to RLVR for autonomous AI systems where verifiable rewards are unavailable. Code is available at https://github.com/sunblaze-ucb/Intuitor

nan

Article 1487

Title@2025-05-26 (1): WQLCP: Weighted Adaptive Conformal Prediction for Robust Uncertainty Quantification Under Distribution Shifts

Title: WQLCP: Weighted Adaptive Conformal Prediction for Robust Uncertainty Quantification Under Distribution Shifts

WQLCP: Gewichtete adaptive konforme Vorhersage für robuste Unsicherheit Quantifizierung unter Verteilungsverschiebungen

WQLCP: 分配变化下强势不确定性量化的加权适应性统一预测 2505.19587v1

Authors: Shadi Alijani, Homayoun Najjaran

Conformal prediction (CP) provides a framework for constructing prediction sets with guaranteed coverage, assuming exchangeable data. However, real-world scenarios often involve distribution shifts that violate exchangeability, leading to unreliable coverage and inflated prediction sets. To address this challenge, we first introduce Reconstruction Loss-Scaled Conformal Prediction (RLSCP), which utilizes reconstruction losses derived from a Variational Autoencoder (VAE) as an uncertainty metric to scale score functions. While RLSCP demonstrates performance improvements, mainly resulting in better coverage, it quantifies quantiles based on a fixed calibration dataset without considering the discrepancies between test and train datasets in an unexchangeable setting. In the next step, we propose Weighted Quantile Loss-scaled Conformal Prediction (WQLCP), which refines RLSCP by incorporating a weighted notion of exchangeability, adjusting the calibration quantile threshold based on weights with respect to the ratio of calibration and test loss values. This approach improves the CP-generated prediction set outputs in the presence of distribution shifts. Experiments on large-scale datasets, including ImageNet variants, demonstrate that WQLCP outperforms existing baselines by consistently maintaining coverage while reducing prediction set sizes, providing a robust solution for CP under distribution shifts.

nan

Article 1488

Title: Accelerating Prefilling for Long-Context LLMs via Sparse Pattern Sharing

Beschleunigung der Vorfüllung für Langkontext-LLMs über Sparse Pattern Sharing

通过 Sparse 模式共享加速预填长文本 LLMs 2505.19578v1

Authors: Dan Peng, Zhihui Fu, Zewen Ye, Zhuoran Song, Jun Wang

Sparse attention methods exploit the inherent sparsity in attention to speed up the prefilling phase of long-context inference, mitigating the quadratic complexity of full attention computation. While existing sparse attention methods rely on predefined patterns or inaccurate estimations to approximate attention behavior, they often fail to fully capture the true dynamics of attention, resulting in reduced efficiency and compromised accuracy. Instead, we propose a highly accurate sparse attention mechanism that shares similar yet precise attention patterns across heads, enabling a more realistic capture of the dynamic behavior of attention. Our approach is grounded in two key observations: (1) attention patterns demonstrate strong inter-head similarity, and (2) this similarity remains remarkably consistent across diverse inputs. By strategically sharing computed accurate patterns across attention heads, our method effectively captures actual patterns while requiring full attention computation for only a small subset of heads. Comprehensive evaluations demonstrate that our approach achieves superior or comparable speedup relative to state-of-the-art methods while delivering the best overall accuracy.

nan

Article 1489

Title@2025-05-26 (1): GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

Title: GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

GraLoRA: Granulare Low-Rank-Anpassung für den Parameter-Effizient Feintuning

GRALORA: 用于参数有效精密调整的颗粒式低兰克适应 2505.20355v1

Authors: Yeonjoon Jung, Daehyun Ahn, Hyungjun Kim, Taesu Kim, Eunhyeok Park

Low-Rank Adaptation (LoRA) is a popular method for parameter-efficient fine-tuning (PEFT) of generative models, valued for its simplicity and effectiveness. Despite recent enhancements, LoRA still suffers from a fundamental limitation: overfitting when the bottleneck is widened. It performs best at ranks 32-64, yet its accuracy stagnates or declines at higher ranks, still falling short of full fine-tuning (FFT) performance. We identify the root cause as LoRA’s structural bottleneck, which introduces gradient entanglement to the unrelated input channels and distorts gradient propagation. To address this, we introduce a novel structure, Granular Low-Rank Adaptation (GraLoRA) that partitions weight matrices into sub-blocks, each with its own low-rank adapter. With negligible computational or storage cost, GraLoRA overcomes LoRA’s limitations, effectively increases the representational capacity, and more closely approximates FFT behavior. Experiments on code generation and commonsense reasoning benchmarks show that GraLoRA consistently outperforms LoRA and other baselines, achieving up to +8.5% absolute gain in Pass@1 on HumanEval+. These improvements hold across model sizes and rank settings, making GraLoRA a scalable and robust solution for PEFT. Code, data, and scripts are available at https://github.com/SqueezeBits/GraLoRA.git

nan

Article 1490

Title@2025-05-26 (1): Situationally-Aware Dynamics Learning

Title: Situationally-Aware Dynamics Learning

Situational-Aware Dynamics Learning

情况认知动态学习 2505.19574v1

Authors: Alejandro Murillo-Gonzalez, Lantao Liu

Autonomous robots operating in complex, unstructured environments face significant challenges due to latent, unobserved factors that obscure their understanding of both their internal state and the external world. Addressing this challenge would enable robots to develop a more profound grasp of their operational context. To tackle this, we propose a novel framework for online learning of hidden state representations, with which the robots can adapt in real-time to uncertain and dynamic conditions that would otherwise be ambiguous and result in suboptimal or erroneous behaviors. Our approach is formalized as a Generalized Hidden Parameter Markov Decision Process, which explicitly models the influence of unobserved parameters on both transition dynamics and reward structures. Our core innovation lies in learning online the joint distribution of state transitions, which serves as an expressive representation of latent ego- and environmental-factors. This probabilistic approach supports the identification and adaptation to different operational situations, improving robustness and safety. Through a multivariate extension of Bayesian Online Changepoint Detection, our method segments changes in the underlying data generating process governing the robot’s dynamics. The robot’s transition model is then informed with a symbolic representation of the current situation derived from the joint distribution of latest state transitions, enabling adaptive and context-aware decision-making. To showcase the real-world effectiveness, we validate our approach in the challenging task of unstructured terrain navigation, where unmodeled and unmeasured terrain characteristics can significantly impact the robot’s motion. Extensive experiments in both simulation and real world reveal significant improvements in data efficiency, policy performance, and the emergence of safer, adaptive navigation strategies.

nan

Article 1491

Title@2025-05-26 (1): Truncated Kernel Stochastic Gradient Descent on Spheres

Title: Truncated Kernel Stochastic Gradient Descent on Spheres

Beschnittener Kern Stochastischer Gradient Abstieg auf Sphären

球体上被排出核心内核岩层渐变源 2410.01570v5

Authors: Jinhui Bai, Lei Shi

Inspired by the structure of spherical harmonics, we propose the truncated kernel stochastic gradient descent (T-kernel SGD) algorithm with a least-square loss function for spherical data fitting. T-kernel SGD introduces a novel regularization strategy by implementing stochastic gradient descent through a closed-form solution of the projection of the stochastic gradient in a low-dimensional subspace. In contrast to traditional kernel SGD, the regularization strategy implemented by T-kernel SGD is more effective in balancing bias and variance by dynamically adjusting the hypothesis space during iterations. The most significant advantage of the proposed algorithm is that it can achieve theoretically optimal convergence rates using a constant step size (independent of the sample size) while overcoming the inherent saturation problem of kernel SGD. Additionally, we leverage the structure of spherical polynomials to derive an equivalent T-kernel SGD, significantly reducing storage and computational costs compared to kernel SGD. Typically, T-kernel SGD requires only $\mathcal{O}(n^{1+\frac{d}{d-1}\epsilon})$ computational complexity and $\mathcal{O}(n^{\frac{d}{d-1}\epsilon})$ storage to achieve optimal rates for the d-dimensional sphere, where $0<\epsilon<\frac{1}{2}$ can be arbitrarily small if the optimal fitting or the underlying space possesses sufficient regularity. This regularity is determined by the smoothness parameter of the objective function and the decaying rate of the eigenvalues of the integral operator associated with the kernel function, both of which reflect the difficulty of the estimation problem. Our main results quantitatively characterize how this prior information influences the convergence of T-kernel SGD. The numerical experiments further validate the theoretical findings presented in this paper.

nan

Article 1492

Title@2025-05-26 (1): MSD-LLM: Predicting Ship Detention in Port State Control Inspections with Large Language Model

Title: MSD-LLM: Predicting Ship Detention in Port State Control Inspections with Large Language Model

MSD-LLM: Schiffshaft in Hafenstaatkontrolle mit großem Sprachmodell vorhersagen

MSD-LLM:用大语言模型预测港口国控制检查中船舶扣留情况 2505.19568v1

Authors: Jiongchao Jin, Xiuju Fu, Xiaowei Gao, Tao Cheng, Ran Yan

Maritime transportation is the backbone of global trade, making ship inspection essential for ensuring maritime safety and environmental protection. Port State Control (PSC), conducted by national ports, enforces compliance with safety regulations, with ship detention being the most severe consequence, impacting both ship schedules and company reputations. Traditional machine learning methods for ship detention prediction are limited by the capacity of representation learning and thus suffer from low accuracy. Meanwhile, autoencoder-based deep learning approaches face challenges due to the severe data imbalance in learning historical PSC detention records. To address these limitations, we propose Maritime Ship Detention with Large Language Models (MSD-LLM), integrating a dual robust subspace recovery (DSR) layer-based autoencoder with a progressive learning pipeline to handle imbalanced data and extract meaningful PSC representations. Then, a large language model groups and ranks features to identify likely detention cases, enabling dynamic thresholding for flexible detention predictions. Extensive evaluations on 31,707 PSC inspection records from the Asia-Pacific region show that MSD-LLM outperforms state-of-the-art methods more than 12\% on Area Under the Curve (AUC) for Singapore ports. Additionally, it demonstrates robustness to real-world challenges, making it adaptable to diverse maritime risk assessment scenarios.

nan

Article 1493

Title@2025-05-26 (1): BackSlash: Rate Constrained Optimized Training of Large Language Models

Title: BackSlash: Rate Constrained Optimized Training of Large Language Models

BackSlash: Rate Constrained Optimized Training of Large Language Models

对大语言模式优化培训 2504.16968v3

Authors: Jun Wu, Jiangtao Wen, Yuxing Han

The rapid advancement of large-language models (LLMs) has driven extensive research into parameter compression after training has been completed, yet compression during the training phase remains largely unexplored. In this work, we introduce Rate-Constrained Training (BackSlash), a novel training-time compression approach based on rate-distortion optimization (RDO). BackSlash enables a flexible trade-off between model accuracy and complexity, significantly reducing parameter redundancy while preserving performance. Experiments in various architectures and tasks demonstrate that BackSlash can reduce memory usage by 60% - 90% without accuracy loss and provides significant compression gain compared to compression after training. Moreover, BackSlash proves to be highly versatile: it enhances generalization with small Lagrange multipliers, improves model robustness to pruning (maintaining accuracy even at 80% pruning rates), and enables network simplification for accelerated inference on edge devices.

nan

Article 1494

Title@2025-05-26 (1): Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams

Title: Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams

Lego Sketch: Ein skalierbares neurales Netzwerk für das Sketching von Datenströmen

Lego Sletch: 一个可缩放的内存放大神经网络,用于切割数据流 2505.19561v1

Authors: Yuan Feng, Yukun Cao, Hairu Wang, Xike Xie, S Kevin Zhou

Sketches, probabilistic structures for estimating item frequencies in infinite data streams with limited space, are widely used across various domains. Recent studies have shifted the focus from handcrafted sketches to neural sketches, leveraging memory-augmented neural networks (MANNs) to enhance the streaming compression capabilities and achieve better space-accuracy trade-offs.However, existing neural sketches struggle to scale across different data domains and space budgets due to inflexible MANN configurations. In this paper, we introduce a scalable MANN architecture that brings to life the {\it Lego sketch}, a novel sketch with superior scalability and accuracy. Much like assembling creations with modular Lego bricks, the Lego sketch dynamically coordinates multiple memory bricks to adapt to various space budgets and diverse data domains. Our theoretical analysis guarantees its high scalability and provides the first error bound for neural sketch. Furthermore, extensive experimental evaluations demonstrate that the Lego sketch exhibits superior space-accuracy trade-offs, outperforming existing handcrafted and neural sketches. Our code is available at https://github.com/FFY0/LegoSketch_ICML.

nan

Article 1495

Title@2025-05-26 (1): EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding

Title: EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding

EuroCon: Benchmarking Parlament Beratung für politische Konsensfindung

EuroCon:确定议会审议政治共识结果的基准 2505.19558v1

Authors: Zhaowei Zhang, Minghua Yi, Mengmeng Wang, Fengshuo Bai, Zilong Zheng, Yipeng Kang, Yaodong Yang

Achieving political consensus is crucial yet challenging for the effective functioning of social governance. However, although frontier AI systems represented by large language models (LLMs) have developed rapidly in recent years, their capabilities on this scope are still understudied. In this paper, we introduce EuroCon, a novel benchmark constructed from 2,225 high-quality deliberation records of the European Parliament over 13 years, ranging from 2009 to 2022, to evaluate the ability of LLMs to reach political consensus among divergent party positions across diverse parliament settings. Specifically, EuroCon incorporates four factors to build each simulated parliament setting: specific political issues, political goals, participating parties, and power structures based on seat distribution. We also develop an evaluation framework for EuroCon to simulate real voting outcomes in different parliament settings, assessing whether LLM-generated resolutions meet predefined political goals. Our experimental results demonstrate that even state-of-the-art models remain undersatisfied with complex tasks like passing resolutions by a two-thirds majority and addressing security issues, while revealing some common strategies LLMs use to find consensus under different power structures, such as prioritizing the stance of the dominant party, highlighting EuroCon’s promise as an effective platform for studying LLMs’ ability to find political consensus.

nan

Article 1496

Title@2025-05-26 (1): Aligning Multiclass Neural Network Classifier Criterion with Task Performance Metrics

Title: Aligning Multiclass Neural Network Classifier Criterion with Task Performance Metrics

Ausrichten von Multiclass Neural Network Klassifikator Kriterium mit Task Performance Metrics

将多等神经网络分类标准与任务性性能计量对齐 2405.20954v2

Authors: Deyuan Li, Taesoo Daniel Lee, Marynel Vázquez, Nathan Tsoi

Multiclass neural network classifiers are typically trained using cross-entropy loss but evaluated using metrics derived from the confusion matrix, such as Accuracy, $F_\beta$-Score, and Matthews Correlation Coefficient. This mismatch between the training objective and evaluation metric can lead to suboptimal performance, particularly when the user’s priorities differ from what cross-entropy implicitly optimizes. For example, in the presence of class imbalance, $F_1$-Score may be preferred over Accuracy. Similarly, given a preference towards precision, the $F_{\beta=0.25}$-Score will better reflect this preference than $F_1$-Score. However, standard cross-entropy loss does not accommodate such a preference. Building on prior work leveraging soft-set confusion matrices and a continuous piecewise-linear Heaviside approximation, we propose Evaluation Aligned Surrogate Training (EAST), a novel approach to train multiclass classifiers using close surrogates of confusion-matrix based metrics, thereby aligning a neural network classifier’s predictions more closely to a target evaluation metric than typical cross-entropy loss. EAST introduces three key innovations: First, we propose a novel dynamic thresholding approach during training. Second, we propose using a multiclass soft-set confusion matrix. Third, we introduce an annealing process that gradually aligns the surrogate loss with the target evaluation metric. Our theoretical analysis shows that EAST results in consistent estimators of the target evaluation metric. Furthermore, we show that the learned network parameters converge asymptotically to values that optimize for the target evaluation metric. Extensive experiments validate the effectiveness of our approach, demonstrating improved alignment between training objectives and evaluation metrics, while outperforming existing methods across many datasets.

nan

Article 1497

Title@2025-05-26 (1): On scalable and efficient training of diffusion samplers

Title: On scalable and efficient training of diffusion samplers

Zur skalierbaren und effizienten Schulung von Diffusionssammlern

对推广采样员进行可推广和高效率的培训 2505.19552v1

Authors: Minkyu Kim, Kiyoung Seong, Dongyeop Woo, Sungsoo Ahn, Minsu Kim

We address the challenge of training diffusion models to sample from unnormalized energy distributions in the absence of data, the so-called diffusion samplers. Although these approaches have shown promise, they struggle to scale in more demanding scenarios where energy evaluations are expensive and the sampling space is high-dimensional. To address this limitation, we propose a scalable and sample-efficient framework that properly harmonizes the powerful classical sampling method and the diffusion sampler. Specifically, we utilize Monte Carlo Markov chain (MCMC) samplers with a novelty-based auxiliary energy as a Searcher to collect off-policy samples, using an auxiliary energy function to compensate for exploring modes the diffusion sampler rarely visits. These off-policy samples are then combined with on-policy data to train the diffusion sampler, thereby expanding its coverage of the energy landscape. Furthermore, we identify primacy bias, i.e., the preference of samplers for early experience during training, as the main cause of mode collapse during training, and introduce a periodic re-initialization trick to resolve this issue. Our method significantly improves sample efficiency on standard benchmarks for diffusion samplers and also excels at higher-dimensional problems and real-world molecular conformer generation.

nan

Article 1498

Title@2025-05-26 (1): Unlocking the Power of Diffusion Models in Sequential Recommendation: A Simple and Effective Approach

Title: Unlocking the Power of Diffusion Models in Sequential Recommendation: A Simple and Effective Approach

Entsperren der Macht von Diffusionsmodellen in der sequentiellen Empfehlung: Ein einfacher und effektiver Ansatz

在 “ 序列建议:简单而有效办法 “ 中解锁扩散模型扩散能力 2505.19544v1

Authors: Jialei Chen, Yuanbo Xu, Yiheng Jiang

In this paper, we focus on the often-overlooked issue of embedding collapse in existing diffusion-based sequential recommendation models and propose ADRec, an innovative framework designed to mitigate this problem. Diverging from previous diffusion-based methods, ADRec applies an independent noise process to each token and performs diffusion across the entire target sequence during training. ADRec captures token interdependency through auto-regression while modeling per-token distributions through token-level diffusion. This dual approach enables the model to effectively capture both sequence dynamics and item representations, overcoming the limitations of existing methods. To further mitigate embedding collapse, we propose a three-stage training strategy: (1) pre-training the embedding weights, (2) aligning these weights with the ADRec backbone, and (3) fine-tuning the model. During inference, ADRec applies the denoising process only to the last token, ensuring that the meaningful patterns in historical interactions are preserved. Our comprehensive empirical evaluation across six datasets underscores the effectiveness of ADRec in enhancing both the accuracy and efficiency of diffusion-based sequential recommendation systems.

nan

Article 1499

Title@2025-05-26 (1): Cuff-KT: Tackling Learners’ Real-time Learning Pattern Adjustment via Tuning-Free Knowledge State Guided Model Updating

Title: Cuff-KT: Tackling Learners’ Real-time Learning Pattern Adjustment via Tuning-Free Knowledge State Guided Model Updating

Cuff-KT: Anpassung von Lernmustern in Echtzeit durch Tuning-Free Knowledge State Guided Model Aktualisieren

CUff-KT:通过更新无资-无知识国家指导模式,解决学生实时学习模式调整问题 2505.19543v1

Authors: Yiyun Zhou, Zheqi Lv, Shengyu Zhang, Jingyuan Chen

Knowledge Tracing (KT) is a core component of Intelligent Tutoring Systems, modeling learners’ knowledge state to predict future performance and provide personalized learning support. Traditional KT models assume that learners’ learning abilities remain relatively stable over short periods or change in predictable ways based on prior performance. However, in reality, learners’ abilities change irregularly due to factors like cognitive fatigue, motivation, and external stress – a task introduced, which we refer to as Real-time Learning Pattern Adjustment (RLPA). Existing KT models, when faced with RLPA, lack sufficient adaptability, because they fail to timely account for the dynamic nature of different learners’ evolving learning patterns. Current strategies for enhancing adaptability rely on retraining, which leads to significant overfitting and high time overhead issues. To address this, we propose Cuff-KT, comprising a controller and a generator. The controller assigns value scores to learners, while the generator generates personalized parameters for selected learners. Cuff-KT controllably adapts to data changes fast and flexibly without fine-tuning. Experiments on five datasets from different subjects demonstrate that Cuff-KT significantly improves the performance of five KT models with different structures under intra- and inter-learner shifts, with an average relative increase in AUC of 10% and 4%, respectively, at a negligible time cost, effectively tackling RLPA task. Our code and datasets are fully available at https://github.com/zyy-2001/Cuff-KT.

nan

Article 1500

Title@2025-05-26 (1): FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation

Title: FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation

FastCache: Schnelles Caching für Difffusionstransformator durch erlernbare lineare Annäherung

快速缓存: 通过可学习的线性近似化快速缓存扩散变异器 2505.20353v1

Authors: Dong Liu, Jiayi Zhang, Yifan Li, Yanxuan Yu, Ben Lengerich, Ying Nian Wu

Diffusion Transformers (DiT) are powerful generative models but remain computationally intensive due to their iterative structure and deep transformer stacks. To alleviate this inefficiency, we propose FastCache, a hidden-state-level caching and compression framework that accelerates DiT inference by exploiting redundancy within the model’s internal representations. FastCache introduces a dual strategy: (1) a spatial-aware token selection mechanism that adaptively filters redundant tokens based on hidden state saliency, and (2) a transformer-level cache that reuses latent activations across timesteps when changes are statistically insignificant. These modules work jointly to reduce unnecessary computation while preserving generation fidelity through learnable linear approximation. Theoretical analysis shows that FastCache maintains bounded approximation error under a hypothesis-testing-based decision rule. Empirical evaluations across multiple DiT variants demonstrate substantial reductions in latency and memory usage, with best generation output quality compared to other cache methods, as measured by FID and t-FID. Code implementation of FastCache is available on GitHub at https://github.com/NoakLiu/FastCache-xDiT.

nan

Article 1501

Title@2025-05-26 (1): R3: Robust Rubric-Agnostic Reward Models

Title: R3: Robust Rubric-Agnostic Reward Models

R3: Robuste Rubric-Agnostische Belohnungsmodelle

R3:坚固的Rubric-不可知奖赏模型 2505.13388v2

Authors: David Anugraha, Zilu Tang, Lester James V. Miranda, Hanyang Zhao, Mohammad Rifqi Farhansyah, Garry Kuwanto, Derry Wijaya, Genta Indra Winata

Reward models are essential for aligning language model outputs with human preferences, yet existing approaches often lack both controllability and interpretability. These models are typically optimized for narrow objectives, limiting their generalizability to broader downstream tasks. Moreover, their scalar outputs are difficult to interpret without contextual reasoning. To address these limitations, we introduce R3, a novel reward modeling framework that is rubric-agnostic, generalizable across evaluation dimensions, and provides interpretable, reasoned score assignments. R3 enables more transparent and flexible evaluation of language models, supporting robust alignment with diverse human values and use cases. Our models, data, and code are available as open source at https://github.com/rubricreward/r3

nan

Article 1502

Title@2025-05-26 (1): Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

Title: Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

Amulett: Neuausrichtung während der Testzeit für Personalisierte Präferenzanpassung von LLMs

缩略图:在试验期间重新对准,以适应LLMM的个性化偏好 2502.19148v2

Authors: Zhaowei Zhang, Fengshuo Bai, Qizhi Chen, Chengdong Ma, Mingzhi Wang, Haoran Sun, Zilong Zheng, Yaodong Yang

How to align large language models (LLMs) with user preferences from a static general dataset has been frequently studied. However, user preferences are usually personalized, changing, and diverse regarding culture, values, or time. This leads to the problem that the actual user preferences often do not coincide with those trained by the model developers in the practical use of LLMs. Since we cannot collect enough data and retrain for every demand, researching efficient real-time preference adaptation methods based on the backbone LLMs during test time is important. To this end, we introduce Amulet, a novel, training-free framework that formulates the decoding process of every token as a separate online learning problem with the guidance of simple user-provided prompts, thus enabling real-time optimization to satisfy users’ personalized preferences. To reduce the computational cost brought by this optimization process for each token, we additionally provide a closed-form solution for each iteration step of the optimization process, thereby reducing the computational time cost to a negligible level. The detailed experimental results demonstrate that Amulet can achieve significant performance improvements in rich settings with combinations of different LLMs, datasets, and user preferences, while maintaining acceptable computational efficiency.

nan

Article 1503

Title@2025-05-26 (1): CITRAS: Covariate-Informed Transformer for Time Series Forecasting

Title: CITRAS: Covariate-Informed Transformer for Time Series Forecasting

CITRAS: Kovariat-informierter Transformer für die Zeitreihenprognose

CITRAS: 用于时间序列预测的共变-内建变换器 2503.24007v2

Authors: Yosuke Yamaguchi, Issei Suemitsu, Wenpeng Wei

In practical time series forecasting, covariates provide rich contextual information that can potentially enhance the forecast of target variables. Although some covariates extend into the future forecasting horizon (e.g., calendar events, discount schedules), most multivariate models fail to leverage this pivotal insight due to the length discrepancy with target variables. Additionally, capturing the dependency between target variables and covariates is non-trivial, as models must precisely reflect the local impact of covariates while also capturing global cross-variate dependencies. To overcome these challenges, we propose CITRAS, a decoder-only Transformer that flexibly leverages multiple targets, past covariates, and future covariates. While preserving strong autoregressive capabilities, CITRAS introduces two novel mechanisms in patch-wise cross-variate attention: Key-Value (KV) Shift and Attention Score Smoothing. KV Shift seamlessly incorporates future covariates into the forecasting of target variables based on their concurrent dependencies. Additionally, Attention Score Smoothing refines locally accurate patch-wise cross-variate dependencies into global variate-level dependencies by smoothing the past series of attention scores. Experimentally, CITRAS outperforms state-of-the-art models on thirteen real-world benchmarks from both covariate-informed and multivariate settings, demonstrating its versatile ability to leverage cross-variate and cross-time dependencies for improved forecasting accuracy.

nan

Article 1504

Title@2025-05-26 (1): Continuous-Time Analysis of Heavy Ball Momentum in Min-Max Games

Title: Continuous-Time Analysis of Heavy Ball Momentum in Min-Max Games

Kontinuierliche Zeitanalyse von schweren Ball Momentum in Min-Max-Spiele

Min-Min-Max运动会重球势连续分析 2505.19537v1

Authors: Yi Feng, Kaito Fujii, Stratis Skoulakis, Xiao Wang, Volkan Cevher

Since Polyak’s pioneering work, heavy ball (HB) momentum has been widely studied in minimization. However, its role in min-max games remains largely unexplored. As a key component of practical min-max algorithms like Adam, this gap limits their effectiveness. In this paper, we present a continuous-time analysis for HB with simultaneous and alternating update schemes in min-max games. Locally, we prove smaller momentum enhances algorithmic stability by enabling local convergence across a wider range of step sizes, with alternating updates generally converging faster. Globally, we study the implicit regularization of HB, and find smaller momentum guides algorithms trajectories towards shallower slope regions of the loss landscapes, with alternating updates amplifying this effect. Surprisingly, all these phenomena differ from those observed in minimization, where larger momentum yields similar effects. Our results reveal fundamental differences between HB in min-max games and minimization, and numerical experiments further validate our theoretical results.

nan

Article 1505

Title@2025-05-26 (1): Training-Free Multi-Step Audio Source Separation

Title: Training-Free Multi-Step Audio Source Separation

Schulungsfreie Mehrstufen-Audio-Quellentrennung

无培训的多步骤多步骤音频来源分离 2505.19534v1

Authors: Yongyi Zang, Jingyi Li, Qiuqiang Kong

Audio source separation aims to separate a mixture into target sources. Previous audio source separation systems usually conduct one-step inference, which does not fully explore the separation ability of models. In this work, we reveal that pretrained one-step audio source separation models can be leveraged for multi-step separation without additional training. We propose a simple yet effective inference method that iteratively applies separation by optimally blending the input mixture with the previous step’s separation result. At each step, we determine the optimal blending ratio by maximizing a metric. We prove that our method always yield improvement over one-step inference, provide error bounds based on model smoothness and metric robustness, and provide theoretical analysis connecting our method to denoising along linear interpolation paths between noise and clean distributions, a property we link to denoising diffusion bridge models. Our approach effectively delivers improved separation performance as a “free lunch” from existing models. Our empirical results demonstrate that our multi-step separation approach consistently outperforms one-step inference across both speech enhancement and music source separation tasks, and can achieve scaling performance similar to training a larger model, using more data, or in some cases employing a multi-step training objective. These improvements appear not only on the optimization metric during multi-step inference, but also extend to nearly all non-optimized metrics (with one exception). We also discuss limitations of our approach and directions for future research.

nan

Article 1506

Title@2025-05-26 (1): ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models

Title: ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models

ExAnte: Ein Benchmark für Ex-Ante-Schlussfolgerungen in großen Sprachmodellen

ExAnte:大语言模型前推定基准 2505.19533v1

Authors: Yachuan Liu, Xiaochun Wei, Lin Shi, Xinnuo Li, Bohan Zhang, Paramveer Dhillon, Qiaozhu Mei

Large language models (LLMs) face significant challenges in ex-ante reasoning, where analysis, inference, or predictions must be made without access to information from future events. Even with explicit prompts enforcing temporal cutoffs, LLMs often generate outputs influenced by internalized knowledge of events beyond the specified cutoff. This paper introduces a novel task and benchmark designed to evaluate the ability of LLMs to reason while adhering to such temporal constraints. The benchmark includes a variety of tasks: stock prediction, Wikipedia event prediction, scientific publication prediction, and Question Answering (QA), designed to assess factual knowledge under temporal cutoff constraints. We use leakage rate to quantify models’ reliance on future information beyond cutoff timestamps. Experimental results reveal that LLMs struggle to consistently adhere to temporal cutoffs across common prompting strategies and tasks, demonstrating persistent challenges in ex-ante reasoning. This benchmark provides a potential evaluation framework to advance the development of LLMs’ temporal reasoning ability for time-sensitive applications.

nan

Article 1507

Title@2025-05-26 (1): Fox in the Henhouse: Supply-Chain Backdoor Attacks Against Reinforcement Learning

Title: Fox in the Henhouse: Supply-Chain Backdoor Attacks Against Reinforcement Learning

Fox im Henhouse: Supply-Chain-Hintertür greift gegen Verstärkungslernen an

Henhouse的狐狸:供应-Chain对加强学习的后门攻击 2505.19532v1

Authors: Shijie Liu, Andrew C. Cullen, Paul Montague, Sarah Erfani, Benjamin I. P. Rubinstein

The current state-of-the-art backdoor attacks against Reinforcement Learning (RL) rely upon unrealistically permissive access models, that assume the attacker can read (or even write) the victim’s policy parameters, observations, or rewards. In this work, we question whether such a strong assumption is required to launch backdoor attacks against RL. To answer this question, we propose the \underline{S}upply-\underline{C}h\underline{a}in \underline{B}ackdoor (SCAB) attack, which targets a common RL workflow: training agents using external agents that are provided separately or embedded within the environment. In contrast to prior works, our attack only relies on legitimate interactions of the RL agent with the supplied agents. Despite this limited access model, by poisoning a mere $3\%$ of training experiences, our attack can successfully activate over $90\%$ of triggered actions, reducing the average episodic return by $80\%$ for the victim. Our novel attack demonstrates that RL attacks are likely to become a reality under untrusted RL training supply-chains.

nan

Article 1508

Title@2025-05-26 (1): Minimalist Softmax Attention Provably Learns Constrained Boolean Functions

Title: Minimalist Softmax Attention Provably Learns Constrained Boolean Functions

Minimalistische Softmax-Achtung lernt nachweislich eingeschränkte Boolean-Funktionen

最小软性软性关注 2505.19531v1

Authors: Jerry Yao-Chieh Hu, Xiwen Zhang, Maojiang Su, Zhao Song, Han Liu

We study the computational limits of learning $k$-bit Boolean functions (specifically, $\mathrm{AND}$, $\mathrm{OR}$, and their noisy variants), using a minimalist single-head softmax-attention mechanism, where $k=\Theta(d)$ relevant bits are selected from $d$ inputs. We show that these simple $\mathrm{AND}$ and $\mathrm{OR}$ functions are unsolvable with a single-head softmax-attention mechanism alone. However, with teacher forcing, the same minimalist attention is capable of solving them. These findings offer two key insights: Architecturally, solving these Boolean tasks requires only minimalist attention, without deep Transformer blocks or FFNs. Methodologically, one gradient descent update with supervision suffices and replaces the multi-step Chain-of-Thought (CoT) reasoning scheme of [Kim and Suzuki, ICLR 2025] for solving Boolean problems. Together, the bounds expose a fundamental gap between what this minimal architecture achieves under ideal supervision and what is provably impossible under standard training.

nan

Article 1509

Title@2025-05-26 (1): SLOT: Sample-specific Language Model Optimization at Test-time

Title: SLOT: Sample-specific Language Model Optimization at Test-time

Steckplatz: Beispielspezifische Sprachmodelloptimierung zur Testzeit

SPLOT: 测试时特定抽样语文示范模式优化 2505.12392v2

Authors: Yang Hu, Xingyu Zhang, Xueji Fang, Zhiyang Chen, Xiao Wang, Huatian Zhang, Guojun Qi

We propose SLOT (Sample-specific Language Model Optimization at Test-time), a novel and parameter-efficient test-time inference approach that enhances a language model’s ability to more accurately respond to individual prompts. Existing Large Language Models (LLMs) often struggle with complex instructions, leading to poor performances on those not well represented among general samples. To address this, SLOT conducts few optimization steps at test-time to update a light-weight sample-specific parameter vector. It is added to the final hidden layer before the output head, and enables efficient adaptation by caching the last layer features during per-sample optimization. By minimizing the cross-entropy loss on the input prompt only, SLOT helps the model better aligned with and follow each given instruction. In experiments, we demonstrate that our method outperforms the compared models across multiple benchmarks and LLMs. For example, Qwen2.5-7B with SLOT achieves an accuracy gain of 8.6% on GSM8K from 57.54% to 66.19%, while DeepSeek-R1-Distill-Llama-70B with SLOT achieves a SOTA accuracy of 68.69% on GPQA among 70B-level models. Our code is available at https://github.com/maple-research-lab/SLOT.

nan

Article 1510

Title@2025-05-26 (1): Navigating loss manifolds via rigid body dynamics: A promising avenue for robustness and generalisation

Title: Navigating loss manifolds via rigid body dynamics: A promising avenue for robustness and generalisation

Navigieren von Verlustkrümmern über starre Körperdynamik: Ein vielversprechender Weg für Robustheit und Verallgemeinerung

通过僵硬体体体动态来控制损失方块:加强和普及的有希望的途径 2505.19527v1

Authors: Mohammed D. Belgoumri, Mohamed Reda Bouadjenek, Hakim Hacid, Imran Razzak, Sunil Aryal

Training large neural networks through gradient-based optimization requires navigating high-dimensional loss landscapes, which often exhibit pathological geometry, leading to undesirable training dynamics. In particular, poor generalization frequently results from convergence to sharp minima that are highly sensitive to input perturbations, causing the model to overfit the training data while failing to generalize to unseen examples. Furthermore, these optimization procedures typically display strong dependence on the fine structure of the loss landscape, leading to unstable training dynamics, due to the fractal-like nature of the loss surface. In this work, we propose an alternative optimizer that simultaneously reduces this dependence, and avoids sharp minima, thereby improving generalization. This is achieved by simulating the motion of the center of a ball rolling on the loss landscape. The degree to which our optimizer departs from the standard gradient descent is controlled by a hyperparameter, representing the radius of the ball. Changing this hyperparameter allows for probing the loss landscape at different scales, making it a valuable tool for understanding its geometry.

nan

Article 1511

Title@2025-05-26 (1): Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate

Title: Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate

Rethinking Gating Mechanism in Sparse MoE: Arbiträre Modalitätsinputs mit vertrauensgeführtem Tor bearbeiten

微粒MOE中的重新思考定位机制:用信任引导门处理任意模式投入 2505.19525v1

Authors: Liangwei Nathan Zheng, Wei Emma Zhang, Mingyu Guo, Miao Xu, Olaf Maennel, Weitong Chen

Effectively managing missing modalities is a fundamental challenge in real-world multimodal learning scenarios, where data incompleteness often results from systematic collection errors or sensor failures. Sparse Mixture-of-Experts (SMoE) architectures have the potential to naturally handle multimodal data, with individual experts specializing in different modalities. However, existing SMoE approach often lacks proper ability to handle missing modality, leading to performance degradation and poor generalization in real-world applications. We propose Conf-SMoE to introduce a two-stage imputation module to handle the missing modality problem for the SMoE architecture and reveal the insight of expert collapse from theoretical analysis with strong empirical evidence. Inspired by our theoretical analysis, Conf-SMoE propose a novel expert gating mechanism by detaching the softmax routing score to task confidence score w.r.t ground truth. This naturally relieves expert collapse without introducing additional load balance loss function. We show that the insights of expert collapse aligns with other gating mechanism such as Gaussian and Laplacian gate. We also evaluate the proposed method on four different real world dataset with three different experiment settings to conduct comprehensive the analysis of Conf-SMoE on modality fusion and resistance to missing modality.

nan

Article 1512

Title@2025-05-26 (1): Semi-Supervised Model-Free Bayesian State Estimation from Compressed Measurements

Title: Semi-Supervised Model-Free Bayesian State Estimation from Compressed Measurements

Halbüberwachte modellfreie bayesische Staatsschätzung aus komprimierten Messungen

根据压缩计量法对贝耶斯州无模式模型的半有效估算 2407.07368v5

Authors: Anubhab Ghosh, Yonina C. Eldar, Saikat Chatterjee

We consider data-driven Bayesian state estimation from compressed measurements (BSCM) of a model-free process. The dimension of the temporal measurement vector is lower than that of the temporal state vector to be estimated, leading to an under-determined inverse problem. The underlying dynamical model of the state’s evolution is unknown for a ‘model-free process.’ Hence, it is difficult to use traditional model-driven methods, for example, Kalman and particle filters. Instead, we consider data-driven methods. We experimentally show that two existing unsupervised learning-based data-driven methods fail to address the BSCM problem in a model-free process. The methods are – data-driven nonlinear state estimation (DANSE) and deep Markov model (DMM). While DANSE provides good predictive/forecasting performance to model the temporal measurement data as a time series, its unsupervised learning lacks suitable regularization for tackling the BSCM task. We then propose a semi-supervised learning approach and develop a semi-supervised learning-based DANSE method, referred to as SemiDANSE. In SemiDANSE, we use a large amount of unlabelled data along with a limited amount of labelled data, i.e., pairwise measurement-and-state data, which provides the desired regularization. Using three benchmark dynamical systems, we empirically show that the data-driven SemiDANSE provides competitive state estimation performance for BSCM using a handful of different measurement systems, against a hybrid method called KalmanNet and two model-driven methods (extended Kalman filter and unscented Kalman filter) that know the dynamical models exactly.

nan

Article 1513

Title@2025-05-26 (1): Applications and Effect Evaluation of Generative Adversarial Networks in Semi-Supervised Learning

Title: Applications and Effect Evaluation of Generative Adversarial Networks in Semi-Supervised Learning

Anwendungen und Wirkungsbewertung generativer adversarialer Netzwerke im semi-überwachten Lernen

半监测学习中产生反效果网络的应用和效果评价 2505.19522v1

Authors: Jiyu Hu, Haijiang Zeng, Zhen Tian

In recent years, image classification, as a core task in computer vision, relies on high-quality labelled data, which restricts the wide application of deep learning models in practical scenarios. To alleviate the problem of insufficient labelled samples, semi-supervised learning has gradually become a research hotspot. In this paper, we construct a semi-supervised image classification model based on Generative Adversarial Networks (GANs), and through the introduction of the collaborative training mechanism of generators, discriminators and classifiers, we achieve the effective use of limited labelled data and a large amount of unlabelled data, improve the quality of image generation and classification accuracy, and provide an effective solution for the task of image recognition in complex environments.

nan

Article 1514

Title@2025-05-26 (1): Learning Dynamics under Environmental Constraints via Measurement-Induced Bundle Structures

Title: Learning Dynamics under Environmental Constraints via Measurement-Induced Bundle Structures

Dynamisches Lernen unter Umweltauflagen durch messinduzierte Bundle-Strukturen

通过衡量产生的捆绑结构,在环境制约因素下学习动力 2505.19521v1

Authors: Dongzhe Zheng, Wenjie Mei

Learning unknown dynamics under environmental (or external) constraints is fundamental to many fields (e.g., modern robotics), particularly challenging when constraint information is only locally available and uncertain. Existing approaches requiring global constraints or using probabilistic filtering fail to fully exploit the geometric structure inherent in local measurements (by using, e.g., sensors) and constraints. This paper presents a geometric framework unifying measurements, constraints, and dynamics learning through a fiber bundle structure over the state space. This naturally induced geometric structure enables measurement-aware Control Barrier Functions that adapt to local sensing (or measurement) conditions. By integrating Neural ODEs, our framework learns continuous-time dynamics while preserving geometric constraints, with theoretical guarantees of learning convergence and constraint satisfaction dependent on sensing quality. The geometric framework not only enables efficient dynamics learning but also suggests promising directions for integration with reinforcement learning approaches. Extensive simulations demonstrate significant improvements in both learning efficiency and constraint satisfaction over traditional methods, especially under limited and uncertain sensing conditions.

nan

Article 1515

Title@2025-05-26 (1): SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback

Title: SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback

SIPDO: Closed-Loop Prompt Optimierung über Synthetic Data Feedback

SIPDO:通过合成数据反馈,通过闭闭电话快速优化 2505.19514v1

Authors: Yaoning Yu, Ye Yu, Kai Wei, Haojing Luo, Haohan Wang

Prompt quality plays a critical role in the performance of large language models (LLMs), motivating a growing body of work on prompt optimization. Most existing methods optimize prompts over a fixed dataset, assuming static input distributions and offering limited support for iterative improvement. We introduce SIPDO (Self-Improving Prompts through Data-Augmented Optimization), a closed-loop framework for prompt learning that integrates synthetic data generation into the optimization process. SIPDO couples a synthetic data generator with a prompt optimizer, where the generator produces new examples that reveal current prompt weaknesses and the optimizer incrementally refines the prompt in response. This feedback-driven loop enables systematic improvement of prompt performance without assuming access to external supervision or new tasks. Experiments across question answering and reasoning benchmarks show that SIPDO outperforms standard prompt tuning methods, highlighting the value of integrating data synthesis into prompt learning workflows.

nan

Article 1516

Title@2025-05-26 (1): Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models

Title: Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models

Benchmarking multimodaler Wissenskonflikt für große multimodale Modelle

确定大型多式联运模式多模式知识冲突基准 2505.19509v1

Authors: Yifan Jia, Kailin Jiang, Yuyang Liang, Qihan Ren, Yi Xin, Rui Yang, Fenze Feng, Mingcai Chen, Hengyang Lu, Haozhe Wang, Xiaoye Qu, Dongrui Liu, Lizhen Cui, Yuntao Du

Large Multimodal Models(LMMs) face notable challenges when encountering multimodal knowledge conflicts, particularly under retrieval-augmented generation(RAG) frameworks where the contextual information from external sources may contradict the model’s internal parametric knowledge, leading to unreliable outputs. However, existing benchmarks fail to reflect such realistic conflict scenarios. Most focus solely on intra-memory conflicts, while context-memory and inter-context conflicts remain largely investigated. Furthermore, commonly used factual knowledge-based evaluations are often overlooked, and existing datasets lack a thorough investigation into conflict detection capabilities. To bridge this gap, we propose MMKC-Bench, a benchmark designed to evaluate factual knowledge conflicts in both context-memory and inter-context scenarios. MMKC-Bench encompasses three types of multimodal knowledge conflicts and includes 1,573 knowledge instances and 3,381 images across 23 broad types, collected through automated pipelines with human verification. We evaluate three representative series of LMMs on both model behavior analysis and conflict detection tasks. Our findings show that while current LMMs are capable of recognizing knowledge conflicts, they tend to favor internal parametric knowledge over external evidence. We hope MMKC-Bench will foster further research in multimodal knowledge conflict and enhance the development of multimodal RAG systems. The source code is available at https://github.com/MLLMKCBENCH/MLLMKC.

nan

Article 1517

Title@2025-05-26 (1): Multimodal Machine Translation with Visual Scene Graph Pruning

Title: Multimodal Machine Translation with Visual Scene Graph Pruning

Multimodale maschinelle Übersetzung mit visuellen Szenendiagrammen

带有视觉场景图的多式机器翻译 2505.19507v1

Authors: Chenyu Lu, Shiliang Sun, Jing Zhao, Nan Zhang, Tengfei Song, Hao Yang

Multimodal machine translation (MMT) seeks to address the challenges posed by linguistic polysemy and ambiguity in translation tasks by incorporating visual information. A key bottleneck in current MMT research is the effective utilization of visual data. Previous approaches have focused on extracting global or region-level image features and using attention or gating mechanisms for multimodal information fusion. However, these methods have not adequately tackled the issue of visual information redundancy in MMT, nor have they proposed effective solutions. In this paper, we introduce a novel approach–multimodal machine translation with visual Scene Graph Pruning (PSG), which leverages language scene graph information to guide the pruning of redundant nodes in visual scene graphs, thereby reducing noise in downstream translation tasks. Through extensive comparative experiments with state-of-the-art methods and ablation studies, we demonstrate the effectiveness of the PSG model. Our results also highlight the promising potential of visual information pruning in advancing the field of MMT.

nan

Article 1518

Title@2025-05-26 (1): Understanding Why Large Language Models Can Be Ineffective in Time Series Analysis: The Impact of Modality Alignment

Title: Understanding Why Large Language Models Can Be Ineffective in Time Series Analysis: The Impact of Modality Alignment

Verständnis, warum große Sprachmodelle in der Zeitreihenanalyse unwirksam sein können: Die Auswirkungen der Modalitätsausrichtung

理解为何大语言模型在时间序列分析中无效:方式调整的影响 2410.12326v2

Authors: Liangwei Nathan Zheng, Chang George Dong, Wei Emma Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen

Large Language Models (LLMs) have demonstrated impressive performance in time series analysis and seems to understand the time temporal relationship well than traditional transformer-based approaches. However, since LLMs are not designed for time series tasks, simpler models like linear regressions can often achieve comparable performance with far less complexity. In this study, we perform extensive experiments to assess the effectiveness of applying LLMs to key time series tasks, including forecasting, classification, imputation, and anomaly detection. We compare the performance of LLMs against simpler baseline models, such as single layer linear models and randomly initialized LLMs. Our results reveal that LLMs offer minimal advantages for these core time series tasks and may even distort the temporal structure of the data. In contrast, simpler models consistently outperform LLMs while requiring far fewer parameters. Furthermore, we analyze existing reprogramming techniques and show, through data manifold analysis, that these methods fail to effectively align time series data with language and display “pseudo-alignment” behavior in embedding space. Our findings suggest that the performance of LLM based methods in time series tasks arises from the intrinsic characteristics and structure of time series data, rather than any meaningful alignment with the language model architecture.

nan

Article 1519

Title@2025-05-26 (1): DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation

Title: DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation

DOGe: Defensive Output Generation für LLM-Schutz vor Wissensdestillation

DOGe: 防知识蒸馏保护LLM的防御性产出产生 2505.19504v1

Authors: Pingzhi Li, Zhen Tan, Huaizhi Qu, Huan Liu, Tianlong Chen

Large Language Models (LLMs) represent substantial intellectual and economic investments, yet their effectiveness can inadvertently facilitate model imitation via knowledge distillation (KD).In practical scenarios, competitors can distill proprietary LLM capabilities by simply observing publicly accessible outputs, akin to reverse-engineering a complex performance by observation alone. Existing protective methods like watermarking only identify imitation post-hoc, while other defenses assume the student model mimics the teacher’s internal logits, rendering them ineffective against distillation purely from observed output text. This paper confronts the challenge of actively protecting LLMs within the realistic constraints of API-based access. We introduce an effective and efficient Defensive Output Generation (DOGe) strategy that subtly modifies the output behavior of an LLM. Its outputs remain accurate and useful for legitimate users, yet are designed to be misleading for distillation, significantly undermining imitation attempts. We achieve this by fine-tuning only the final linear layer of the teacher LLM with an adversarial loss. This targeted training approach anticipates and disrupts distillation attempts during inference time. Our experiments show that, while preserving or even improving the original performance of the teacher model, student models distilled from the defensively generated teacher outputs demonstrate catastrophically reduced performance, demonstrating our method’s effectiveness as a practical safeguard against KD-based model imitation.

nan

Article 1520

Title@2025-05-26 (1): Differentially private ratio statistics

Title: Differentially private ratio statistics

Statistiken über unterschiedliche private Verhältnisse

差异性私人比率统计 2505.20351v1

Authors: Tomer Shoham, Katrina Ligettt

Ratio statistics–such as relative risk and odds ratios–play a central role in hypothesis testing, model evaluation, and decision-making across many areas of machine learning, including causal inference and fairness analysis. However, despite privacy concerns surrounding many datasets and despite increasing adoption of differential privacy, differentially private ratio statistics have largely been neglected by the literature and have only recently received an initial treatment by Lin et al. [1]. This paper attempts to fill this lacuna, giving results that can guide practice in evaluating ratios when the results must be protected by differential privacy. In particular, we show that even a simple algorithm can provide excellent properties concerning privacy, sample accuracy, and bias, not just asymptotically but also at quite small sample sizes. Additionally, we analyze a differentially private estimator for relative risk, prove its consistency, and develop a method for constructing valid confidence intervals. Our approach bridges a gap in the differential privacy literature and provides a practical solution for ratio estimation in private machine learning pipelines.

nan

Article 1521

Title@2025-05-26 (1): Learning for Dynamic Combinatorial Optimization without Training Data

Title: Learning for Dynamic Combinatorial Optimization without Training Data

Lernen für dynamische kombinatorische Optimierung ohne Trainingsdaten

没有培训数据的动态组合优化学习 2505.19497v1

Authors: Yiqiao Liao, Farinaz Koushanfar, Parinaz Naghizadeh

We introduce DyCO-GNN, a novel unsupervised learning framework for Dynamic Combinatorial Optimization that requires no training data beyond the problem instance itself. DyCO-GNN leverages structural similarities across time-evolving graph snapshots to accelerate optimization while maintaining solution quality. We evaluate DyCO-GNN on dynamic maximum cut, maximum independent set, and the traveling salesman problem across diverse datasets of varying sizes, demonstrating its superior performance under tight and moderate time budgets. DyCO-GNN consistently outperforms the baseline methods, achieving high-quality solutions up to 3-60x faster, highlighting its practical effectiveness in rapidly evolving resource-constrained settings.

nan

Article 1522

Title@2025-05-26 (1): MetaSTNet: Multimodal Meta-learning for Cellular Traffic Conformal Prediction

Title: MetaSTNet: Multimodal Meta-learning for Cellular Traffic Conformal Prediction

MetaSTNet: Multimodales Meta-Learning für zellulären Verkehr Konforme Vorhersage

MetaSTNet: 细胞交通预测的多模式元学习 2505.21553v1

Authors: Hui Ma, Kai Yang

Network traffic prediction techniques have attracted much attention since they are valuable for network congestion control and user experience improvement. While existing prediction techniques can achieve favorable performance when there is sufficient training data, it remains a great challenge to make accurate predictions when only a small amount of training data is available. To tackle this problem, we propose a deep learning model, entitled MetaSTNet, based on a multimodal meta-learning framework. It is an end-to-end network architecture that trains the model in a simulator and transfers the meta-knowledge to a real-world environment, which can quickly adapt and obtain accurate predictions on a new task with only a small amount of real-world training data. In addition, we further employ cross conformal prediction to assess the calibrated prediction intervals. Extensive experiments have been conducted on real-world datasets to illustrate the efficiency and effectiveness of MetaSTNet.

nan

Article 1523

Title@2025-05-26 (1): Discounted Online Convex Optimization: Uniform Regret Across a Continuous Interval

Title: Discounted Online Convex Optimization: Uniform Regret Across a Continuous Interval

Discounted Online Convex-Optimierung: Einheitlicher Bedauern über einen kontinuierlichen Intervall

贴现的在线 Convex 优化: 连续间隔的统一遗憾 2505.19491v1

Authors: Wenhao Yang, Sifan Yang, Lijun Zhang

Reflecting the greater significance of recent history over the distant past in non-stationary environments, $\lambda$-discounted regret has been introduced in online convex optimization (OCO) to gracefully forget past data as new information arrives. When the discount factor $\lambda$ is given, online gradient descent with an appropriate step size achieves an $O(1/\sqrt{1-\lambda})$ discounted regret. However, the value of $\lambda$ is often not predetermined in real-world scenarios. This gives rise to a significant open question: is it possible to develop a discounted algorithm that adapts to an unknown discount factor. In this paper, we affirmatively answer this question by providing a novel analysis to demonstrate that smoothed OGD (SOGD) achieves a uniform $O(\sqrt{\log T/1-\lambda})$ discounted regret, holding for all values of $\lambda$ across a continuous interval simultaneously. The basic idea is to maintain multiple OGD instances to handle different discount factors, and aggregate their outputs sequentially by an online prediction algorithm named as Discounted-Normal-Predictor (DNP) (Kapralov and Panigrahy,2010). Our analysis reveals that DNP can combine the decisions of two experts, even when they operate on discounted regret with different discount factors.

nan

Article 1524

Title@2025-05-26 (1): Understanding Transformer from the Perspective of Associative Memory

Title: Understanding Transformer from the Perspective of Associative Memory

Transformer aus der Perspektive des assoziativen Gedächtnisses verstehen

从共同记忆的角度理解变异器 2505.19488v1

Authors: Shu Zhong, Mingyu Xu, Tenglong Ao, Guang Shi

In this paper, we share our reflections and insights on understanding Transformer architectures through the lens of associative memory–a classic psychological concept inspired by human cognition. We start with the basics of associative memory (think simple linear attention) and then dive into two dimensions: Memory Capacity: How much can a Transformer really remember, and how well? We introduce retrieval SNR to measure this and use a kernel perspective to mathematically reveal why Softmax Attention is so effective. We also show how FFNs can be seen as a type of associative memory, leading to insights on their design and potential improvements. Memory Update: How do these memories learn and evolve? We present a unified framework for understanding how different Transformer variants (like DeltaNet and Softmax Attention) update their “knowledge base”. This leads us to tackle two provocative questions: 1. Are Transformers fundamentally limited in what they can express, and can we break these barriers? 2. If a Transformer had infinite context, would it become infinitely intelligent? We want to demystify Transformer architecture, offering a clearer understanding of existing designs. This exploration aims to provide fresh insights and spark new avenues for Transformer innovation.

nan

Article 1525

Title@2025-05-26 (1): VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning

Title: VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning

VLMLight: Verkehrssignalsteuerung über Vision-Language Meta-Control und Dual-Branch-Reasoning

VLMLight:通过视觉语言、超控制和双层理由解释控制交通信号控制 2505.19486v1

Authors: Maonan Wang, Yirong Chen, Aoyu Pang, Yuxin Cai, Chung Shue Chen, Yuheng Kan, Man-On Pun

Traffic signal control (TSC) is a core challenge in urban mobility, where real-time decisions must balance efficiency and safety. Existing methods - ranging from rule-based heuristics to reinforcement learning (RL) - often struggle to generalize to complex, dynamic, and safety-critical scenarios. We introduce VLMLight, a novel TSC framework that integrates vision-language meta-control with dual-branch reasoning. At the core of VLMLight is the first image-based traffic simulator that enables multi-view visual perception at intersections, allowing policies to reason over rich cues such as vehicle type, motion, and spatial density. A large language model (LLM) serves as a safety-prioritized meta-controller, selecting between a fast RL policy for routine traffic and a structured reasoning branch for critical cases. In the latter, multiple LLM agents collaborate to assess traffic phases, prioritize emergency vehicles, and verify rule compliance. Experiments show that VLMLight reduces waiting times for emergency vehicles by up to 65% over RL-only systems, while preserving real-time performance in standard conditions with less than 1% degradation. VLMLight offers a scalable, interpretable, and safety-aware solution for next-generation traffic signal control.

nan

Article 1526

Title@2025-05-26 (1): Understanding the learned look-ahead behavior of chess neural networks

Title: Understanding the learned look-ahead behavior of chess neural networks

Das gelernte Look-Ahead-Verhalten von neuronalen Schachnetzwerken verstehen

了解国际象棋神经网络所学的直视行为 2505.21552v1

Authors: Diogo Cruz

We investigate the look-ahead capabilities of chess-playing neural networks, specifically focusing on the Leela Chess Zero policy network. We build on the work of Jenner et al. (2024) by analyzing the model’s ability to consider future moves and alternative sequences beyond the immediate next move. Our findings reveal that the network’s look-ahead behavior is highly context-dependent, varying significantly based on the specific chess position. We demonstrate that the model can process information about board states up to seven moves ahead, utilizing similar internal mechanisms across different future time steps. Additionally, we provide evidence that the network considers multiple possible move sequences rather than focusing on a single line of play. These results offer new insights into the emergence of sophisticated look-ahead capabilities in neural networks trained on strategic tasks, contributing to our understanding of AI reasoning in complex domains. Our work also showcases the effectiveness of interpretability techniques in uncovering cognitive-like processes in artificial intelligence systems.

nan

Article 1527

Title@2025-05-26 (1): Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

Title: Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

Gewinnen Sie schnell oder verlieren Sie langsam: Ausgleichende Geschwindigkeit und Genauigkeit in Latenz-Sensitive Entscheidungen von LLMs

慢赢或慢输:LLMs的延缓敏感决定中平衡速度和准确性 2505.19481v1

Authors: Hao Kang, Qingru Zhang, Han Cai, Weiyuan Xu, Tushar Krishna, Yilun Du, Tsachy Weissman

Large language models (LLMs) have shown remarkable performance across diverse reasoning and generation tasks, and are increasingly deployed as agents in dynamic environments such as code generation and recommendation systems. However, many real-world applications, such as high-frequency trading and real-time competitive gaming, require decisions under strict latency constraints, where faster responses directly translate into higher rewards. Despite the importance of this latency quality trade off, it remains underexplored in the context of LLM based agents. In this work, we present the first systematic study of this trade off in real time decision making tasks. To support our investigation, we introduce two new benchmarks: HFTBench, a high frequency trading simulation, and StreetFighter, a competitive gaming platform. Our analysis reveals that optimal latency quality balance varies by task, and that sacrificing quality for lower latency can significantly enhance downstream performance. To address this, we propose FPX, an adaptive framework that dynamically selects model size and quantization level based on real time demands. Our method achieves the best performance on both benchmarks, improving win rate by up to 80% in Street Fighter and boosting daily yield by up to 26.52% in trading, underscoring the need for latency aware evaluation and deployment strategies for LLM based agents. These results demonstrate the critical importance of latency aware evaluation and deployment strategies for real world LLM based agents. Our benchmarks are available at Latency Sensitive Benchmarks.

nan

Article 1528

Title@2025-05-26 (1): Revolutionizing Wildfire Detection with Convolutional Neural Networks: A VGG16 Model Approach

Title: Revolutionizing Wildfire Detection with Convolutional Neural Networks: A VGG16 Model Approach

Revolutionierung der Wildfire-Detektion mit konvolutionären neuralen Netzwerken: Ein VGG16-Modellansatz

与革命神经神经网络一起革命性野火探测革命:VGG16示范方法 2505.19479v1

Authors: Lakshmi Aishwarya Malladi, Navarun Gupta, Ahmed El-Sayed, Xingguo Xiong

Over 8,024 wildfire incidents have been documented in 2024 alone, affecting thousands of fatalities and significant damage to infrastructure and ecosystems. Wildfires in the United States have inflicted devastating losses. Wildfires are becoming more frequent and intense, which highlights how urgently efficient warning systems are needed to avoid disastrous outcomes. The goal of this study is to enhance the accuracy of wildfire detection by using Convolutional Neural Network (CNN) built on the VGG16 architecture. The D-FIRE dataset, which includes several kinds of wildfire and non-wildfire images, was employed in the study. Low-resolution images, dataset imbalance, and the necessity for real-time applicability are some of the main challenges. These problems were resolved by enriching the dataset using data augmentation techniques and optimizing the VGG16 model for binary classification. The model produced a low false negative rate, which is essential for reducing unexplored fires, despite dataset boundaries. In order to help authorities execute fast responses, this work shows that deep learning models such as VGG16 can offer a reliable, automated approach for early wildfire recognition. For the purpose of reducing the impact of wildfires, our future work will concentrate on connecting to systems with real-time surveillance networks and enlarging the dataset to cover more varied fire situations.

nan

Article 1529

Title@2025-05-26 (1): Weighted quantization using MMD: From mean field to mean shift via gradient flows

Title: Weighted quantization using MMD: From mean field to mean shift via gradient flows

Gewichtete Quantisierung mit MMD: Vom mittleren Feld zur mittleren Verschiebung über Gradientenströme

使用 MMD 加权量化: 从平均字段到通过梯度流转移 2502.10600v2

Authors: Ayoub Belhadji, Daniel Sharp, Youssef Marzouk

Approximating a probability distribution using a set of particles is a fundamental problem in machine learning and statistics, with applications including clustering and quantization. Formally, we seek a weighted mixture of Dirac measures that best approximates the target distribution. While much existing work relies on the Wasserstein distance to quantify approximation errors, maximum mean discrepancy (MMD) has received comparatively less attention, especially when allowing for variable particle weights. We argue that a Wasserstein-Fisher-Rao gradient flow is well-suited for designing quantizations optimal under MMD. We show that a system of interacting particles satisfying a set of ODEs discretizes this flow. We further derive a new fixed-point algorithm called mean shift interacting particles (MSIP). We show that MSIP extends the classical mean shift algorithm, widely used for identifying modes in kernel density estimators. Moreover, we show that MSIP can be interpreted as preconditioned gradient descent and that it acts as a relaxation of Lloyd’s algorithm for clustering. Our unification of gradient flows, mean shift, and MMD-optimal quantization yields algorithms that are more robust than state-of-the-art methods, as demonstrated via high-dimensional and multi-modal numerical experiments.

nan

Article 1530

Title@2025-05-26 (1): Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables

Title: Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables

Informationstheoretische Generalisierungsanalyse für VQ-VAEs: Eine Rolle latenter Variablen

VQ-VAEs 信息理论概括分析:隐性变量的作用 2505.19470v1

Authors: Futoshi Futami, Masahiro Fujisawa

Latent variables (LVs) play a crucial role in encoder-decoder models by enabling effective data compression, prediction, and generation. Although their theoretical properties, such as generalization, have been extensively studied in supervised learning, similar analyses for unsupervised models such as variational autoencoders (VAEs) remain insufficiently underexplored. In this work, we extend information-theoretic generalization analysis to vector-quantized (VQ) VAEs with discrete latent spaces, introducing a novel data-dependent prior to rigorously analyze the relationship among LVs, generalization, and data generation. We derive a novel generalization error bound of the reconstruction loss of VQ-VAEs, which depends solely on the complexity of LVs and the encoder, independent of the decoder. Additionally, we provide the upper bound of the 2-Wasserstein distance between the distributions of the true data and the generated data, explaining how the regularization of the LVs contributes to the data generation performance.

nan

Article 1531

Title@2025-05-26 (1): Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory

Title: Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory

Diversity-getriebene Generative Datensatzdestillation basierend auf Diffusionsmodell mit selbstadaptivem Speicher

基于带有自适应内存的传播模型的传播模型的多样化生成数据集蒸馏 2505.19469v1

Authors: Mingzhuo Li, Guang Li, Jiafeng Mao, Takahiro Ogawa, Miki Haseyama

Dataset distillation enables the training of deep neural networks with comparable performance in significantly reduced time by compressing large datasets into small and representative ones. Although the introduction of generative models has made great achievements in this field, the distributions of their distilled datasets are not diverse enough to represent the original ones, leading to a decrease in downstream validation accuracy. In this paper, we present a diversity-driven generative dataset distillation method based on a diffusion model to solve this problem. We introduce self-adaptive memory to align the distribution between distilled and real datasets, assessing the representativeness. The degree of alignment leads the diffusion model to generate more diverse datasets during the distillation process. Extensive experiments show that our method outperforms existing state-of-the-art methods in most situations, proving its ability to tackle dataset distillation tasks.

nan

Article 1532

Title@2025-05-26 (1): Parrot: Multilingual Visual Instruction Tuning

Title: Parrot: Multilingual Visual Instruction Tuning

Papagei: Mehrsprachige visuelle Anleitung

Parrot: 多语言视觉教学图示 2406.02539v3

Authors: Hai-Long Sun, Da-Wei Zhou, Yang Li, Shiyin Lu, Chao Yi, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye

The rapid development of Multimodal Large Language Models (MLLMs), such as GPT-4o, marks a significant step toward artificial general intelligence. Existing methods typically align vision encoders with LLMs via supervised fine-tuning (SFT), but this often deteriorates their ability to handle multiple languages as training progresses. We empirically observe that imbalanced SFT datasets, largely English-centric, degrade performance on non-English languages due to the failure in multilingual token alignment. To address this, we propose PARROT, a novel approach that leverages textual guidance for visual token alignment at the language level. PARROT conditions visual tokens on diverse language inputs and uses Mixture-of-Experts (MoE) to align multilingual tokens. By computing cross-attention between initial visual features and textual embeddings, we select the most relevant experts, converting visual tokens into language-specific representations. Additionally, we introduce the Massive Multilingual Multimodal Benchmark (MMMB), a new benchmark comprising 6 languages, 15 categories, and 12,000 questions, to assess multilingual capabilities. PARROT achieves state-of-the-art performance on both the multilingual benchmarks and a wide range of multimodal tasks. Code and dataset are available at: https://github.com/AIDC-AI/Parrot

nan

Article 1533

Title@2025-05-26 (1): Towards End-to-End Training of Automatic Speech Recognition for Nigerian Pidgin

Title: Towards End-to-End Training of Automatic Speech Recognition for Nigerian Pidgin

Auf dem Weg zum Ende der Ausbildung zur automatischen Spracherkennung für nigerianische Pidgin

走向尼日利亚皮吉纳自动语音识别的端至端培训 2010.11123v2

Authors: Amina Mardiyyah Rufai, Afolabi Abeeb, Esther Oduntan, Tayo Arulogun, Oluwabukola Adegboro, Daniel Ajisafe

The prevalence of automatic speech recognition (ASR) systems in spoken language applications has increased significantly in recent years. Notably, many African languages lack sufficient linguistic resources to support the robustness of these systems. This paper focuses on the development of an end-to-end speech recognition system customized for Nigerian Pidgin English. We investigated and evaluated different pretrained state-of-the-art architectures on a new dataset. Our empirical results demonstrate a notable performance of the variant Wav2Vec2 XLSR-53 on our dataset, achieving a word error rate (WER) of 29.6% on the test set, surpassing other architectures such as NEMO QUARTZNET and Wav2Vec2.0 BASE-100H in quantitative assessments. Additionally, we demonstrate that pretrained state-of-the-art architectures do not work well out-of-the-box. We performed zero-shot evaluation using XLSR-English as the baseline, chosen for its similarity to Nigerian Pidgin. This yielded a higher WER of 73.7%. By adapting this architecture to nuances represented in our dataset, we reduce error by 59.84%. Our dataset comprises 4,288 recorded utterances from 10 native speakers, partitioned into training, validation, and test sets. This study underscores the potential for improving ASR systems for under-resourced languages like Nigerian Pidgin English, contributing to greater inclusion in speech technology applications. We publicly release our unique parallel dataset (speech-to-text) on Nigerian Pidgin, as well as the model weights on Hugging Face. Our code would be made available to foster future research from the community.

nan

Article 1534

Title@2025-05-26 (1): Decision Flow Policy Optimization

Title: Decision Flow Policy Optimization

Optimierung der Entscheidungsflusspolitik

优化决策流程政策 2505.20350v1

Authors: Jifeng Hu, Sili Huang, Siyuan Guo, Zhaogeng Liu, Li Shen, Lichao Sun, Hechang Chen, Yi Chang, Dacheng Tao

In recent years, generative models have shown remarkable capabilities across diverse fields, including images, videos, language, and decision-making. By applying powerful generative models such as flow-based models to reinforcement learning, we can effectively model complex multi-modal action distributions and achieve superior robotic control in continuous action spaces, surpassing the limitations of single-modal action distributions with traditional Gaussian-based policies. Previous methods usually adopt the generative models as behavior models to fit state-conditioned action distributions from datasets, with policy optimization conducted separately through additional policies using value-based sample weighting or gradient-based updates. However, this separation prevents the simultaneous optimization of multi-modal distribution fitting and policy improvement, ultimately hindering the training of models and degrading the performance. To address this issue, we propose Decision Flow, a unified framework that integrates multi-modal action distribution modeling and policy optimization. Specifically, our method formulates the action generation procedure of flow-based models as a flow decision-making process, where each action generation step corresponds to one flow decision. Consequently, our method seamlessly optimizes the flow policy while capturing multi-modal action distributions. We provide rigorous proofs of Decision Flow and validate the effectiveness through extensive experiments across dozens of offline RL environments. Compared with established offline RL baselines, the results demonstrate that our method achieves or matches the SOTA performance.

nan

Article 1535

Title@2025-05-26 (1): Origin Tracer: A Method for Detecting LoRA Fine-Tuning Origins in LLMs

Title: Origin Tracer: A Method for Detecting LoRA Fine-Tuning Origins in LLMs

Herkunfts-Tracer: Eine Methode zur Erkennung von LoRA-Feinabstimmungs-Ursprungen in LLMs

来源追踪器:用LLMM探测LORA精导来源的方法 2505.19466v1

Authors: Hongyu Liang, Yuting Zheng, Yihan Li, Yiran Zhang, Shiyu Liang

As large language models (LLMs) continue to advance, their deployment often involves fine-tuning to enhance performance on specific downstream tasks. However, this customization is sometimes accompanied by misleading claims about the origins, raising significant concerns about transparency and trust within the open-source community. Existing model verification techniques typically assess functional, representational, and weight similarities. However, these approaches often struggle against obfuscation techniques, such as permutations and scaling transformations. To address this limitation, we propose a novel detection method Origin-Tracer that rigorously determines whether a model has been fine-tuned from a specified base model. This method includes the ability to extract the LoRA rank utilized during the fine-tuning process, providing a more robust verification framework. This framework is the first to provide a formalized approach specifically aimed at pinpointing the sources of model fine-tuning. We empirically validated our method on thirty-one diverse open-source models under conditions that simulate real-world obfuscation scenarios. We empirically analyze the effectiveness of our framework and finally, discuss its limitations. The results demonstrate the effectiveness of our approach and indicate its potential to establish new benchmarks for model verification.

nan

Article 1536

Title@2025-05-26 (1): Residual Cross-Attention Transformer-Based Multi-User CSI Feedback with Deep Joint Source-Channel Coding

Title: Residual Cross-Attention Transformer-Based Multi-User CSI Feedback with Deep Joint Source-Channel Coding

Residual Cross-Attention Transformer-basierte Multi-User CSI Feedback mit Deep Joint Source-Channel Coding

CSI 与深源-源-汇联合编码的反馈 2505.19465v1

Authors: Hengwei Zhang, Minghui Wu, Li Qiao, Ling Liu, Ziqi Han, Zhen Gao

This letter proposes a deep-learning (DL)-based multi-user channel state information (CSI) feedback framework for massive multiple-input multiple-output systems, where the deep joint source-channel coding (DJSCC) is utilized to improve the CSI reconstruction accuracy. Specifically, we design a multi-user joint CSI feedback framework, whereby the CSI correlation of nearby users is utilized to reduce the feedback overhead. Under the framework, we propose a new residual cross-attention transformer architecture, which is deployed at the base station to further improve the CSI feedback performance. Moreover, to tackle the “cliff-effect” of conventional bit-level CSI feedback approaches, we integrated DJSCC into the multi-user CSI feedback, together with utilizing a two-stage training scheme to adapt to varying uplink noise levels. Experimental results demonstrate the superiority of our methods in CSI feedback performance, with low network complexity and better scalability.

nan

Article 1537

Title@2025-05-26 (1): Your Classifier Can Do More: Towards Bridging the Gaps in Classification, Robustness, and Generation

Title: Your Classifier Can Do More: Towards Bridging the Gaps in Classification, Robustness, and Generation

Ihr Klassifikator kann mehr: Auf dem Weg zur Überbrückung der Lücken in Klassifizierung, Robustheit und Generation

您的分类员可以做更多的事情: 缩小分类、强健和代际差距 2505.19459v1

Authors: Kaichao Jiang, He Wang, Xiaoshuai Hao, Xiulong Yang, Ajian Liu, Qi Chu, Yunfeng Diao

Joint Energy-based Models (JEMs), a class of hybrid generative-discriminative models, are well known for their ability to achieve both high classification accuracy and generative capability within a single model. However, their robustness still lags significantly behind the classifiers based adversarial training (AT). Conversely, while AT is currently the most effective approach to improving the classifier’s robustness, it typically sacrifices accuracy on clean data and lacks generative capability. The triple trade-off between classification accuracy, generative capability and robustness, raises a natural question: Can a single model simultaneously achieve high classification accuracy, adversarial robustness, and generative performance? – a goal that has been rarely explored. To address this question, we systematically analyze the energy distribution differences of clean, adversarial, and generated samples across various JEM variants and adversarially trained models. We observe that AT tends to reduce the energy gap between clean and adversarial samples, while JEMs reduce the gap between clean and synthetic ones. This observation suggests a key insight: if the energy distributions of all three data types can be aligned, we might unify the strengths of AT and JEMs, resolving their inherent trade-offs. Building on this idea, we propose Energy-based Joint Distribution Adversarial Training (EB-JDAT), to jointly model the clean data distribution, the adversarial distribution, and the classifier by maximizing their joint probability. EB-JDAT is a general and flexible optimization method, compatible with various JEM variants. Extensive experimental results demonstrate that EB-JDAT not only maintains near original accuracy and generative capability of JEMs, but also significantly enhances robustness, even surpassing state-of-the-art ATs.

nan

Article 1538

Title@2025-05-26 (1): Recurrent Self-Attention Dynamics: An Energy-Agnostic Perspective from Jacobians

Title: Recurrent Self-Attention Dynamics: An Energy-Agnostic Perspective from Jacobians

Recurrent Self-Attention Dynamics: Eine energie-agnostische Perspektive von Jacobians

《自我注意动态:雅各布人对能源不可知的视角》 2505.19458v1

Authors: Akiyoshi Tomihari, Ryo Karakida

The theoretical understanding of self-attention (SA) has been steadily progressing. A prominent line of work studies a class of SA layers that admit an energy function decreased by state updates. While it provides valuable insights into inherent biases in signal propagation, it often relies on idealized assumptions or additional constraints not necessarily present in standard SA. Thus, to broaden our understanding, this work aims to relax these energy constraints and provide an energy-agnostic characterization of inference dynamics by dynamical systems analysis. In more detail, we first consider relaxing the symmetry and single-head constraints traditionally required in energy-based formulations. Next, to investigate more general SA architectures capable of oscillatory dynamics without necessarily admitting an energy function, we analyze the Jacobian matrix of the state. We reveal that normalization layers effectively normalize the Jacobian’s complex eigenvalues, forcing the dynamics close to a critical state. This significantly enhances inference performance. Furthermore, we utilize the Jacobian perspective to develop regularization methods for training and a pseudo-energy for monitoring inference dynamics.

nan

Article 1539

Title: MM-Prompt: Cross-Modal Prompt Tuning for Continual Visual Question Answering

MM-Prompt: Cross-Modal Prompt Tuning zur kontinuierlichen visuellen Fragestellung

MM-Prompt: 用于持续视觉问答的跨模式快速测试 2505.19455v1

Authors: Xu Li, Fan Lyu

Continual Visual Question Answering (CVQA) based on pre-trained models(PTMs) has achieved promising progress by leveraging prompt tuning to enable continual multi-modal learning. However, most existing methods adopt cross-modal prompt isolation, constructing visual and textual prompts separately, which exacerbates modality imbalance and leads to degraded performance over time. To tackle this issue, we propose MM-Prompt, a novel framework incorporating cross-modal prompt query and cross-modal prompt recovery. The former enables balanced prompt selection by incorporating cross-modal signals during query formation, while the latter promotes joint prompt reconstruction through iterative cross-modal interactions, guided by an alignment loss to prevent representational drift. Extensive experiments show that MM-Prompt surpasses prior approaches in accuracy and knowledge retention, while maintaining balanced modality engagement throughout continual learning.

nan

Article 1540

Title@2025-05-26 (1): MetaGMT: Improving Actionable Interpretability of Graph Multilinear Networks via Meta-Learning Filtration

Title: MetaGMT: Improving Actionable Interpretability of Graph Multilinear Networks via Meta-Learning Filtration

MetaGMT: Durch Meta-Learning Filtration die Durchführbarkeit von Graphen-Multilinearen Netzwerken verbessern

MetGMT:通过Met-Learn Filtation改进图形多线网络可操作的解释性 2505.19445v1

Authors: Rishabh Bhattacharya, Hari Shankar, Vaishnavi Shivkumar, Ponnurangam Kumaraguru

The growing adoption of Graph Neural Networks (GNNs) in high-stakes domains like healthcare and finance demands reliable explanations of their decision-making processes. While inherently interpretable GNN architectures like Graph Multi-linear Networks (GMT) have emerged, they remain vulnerable to generating explanations based on spurious correlations, potentially undermining trust in critical applications. We present MetaGMT, a meta-learning framework that enhances explanation fidelity through a novel bi-level optimization approach. We demonstrate that MetaGMT significantly improves both explanation quality (AUC-ROC, Precision@K) and robustness to spurious patterns, across BA-2Motifs, MUTAG, and SP-Motif benchmarks. Our approach maintains competitive classification accuracy while producing more faithful explanations (with an increase up to 8% of Explanation ROC on SP-Motif 0.5) compared to baseline methods. These advancements in interpretability could enable safer deployment of GNNs in sensitive domains by (1) facilitating model debugging through more reliable explanations, (2) supporting targeted retraining when biases are identified, and (3) enabling meaningful human oversight. By addressing the critical challenge of explanation reliability, our work contributes to building more trustworthy and actionable GNN systems for real-world applications.

nan

Article 1541

Title@2025-05-26 (1): Discovering Forbidden Topics in Language Models

Title: Discovering Forbidden Topics in Language Models

Verbotene Themen in Sprachmodellen entdecken

发现语言模型中的禁止专题 2505.17441v2

Authors: Can Rager, Chris Wendler, Rohit Gandikota, David Bau

Refusal discovery is the task of identifying the full set of topics that a language model refuses to discuss. We introduce this new problem setting and develop a refusal discovery method, LLM-crawler, that uses token prefilling to find forbidden topics. We benchmark the LLM-crawler on Tulu-3-8B, an open-source model with public safety tuning data. Our crawler manages to retrieve 31 out of 36 topics within a budget of 1000 prompts. Next, we scale the crawl to a frontier model using the prefilling option of Claude-Haiku. Finally, we crawl three widely used open-weight models: Llama-3.3-70B and two of its variants finetuned for reasoning: DeepSeek-R1-70B and Perplexity-R1-1776-70B. DeepSeek-R1-70B reveals patterns consistent with censorship tuning: The model exhibits “thought suppression” behavior that indicates memorization of CCP-aligned responses. Although Perplexity-R1-1776-70B is robust to censorship, LLM-crawler elicits CCP-aligned refusals answers in the quantized model. Our findings highlight the critical need for refusal discovery methods to detect biases, boundaries, and alignment failures of AI systems.

nan

Article 1542

Title@2025-05-26 (1): MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding

Title: MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding

MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding

MORE-Brain:可解释和可通用跨主题FMRI视觉解码专家有条不紊混合 2505.15946v2

Authors: Yuxiang Wei, Yanteng Zhang, Xi Xiao, Tianyang Wang, Xiao Wang, Vince D. Calhoun

Decoding visual experiences from fMRI offers a powerful avenue to understand human perception and develop advanced brain-computer interfaces. However, current progress often prioritizes maximizing reconstruction fidelity while overlooking interpretability, an essential aspect for deriving neuroscientific insight. To address this gap, we propose MoRE-Brain, a neuro-inspired framework designed for high-fidelity, adaptable, and interpretable visual reconstruction. MoRE-Brain uniquely employs a hierarchical Mixture-of-Experts architecture where distinct experts process fMRI signals from functionally related voxel groups, mimicking specialized brain networks. The experts are first trained to encode fMRI into the frozen CLIP space. A finetuned diffusion model then synthesizes images, guided by expert outputs through a novel dual-stage routing mechanism that dynamically weighs expert contributions across the diffusion process. MoRE-Brain offers three main advancements: First, it introduces a novel Mixture-of-Experts architecture grounded in brain network principles for neuro-decoding. Second, it achieves efficient cross-subject generalization by sharing core expert networks while adapting only subject-specific routers. Third, it provides enhanced mechanistic insight, as the explicit routing reveals precisely how different modeled brain regions shape the semantic and spatial attributes of the reconstructed image. Extensive experiments validate MoRE-Brain’s high reconstruction fidelity, with bottleneck analyses further demonstrating its effective utilization of fMRI signals, distinguishing genuine neural decoding from over-reliance on generative priors. Consequently, MoRE-Brain marks a substantial advance towards more generalizable and interpretable fMRI-based visual decoding. Code will be publicly available soon: https://github.com/yuxiangwei0808/MoRE-Brain.

nan

Article 1543

Title@2025-05-26 (1): RDI: An adversarial robustness evaluation metric for deep neural networks based on model statistical features

Title: RDI: An adversarial robustness evaluation metric for deep neural networks based on model statistical features

RDI: Eine gegnerische Robustheitsbewertungsmetrik für tiefe neuronale Netzwerke basierend auf modellstatistischen Merkmalen

RDI:基于示范统计特征的深神经网络对抗性强力评价标准 2504.18556v2

Authors: Jialei Song, Xingquan Zuo, Feiyang Wang, Hai Huang, Tianle Zhang

Deep neural networks (DNNs) are highly susceptible to adversarial samples, raising concerns about their reliability in safety-critical tasks. Currently, methods of evaluating adversarial robustness are primarily categorized into attack-based and certified robustness evaluation approaches. The former not only relies on specific attack algorithms but also is highly time-consuming, while the latter due to its analytical nature, is typically difficult to implement for large and complex models. A few studies evaluate model robustness based on the model’s decision boundary, but they suffer from low evaluation accuracy. To address the aforementioned issues, we propose a novel adversarial robustness evaluation metric, Robustness Difference Index (RDI), which is based on model statistical features. RDI draws inspiration from clustering evaluation by analyzing the intra-class and inter-class distances of feature vectors separated by the decision boundary to quantify model robustness. It is attack-independent and has high computational efficiency. Experiments show that, RDI demonstrates a stronger correlation with the gold-standard adversarial robustness metric of attack success rate (ASR). The average computation time of RDI is only 1/30 of the evaluation method based on the PGD attack. Our open-source code is available at: https://github.com/BUPTAIOC/RDI.

nan

Article 1544

Title@2025-05-26 (1): Fairness Practices in Industry: A Case Study in Machine Learning Teams Building Recommender Systems

Title: Fairness Practices in Industry: A Case Study in Machine Learning Teams Building Recommender Systems

Fairness Practices in der Industrie: Eine Fallstudie in Machine Learning Teams Bau von Recommender Systemen

工业公平做法:机械学习小组建立建议系统个案研究 2505.19441v1

Authors: Jing Nathan Yan, Junxiong Wang, Jeffrey M. Rzeszotarski, Allison Koenecke

The rapid proliferation of recommender systems necessitates robust fairness practices to address inherent biases. Assessing fairness, though, is challenging due to constantly evolving metrics and best practices. This paper analyzes how industry practitioners perceive and incorporate these changing fairness standards in their workflows. Through semi-structured interviews with 11 practitioners from technical teams across a range of large technology companies, we investigate industry implementations of fairness in recommendation system products. We focus on current debiasing practices, applied metrics, collaborative strategies, and integrating academic research into practice. Findings show a preference for multi-dimensional debiasing over traditional demographic methods, and a reliance on intuitive rather than academic metrics. This study also highlights the difficulties in balancing fairness with both the practitioner’s individual (bottom-up) roles and organizational (top-down) workplace constraints, including the interplay with legal and compliance experts. Finally, we offer actionable recommendations for the recommender system community and algorithmic fairness practitioners, underlining the need to refine fairness practices continually.

nan

Article 1545

Title@2025-05-26 (1): The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

Title: The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

Die Geburt des Wissens: Emergente Funktionen über Zeit, Raum und Maßstab in großen Sprachmodellen

知识的诞生:跨越时间、空间和大语言模型规模的新兴特征 2505.19440v1

Authors: Shashata Sawmya, Micah Adler, Nir Shavit

This paper studies the emergence of interpretable categorical features within large language models (LLMs), analyzing their behavior across training checkpoints (time), transformer layers (space), and varying model sizes (scale). Using sparse autoencoders for mechanistic interpretability, we identify when and where specific semantic concepts emerge within neural activations. Results indicate clear temporal and scale-specific thresholds for feature emergence across multiple domains. Notably, spatial analysis reveals unexpected semantic reactivation, with early-layer features re-emerging at later layers, challenging standard assumptions about representational dynamics in transformer models.

nan

Article 1546

Title@2025-05-26 (1): Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression

Title: Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression

Kann komprimierte LLMs wirklich handeln? Eine empirische Bewertung der Agentischen Fähigkeiten in der LLM-Kompression

能否压缩LLM Really Act? 对LLM Actrables in LLM Corpression的代理能力进行经验评估。 2505.19433v1

Authors: Peijie Dong, Zhenheng Tang, Xiang Liu, Lujun Li, Xiaowen Chu, Bo Li

Post-training compression reduces the computational and memory costs of large language models (LLMs), enabling resource-efficient deployment. However, existing compression benchmarks only focus on language modeling (e.g., perplexity) and natural language understanding tasks (e.g., GLUE accuracy), ignoring the agentic capabilities - workflow, tool use/function call, long-context understanding and real-world application. We introduce the Agent Compression Benchmark (ACBench), the first comprehensive benchmark for evaluating how compression impacts LLMs’ agentic abilities. ACBench spans (1) 12 tasks across 4 capabilities (e.g., WorfBench for workflow generation, Needle-in-Haystack for long-context retrieval), (2) quantization (GPTQ, AWQ) and pruning (Wanda, SparseGPT), and (3) 15 models, including small (Gemma-2B), standard (Qwen2.5 7B-32B), and distilled reasoning LLMs (DeepSeek-R1-Distill). Our experiments reveal compression tradeoffs: 4-bit quantization preserves workflow generation and tool use (1%-3% drop) but degrades real-world application accuracy by 10%-15%. We introduce ERank, Top-k Ranking Correlation and Energy to systematize analysis. ACBench provides actionable insights for optimizing LLM compression in agentic scenarios. The code can be found in https://github.com/pprp/ACBench.

nan

Article 1547

Title@2025-05-26 (1): Advanced long-term earth system forecasting by learning the small-scale nature

Title: Advanced long-term earth system forecasting by learning the small-scale nature

Fortschrittliche Langzeitprognosen des Erdsystems durch Erlernen der kleinmaßstäblichen Natur

学习小规模性质,进行高级长期地球系统预测 2505.19432v1

Authors: Hao Wu, Yuan Gao, Ruiqi Shu, Kun Wang, Ruijian Gou, Chuhan Wu, Xinliang Liu, Juncai He, Shuhao Cao, Junfeng Fang, Xingjian Shi, Feng Tao, Qi Song, Shengxuan Ji, Yanfei Xiang, Yuze Sun, Jiahao Li, Fan Xu, Huanshuo Dong, Haixin Wang, Fan Zhang, Penghao Zhao, Xian Wu, Qingsong Wen, Deliang Chen, Xiaomeng Huang

Reliable long-term forecast of Earth system dynamics is heavily hampered by instabilities in current AI models during extended autoregressive simulations. These failures often originate from inherent spectral bias, leading to inadequate representation of critical high-frequency, small-scale processes and subsequent uncontrolled error amplification. We present Triton, an AI framework designed to address this fundamental challenge. Inspired by increasing grids to explicitly resolve small scales in numerical models, Triton employs a hierarchical architecture processing information across multiple resolutions to mitigate spectral bias and explicitly model cross-scale dynamics. We demonstrate Triton’s superior performance on challenging forecast tasks, achieving stable year-long global temperature forecasts, skillful Kuroshio eddy predictions till 120 days, and high-fidelity turbulence simulations preserving fine-scale structures all without external forcing, with significantly surpassing baseline AI models in long-term stability and accuracy. By effectively suppressing high-frequency error accumulation, Triton offers a promising pathway towards trustworthy AI-driven simulation for climate and earth system science.

nan

Article 1548

Title@2025-05-26 (1): Importance Weighted Score Matching for Diffusion Samplers with Enhanced Mode Coverage

Title: Importance Weighted Score Matching for Diffusion Samplers with Enhanced Mode Coverage

Bedeutung Gewichteter Score passend für Diffusion Sampler mit erweiterten Modus Abdeckung

具有强化模式覆盖率的传播采样器比对重要加权分数 2505.19431v1

Authors: Chenguang Wang, Xiaoyu Zhang, Kaiyuan Cui, Weichen Zhao, Yongtao Guan, Tianshu Yu

Training neural samplers directly from unnormalized densities without access to target distribution samples presents a significant challenge. A critical desideratum in these settings is achieving comprehensive mode coverage, ensuring the sampler captures the full diversity of the target distribution. However, prevailing methods often circumvent the lack of target data by optimizing reverse KL-based objectives. Such objectives inherently exhibit mode-seeking behavior, potentially leading to incomplete representation of the underlying distribution. While alternative approaches strive for better mode coverage, they typically rely on implicit mechanisms like heuristics or iterative refinement. In this work, we propose a principled approach for training diffusion-based samplers by directly targeting an objective analogous to the forward KL divergence, which is conceptually known to encourage mode coverage. We introduce \textit{Importance Weighted Score Matching}, a method that optimizes this desired mode-covering objective by re-weighting the score matching loss using tractable importance sampling estimates, thereby overcoming the absence of target distribution data. We also provide theoretical analysis of the bias and variance for our proposed Monte Carlo estimator and the practical loss function used in our method. Experiments on increasingly complex multi-modal distributions, including 2D Gaussian Mixture Models with up to 120 modes and challenging particle systems with inherent symmetries – demonstrate that our approach consistently outperforms existing neural samplers across all distributional distance metrics, achieving state-of-the-art results on all benchmarks.

nan

Article 1549

Title@2025-05-26 (1): MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision

Title: MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision

MAS-ZERO: Konzipieren von Multi-Agenten-Systemen mit Zero Supervision

MAS-ZERO: 设计无监督的多机构系统 2505.14996v2

Authors: Zixuan Ke, Austin Xu, Yifei Ming, Xuan-Phi Nguyen, Caiming Xiong, Shafiq Joty

Multi-agent systems (MAS) leveraging the impressive capabilities of Large Language Models (LLMs) hold significant potential for tackling complex tasks. However, most current MAS depend on manually designed agent roles and communication protocols. These manual designs often fail to align with the underlying LLMs’ strengths and struggle to adapt to novel tasks. Recent automatic MAS approaches attempt to mitigate these limitations but typically necessitate a validation set for tuning and yield static MAS designs lacking adaptability during inference. We introduce MAS-ZERO, the first self-evolved, inference-time framework for automatic MAS design. MAS-ZERO employs meta-level design to iteratively generate, evaluate, and refine MAS configurations tailored to each problem instance, without requiring a validation set. Critically, it enables dynamic agent composition and problem decomposition through meta-feedback on solvability and completeness. Experiments across math, graduate-level QA, and software engineering benchmarks, using both closed-source and open-source LLM backbones of varying sizes, demonstrate that MAS-ZERO outperforms both manual and automatic MAS baselines, achieving a 7.44% average accuracy improvement over the next strongest baseline while maintaining cost-efficiency. These findings underscore the promise of meta-level self-evolved design for creating effective and adaptive MAS.

nan

Article 1550

Title@2025-05-26 (1): WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference

Title: WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference

WINA: Gewichtsinformierte Neuronen-Aktivierung zur Beschleunigung der Large Language Model Inferenz

Authors: Sihan Chen, Dan Zhao, Jongwoo Ko, Colby Banbury, Huiping Zhuang, Luming Liang, Tianyi Chen

The growing computational demands of large language models (LLMs) make efficient inference and activation strategies increasingly critical. While recent approaches, such as Mixture-of-Experts (MoE), leverage selective activation but require specialized training, training-free sparse activation methods offer broader applicability and superior resource efficiency through their plug-and-play design. However, many existing methods rely solely on hidden state magnitudes to determine activation, resulting in high approximation errors and suboptimal inference accuracy. To address these limitations, we propose WINA (Weight Informed Neuron Activation), a novel, simple, and training-free sparse activation framework that jointly considers hidden state magnitudes and the column-wise $\ell_2$-norms of weight matrices. We show that this leads to a sparsification strategy that obtains optimal approximation error bounds with theoretical guarantees tighter than existing techniques. Empirically, WINA also outperforms state-of-the-art methods (e.g., TEAL) by up to $2.94\%$ in average performance at the same sparsity levels, across a diverse set of LLM architectures and datasets. These results position WINA as a new performance frontier for training-free sparse activation in LLM inference, advancing training-free sparse activation methods and setting a robust baseline for efficient inference. The source code is available at https://github.com/microsoft/wina.

nan

Article 1551

Title@2025-05-26 (1): The Role of Diversity in In-Context Learning for Large Language Models

Title: The Role of Diversity in In-Context Learning for Large Language Models

Die Rolle der Vielfalt im In-Context-Lernen für große Sprachmodelle

多样性在为大语言模式进行内文学习方面的作用 2505.19426v1

Authors: Wenyang Xiao, Haoyu Zhao, Lingxiao Huang

In-context learning (ICL) is a crucial capability of current large language models (LLMs), where the selection of examples plays a key role in performance. While most existing approaches focus on selecting the most similar examples to the query, the impact of diversity in example selection remains underexplored. We systematically investigate the role of diversity in in-context example selection through experiments across a range of tasks, from sentiment classification to more challenging math and code problems. Experiments on Llama-3.1, Gemma-2, and Mistral-v0.3 families of models show that diversity-aware selection methods improve performance, particularly on complex tasks like math and code, and enhance robustness to out-of-distribution queries. To support these findings, we introduce a theoretical framework that explains the benefits of incorporating diversity in in-context example selection.

nan

Article 1552

Title@2025-05-26 (1): Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation

Title: Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation

Strukturstörung: Verringern von bösartiger Diffusions-basierter Inpainting durch Selbstaufmerksamkeit Abfrage Störung

结构混乱:通过自控查询干扰来改变恶意扩散的涂漆 2505.19425v1

Authors: Yuhao He, Jinyu Tian, Haiwei Wu, Jianqing Li

The rapid advancement of diffusion models has enhanced their image inpainting and editing capabilities but also introduced significant societal risks. Adversaries can exploit user images from social media to generate misleading or harmful content. While adversarial perturbations can disrupt inpainting, global perturbation-based methods fail in mask-guided editing tasks due to spatial constraints. To address these challenges, we propose Structure Disruption Attack (SDA), a powerful protection framework for safeguarding sensitive image regions against inpainting-based editing. Building upon the contour-focused nature of self-attention mechanisms of diffusion models, SDA optimizes perturbations by disrupting queries in self-attention during the initial denoising step to destroy the contour generation process. This targeted interference directly disrupts the structural generation capability of diffusion models, effectively preventing them from producing coherent images. We validate our motivation through visualization techniques and extensive experiments on public datasets, demonstrating that SDA achieves state-of-the-art (SOTA) protection performance while maintaining strong robustness.

nan

Article 1553

Title@2025-05-26 (1): Each Graph is a New Language: Graph Learning with LLMs

Title: Each Graph is a New Language: Graph Learning with LLMs

Jeder Graph ist eine neue Sprache: Graph Learning mit LLMs

每图都是一种新语言:用LLMM学习图表 2501.11478v3

Authors: Huachi Zhou, Jiahe Du, Chuang Zhou, Chang Yang, Yilin Xiao, Yuxuan Xie, Xiao Huang

Recent efforts leverage Large Language Models (LLMs) for modeling text-attributed graph structures in node classification tasks. These approaches describe graph structures for LLMs to understand or aggregate LLM-generated textual attribute embeddings through graph structure. However, these approaches face two main limitations in modeling graph structures with LLMs. (i) Graph descriptions become verbose in describing high-order graph structure. (ii) Textual attributes alone do not contain adequate graph structure information. It is challenging to model graph structure concisely and adequately with LLMs. LLMs lack built-in mechanisms to model graph structures directly. They also struggle with complex long-range dependencies between high-order nodes and target nodes. Inspired by the observation that LLMs pre-trained on one language can achieve exceptional performance on another with minimal additional training, we propose \textbf{G}raph-\textbf{D}efined \textbf{L}anguage for \textbf{L}arge \textbf{L}anguage \textbf{M}odel (GDL4LLM). This novel framework enables LLMs to transfer their powerful language understanding capabilities to graph-structured data. GDL4LLM translates graphs into a graph language corpus instead of graph descriptions and pre-trains LLMs on this corpus to adequately understand graph structures. During fine-tuning, this corpus describes the structural information of target nodes concisely with only a few tokens. By treating graphs as a new language, GDL4LLM enables LLMs to model graph structures adequately and concisely for node classification tasks. Extensive experiments on three real-world datasets demonstrate that GDL4LLM outperforms description-based and textual attribute embeddings-based baselines by efficiently modeling different orders of graph structure with LLMs.

nan

Article 1554

Title@2025-05-26 (1): Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

Title: Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

Im Moment falsch dann: Nicht-Stationäre Direktpräferenz-Optimierung unter Preference Drift

右,右,错误然后: 非标准直接首选优化在偏好驱动器下 2407.18676v2

Authors: Seongho Son, William Bankes, Sayak Ray Chowdhury, Brooks Paige, Ilija Bogunovic

Reinforcement learning from human feedback (RLHF) aligns Large Language Models (LLMs) with human preferences. However, these preferences can often change over time due to external factors (e.g. environment change and societal influence). Consequently, what was wrong then might be right now. Current preference optimization algorithms do not account for temporal preference drift in their modeling, which can lead to severe misalignment. To address this limitation, we use a Dynamic Bradley-Terry model that models preferences via time-dependent reward functions, and propose Non-Stationary Direct Preference Optimisation (NS-DPO). By introducing a discount parameter in the loss function, NS-DPO applies exponential weighting, which proportionally focuses learning on more time-relevant datapoints. We theoretically analyse the convergence of NS-DPO in the offline setting, providing upper bounds on the estimation error caused by non-stationary preferences. Finally, we demonstrate the effectiveness of NS-DPO for fine-tuning LLMs in scenarios with drifting preferences. By simulating preference drift using renowned reward models and modifying popular LLM datasets accordingly, we show that NS-DPO fine-tuned LLMs remain robust under non-stationarity, significantly outperforming baseline algorithms that ignore temporal preference changes, without sacrificing performance in stationary cases.

nan

Article 1555

Title@2025-05-26 (1): SaVe-TAG: Semantic-aware Vicinal Risk Minimization for Long-Tailed Text-Attributed Graphs

Title: SaVe-TAG: Semantic-aware Vicinal Risk Minimization for Long-Tailed Text-Attributed Graphs

SaVe-TAG: Semantisch-bewusst Vicinal Risk Minimierung für langgestreckte Text-Attribute Graphen

SaVe-TAG: 长途脱轨文本可归图解析相邻风险最小化 2410.16882v3

Authors: Leyao Wang, Yu Wang, Bo Ni, Yuying Zhao, Hanyu Wang, Yao Ma, Tyler Derr

Real-world graph data often follows long-tailed distributions, making it difficult for Graph Neural Networks (GNNs) to generalize well across both head and tail classes. Recent advances in Vicinal Risk Minimization (VRM) have shown promise in mitigating class imbalance with numeric interpolation; however, existing approaches largely rely on embedding-space arithmetic, which fails to capture the rich semantics inherent in text-attributed graphs. In this work, we propose our method, SaVe-TAG (Semantic-aware Vicinal Risk Minimization for Long-Tailed Text-Attributed Graphs), a novel VRM framework that leverages Large Language Models (LLMs) to perform text-level interpolation, generating on-manifold, boundary-enriching synthetic samples for minority classes. To mitigate the risk of noisy generation, we introduce a confidence-based edge assignment mechanism that uses graph topology as a natural filter to ensure structural consistency. We provide theoretical justification for our method and conduct extensive experiments on benchmark datasets, showing that our approach consistently outperforms both numeric interpolation and prior long-tailed node classification baselines. Our results highlight the importance of integrating semantic and structural signals for balanced and effective learning on text-attributed graphs.

nan

Article 1556

Title@2025-05-26 (1): Strictly Constrained Generative Modeling via Split Augmented Langevin Sampling

Title: Strictly Constrained Generative Modeling via Split Augmented Langevin Sampling

Streng eingeschränkte generative Modellierung über Split Augmented Langevin Sampling

通过分分扩大Langevin抽样进行严格约束的生成模型模拟 2505.18017v2

Authors: Matthieu Blanke, Yongquan Qu, Sara Shamekh, Pierre Gentine

Deep generative models hold great promise for representing complex physical systems, but their deployment is currently limited by the lack of guarantees on the physical plausibility of the generated outputs. Ensuring that known physical constraints are enforced is therefore critical when applying generative models to scientific and engineering problems. We address this limitation by developing a principled framework for sampling from a target distribution while rigorously satisfying physical constraints. Leveraging the variational formulation of Langevin dynamics, we propose Split Augmented Langevin (SAL), a novel primal-dual sampling algorithm that enforces constraints progressively through variable splitting, with convergence guarantees. While the method is developed theoretically for Langevin dynamics, we demonstrate its effective applicability to diffusion models. In particular, we use constrained diffusion models to generate physical fields satisfying energy and mass conservation laws. We apply our method to diffusion-based data assimilation on a complex physical system, where enforcing physical constraints substantially improves both forecast accuracy and the preservation of critical conserved quantities. We also demonstrate the potential of SAL for challenging feasibility problems in optimal control.

nan

Article 1557

Title@2025-05-26 (1): Toward Physics-Informed Machine Learning for Data Center Operations: A Tropical Case Study

Title: Toward Physics-Informed Machine Learning for Data Center Operations: A Tropical Case Study

Auf dem Weg zum physikinformierten maschinellen Lernen für Rechenzentrumsoperationen: Eine Tropische Fallstudie

争取为数据中心业务进行物理一体化机械学习:热带案例研究 2505.19414v1

Authors: Ruihang Wang, Zhiwei Cao, Qingang Zhang, Rui Tan, Yonggang Wen, Tommy Leung, Stuart Kennedy, Justin Teoh

Data centers are the backbone of computing capacity. Operating data centers in the tropical regions faces unique challenges due to consistently high ambient temperature and elevated relative humidity throughout the year. These conditions result in increased cooling costs to maintain the reliability of the computing systems. While existing machine learning-based approaches have demonstrated potential to elevate operations to a more proactive and intelligent level, their deployment remains dubious due to concerns about model extrapolation capabilities and associated system safety issues. To address these concerns, this article proposes incorporating the physical characteristics of data centers into traditional data-driven machine learning solutions. We begin by introducing the data center system, including the relevant multiphysics processes and the data-physics availability. Next, we outline the associated modeling and optimization problems and propose an integrated, physics-informed machine learning system to address them. Using the proposed system, we present relevant applications across varying levels of operational intelligence. A case study on an industry-grade tropical data center is provided to demonstrate the effectiveness of our approach. Finally, we discuss key challenges and highlight potential future directions.

nan

Article 1558

Title@2025-05-26 (1): Future Link Prediction Without Memory or Aggregation

Title: Future Link Prediction Without Memory or Aggregation

Zukünftige Link-Vorhersage ohne Gedächtnis oder Aggregation

没有记忆或聚合的未来联系预测 2505.19408v1

Authors: Lu Yi, Runlin Lei, Fengran Mo, Yanping Zheng, Zhewei Wei, Yuhang Ye

Future link prediction on temporal graphs is a fundamental task with wide applicability in real-world dynamic systems. These scenarios often involve both recurring (seen) and novel (unseen) interactions, requiring models to generalize effectively across both types of edges. However, existing methods typically rely on complex memory and aggregation modules, yet struggle to handle unseen edges. In this paper, we revisit the architecture of existing temporal graph models and identify two essential but overlooked modeling requirements for future link prediction: representing nodes with unique identifiers and performing target-aware matching between source and destination nodes. To this end, we propose Cross-Attention based Future Link Predictor on Temporal Graphs (CRAFT), a simple yet effective architecture that discards memory and aggregation modules and instead builds on two components: learnable node embeddings and cross-attention between the destination and the source’s recent interactions. This design provides strong expressive power and enables target-aware modeling of the compatibility between candidate destinations and the source’s interaction patterns. Extensive experiments on diverse datasets demonstrate that CRAFT consistently achieves superior performance with high efficiency, making it well-suited for large-scale real-world applications.

nan

Article 1559

Title@2025-05-26 (1): FedHERO: A Federated Learning Approach for Node Classification Task on Heterophilic Graphs

Title: FedHERO: A Federated Learning Approach for Node Classification Task on Heterophilic Graphs

FedHERO: Ein Federated Learning Approach für Knotenklassifikation Aufgaben auf heterophilen Graphen

FEFHERO: 异生物图节点分类任务联邦学习方法 2504.21206v2

Authors: Zihan Chen, Xingbo Fu, Yushun Dong, Jundong Li, Cong Shen

Federated Graph Learning (FGL) empowers clients to collaboratively train Graph neural networks (GNNs) in a distributed manner while preserving data privacy. However, FGL methods usually require that the graph data owned by all clients is homophilic to ensure similar neighbor distribution patterns of nodes. Such an assumption ensures that the learned knowledge is consistent across the local models from all clients. Therefore, these local models can be properly aggregated as a global model without undermining the overall performance. Nevertheless, when the neighbor distribution patterns of nodes vary across different clients (e.g., when clients hold graphs with different levels of heterophily), their local models may gain different and even conflict knowledge from their node-level predictive tasks. Consequently, aggregating these local models usually leads to catastrophic performance deterioration on the global model. To address this challenge, we propose FedHERO, an FGL framework designed to harness and share insights from heterophilic graphs effectively. At the heart of FedHERO is a dual-channel GNN equipped with a structure learner, engineered to discern the structural knowledge encoded in the local graphs. With this specialized component, FedHERO enables the local model for each client to identify and learn patterns that are universally applicable across graphs with different patterns of node neighbor distributions. FedHERO not only enhances the performance of individual client models by leveraging both local and shared structural insights but also sets a new precedent in this field to effectively handle graph data with various node neighbor distribution patterns. We conduct extensive experiments to validate the superior performance of FedHERO against existing alternatives.

nan

Article 1560

Title@2025-05-26 (1): Exploring the Possibility of TypiClust for Low-Budget Federated Active Learning

Title: Exploring the Possibility of TypiClust for Low-Budget Federated Active Learning

Erforschung der Möglichkeit des TypiClusts für budgetarmes, föderiertes aktives Lernen

探讨低预算联邦积极学习的TypiClust 2505.19404v1

Authors: Yuta Ono, Hiroshi Nakamura, Hideki Takase

Federated Active Learning (FAL) seeks to reduce the burden of annotation under the realistic constraints of federated learning by leveraging Active Learning (AL). As FAL settings make it more expensive to obtain ground truth labels, FAL strategies that work well in low-budget regimes, where the amount of annotation is very limited, are needed. In this work, we investigate the effectiveness of TypiClust, a successful low-budget AL strategy, in low-budget FAL settings. Our empirical results show that TypiClust works well even in low-budget FAL settings contrasted with relatively low performances of other methods, although these settings present additional challenges, such as data heterogeneity, compared to AL. In addition, we show that FAL settings cause distribution shifts in terms of typicality, but TypiClust is not very vulnerable to the shifts. We also analyze the sensitivity of TypiClust to feature extraction methods, and it suggests a way to perform FAL even in limited data situations.

nan

Article 1561

Title@2025-05-26 (1): KHRONOS: a Kernel-Based Neural Architecture for Rapid, Resource-Efficient Scientific Computation

Title: KHRONOS: a Kernel-Based Neural Architecture for Rapid, Resource-Efficient Scientific Computation

KHRONOS: Eine Kernel-basierte Neuralarchitektur für schnelle, ressourceneffiziente wissenschaftliche Berechnung

KHRONOS:一个以核心为基础的神经结构,用于快速、资源高效科学计算 2505.13315v2

Authors: Reza T. Batley, Sourav Saha

Contemporary models of high dimensional physical systems are constrained by the curse of dimensionality and a reliance on dense data. We introduce KHRONOS (Kernel Expansion Hierarchy for Reduced Order, Neural Optimized Surrogates), an AI framework for model based, model free and model inversion tasks. KHRONOS constructs continuously differentiable target fields with a hierarchical composition of per-dimension kernel expansions, which are tensorized into modes and then superposed. We evaluate KHRONOS on a canonical 2D, Poisson equation benchmark: across 16 to 512 degrees of freedom (DoFs), it obtained L_2-square errors of 5e-4 down to 6e-11. This represents a greater than 100-fold gain over Kolmogorov Arnold Networks (which itself reports a 100 times improvement on MLPs/PINNs with 100 times fewer parameters) when controlling for the number of parameters. This also represents a 1e6-fold improvement in L_2-square error compared to standard linear FEM at comparable DoFs. Inference complexity is dominated by inner products, yielding sub-millisecond full-field predictions that scale to an arbitrary resolution. For inverse problems, KHRONOS facilitates rapid, iterative level set recovery in only a few forward evaluations, with sub-microsecond per sample latency. KHRONOS’s scalability, expressivity, and interpretability open new avenues in constrained edge computing, online control, computer vision, and beyond.

nan

Article 1562

Title@2025-05-26 (1): Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs

Title: Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs

Können LLMs helfen, Erkenntnisse über LLMs zu enthüllen? Eine groß angelegte, sich entwickelnde Literaturanalyse von Frontier LLMs

LLMs 帮助发现关于LLM的见识? 大型、不断发展的前沿LMS文学分析 2502.18791v3

Authors: Jungsoo Park, Junmo Kang, Gabriel Stanovsky, Alan Ritter

The surge of LLM studies makes synthesizing their findings challenging. Analysis of experimental results from literature can uncover important trends across studies, but the time-consuming nature of manual data extraction limits its use. Our study presents a semi-automated approach for literature analysis that accelerates data extraction using LLMs. It automatically identifies relevant arXiv papers, extracts experimental results and related attributes, and organizes them into a structured dataset, LLMEvalDB. We then conduct an automated literature analysis of frontier LLMs, reducing the effort of paper surveying and data extraction by more than 93% compared to manual approaches. We validate LLMEvalDB by showing that it reproduces key findings from a recent manual analysis of Chain-of-Thought (CoT) reasoning and also uncovers new insights that go beyond it, showing, for example, that in-context examples benefit coding & multimodal tasks but offer limited gains in math reasoning tasks compared to zero-shot CoT. Our automatically updatable dataset enables continuous tracking of target models by extracting evaluation studies as new data becomes available. Through LLMEvalDB and empirical analysis, we provide insights into LLMs while facilitating ongoing literature analyses of their behavior.

nan

Article 1563

Title@2025-05-26 (1): Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent

Title: Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent

Auf dem Weg zum Verständnis der Verallgemeinerbarkeit des verzögerten stochastischen Absinkens

了解拖延的拖延的逐步后世后代的普遍适用性 2308.09430v4

Authors: Xiaoge Deng, Li Shen, Shengwei Li, Tao Sun, Dongsheng Li, Dacheng Tao

Stochastic gradient descent (SGD) performed in an asynchronous manner plays a crucial role in training large-scale machine learning models. However, the generalization performance of asynchronous delayed SGD, which is an essential metric for assessing machine learning algorithms, has rarely been explored. Existing generalization error bounds are rather pessimistic and cannot reveal the correlation between asynchronous delays and generalization. In this paper, we investigate sharper generalization error bound for SGD with asynchronous delay $\tau$. Leveraging the generating function analysis tool, we first establish the average stability of the delayed gradient algorithm. Based on this algorithmic stability, we provide upper bounds on the generalization error of $\tilde{\mathcal{O}}(\frac{T-\tau}{n\tau})$ and $\tilde{\mathcal{O}}(\frac{1}{n})$ for quadratic convex and strongly convex problems, respectively, where $T$ refers to the iteration number and $n$ is the amount of training data. Our theoretical results indicate that asynchronous delays reduce the generalization error of the delayed SGD algorithm. Analogous analysis can be generalized to the random delay setting, and the experimental results validate our theoretical findings.

nan

Article 1564

Title@2025-05-26 (1): Are Time-Series Foundation Models Deployment-Ready? A Systematic Study of Adversarial Robustness Across Domains

Title: Are Time-Series Foundation Models Deployment-Ready? A Systematic Study of Adversarial Robustness Across Domains

Sind Time-Series-Stiftungsmodelle bereit? Eine systematische Studie über die widerrechtliche Robustheit über Domains hinweg

时间-系列基金会的模型是部署-准备模型吗? 2505.19397v1

Authors: Jiawen Zhang, Zhenwei Zhang, Shun Zheng, Xumeng Wen, Jia Li, Jiang Bian

Time Series Foundation Models (TSFMs), which are pretrained on large-scale, cross-domain data and capable of zero-shot forecasting in new scenarios without further training, are increasingly adopted in real-world applications. However, as the zero-shot forecasting paradigm gets popular, a critical yet overlooked question emerges: Are TSFMs robust to adversarial input perturbations? Such perturbations could be exploited in man-in-the-middle attacks or data poisoning. To address this gap, we conduct a systematic investigation into the adversarial robustness of TSFMs. Our results show that even minimal perturbations can induce significant and controllable changes in forecast behaviors, including trend reversal, temporal drift, and amplitude shift, posing serious risks to TSFM-based services. Through experiments on representative TSFMs and multiple datasets, we reveal their consistent vulnerabilities and identify potential architectural designs, such as structural sparsity and multi-task pretraining, that may improve robustness. Our findings offer actionable guidance for designing more resilient forecasting systems and provide a critical assessment of the adversarial robustness of TSFMs.

nan

Article 1565

Title@2025-05-26 (1): Uniform convergence of the smooth calibration error and its relationship with functional gradient

Title: Uniform convergence of the smooth calibration error and its relationship with functional gradient

Einheitliche Konvergenz des glatten Kalibrierfehlers und seines Verhältnisses mit dem funktionellen Gradienten

平稳校准误差及其与功能梯度的关系统一汇合 2505.19396v1

Authors: Futoshi Futami, Atsushi Nitanda

Calibration is a critical requirement for reliable probabilistic prediction, especially in high-risk applications. However, the theoretical understanding of which learning algorithms can simultaneously achieve high accuracy and good calibration remains limited, and many existing studies provide empirical validation or a theoretical guarantee in restrictive settings. To address this issue, in this work, we focus on the smooth calibration error (CE) and provide a uniform convergence bound, showing that the smooth CE is bounded by the sum of the smooth CE over the training dataset and a generalization gap. We further prove that the functional gradient of the loss function can effectively control the training smooth CE. Based on this framework, we analyze three representative algorithms: gradient boosting trees, kernel boosting, and two-layer neural networks. For each, we derive conditions under which both classification and calibration performances are simultaneously guaranteed. Our results offer new theoretical insights and practical guidance for designing reliable probabilistic models with provable calibration guarantees.

nan

Article 1566

Title: Towards the Causal Complete Cause of Multi-Modal Representation Learning

Auf dem Weg zur kausalen vollständigen Ursache des multi-Modalen Repräsentationslernens

走向多模式代表制学习的事业完全原因 2407.14058v6

Authors: Jingyao Wang, Siyu Zhao, Wenwen Qiang, Jiangmeng Li, Changwen Zheng, Fuchun Sun, Hui Xiong

Multi-Modal Learning (MML) aims to learn effective representations across modalities for accurate predictions. Existing methods typically focus on modality consistency and specificity to learn effective representations. However, from a causal perspective, they may lead to representations that contain insufficient and unnecessary information. To address this, we propose that effective MML representations should be causally sufficient and necessary. Considering practical issues like spurious correlations and modality conflicts, we relax the exogeneity and monotonicity assumptions prevalent in prior works and explore the concepts specific to MML, i.e., Causal Complete Cause $C^3$. We begin by defining $C^3$, which quantifies the probability of representations being causally sufficient and necessary. We then discuss the identifiability of $C^3$ and introduce an instrumental variable to support identifying $C^3$ with non-exogeneity and non-monotonicity. Building on this, we conduct the $C^3$ measurement, i.e., (C^3) risk. We propose a twin network to estimate it through (i) the real-world branch: utilizing the instrumental variable for sufficiency, and (ii) the hypothetical-world branch: applying gradient-based counterfactual modeling for necessity. Theoretical analyses confirm its reliability. Based on these results, we propose $C^3$ Regularization, a plug-and-play method that enforces the causal completeness of the learned representations by minimizing $C^3$ risk. Extensive experiments demonstrate its effectiveness.

nan

Article 1567

Title@2025-05-26 (1): Alignment of large language models with constrained learning

Title: Alignment of large language models with constrained learning

Ausrichtung großer Sprachmodelle mit eingeschränktem Lernen

大型语言模式与限制学习的结合 2505.19387v1

Authors: Botong Zhang, Shuo Li, Ignacio Hounie, Osbert Bastani, Dongsheng Ding, Alejandro Ribeiro

We study the problem of computing an optimal large language model (LLM) policy for a constrained alignment problem, where the goal is to maximize a primary reward objective while satisfying constraints on secondary utilities. Despite the popularity of Lagrangian-based LLM policy search in constrained alignment, iterative primal-dual methods often fail to converge, and non-iterative dual-based methods do not achieve optimality in the LLM parameter space. To address these challenges, we employ Lagrangian duality to develop an iterative dual-based alignment method that alternates between updating the LLM policy via Lagrangian maximization and updating the dual variable via dual descent. In theory, we characterize the primal-dual gap between the primal value in the distribution space and the dual value in the LLM parameter space. We further quantify the optimality gap of the learned LLM policies at near-optimal dual variables with respect to both the objective and the constraint functions. These results prove that dual-based alignment methods can find an optimal constrained LLM policy, up to an LLM parametrization gap. We demonstrate the effectiveness and merits of our approach through extensive experiments conducted on the PKU-SafeRLHF dataset.

nan

Article 1568

Title@2025-05-26 (1): JingFang: An Expert-Level Large Language Model for Traditional Chinese Medicine Clinical Consultation and Syndrome Differentiation-Based Treatment

Title: JingFang: An Expert-Level Large Language Model for Traditional Chinese Medicine Clinical Consultation and Syndrome Differentiation-Based Treatment

JingFang: Ein sachverständiges Sprachmodell für die traditionelle chinesische Medizin Klinische Beratung und Syndromdifferenzierungsbasierte Behandlung

JingFang:中国传统医学临床咨询和综合症差别治疗专家级大语言模式 2502.04345v2

Authors: Yehan Yang, Tianhao Ma, Ruotai Li, Xinhan Zheng, Guodong Shan, Chisheng Li

The effective application of traditional Chinese medicine (TCM) requires extensive knowledge of TCM and clinical experience. The emergence of Large Language Models (LLMs) provides a solution to this, while existing LLMs for TCM exhibit critical limitations of incomplete clinical consultation and diagnoses, as well as inaccurate syndrome differentiation. To address these issues, we establish JingFang (JF), a novel TCM LLM that demonstrates the level of expertise in clinical consultation and syndrome differentiation. We propose a Multi-Agent Collaborative Chain-of-Thought Mechanism (MACCTM) for comprehensive and targeted clinical consultation, enabling JF with effective and accurate diagnostic ability. In addition, a Syndrome Agent and a Dual-Stage Recovery Scheme (DSRS) are developed to accurately enhance the differentiation of the syndrome and the subsequent corresponding treatment. JingFang not only facilitates the application of LLMs but also promotes the effective application of TCM for healthcare.

nan

Article 1569

Title@2025-05-26 (1): Unsupervised Anomaly Detection Using Diffusion Trend Analysis for Display Inspection

Title: Unsupervised Anomaly Detection Using Diffusion Trend Analysis for Display Inspection

Unüberwachte Anomalieerkennung mit Diffusion Trendanalyse für Display-Inspektion

用于显示检查的利用扩散趋势分析进行无监督异常探测 2407.09578v2

Authors: Eunwoo Kim, Un Yang, Cheol Lae Roh, Stefano Ermon

Reconstruction-based anomaly detection via denoising diffusion model has limitations in determining appropriate noise parameters that can degrade anomalies while preserving normal characteristics. Also, normal regions can fluctuate considerably during reconstruction, resulting in false detection. In this paper, we propose a method to detect anomalies by analysis of reconstruction trend depending on the degree of degradation, effectively solving the both problems that impede practical application in display inspection.

nan

Article 1570

Title@2025-05-25 (7): SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

Title: SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

SALSA-RL: Stabilitätsanalyse im Latent Space of Actions zur Stärkung des Lernens

SALSA-RL:加强学习行动空间的稳定分析 2502.15512v2

Authors: Xuyang Li, Romit Maulik

Modern deep reinforcement learning (DRL) methods have made significant advances in handling continuous action spaces. However, real-world control systems–especially those requiring precise and reliable performance–often demand interpretability in the sense of a-priori assessments of agent behavior to identify safe or failure-prone interactions with environments. To address this limitation, we propose SALSA-RL (Stability Analysis in the Latent Space of Actions), a novel RL framework that models control actions as dynamic, time-dependent variables evolving within a latent space. By employing a pre-trained encoder-decoder and a state-dependent linear system, our approach enables interpretability through local stability analysis, where instantaneous growth in action-norms can be predicted before their execution. We demonstrate that SALSA-RL can be deployed in a non-invasive manner for assessing the local stability of actions from pretrained RL agents without compromising on performance across diverse benchmark environments. By enabling a more interpretable analysis of action generation, SALSA-RL provides a powerful tool for advancing the design, analysis, and theoretical understanding of RL systems.

nan

Article 1571

Title@2025-05-25 (7): Foundations of Top-$k$ Decoding For Language Models

Title: Foundations of Top-$k$ Decoding For Language Models

Grundlagen von Top-$k$ Dekodierung für Sprachmodelle

语言模式最高价基数 2505.19371v1

Authors: Georgy Noarov, Soham Mallick, Tao Wang, Sunay Joshi, Yan Sun, Yangxinyu Xie, Mengxin Yu, Edgar Dobriban

Top-$k$ decoding is a widely used method for sampling from LLMs: at each token, only the largest $k$ next-token-probabilities are kept, and the next token is sampled after re-normalizing them to sum to unity. Top-$k$ and other sampling methods are motivated by the intuition that true next-token distributions are sparse, and the noisy LLM probabilities need to be truncated. However, to our knowledge, a precise theoretical motivation for the use of top-$k$ decoding is missing. In this work, we develop a theoretical framework that both explains and generalizes top-$k$ decoding. We view decoding at a fixed token as the recovery of a sparse probability distribution. We consider \emph{Bregman decoders} obtained by minimizing a separable Bregman divergence (for both the \emph{primal} and \emph{dual} cases) with a sparsity-inducing $\ell_0$ regularization. Despite the combinatorial nature of the objective, we show how to optimize it efficiently for a large class of divergences. We show that the optimal decoding strategies are greedy, and further that the loss function is discretely convex in $k$, so that binary search provably and efficiently finds the optimal $k$. We show that top-$k$ decoding arises as a special case for the KL divergence, and identify new decoding strategies that have distinct behaviors (e.g., non-linearly up-weighting larger probabilities after re-normalization).

nan

Article 1572

Title@2025-05-25 (7): SETransformer: A Hybrid Attention-Based Architecture for Robust Human Activity Recognition

Title: SETransformer: A Hybrid Attention-Based Architecture for Robust Human Activity Recognition

SETransformer: Eine hybride, auf Aufmerksamkeit basierende Architektur für robuste menschliche Aktivitätserkennung

转型:以关注为基础的混合结构,以确认强有力的人类活动 2505.19369v1

Authors: Yunbo Liu, Xukui Qin, Yifan Gao, Xiang Li, Chengwei Feng

Human Activity Recognition (HAR) using wearable sensor data has become a central task in mobile computing, healthcare, and human-computer interaction. Despite the success of traditional deep learning models such as CNNs and RNNs, they often struggle to capture long-range temporal dependencies and contextual relevance across multiple sensor channels. To address these limitations, we propose SETransformer, a hybrid deep neural architecture that combines Transformer-based temporal modeling with channel-wise squeeze-and-excitation (SE) attention and a learnable temporal attention pooling mechanism. The model takes raw triaxial accelerometer data as input and leverages global self-attention to capture activity-specific motion dynamics over extended time windows, while adaptively emphasizing informative sensor channels and critical time steps. We evaluate SETransformer on the WISDM dataset and demonstrate that it significantly outperforms conventional models including LSTM, GRU, BiLSTM, and CNN baselines. The proposed model achieves a validation accuracy of 84.68\% and a macro F1-score of 84.64\%, surpassing all baseline architectures by a notable margin. Our results show that SETransformer is a competitive and interpretable solution for real-world HAR tasks, with strong potential for deployment in mobile and ubiquitous sensing applications.

nan

Article 1573

Title@2025-05-25 (7): One Step Diffusion via Shortcut Models

Title: One Step Diffusion via Shortcut Models

Ein Schritt Diffusion über Shortcut-Modelle

通过快捷键模型进行单步扩散 2410.12557v2

Authors: Kevin Frans, Danijar Hafner, Sergey Levine, Pieter Abbeel

Diffusion models and flow-matching models have enabled generating diverse and realistic images by learning to transfer noise to data. However, sampling from these models involves iterative denoising over many neural network passes, making generation slow and expensive. Previous approaches for speeding up sampling require complex training regimes, such as multiple training phases, multiple networks, or fragile scheduling. We introduce shortcut models, a family of generative models that use a single network and training phase to produce high-quality samples in a single or multiple sampling steps. Shortcut models condition the network not only on the current noise level but also on the desired step size, allowing the model to skip ahead in the generation process. Across a wide range of sampling step budgets, shortcut models consistently produce higher quality samples than previous approaches, such as consistency models and reflow. Compared to distillation, shortcut models reduce complexity to a single network and training phase and additionally allow varying step budgets at inference time.

nan

Article 1574

Title@2025-05-25 (7): Adaptive Diffusion Guidance via Stochastic Optimal Control

Title: Adaptive Diffusion Guidance via Stochastic Optimal Control

Adaptive Diffusionsführung über stochastische Optimale Kontrolle

通过斯托卡优化控制进行适应性扩散指导 2505.19367v1

Authors: Iskander Azangulov, Peter Potaptchik, Qinyu Li, Eddie Aamari, George Deligiannidis, Judith Rousseau

Guidance is a cornerstone of modern diffusion models, playing a pivotal role in conditional generation and enhancing the quality of unconditional samples. However, current approaches to guidance scheduling–determining the appropriate guidance weight–are largely heuristic and lack a solid theoretical foundation. This work addresses these limitations on two fronts. First, we provide a theoretical formalization that precisely characterizes the relationship between guidance strength and classifier confidence. Second, building on this insight, we introduce a stochastic optimal control framework that casts guidance scheduling as an adaptive optimization problem. In this formulation, guidance strength is not fixed but dynamically selected based on time, the current sample, and the conditioning class, either independently or in combination. By solving the resulting control problem, we establish a principled foundation for more effective guidance in diffusion models.

nan

Article 1575

Title@2025-05-25 (7): FD-Bench: A Modular and Fair Benchmark for Data-driven Fluid Simulation

Title: FD-Bench: A Modular and Fair Benchmark for Data-driven Fluid Simulation

FD-Bench: Modularer und fairer Benchmark für datengetriebene Fluidsimulation

FD-时区:数据驱动流流模拟模块化公平基准 2505.20349v1

Authors: Haixin Wang, Ruoyan Li, Fred Xu, Fang Sun, Kaiqiao Han, Zijie Huang, Guancheng Wan, Ching Chang, Xiao Luo, Wei Wang, Yizhou Sun

Data-driven modeling of fluid dynamics has advanced rapidly with neural PDE solvers, yet a fair and strong benchmark remains fragmented due to the absence of unified PDE datasets and standardized evaluation protocols. Although architectural innovations are abundant, fair assessment is further impeded by the lack of clear disentanglement between spatial, temporal and loss modules. In this paper, we introduce FD-Bench, the first fair, modular, comprehensive and reproducible benchmark for data-driven fluid simulation. FD-Bench systematically evaluates 85 baseline models across 10 representative flow scenarios under a unified experimental setup. It provides four key contributions: (1) a modular design enabling fair comparisons across spatial, temporal, and loss function modules; (2) the first systematic framework for direct comparison with traditional numerical solvers; (3) fine-grained generalization analysis across resolutions, initial conditions, and temporal windows; and (4) a user-friendly, extensible codebase to support future research. Through rigorous empirical studies, FD-Bench establishes the most comprehensive leaderboard to date, resolving long-standing issues in reproducibility and comparability, and laying a foundation for robust evaluation of future data-driven fluid models. The code is open-sourced at https://anonymous.4open.science/r/FD-Bench-15BC.

nan

Article 1576

Title@2025-05-25 (7): Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments

Title: Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments

Konsistenzbasierte abduktive Begründung über Wahrnehmungsfehler mehrerer vortrainierter Modelle in neuartigen Umgebungen

创新环境中多个未受过培训的多种模式的认知错误的基于一致性的直截力理由 2505.19361v1

Authors: Mario Leiva, Noel Ngu, Joshua Shay Kricheli, Aditya Taparia, Ransalu Senanayake, Paulo Shakarian, Nathaniel Bastian, John Corcoran, Gerardo Simari

The deployment of pre-trained perception models in novel environments often leads to performance degradation due to distributional shifts. Although recent artificial intelligence approaches for metacognition use logical rules to characterize and filter model errors, improving precision often comes at the cost of reduced recall. This paper addresses the hypothesis that leveraging multiple pre-trained models can mitigate this recall reduction. We formulate the challenge of identifying and managing conflicting predictions from various models as a consistency-based abduction problem. The input predictions and the learned error detection rules derived from each model are encoded in a logic program. We then seek an abductive explanation–a subset of model predictions–that maximizes prediction coverage while ensuring the rate of logical inconsistencies (derived from domain constraints) remains below a specified threshold. We propose two algorithms for this knowledge representation task: an exact method based on Integer Programming (IP) and an efficient Heuristic Search (HS). Through extensive experiments on a simulated aerial imagery dataset featuring controlled, complex distributional shifts, we demonstrate that our abduction-based framework outperforms individual models and standard ensemble baselines, achieving, for instance, average relative improvements of approximately 13.6% in F1-score and 16.6% in accuracy across 15 diverse test datasets when compared to the best individual model. Our results validate the use of consistency-based abduction as an effective mechanism to robustly integrate knowledge from multiple imperfect reasoners in challenging, novel scenarios.

nan

Article 1577

Title@2025-05-25 (7): Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval

Title: Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval

Optimierte Text-Embedding-Modelle und Benchmarks für die Amharische Passage Retrieval

阿姆光通过通过检索的最佳文本嵌入模型和基准 2505.19356v1

Authors: Kidist Amde Mekonnen, Yosef Worku Alemneh, Maarten de Rijke

Neural retrieval methods using transformer-based pre-trained language models have advanced multilingual and cross-lingual retrieval. However, their effectiveness for low-resource, morphologically rich languages such as Amharic remains underexplored due to data scarcity and suboptimal tokenization. We address this gap by introducing Amharic-specific dense retrieval models based on pre-trained Amharic BERT and RoBERTa backbones. Our proposed RoBERTa-Base-Amharic-Embed model (110M parameters) achieves a 17.6% relative improvement in MRR@10 and a 9.86% gain in Recall@10 over the strongest multilingual baseline, Arctic Embed 2.0 (568M parameters). More compact variants, such as RoBERTa-Medium-Amharic-Embed (42M), remain competitive while being over 13x smaller. Additionally, we train a ColBERT-based late interaction retrieval model that achieves the highest MRR@10 score (0.843) among all evaluated models. We benchmark our proposed models against both sparse and dense retrieval baselines to systematically assess retrieval effectiveness in Amharic. Our analysis highlights key challenges in low-resource settings and underscores the importance of language-specific adaptation. To foster future research in low-resource IR, we publicly release our dataset, codebase, and trained models at https://github.com/kidist-amde/amharic-ir-benchmarks.

nan

Article 1578

Title@2025-05-25 (7): FlashMD: long-stride, universal prediction of molecular dynamics

Title: FlashMD: long-stride, universal prediction of molecular dynamics

FlashMD: Langstride, universelle Vorhersage der molekularen Dynamik

FlashMD:长途、全方位预测分子动态 2505.19350v1

Authors: Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi, Michele Ceriotti

Molecular dynamics (MD) provides insights into atomic-scale processes by integrating over time the equations that describe the motion of atoms under the action of interatomic forces. Machine learning models have substantially accelerated MD by providing inexpensive predictions of the forces, but they remain constrained to minuscule time integration steps, which are required by the fast time scale of atomic motion. In this work, we propose FlashMD, a method to predict the evolution of positions and momenta over strides that are between one and two orders of magnitude longer than typical MD time steps. We incorporate considerations on the mathematical and physical properties of Hamiltonian dynamics in the architecture, generalize the approach to allow the simulation of any thermodynamic ensemble, and carefully assess the possible failure modes of such a long-stride MD approach. We validate FlashMD’s accuracy in reproducing equilibrium and time-dependent properties, using both system-specific and general-purpose models, extending the ability of MD simulation to reach the long time scales needed to model microscopic processes of high scientific and technological relevance.

nan

Article 1579

Title@2025-05-25 (7): Communication-Efficient Multi-Device Inference Acceleration for Transformer Models

Title: Communication-Efficient Multi-Device Inference Acceleration for Transformer Models

Kommunikationseffiziente Multi-Device-Inferenzbeschleunigung für Transformer-Modelle

变换模型的通信效率高多变量推推加速 2505.19342v1

Authors: Xiao Liu, Lijun Zhang, Deepak Ganesan, Hui Guan

Transformer models power many AI applications but suffer from high inference latency, limiting their use in real-time settings. Multi-device inference can reduce latency by parallelizing computation. Yet, existing methods require high inter-device bandwidth, making them impractical for bandwidth-constrained environments. We propose ASTRA, a communication-efficient framework that accelerates Transformer inference through a novel integration of sequence parallelism and a Mixed-Precision Attention mechanism designed to minimize inter-device communication. ASTRA compresses non-local token embeddings via vector quantization and preserves task accuracy through two optimizations, Noise-Augmented Quantization and Distributed Class Tokens. Experiments on ViT and GPT2 across vision and NLP tasks show that ASTRA achieves up to 2.64X speedups over single-device inference and up to 15.25X speedups over state-of-the-art multi-device inferences, while operating under bandwidths as low as 10 Mbps. ASTRA is open-sourced at https://github.com/xl1990/Astra.

nan

Article 1580

Title@2025-05-25 (7): Flow Q-Learning

Title: Flow Q-Learning

Fluss Q-Lernen

流动学习 2502.02538v2

Authors: Seohong Park, Qiyang Li, Sergey Levine

We present flow Q-learning (FQL), a simple and performant offline reinforcement learning (RL) method that leverages an expressive flow-matching policy to model arbitrarily complex action distributions in data. Training a flow policy with RL is a tricky problem, due to the iterative nature of the action generation process. We address this challenge by training an expressive one-step policy with RL, rather than directly guiding an iterative flow policy to maximize values. This way, we can completely avoid unstable recursive backpropagation, eliminate costly iterative action generation at test time, yet still mostly maintain expressivity. We experimentally show that FQL leads to strong performance across 73 challenging state- and pixel-based OGBench and D4RL tasks in offline RL and offline-to-online RL. Project page: https://seohong.me/projects/fql/

nan

Article 1581

Title@2025-05-25 (7): Improving Compositional Generation with Diffusion Models Using Lift Scores

Title: Improving Compositional Generation with Diffusion Models Using Lift Scores

Verbesserung der kompositorischen Generierung mit Diffusionsmodellen mit Lift-Scores

利用使用提升分数的传播模型改善组成型 2505.13740v2

Authors: Chenning Yu, Sicun Gao

We introduce a novel resampling criterion using lift scores, for improving compositional generation in diffusion models. By leveraging the lift scores, we evaluate whether generated samples align with each single condition and then compose the results to determine whether the composed prompt is satisfied. Our key insight is that lift scores can be efficiently approximated using only the original diffusion model, requiring no additional training or external modules. We develop an optimized variant that achieves relatively lower computational overhead during inference while maintaining effectiveness. Through extensive experiments, we demonstrate that lift scores significantly improved the condition alignment for compositional generation across 2D synthetic data, CLEVR position tasks, and text-to-image synthesis. Our code is available at http://rainorangelemon.github.io/complift.

nan

Article 1582

Title@2025-05-25 (7): TRANSIT your events into a new mass: Fast background interpolation for weakly-supervised anomaly searches

Title: TRANSIT your events into a new mass: Fast background interpolation for weakly-supervised anomaly searches

Übertragen Sie Ihre Ereignisse in eine neue Masse: Schnelle Hintergrundinterpolation für schwach überwachte Anomaliensuche

将您的事件转换成一个新的质量: 快速背景内插, 用于受微弱监督的异常搜索 2503.04342v2

Authors: Ivan Oleksiyuk, Svyatoslav Voloshynovskiy, Tobias Golling

We introduce a new model for conditional and continuous data morphing called TRansport Adversarial Network for Smooth InTerpolation (TRANSIT). We apply it to create a background data template for weakly-supervised searches at the LHC. The method smoothly transforms sideband events to match signal region mass distributions. We demonstrate the performance of TRANSIT using the LHC Olympics R\&D dataset. The model captures non-linear mass correlations of features and produces a template that offers a competitive anomaly sensitivity compared to state-of-the-art transport-based template generators. Moreover, the computational training time required for TRANSIT is an order of magnitude lower than that of competing deep learning methods. This makes it ideal for analyses that iterate over many signal regions and signal models. Unlike generative models, which must learn a full probability density distribution, i.e., the correlations between all the variables, the proposed transport model only has to learn a smooth conditional shift of the distribution. This allows for a simpler, more efficient residual architecture, enabling mass uncorrelated features to pass the network unchanged while the mass correlated features are adjusted accordingly. Furthermore, we show that the latent space of the model provides a set of mass decorrelated features useful for anomaly detection without background sculpting.

nan

Article 1583

Title@2025-05-25 (7): WhisperD: Dementia Speech Recognition and Filler Word Detection with Whisper

Title: WhisperD: Dementia Speech Recognition and Filler Word Detection with Whisper

WhisperD: Dementia Spracherkennung und Filler-Worterkennung mit Whisper

耳语:痴呆症言语识别和用耳语探测填字词 2505.21551v1

Authors: Emmanuel Akinrintoyo, Nadine Abdelhalim, Nicole Salomons

Whisper fails to correctly transcribe dementia speech because persons with dementia (PwDs) often exhibit irregular speech patterns and disfluencies such as pauses, repetitions, and fragmented sentences. It was trained on standard speech and may have had little or no exposure to dementia-affected speech. However, correct transcription is vital for dementia speech for cost-effective diagnosis and the development of assistive technology. In this work, we fine-tune Whisper with the open-source dementia speech dataset (DementiaBank) and our in-house dataset to improve its word error rate (WER). The fine-tuning also includes filler words to ascertain the filler inclusion rate (FIR) and F1 score. The fine-tuned models significantly outperformed the off-the-shelf models. The medium-sized model achieved a WER of 0.24, outperforming previous work. Similarly, there was a notable generalisability to unseen data and speech patterns.

nan

Article 1584

Title@2025-05-25 (7): Likert or Not: LLM Absolute Relevance Judgments on Fine-Grained Ordinal Scales

Title: Likert or Not: LLM Absolute Relevance Judgments on Fine-Grained Ordinal Scales

LLM Absolute Relevanz Urteile auf feinkörnigen Ordinalwaagen

理論或非理論:LLM 关于精准奥氏比额的绝对相关性判决 2505.19334v1

Authors: Charles Godfrey, Ping Nie, Natalia Ostapuk, David Ken, Shang Gao, Souheil Inati

Large language models (LLMs) obtain state of the art zero shot relevance ranking performance on a variety of information retrieval tasks. The two most common prompts to elicit LLM relevance judgments are pointwise scoring (a.k.a. relevance generation), where the LLM sees a single query-document pair and outputs a single relevance score, and listwise ranking (a.k.a. permutation generation), where the LLM sees a query and a list of documents and outputs a permutation, sorting the documents in decreasing order of relevance. The current research community consensus is that listwise ranking yields superior performance, and significant research effort has been devoted to crafting LLM listwise ranking algorithms. The underlying hypothesis is that LLMs are better at making relative relevance judgments than absolute ones. In tension with this hypothesis, we find that the gap between pointwise scoring and listwise ranking shrinks when pointwise scoring is implemented using a sufficiently large ordinal relevance label space, becoming statistically insignificant for many LLM-benchmark dataset combinations (where significant'' means95\% confidence that listwise ranking improves NDCG@10’’). Our evaluations span four LLMs, eight benchmark datasets from the BEIR and TREC-DL suites, and two proprietary datasets with relevance labels collected after the training cut-off of all LLMs evaluated.

nan

Article 1585

Title@2025-05-25 (7): Bayesian Comparisons Between Representations

Title: Bayesian Comparisons Between Representations

Bayesische Vergleiche zwischen Repräsentationen

代表之间的贝叶比较 2411.08739v3

Authors: Heiko H. Schütt

Which neural networks are similar is a fundamental question for both machine learning and neuroscience. Here, it is proposed to base comparisons on the predictive distributions of linear readouts from intermediate representations. In Bayesian statistics, the prior predictive distribution is a full description of the inductive bias and generalization of a model, making it a great basis for comparisons. This distribution directly gives the evidence a dataset would provide in favor of the model. If we want to compare multiple models to each other, we can use a metric for probability distributions like the Jensen-Shannon distance or the total variation distance. As these are metrics, this induces pseudo-metrics for representations, which measure how well two representations could be distinguished based on a linear read out. For a linear readout with a Gaussian prior on the read-out weights and Gaussian noise, we can analytically compute the (prior and posterior) predictive distributions without approximations. These distributions depend only on the linear kernel matrix of the representations in the model. Thus, the Bayesian metrics connect to both linear read-out based comparisons and kernel based metrics like centered kernel alignment and representational similarity analysis. The new methods are demonstrated with deep neural networks trained on ImageNet-1k comparing them to each other and a small subset of the Natural Scenes Dataset. The Bayesian comparisons are correlated to but distinct from existing metrics. Evaluations vary slightly less across random image samples and yield informative results with full uncertainty information. Thus the proposed Bayesian metrics nicely extend our toolkit for comparing representations.

nan

Article 1586

Title@2025-05-25 (7): Paying Alignment Tax with Contrastive Learning

Title: Paying Alignment Tax with Contrastive Learning

Steuern mit kontraproduktivem Lernen ausgleichen

与反向学习支付一致税 2505.19327v1

Authors: Buse Sibel Korkmaz, Rahul Nair, Elizabeth M. Daly, Antonio del Rio Chanona

Current debiasing approaches often result a degradation in model capabilities such as factual accuracy and knowledge retention. Through systematic evaluation across multiple benchmarks, we demonstrate that existing debiasing methods face fundamental trade-offs, particularly in smaller models, leading to reduced truthfulness, knowledge loss, or unintelligible outputs. To address these limitations, we propose a contrastive learning framework that learns through carefully constructed positive and negative examples. Our approach introduces contrast computation and dynamic loss scaling to balance bias mitigation with faithfulness preservation. Experimental results across multiple model scales demonstrate that our method achieves substantial improvements in both toxicity reduction and faithfulness preservation. Most importantly, we show that our framework is the first to consistently improve both metrics simultaneously, avoiding the capability degradation characteristic of existing approaches. These results suggest that explicit modeling of both positive and negative examples through contrastive learning could be a promising direction for reducing the alignment tax in language model debiasing.

nan

Article 1587

Title@2025-05-25 (7): An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

Title: An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

Eine Adversarial Analyse von Thompson Sampling für Full-Information Online-Lernen: von Finite zu Unendlichen Aktionsräumen

对Thompson网上全面信息学习抽样分析:从有限到无限行动空间 2502.14790v4

Authors: Alexander Terenin, Jeffrey Negrea

We develop a form Thompson sampling for online learning under full feedback - also known as prediction with expert advice - where the learner’s prior is defined over the space of an adversary’s future actions, rather than the space of experts. We show regret decomposes into regret the learner expected a priori, plus a prior-robustness-type term we call excess regret. In the classical finite-expert setting, this recovers optimal rates. As an initial step towards practical online learning in settings with a potentially-uncountably-infinite number of experts, we show that Thompson sampling over the $d$-dimensional unit cube, using a certain Gaussian process prior widely-used in the Bayesian optimization literature, has a $\mathcal{O}\Big(\beta\sqrt{Td\log(1+\sqrt{d}\frac{\lambda}{\beta})}\Big)$ rate against a $\beta$-bounded $\lambda$-Lipschitz adversary.

nan

Article 1588

Title@2025-05-25 (7): Regress, Don’t Guess – A Regression-like Loss on Number Tokens for Language Models

Title: Regress, Don’t Guess – A Regression-like Loss on Number Tokens for Language Models

Regress, nicht raten – Ein Rückschritt-ähnlicher Verlust an Zahlenzeichen für Sprachmodelle

Regress, don’t guess - 语言模型数字调的回归式损失 2411.02083v2

Authors: Jonas Zausinger, Lars Pennig, Anamarija Kozina, Sean Sdahl, Julian Sikora, Adrian Dendorfer, Timofey Kuznetsov, Mohamad Hagog, Nina Wiedemann, Kacper Chlodny, Vincent Limbach, Anna Ketteler, Thorben Prein, Vishwa Mohan Singh, Michael Morris Danziger, Jannis Born

While language models have exceptional capabilities at text generation, they lack a natural inductive bias for emitting numbers and thus struggle in tasks involving quantitative reasoning, especially arithmetic. One fundamental limitation is the nature of the Cross Entropy loss, which assumes a nominal scale and thus cannot convey proximity between generated number tokens. In response, we here present a regression-like loss that operates purely on token level. Our proposed Number Token Loss (NTL) comes in two flavors and minimizes either the Lp norm or the Wasserstein distance between the numerical values of the real and predicted number tokens. NTL can easily be added to any language model and extend the Cross Entropy objective during training without runtime overhead. We evaluate the proposed scheme on various mathematical datasets and find that it consistently improves performance in math-related tasks. In a direct comparison on a regression task, we find that NTL can match the performance of a regression head, despite operating on token level. Finally, we scale NTL up to 3B parameter models and observe improved performance, demonstrating its potential for seamless integration into LLMs. We hope that this work can inspire LLM developers to improve their pretraining objectives. The code is available via: https://tum-ai.github.io/number-token-loss/

nan

Article 1589

Title@2025-05-25 (7): PIGPVAE: Physics-Informed Gaussian Process Variational Autoencoders

Title: PIGPVAE: Physics-Informed Gaussian Process Variational Autoencoders

PIGPVAE: Physik-informierte Gauß-Prozessvariationelle Autoencoder

PIGPVAE: 物理化高斯进程变异自动编码器 2505.19320v1

Authors: Michail Spitieris, Massimiliano Ruocco, Abdulmajid Murad, Alessandro Nocente

Recent advances in generative AI offer promising solutions for synthetic data generation but often rely on large datasets for effective training. To address this limitation, we propose a novel generative model that learns from limited data by incorporating physical constraints to enhance performance. Specifically, we extend the VAE architecture by incorporating physical models in the generative process, enabling it to capture underlying dynamics more effectively. While physical models provide valuable insights, they struggle to capture complex temporal dependencies present in real-world data. To bridge this gap, we introduce a discrepancy term to account for unmodeled dynamics, represented within a latent Gaussian Process VAE (GPVAE). Furthermore, we apply regularization to ensure the generated data aligns closely with observed data, enhancing both the diversity and accuracy of the synthetic samples. The proposed method is applied to indoor temperature data, achieving state-of-the-art performance. Additionally, we demonstrate that PIGPVAE can produce realistic samples beyond the observed distribution, highlighting its robustness and usefulness under distribution shifts.

nan

Article 1590

Title@2025-05-25 (7): Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?

Title: Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?

Sind Transformer durch die Verbindung getrennter Kenntnisse in Trainingsdaten in der Lage, Vernunft zu erreichen?

将培训数据方面的单独知识连接起来的变换者是否具有理性? 2501.15857v6

Authors: Yutong Yin, Zhaoran Wang

Humans exhibit remarkable compositional reasoning by integrating knowledge from various sources. For example, if someone learns ( B = f(A) ) from one source and ( C = g(B) ) from another, they can deduce ( C=g(B)=g(f(A)) ) even without encountering ( ABC ) together, showcasing the generalization ability of human intelligence. In this paper, we introduce a synthetic learning task, “FTCT” (Fragmented at Training, Chained at Testing), to validate the potential of Transformers in replicating this skill and interpret its inner mechanism. In the training phase, data consist of separated knowledge fragments from an overall causal graph. During testing, Transformers must infer complete causal graph traces by integrating these fragments. Our findings demonstrate that few-shot Chain-of-Thought prompting enables Transformers to perform compositional reasoning on FTCT by revealing correct combinations of fragments, even if such combinations were absent in the training data. Furthermore, the emergence of compositional reasoning ability is strongly correlated with the model complexity and training-testing data similarity. We propose, both theoretically and empirically, that Transformers learn an underlying generalizable program from training, enabling effective compositional reasoning during testing.

nan

Article 1591

Title@2025-05-25 (7): Effort-aware Fairness: Incorporating a Philosophy-informed, Human-centered Notion of Effort into Algorithmic Fairness Metrics

Title: Effort-aware Fairness: Incorporating a Philosophy-informed, Human-centered Notion of Effort into Algorithmic Fairness Metrics

Effort-aware Fairness: Aufnahme einer philosophisch-informierten, menschlich-zentrierten Nennung von Effort in algorithmische Fairness-Metriken

努力做到公平:将了解哲学、以人为中心的努力理念纳入到算法公平度量中 2505.19317v1

Authors: Tin Nguyen, Jiannan Xu, Zora Che, Phuong-Anh Nguyen-Le, Rushil Dandamudi, Donald Braman, Furong Huang, Hal Daumé III, Zubin Jelveh

Although popularized AI fairness metrics, e.g., demographic parity, have uncovered bias in AI-assisted decision-making outcomes, they do not consider how much effort one has spent to get to where one is today in the input feature space. However, the notion of effort is important in how Philosophy and humans understand fairness. We propose a philosophy-informed way to conceptualize and evaluate Effort-aware Fairness (EaF) based on the concept of Force, or temporal trajectory of predictive features coupled with inertia. In addition to our theoretical formulation of EaF metrics, our empirical contributions include: 1/ a pre-registered human subjects experiment, which demonstrates that for both stages of the (individual) fairness evaluation process, people consider the temporal trajectory of a predictive feature more than its aggregate value; 2/ pipelines to compute Effort-aware Individual/Group Fairness in the criminal justice and personal finance contexts. Our work may enable AI model auditors to uncover and potentially correct unfair decisions against individuals who spent significant efforts to improve but are still stuck with systemic/early-life disadvantages outside their control.

nan

Article 1592

Title@2025-05-25 (7): Demand Selection for VRP with Emission Quota

Title: Demand Selection for VRP with Emission Quota

Auswahl der Nachfrage nach VRP mit Emissionsquoten

具有排放配额的VRP需求选择 2505.19315v1

Authors: Farid Najar, Dominique Barth, Yann Strozecki

Combinatorial optimization (CO) problems are traditionally addressed using Operations Research (OR) methods, including metaheuristics. In this study, we introduce a demand selection problem for the Vehicle Routing Problem (VRP) with an emission quota, referred to as QVRP. The objective is to minimize the number of omitted deliveries while respecting the pollution quota. We focus on the demand selection part, called Maximum Feasible Vehicle Assignment (MFVA), while the construction of a routing for the VRP instance is solved using classical OR methods. We propose several methods for selecting the packages to omit, both from machine learning (ML) and OR. Our results show that, in this static problem setting, classical OR-based methods consistently outperform ML-based approaches.

nan

Article 1593

Title@2025-05-25 (7): Concept Reachability in Diffusion Models: Beyond Dataset Constraints

Title: Concept Reachability in Diffusion Models: Beyond Dataset Constraints

Konzept-Erreichbarkeit in Diffusions-Modellen: Jenseits von Datensatzbeschränkungen

传播模型中可达到的概念:超越数据集的制约 2505.19313v1

Authors: Marta Aparicio Rodriguez, Xenia Miscouridou, Anastasia Borovykh

Despite significant advances in quality and complexity of the generations in text-to-image models, prompting does not always lead to the desired outputs. Controlling model behaviour by directly steering intermediate model activations has emerged as a viable alternative allowing to reach concepts in latent space that may otherwise remain inaccessible by prompt. In this work, we introduce a set of experiments to deepen our understanding of concept reachability. We design a training data setup with three key obstacles: scarcity of concepts, underspecification of concepts in the captions, and data biases with tied concepts. Our results show: (i) concept reachability in latent space exhibits a distinct phase transition, with only a small number of samples being sufficient to enable reachability, (ii) where in the latent space the intervention is performed critically impacts reachability, showing that certain concepts are reachable only at certain stages of transformation, and (iii) while prompting ability rapidly diminishes with a decrease in quality of the dataset, concepts often remain reliably reachable through steering. Model providers can leverage this to bypass costly retraining and dataset curation and instead innovate with user-facing control mechanisms.

nan

Article 1594

Title@2025-05-25 (7): Stochastic Hessian Fittings with Lie Groups

Title: Stochastic Hessian Fittings with Lie Groups

Stochastische hessische Beschläge mit Lie Groups

配有谎言组的假体装配机 2402.11858v5

Authors: Xi-Lin Li

This report investigates the fitting of Hessian or its inverse for stochastic optimizations using a Hessian fitting criterion derived from the preconditioned stochastic gradient descent (PSGD) method. This criterion is closely related to many widely used second-order and adaptive gradient optimization methods, including BFGS, the Gauss-Newton algorithm, natural gradient descent, and AdaGrad. Our analyses reveal the efficiency and reliability differences of a broad range of preconditioner fitting methods, ranging from closed-form to iterative approaches, using Hessian-vector products or stochastic gradients only, with Hessian fittings across various geometric settings (the Euclidean space, the manifold of symmetric positive definite (SPD) matrices and a variety of Lie groups). The most intriguing finding is that the Hessian fitting problem is strongly convex under mild conditions in certain general Lie groups. This result turns the Hessian fitting into a well-behaved Lie group optimization problem and facilitates the designs of highly efficient and elegant Lie group sparse preconditioner fitting methods for large-scale stochastic optimizations.

nan

Article 1595

Title@2025-05-25 (7): Fractional-Boundary-Regularized Deep Galerkin Method for Variational Inequalities in Mixed Optimal Stopping and Control

Title: Fractional-Boundary-Regularized Deep Galerkin Method for Variational Inequalities in Mixed Optimal Stopping and Control

Fraktional-Boundary-Regularized Deep Galerkin-Methode für unterschiedliche Ungleichheiten in gemischten Optimalen Stoppen und Steuern

用于混合最佳制止和控制中差异性不平等的分数-界分- 常规深加热法 2505.19309v1

Authors: Yun Zhao, Harry Zheng

Mixed optimal stopping and stochastic control problems define variational inequalities with non-linear Hamilton-Jacobi-Bellman (HJB) operators, whose numerical solution is notoriously difficult and lack of reliable benchmarks. We first use the dual approach to transform it into a linear operator, and then introduce a Fractional-Boundary-Regularized Deep Galerkin Method (FBR-DGM) that augments the classical $L^2$ loss with Sobolev-Slobodeckij norms on the parabolic boundary, enforcing regularity and yielding consistent improvements in the network approximation and its derivatives. The improved accuracy allows the network to be converted back to the original solution using the dual transform. The self-consistency and stability of the network can be tested by checking the primal-dual relationship among optimal value, optimal wealth, and optimal control, offering innovative benchmarks in the absence of analytical solutions.

nan

Article 1596

Title@2025-05-25 (7): From Single Images to Motion Policies via Video-Generation Environment Representations

Title: From Single Images to Motion Policies via Video-Generation Environment Representations

Von Einzelbildern zu Motion Policies über Video-Generation Umweltvertretungen

从单一图像到通过视频环境代表从单一图像到运动政策 2505.19306v1

Authors: Weiming Zhi, Ziyong Ma, Tianyi Zhang, Matthew Johnson-Roberson

Autonomous robots typically need to construct representations of their surroundings and adapt their motions to the geometry of their environment. Here, we tackle the problem of constructing a policy model for collision-free motion generation, consistent with the environment, from a single input RGB image. Extracting 3D structures from a single image often involves monocular depth estimation. Developments in depth estimation have given rise to large pre-trained models such as DepthAnything. However, using outputs of these models for downstream motion generation is challenging due to frustum-shaped errors that arise. Instead, we propose a framework known as Video-Generation Environment Representation (VGER), which leverages the advances of large-scale video generation models to generate a moving camera video conditioned on the input image. Frames of this video, which form a multiview dataset, are then input into a pre-trained 3D foundation model to produce a dense point cloud. We then introduce a multi-scale noise approach to train an implicit representation of the environment structure and build a motion generation model that complies with the geometry of the representation. We extensively evaluate VGER over a diverse set of indoor and outdoor environments. We demonstrate its ability to produce smooth motions that account for the captured geometry of a scene, all from a single RGB input image.

nan

Article 1597

Title@2025-05-25 (7): Time Series Embedding Methods for Classification Tasks: A Review

Title: Time Series Embedding Methods for Classification Tasks: A Review

Zeitreihen Einbetten von Methoden für die Klassifizierung Aufgaben: Eine Überprüfung

分类任务所含方法:审查 2501.13392v2

Authors: Habib Irani, Yasamin Ghahremani, Arshia Kermani, Vangelis Metsis

Time series analysis has become crucial in various fields, from engineering and finance to healthcare and social sciences. Due to their multidimensional nature, time series often need to be embedded into a fixed-dimensional feature space to enable processing with various machine learning algorithms. In this paper, we present a comprehensive review and quantitative evaluation of time series embedding methods for effective representations in machine learning and deep learning models. We introduce a taxonomy of embedding techniques, categorizing them based on their theoretical foundations and application contexts. Our work provides a quantitative evaluation of representative methods from each category by assessing their performance on downstream classification tasks across diverse real-world datasets. Our experimental results demonstrate that the performance of embedding methods varies significantly depending on the dataset and classification algorithm used, highlighting the importance of careful model selection and extensive experimentation for specific applications. To facilitate further research and practical applications, we provide an open-source code repository implementing these embedding methods. This study contributes to the field by offering a systematic comparison of time series embedding techniques, guiding practitioners in selecting appropriate methods for their specific applications, and providing a foundation for future advancements in time series analysis.

nan

Article 1598

Title@2025-05-25 (7): LLM-Based Emulation of the Radio Resource Control Layer: Towards AI-Native RAN Protocols

Title: LLM-Based Emulation of the Radio Resource Control Layer: Towards AI-Native RAN Protocols

LLM-basierte Emulation der Funkressourcenkontrollschicht: Auf dem Weg zu KI-Native RAN-Protokollen

基于LLM的无线电资源控制层模拟模拟无线电资源控制层:迈向AI-NTRAN议定书 2505.16821v2

Authors: Ziming Liu, Bryan Liu, Alvaro Valcarce, Xiaoli Chu

Integrating large AI models (LAMs) into 6G mobile networks promises to redefine protocol design and control-plane intelligence by enabling autonomous, cognitive network operations. While industry concepts, such as ETSI’s Experiential Networked Intelligence (ENI), envision LAM-driven agents for adaptive network slicing and intent-based management, practical implementations still face challenges in protocol literacy and real-world deployment. This paper presents an end-to-end demonstration of a LAM that generates standards-compliant, ASN.1-encoded Radio Resource Control (RRC) messages as part of control-plane procedures inside a gNB. We treat RRC messaging as a domain-specific language and fine-tune a decoder-only transformer model (LLaMA class) using parameter-efficient Low-Rank Adaptation (LoRA) on RRC messages linearized to retain their ASN.1 syntactic structure before standard byte-pair encoding tokenization. This enables combinatorial generalization over RRC protocol states while minimizing training overhead. On 30k field-test request-response pairs, our 8 B model achieves a median cosine similarity of 0.97 with ground-truth messages on an edge GPU – a 61 % relative gain over a zero-shot LLaMA-3 8B baseline – indicating substantially improved structural and semantic RRC fidelity. Overall, our results show that LAMs, when augmented with Radio Access Network (RAN)-specific reasoning, can directly orchestrate control-plane procedures, representing a stepping stone toward the AI-native air-interface paradigm. Beyond RRC emulation, this work lays the groundwork for future AI-native wireless standards.

nan

Article 1599

Title@2025-05-25 (7): On the status of current quantum machine learning software

Title: On the status of current quantum machine learning software

Zum Status der aktuellen Quantenmaschinen-Lernsoftware

关于当前量子机器学习软件现状 2503.08962v2

Authors: Manish K. Gupta, Tomasz Rybotycki, Piotr Gawron

The recent advancements in noisy intermediate-scale quantum (NISQ) devices implementation allow us to study their application to real-life computational problems. However, hardware challenges are not the only ones that hinder our quantum computation capabilities. Software limitations are the other, less explored side of this medal. Using satellite image segmentation as a task example, we investigated how difficult it is to run a hybrid quantum-classical model on a real, publicly available quantum device. We also analyzed the costs of such endeavor and the change in quality of model.

nan

Article 1600

Title@2025-05-25 (7): 100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?

Title: 100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?

100-LongBench: Sind de facto Long-Context-Benchmarks wortwörtlich die Lang-Context-Fähigkeit zu bewerten?

100-LongBench:事实上的长文本基准是否实际评价长文本能力? 2505.19293v1

Authors: Wang Yang, Hongye Jin, Shaochen Zhong, Song Jiang, Qifan Wang, Vipin Chaudhary, Xiaotian Han

Long-context capability is considered one of the most important abilities of LLMs, as a truly long context-capable LLM enables users to effortlessly process many originally exhausting tasks – e.g., digesting a long-form document to find answers vs. directly asking an LLM about it. However, existing real-task-based long-context evaluation benchmarks have two major shortcomings. First, benchmarks like LongBench often do not provide proper metrics to separate long-context performance from the model’s baseline ability, making cross-model comparison unclear. Second, such benchmarks are usually constructed with fixed input lengths, which limits their applicability across different models and fails to reveal when a model begins to break down. To address these issues, we introduce a length-controllable long-context benchmark and a novel metric that disentangles baseline knowledge from true long-context capabilities. Experiments demonstrate the superiority of our approach in effectively evaluating LLMs.

nan

Article 1601

Title@2025-05-25 (7): Hypercube-RAG: Hypercube-Based Retrieval-Augmented Generation for In-domain Scientific Question-Answering

Title: Hypercube-RAG: Hypercube-Based Retrieval-Augmented Generation for In-domain Scientific Question-Answering

Hypercube-RAG: Hypercube-based Retrieval-Augmented Generation for In-domain Scientific Question-Answering

Hypercube-RAG: 内地科学问题解答的超立方体回收回溯性养代 2505.19288v1

Authors: Jimeng Shi, Sizhe Zhou, Bowen Jin, Wei Hu, Shaowen Wang, Giri Narasimhan, Jiawei Han

Large language models (LLMs) often need to incorporate external knowledge to solve theme-specific problems. Retrieval-augmented generation (RAG), which empowers LLMs to generate more qualified responses with retrieved external data and knowledge, has shown its high promise. However, traditional semantic similarity-based RAGs struggle to return concise yet highly relevant information for domain knowledge-intensive tasks, such as scientific question-answering (QA). Built on a multi-dimensional (cube) structure called Hypercube, which can index documents in an application-driven, human-defined, multi-dimensional space, we introduce the Hypercube-RAG, a novel RAG framework for precise and efficient retrieval. Given a query, Hypercube-RAG first decomposes it based on its entities and topics and then retrieves relevant documents from cubes by aligning these decomposed components with hypercube dimensions. Experiments on three in-domain scientific QA datasets demonstrate that our method improves accuracy by 3.7% and boosts retrieval efficiency by 81.2%, measured as relative gains over the strongest RAG baseline. More importantly, our Hypercube-RAG inherently offers explainability by revealing the underlying predefined hypercube dimensions used for retrieval. The code and data sets are available at https://github.com/JimengShi/Hypercube-RAG.

nan

Article 1602

Title@2025-05-25 (7): Provably Overwhelming Transformer Models with Designed Inputs

Title: Provably Overwhelming Transformer Models with Designed Inputs

Wahrscheinlich überwältigende Transformer-Modelle mit designten Eingängen

具有设计投入的、可预见地压得压得压倒的变压器模型 2502.06038v2

Authors: Lev Stambler, Seyed Sajjad Nezhadi, Matthew Coudron

We develop an algorithm which, given a trained transformer model $\mathcal{M}$ as input, as well as a string of tokens $s$ of length $n_{fix}$ and an integer $n_{free}$, can generate a mathematical proof that $\mathcal{M}$ is overwhelmed'' by $s$, in time and space $\widetilde{O}(n_{fix}^2 + n_{free}^3)$. We say that $\mathcal{M}$ isoverwhelmed’’ by $s$ when the output of the model evaluated on this string plus any additional string $t$, $\mathcal{M}(s + t)$, is completely insensitive to the value of the string $t$ whenever length($t$) $\leq n_{free}$. Along the way, we prove a particularly strong worst-case form of ``over-squashing’’, which we use to bound the model’s behavior. Our technique uses computer-aided proofs to establish this type of operationally relevant guarantee about transformer models. We empirically test our algorithm on a single layer transformer complete with an attention head, layer-norm, MLP/ReLU layers, and RoPE positional encoding. We believe that this work is a stepping stone towards the difficult task of obtaining useful guarantees for trained transformer models.

nan

Article 1603

Title@2025-05-25 (7): A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning

Title: A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning

Eine Momentaufnahme des Einflusses: Ein lokales Daten-Attributions-Framework für Online-Verstärkungs-Lernen

《影响概览:在线强化学习地方数据归属框架》 2505.19281v1

Authors: Yuzheng Hu, Fan Wu, Haotian Ye, David Forsyth, James Zou, Nan Jiang, Jiaqi W. Ma, Han Zhao

Online reinforcement learning (RL) excels in complex, safety-critical domains, yet it faces challenges such as sample inefficiency, training instability, and a lack of interpretability. Data attribution offers a principled way to trace model behavior back to individual training samples. However, in online RL, each training sample not only drives policy updates but also influences future data collection, violating the fixed dataset assumption in existing attribution methods. In this paper, we initiate the study of data attribution for online RL, focusing on the widely used Proximal Policy Optimization (PPO) algorithm. We start by establishing a local attribution framework, interpreting model checkpoints with respect to the records in the recent training buffer. We design two target functions, capturing agent action and cumulative return respectively, and measure each record’s contribution through gradient similarity between its training loss and these targets. We demonstrate the power of this framework through three concrete applications: diagnosis of learning, temporal analysis of behavior formation, and targeted intervention during training. Leveraging this framework, we further propose an algorithm, iterative influence-based filtering (IIF), for online RL training that iteratively performs experience filtering to refine policy updates. Across standard RL benchmarks (classic control, navigation, locomotion) to RLHF for large language models, IIF reduces sample complexity, speeds up training, and achieves higher returns. Overall, these results advance interpretability, efficiency, and effectiveness of online RL.

nan

Article 1604

Title@2025-05-25 (7): Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization

Title: Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization

Optimaler Transport Barycenter über Nonconvex-Concave Minimax-Optimierung

通过非 connconvex- concave Minimax 优化化优化运输博利中心 2501.14635v2

Authors: Kaheon Kim, Rentian Yao, Changbo Zhu, Xiaohui Chen

The optimal transport barycenter (a.k.a. Wasserstein barycenter) is a fundamental notion of averaging that extends from the Euclidean space to the Wasserstein space of probability distributions. Computation of the unregularized barycenter for discretized probability distributions on point clouds is a challenging task when the domain dimension $d > 1$. Most practical algorithms for approximating the barycenter problem are based on entropic regularization. In this paper, we introduce a nearly linear time $O(m \log{m})$ and linear space complexity $O(m)$ primal-dual algorithm, the Wasserstein-Descent $\dot{\mathbb{H}}^1$-Ascent (WDHA) algorithm, for computing the exact barycenter when the input probability density functions are discretized on an $m$-point grid. The key success of the WDHA algorithm hinges on alternating between two different yet closely related Wasserstein and Sobolev optimization geometries for the primal barycenter and dual Kantorovich potential subproblems. Under reasonable assumptions, we establish the convergence rate and iteration complexity of WDHA to its stationary point when the step size is appropriately chosen. Superior computational efficacy, scalability, and accuracy over the existing Sinkhorn-type algorithms are demonstrated on high-resolution (e.g., $1024 \times 1024$ images) 2D synthetic and real data.

nan

Article 1605

Title@2025-05-25 (7): Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits through Gaussian Approximation

Title: Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits through Gaussian Approximation

Erreichen von $\tilde{\mathcal{O}(1/N)$ Optimality Gap in ruhelosen Banditen durch Gaußsche Annäherung

通过高斯近似度实现无休止强盗的最佳差距 $\ tilde\ mathcal{O\\\\\\\\\\( n)$ 2410.15003v2

Authors: Chen Yan, Weina Wang, Lei Ying

We study the finite-horizon Restless Multi-Armed Bandit (RMAB) problem with $N$ homogeneous arms. Prior work has shown that when an RMAB satisfies a non-degeneracy condition, Linear-Programming-based (LP-based) policies derived from the fluid approximation, which captures the mean dynamics of the system, achieve an exponentially small optimality gap. However, it is common for RMABs to be degenerate, in which case LP-based policies can result in a $\Theta(1/\sqrt{N})$ optimality gap per arm. In this paper, we propose a novel Stochastic-Programming-based (SP-based) policy that, under a uniqueness assumption, achieves an $\tilde{\mathcal{O}}(1/N)$ optimality gap for degenerate RMABs. Our approach is based on the construction of a Gaussian stochastic system that captures not only the mean but also the variance of the RMAB dynamics, resulting in a more accurate approximation than the fluid approximation. We then solve a stochastic program for this system to obtain our policy. This is the first result to establish an $\tilde{\mathcal{O}}(1/N)$ optimality gap for degenerate RMABs.

nan

Article 1606

Title@2025-05-25 (7): Cellular Traffic Prediction via Byzantine-robust Asynchronous Federated Learning

Title: Cellular Traffic Prediction via Byzantine-robust Asynchronous Federated Learning

Zelluläre Verkehrsvorhersage über byzantinisches-robustes Asynchrones Federated Learning

通过Byzantine-Robust 亚同步联谊会学习的细胞交通预测 2505.19263v1

Authors: Hui Ma, Kai Yang, Yang Jiao

Network traffic prediction plays a crucial role in intelligent network operation. Traditional prediction methods often rely on centralized training, necessitating the transfer of vast amounts of traffic data to a central server. This approach can lead to latency and privacy concerns. To address these issues, federated learning integrated with differential privacy has emerged as a solution to improve data privacy and model robustness in distributed settings. Nonetheless, existing federated learning protocols are vulnerable to Byzantine attacks, which may significantly compromise model robustness. Developing a robust and privacy-preserving prediction model in the presence of Byzantine clients remains a significant challenge. To this end, we propose an asynchronous differential federated learning framework based on distributionally robust optimization. The proposed framework utilizes multiple clients to train the prediction model collaboratively with local differential privacy. In addition, regularization techniques have been employed to further improve the Byzantine robustness of the models. We have conducted extensive experiments on three real-world datasets, and the results elucidate that our proposed distributed algorithm can achieve superior performance over existing methods.

nan

Article 1607

Title@2025-05-25 (7): Towards a Spatiotemporal Fusion Approach to Precipitation Nowcasting

Title: Towards a Spatiotemporal Fusion Approach to Precipitation Nowcasting

Auf dem Weg zu einem Spatiotemporalen Fusionsansatz zur Niederschlagung von Nowcasting

迈向对降水即时播送采取相向时间融合办法 2505.19258v1

Authors: Felipe Curcio, Pedro Castro, Augusto Fonseca, Rafaela Castro, Raquel Franco, Eduardo Ogasawara, Victor Stepanenko, Fabio Porto, Mariza Ferro, Eduardo Bezerra

With the increasing availability of meteorological data from various sensors, numerical models and reanalysis products, the need for efficient data integration methods has become paramount for improving weather forecasts and hydrometeorological studies. In this work, we propose a data fusion approach for precipitation nowcasting by integrating data from meteorological and rain gauge stations in Rio de Janeiro metropolitan area with ERA5 reanalysis data and GFS numerical weather prediction. We employ the spatiotemporal deep learning architecture called STConvS2S, leveraging a structured dataset covering a 9 x 11 grid. The study spans from January 2011 to October 2024, and we evaluate the impact of integrating three surface station systems. Among the tested configurations, the fusion-based model achieves an F1-score of 0.2033 for forecasting heavy precipitation events (greater than 25 mm/h) at a one-hour lead time. Additionally, we present an ablation study to assess the contribution of each station network and propose a refined inference strategy for precipitation nowcasting, integrating the GFS numerical weather prediction (NWP) data with in-situ observations.

nan

Article 1608

Title@2025-05-25 (7): Learning-Augmented Online Bipartite Fractional Matching

Title: Learning-Augmented Online Bipartite Fractional Matching

Learning-Augmented Online Bipartite Fraktional Matching

学习增强的在线双两派人数配对 2505.19252v1

Authors: Davin Choo, Billy Jin, Yongho Shin

Online bipartite matching is a fundamental problem in online optimization, extensively studied both in its integral and fractional forms due to its theoretical significance and practical applications, such as online advertising and resource allocation. Motivated by recent progress in learning-augmented algorithms, we study online bipartite fractional matching when the algorithm is given advice in the form of a suggested matching in each iteration. We develop algorithms for both the vertex-weighted and unweighted variants that provably dominate the naive “coin flip” strategy of randomly choosing between the advice-following and advice-free algorithms. Moreover, our algorithm for the vertex-weighted setting extends to the AdWords problem under the small bids assumption, yielding a significant improvement over the seminal work of Mahdian, Nazerzadeh, and Saberi (EC 2007, TALG 2012). Complementing our positive results, we establish a hardness bound on the robustness-consistency tradeoff that is attainable by any algorithm. We empirically validate our algorithms through experiments on synthetic and real-world data.

nan

Article 1609

Title@2025-05-25 (7): Empirical Privacy Variance

Title: Empirical Privacy Variance

Empirische Datenschutzvarianz

隐私经验差异 2503.12314v2

Authors: Yuzheng Hu, Fan Wu, Ruicheng Xian, Yuhang Liu, Lydia Zakynthinou, Pritish Kamath, Chiyuan Zhang, David Forsyth

We propose the notion of empirical privacy variance and study it in the context of differentially private fine-tuning of language models. Specifically, we show that models calibrated to the same $(\varepsilon, \delta)$-DP guarantee using DP-SGD with different hyperparameter configurations can exhibit significant variations in empirical privacy, which we quantify through the lens of memorization. We investigate the generality of this phenomenon across multiple dimensions and discuss why it is surprising and relevant. Through regression analysis, we examine how individual and composite hyperparameters influence empirical privacy. The results reveal a no-free-lunch trade-off: existing practices of hyperparameter tuning in DP-SGD, which focus on optimizing utility under a fixed privacy budget, often come at the expense of empirical privacy. To address this, we propose refined heuristics for hyperparameter selection that explicitly account for empirical privacy, showing that they are both precise and practically useful. Finally, we take preliminary steps to understand empirical privacy variance. We propose two hypotheses, identify limitations in existing techniques like privacy auditing, and outline open questions for future research.

nan

Article 1610

Title@2025-05-25 (7): Improving Value Estimation Critically Enhances Vanilla Policy Gradient

Title: Improving Value Estimation Critically Enhances Vanilla Policy Gradient

Verbesserung der Wertschätzung Kritisch verbessert Vanilla Policy Gradient

显著加强香草政策梯度 2505.19247v1

Authors: Tao Wang, Ruipeng Zhang, Sicun Gao

Modern policy gradient algorithms, such as TRPO and PPO, outperform vanilla policy gradient in many RL tasks. Questioning the common belief that enforcing approximate trust regions leads to steady policy improvement in practice, we show that the more critical factor is the enhanced value estimation accuracy from more value update steps in each iteration. To demonstrate, we show that by simply increasing the number of value update steps per iteration, vanilla policy gradient itself can achieve performance comparable to or better than PPO in all the standard continuous control benchmark environments. Importantly, this simple change to vanilla policy gradient is significantly more robust to hyperparameter choices, opening up the possibility that RL algorithms may still become more effective and easier to use.

nan

Article 1611

Title@2025-05-25 (7): To CoT or To Loop? A Formal Comparison Between Chain-of-Thought and Looped Transformers

Title: To CoT or To Loop? A Formal Comparison Between Chain-of-Thought and Looped Transformers

To CoT or To Loop? Ein formaler Vergleich zwischen Ketten-of-Thought und Schleiftransformatoren

尝试链和循环变换器之间的正式比较 2505.19245v1

Authors: Kevin Xu, Issei Sato

Chain-of-Thought (CoT) and Looped Transformers have been shown to empirically improve performance on reasoning tasks and to theoretically enhance expressivity by recursively increasing the number of computational steps. However, their comparative capabilities are still not well understood. In this paper, we provide a formal analysis of their respective strengths and limitations. We show that Looped Transformers can efficiently simulate parallel computations for deterministic tasks, which we formalize as evaluation over directed acyclic graphs. In contrast, CoT with stochastic decoding excels at approximate inference for compositional structures, namely self-reducible problems. These separations suggest the tasks for which depth-driven recursion is more suitable, thereby offering practical cues for choosing between reasoning paradigms.

nan

Article 1612

Title@2025-05-25 (7): ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment

Title: ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment

ActiveDPO: Aktive Direktpräferenzoptimierung für eine stichprobeneffiziente Ausrichtung

主动式DPO:为抽样有效对齐积极直接首选优化 2505.19241v1

Authors: Xiaoqiang Lin, Arun Verma, Zhongxiang Dai, Daniela Rus, See-Kiong Ng, Bryan Kian Hsiang Low

The recent success of using human preferences to align large language models (LLMs) has significantly improved their performance in various downstream tasks like question answering, mathematical reasoning, and code generation. However,3 achieving effective LLM alignment depends on high-quality human preference datasets. Collecting these datasets requires human preference annotation, which is costly and resource-intensive, necessitating efficient active data selection methods. Existing methods either lack a strong theoretical foundation or depend on restrictive reward function assumptions (e.g., linearity). To this end, we propose an algorithm, ActiveDPO, that uses a theoretically grounded data selection criterion for non-linear reward functions while directly leveraging the LLM itself to parameterize the reward model that is used for active data selection. As a result, ActiveDPO explicitly accounts for the influence of LLM on data selection, unlike methods that select the data without considering the LLM that is being aligned, thereby leading to more effective and efficient data collection. Extensive experiments show that ActiveDPO outperforms existing methods across various models and datasets.

nan

Article 1613

Title@2025-05-25 (7): CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling

Title: CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling

CLIP-UP: Ein einfaches und effizientes Mixture-of-Experts CLIP Training Rezept mit Sparse Upcycling

CLIP-UP:一个简单、高效的专家混合体 CLIP 与粗垃圾垃圾垃圾垃圾处理有关的培训名额 2502.00965v2

Authors: Xinze Wang, Chen Chen, Yinfei Yang, Hong-You Chen, Bowen Zhang, Aditya Pal, Xiangxin Zhu, Xianzhi Du

Mixture-of-Experts (MoE) models are crucial for scaling model capacity while controlling inference costs. While integrating MoE into multimodal models like CLIP improves performance, training these models is notoriously challenging and expensive. We propose CLIP-Upcycling (CLIP-UP), an efficient alternative training strategy that converts a pre-trained dense CLIP model into a sparse MoE architecture. Through extensive experimentation with various settings and auxiliary losses, we demonstrate that CLIP-UP significantly reduces training complexity and cost. Remarkably, our sparse CLIP B/16 model, trained with CLIP-UP, outperforms its dense counterpart by 7.2% and 6.6% on COCO and Flickr30k text-to-image Recall@1 benchmarks respectively. It even surpasses the larger CLIP L/14 model on this task while using only 30% of the inference FLOPs. We further demonstrate the generalizability of our training recipe across different scales, establishing sparse upcycling as a practical and scalable approach for building efficient, high-performance CLIP models.

nan

Article 1614

Title@2025-05-25 (7): LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models

Title: LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models

LLLMs: Eine datengestützte Untersuchung der sich entwickelnden Forschung über Grenzen großer Sprachmodelle

LLLMs:关于大语言模式限制的不断发展的研究数据驱动调查 2505.19240v1

Authors: Aida Kostikova, Zhipin Wang, Deidamea Bajri, Ole Pütz, Benjamin Paaßen, Steffen Eger

Large language model (LLM) research has grown rapidly, along with increasing concern about their limitations such as failures in reasoning, hallucinations, and limited multilingual capability. In this survey, we conduct a data-driven, semi-automated review of research on limitations of LLM (LLLMs) from 2022 to 2024 using a bottom-up approach. From a corpus of 250,000 ACL and arXiv papers, we identify 14,648 relevant papers using keyword filtering, LLM-based classification, validated against expert labels, and topic clustering (via two approaches, HDBSCAN+BERTopic and LlooM). We find that LLM-related research increases over fivefold in ACL and fourfold in arXiv. Since 2022, LLLMs research grows even faster, reaching over 30% of LLM papers by late 2024. Reasoning remains the most studied limitation, followed by generalization, hallucination, bias, and security. The distribution of topics in the ACL dataset stays relatively stable over time, while arXiv shifts toward safety and controllability (with topics like security risks, alignment, hallucinations, knowledge editing), and multimodality between 2022 and 2024. We release a dataset of annotated abstracts and a validated methodology, and offer a quantitative view of trends in LLM limitations research.

nan

Article 1615

Title@2025-05-25 (7): Learning Transformer-based World Models with Contrastive Predictive Coding

Title: Learning Transformer-based World Models with Contrastive Predictive Coding

Transformer-basierte Weltmodelle mit kontradiktivem Predictive Coding lernen

以学习变换器为基础的世界差异预测编码模式 2503.04416v2

Authors: Maxime Burchi, Radu Timofte

The DreamerV3 algorithm recently obtained remarkable performance across diverse environment domains by learning an accurate world model based on Recurrent Neural Networks (RNNs). Following the success of model-based reinforcement learning algorithms and the rapid adoption of the Transformer architecture for its superior training efficiency and favorable scaling properties, recent works such as STORM have proposed replacing RNN-based world models with Transformer-based world models using masked self-attention. However, despite the improved training efficiency of these methods, their impact on performance remains limited compared to the Dreamer algorithm, struggling to learn competitive Transformer-based world models. In this work, we show that the next state prediction objective adopted in previous approaches is insufficient to fully exploit the representation capabilities of Transformers. We propose to extend world model predictions to longer time horizons by introducing TWISTER (Transformer-based World model wIth contraSTivE Representations), a world model using action-conditioned Contrastive Predictive Coding to learn high-level temporal feature representations and improve the agent performance. TWISTER achieves a human-normalized mean score of 162% on the Atari 100k benchmark, setting a new record among state-of-the-art methods that do not employ look-ahead search.

nan

Article 1616

Title@2025-05-25 (7): Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Title: Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Effiziente Politikoptimierung in robusten, eingeschränkten MDPs mit Iterationskomplexitätsgarantien

在强力约束下,在具有迭接复杂度保障的多用途发展方案中提高政策效率的优化 2505.19238v1

Authors: Sourav Ganguly, Arnob Ghosh, Kishan Panaganti, Adam Wierman

Constrained decision-making is essential for designing safe policies in real-world control systems, yet simulated environments often fail to capture real-world adversities. We consider the problem of learning a policy that will maximize the cumulative reward while satisfying a constraint, even when there is a mismatch between the real model and an accessible simulator/nominal model. In particular, we consider the robust constrained Markov decision problem (RCMDP) where an agent needs to maximize the reward and satisfy the constraint against the worst possible stochastic model under the uncertainty set centered around an unknown nominal model. Primal-dual methods, effective for standard constrained MDP (CMDP), are not applicable here because of the lack of the strong duality property. Further, one cannot apply the standard robust value-iteration based approach on the composite value function either as the worst case models may be different for the reward value function and the constraint value function. We propose a novel technique that effectively minimizes the constraint value function–to satisfy the constraints; on the other hand, when all the constraints are satisfied, it can simply maximize the robust reward value function. We prove that such an algorithm finds a policy with at most $\epsilon$ sub-optimality and feasible policy after $O(\epsilon^{-2})$ iterations. In contrast to the state-of-the-art method, we do not need to employ a binary search, thus, we reduce the computation time by at least 4x for smaller value of discount factor ($\gamma$) and by at least 6x for larger value of $\gamma$.

nan

Article 1617

Title@2025-05-25 (7): To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Title: To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Eine Welt in einem Funken Neuron zu sehen: Entwirren von Multi-Task-Interferenzen für trainingsfreies Modellverschmelzen

《在中世纪的火花中看到世界:为无培训模式合并拆散多任务干预》 2503.05320v2

Authors: Zitao Fang, Guodong DU, Shuyang Yu, Yifei Guo, Yiwei Zhang, Yiyao Cao, Jing Li, Ho-Kin Tang, Sim Kuan Goh

Fine-tuning pre-trained models on targeted datasets enhances task-specific performance but often comes at the expense of generalization. Model merging techniques, which integrate multiple fine-tuned models into a single multi-task model through task arithmetic, offer a promising solution. However, task interference remains a fundamental challenge, leading to performance degradation and suboptimal merged models. Existing approaches largely overlook the fundamental roles of neurons, their connectivity, and activation, resulting in a merging process and a merged model that does not consider how neurons relay and process information. In this work, we present the first study that relies on neuronal mechanisms for model merging. We decompose task-specific representations into two complementary neuronal subspaces that regulate neuron sensitivity and input adaptability. Leveraging this decomposition, we introduce NeuroMerging, a novel merging framework developed to mitigate task interference within neuronal subspaces, enabling training-free model fusion across diverse tasks. Through extensive experiments, we demonstrate that NeuroMerging achieves superior performance compared to existing methods on multi-task benchmarks across both natural language and vision domains. Our findings highlight the importance of aligning neuronal mechanisms in model merging, offering new insights into mitigating task interference and improving knowledge fusion. Code will be released upon acceptance.

nan

Article 1618

Title@2025-05-25 (7): CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models

Title: CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models

CoreMatching: Co-adaptive Sparse Inference Framework mit Token und Neuron Pruning für eine umfassende Beschleunigung von Vision-Language-Modellen

核心配料:与Token 和Neron Prurning 共同调适的简单推断框架,以全面加速视觉语言模型 2505.19235v1

Authors: Qinsi Wang, Hancheng Ye, Ming-Yu Chung, Yudong Liu, Yueqian Lin, Martin Kuo, Mingyuan Ma, Jianyi Zhang, Yiran Chen

Vision-Language Models (VLMs) excel across diverse tasks but suffer from high inference costs in time and memory. Token sparsity mitigates inefficiencies in token usage, while neuron sparsity reduces high-dimensional computations, both offering promising solutions to enhance efficiency. Recently, these two sparsity paradigms have evolved largely in parallel, fostering the prevailing assumption that they function independently. However, a fundamental yet underexplored question remains: Do they truly operate in isolation, or is there a deeper underlying interplay that has yet to be uncovered? In this paper, we conduct the first comprehensive investigation into this question. By introducing and analyzing the matching mechanism between Core Neurons and Core Tokens, we found that key neurons and tokens for inference mutually influence and reinforce each other. Building on this insight, we propose CoreMatching, a co-adaptive sparse inference framework, which leverages the synergy between token and neuron sparsity to enhance inference efficiency. Through theoretical analysis and efficiency evaluations, we demonstrate that the proposed method surpasses state-of-the-art baselines on ten image understanding tasks and three hardware devices. Notably, on the NVIDIA Titan Xp, it achieved 5x FLOPs reduction and a 10x overall speedup. Code is released at https://github.com/wangqinsi1/2025-ICML-CoreMatching/tree/main.

nan

Article 1619

Title@2025-05-25 (7): Learning Flexible Forward Trajectories for Masked Molecular Diffusion

Title: Learning Flexible Forward Trajectories for Masked Molecular Diffusion

Flexible Forward-Trajektorien für maskierte molekulare Diffusion lernen

蒙面分子扩散学习灵活前向轨迹 2505.16790v2

Authors: Hyunjin Seo, Taewon Kim, Sihyun Yu, SungSoo Ahn

Masked diffusion models (MDMs) have achieved notable progress in modeling discrete data, while their potential in molecular generation remains underexplored. In this work, we explore their potential and introduce the surprising result that naively applying standards MDMs severely degrades the performance. We identify the critical cause of this issue as a state-clashing problem-where the forward diffusion of distinct molecules collapse into a common state, resulting in a mixture of reconstruction targets that cannot be learned using typical reverse diffusion process with unimodal predictions. To mitigate this, we propose Masked Element-wise Learnable Diffusion (MELD) that orchestrates per-element corruption trajectories to avoid collision between distinct molecular graphs. This is achieved through a parameterized noise scheduling network that assigns distinct corruption rates to individual graph elements, i.e., atoms and bonds. Extensive experiments on diverse molecular benchmarks reveal that MELD markedly enhances overall generation quality compared to element-agnostic noise scheduling, increasing the chemical validity of vanilla MDMs on ZINC250K from 15% to 93%, Furthermore, it achieves state-of-the-art property alignment in conditional generation tasks.

nan

Article 1620

Title@2025-05-25 (7): Statistical Collusion by Collectives on Learning Platforms

Title: Statistical Collusion by Collectives on Learning Platforms

Statistische Kollusion von Kollektiven über Lernplattformen

学习平台集体统计协作 2502.04879v3

Authors: Etienne Gauthier, Francis Bach, Michael I. Jordan

As platforms increasingly rely on learning algorithms, collectives may form and seek ways to influence these platforms to align with their own interests. This can be achieved by coordinated submission of altered data. To evaluate the potential impact of such behavior, it is essential to understand the computations that collectives must perform to impact platforms in this way. In particular, collectives need to make a priori assessments of the effect of the collective before taking action, as they may face potential risks when modifying their data. Moreover they need to develop implementable coordination algorithms based on quantities that can be inferred from observed data. We develop a framework that provides a theoretical and algorithmic treatment of these issues and present experimental results in a product evaluation domain.

nan

Article 1621

Title@2025-05-25 (7): Imitation Learning via Focused Satisficing

Title: Imitation Learning via Focused Satisficing

Imitation Learning via Focused Satisficing

通过有重点的满意度学习模拟学习 2505.14820v2

Authors: Rushit N. Shah, Nikolaos Agadakos, Synthia Sasulski, Ali Farajzadeh, Sanjiban Choudhury, Brian Ziebart

Imitation learning often assumes that demonstrations are close to optimal according to some fixed, but unknown, cost function. However, according to satisficing theory, humans often choose acceptable behavior based on their personal (and potentially dynamic) levels of aspiration, rather than achieving (near-) optimality. For example, a lunar lander demonstration that successfully lands without crashing might be acceptable to a novice despite being slow or jerky. Using a margin-based objective to guide deep reinforcement learning, our focused satisficing approach to imitation learning seeks a policy that surpasses the demonstrator’s aspiration levels – defined over trajectories or portions of trajectories – on unseen demonstrations without explicitly learning those aspirations. We show experimentally that this focuses the policy to imitate the highest quality (portions of) demonstrations better than existing imitation learning methods, providing much higher rates of guaranteed acceptability to the demonstrator, and competitive true returns on a range of environments.

nan

Article 1622

Title@2025-05-25 (7): CLEVER: A Curated Benchmark for Formally Verified Code Generation

Title: CLEVER: A Curated Benchmark for Formally Verified Code Generation

CLEVER: Ein kuratierter Benchmark für die formal verifizierte Codegenerierung

正式核实的代码生成基准 2505.13938v3

Authors: Amitayush Thakur, Jasper Lee, George Tsoukalas, Meghana Sistla, Matthew Zhao, Stefan Zetzsche, Greg Durrett, Yisong Yue, Swarat Chaudhuri

We introduce ${\rm C{\small LEVER}}$, a high-quality, curated benchmark of 161 problems for end-to-end verified code generation in Lean. Each problem consists of (1) the task of generating a specification that matches a held-out ground-truth specification, and (2) the task of generating a Lean implementation that provably satisfies this specification. Unlike prior benchmarks, ${\rm C{\small LEVER}}$ avoids test-case supervision, LLM-generated annotations, and specifications that leak implementation logic or allow vacuous solutions. All outputs are verified post-hoc using Lean’s type checker to ensure machine-checkable correctness. We use ${\rm C{\small LEVER}}$ to evaluate several few-shot and agentic approaches based on state-of-the-art language models. These methods all struggle to achieve full verification, establishing it as a challenging frontier benchmark for program synthesis and formal reasoning. Our benchmark can be found on GitHub(https://github.com/trishullab/clever) as well as HuggingFace(https://huggingface.co/datasets/amitayusht/clever). All our evaluation code is also available online(https://github.com/trishullab/clever-prover).

nan

Article 1623

Title@2025-05-25 (7): Scalarisation-based risk concepts for robust multi-objective optimisation

Title: Scalarisation-based risk concepts for robust multi-objective optimisation

Scalarisierungsbasierte Risikokonzepte für eine robuste multiobjektive Optimierung

实现稳健的多目标优化的以尺度化为基础的风险风险概念 2405.10221v4

Authors: Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective case of this problem. We identify that the majority of all robust multi-objective algorithms rely on two key operations: robustification and scalarisation. Robustification refers to the strategy that is used to account for the uncertainty in the problem. Scalarisation refers to the procedure that is used to encode the relative importance of each objective to a scalar-valued reward. As these operations are not necessarily commutative, the order that they are performed in has an impact on the resulting solutions that are identified and the final decisions that are made. The purpose of this work is to give a thorough exposition on the effects of these different orderings and in particular highlight when one should opt for one ordering over the other. As part of our analysis, we showcase how many existing risk concepts can be integrated into the specification and solution of a robust multi-objective optimisation problem. Besides this, we also demonstrate how one can principally define the notion of a robust Pareto front and a robust performance metric based on our ``robustify and scalarise’’ methodology. To illustrate the efficacy of these new ideas, we present two insightful case studies which are based on real-world data sets.

nan

Article 1624

Title@2025-05-25 (7): Dynamic Angle Selection in X-Ray CT: A Reinforcement Learning Approach to Optimal Stopping

Title: Dynamic Angle Selection in X-Ray CT: A Reinforcement Learning Approach to Optimal Stopping

Dynamische Winkelauswahl in X-Ray CT: Ein verstärkten Lernansatz zum optimalen Stoppen

X- Ray CT: 优化停止的强化学习方法 2503.12688v2

Authors: Tianyuan Wang, Felix Lucka, Daniël M. Pelt, K. Joost Batenburg, Tristan van Leeuwen

In industrial X-ray Computed Tomography (CT), the need for rapid in-line inspection is critical. Sparse-angle tomography plays a significant role in this by reducing the required number of projections, thereby accelerating processing and conserving resources. Most existing methods aim to balance reconstruction quality and scanning time, typically relying on fixed scan durations. Adaptive adjustment of the number of angles is essential; for instance, more angles may be required for objects with complex geometries or noisier projections. The concept of optimal stopping, which dynamically adjusts this balance according to varying industrial needs, remains overlooked. Building on our previous work, we integrate optimal stopping into sequential Optimal Experimental Design (sOED) and Reinforcement Learning (RL). We propose a novel method for computing the policy gradient within the Actor-Critic framework, enabling the development of adaptive policies for informative angle selection and scan termination. Additionally, we investigate the gap between simulation and real-world applications in the context of the developed learning-based method. Our trained model, developed using synthetic data, demonstrates reliable performance when applied to experimental X-ray CT data. This approach enhances the flexibility of CT operations and expands the applicability of sparse-angle tomography in industrial settings.

nan

Article 1625

Title@2025-05-25 (7): Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More

Title: Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More

Sprachmodelle, Graph Searching und Überwachung Ehebruch: Wenn mehr Aufsicht weniger ist und wie man mehr macht

语言模式、图图搜索和监督通配:越少越少监督,如何做越多 2503.10542v3

Authors: Arvid Frydenlund

This work concerns the path-star task, a minimal example of searching over a graph. The graph, $G$, is star-shaped with $D$ arms radiating from a start node, $s$. A language model (LM) is given $G$, $s$, and a target node $t$, which ends one of the arms and is tasked with generating the arm containing $t$. The minimal nature of this task means only a single choice needs to be made: which of the $D$ arms contains $t$? Decoder-only LMs fail to solve this elementary task above $1/D$ chance due to a learned shortcut that absorbs training supervision. We show how this pathology is caused by excess supervision and we present a series of solutions demonstrating that the task is solvable via decoder-only LMs. We find that the task’s minimal nature causes its difficulty, as it prevents task decomposition. Our solutions provide insight into the pathology and its implications for LMs trained via next-token prediction.

nan

Article 1626

Title@2025-05-25 (7): Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law

Title: Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law

Skalierungsgesetze für gradienten Abstieg und Zeichenabstieg für lineare Bigram-Modelle unter Zipf’s Gesetz

齐普夫法下线形大梁模型的渐渐后裔和信号后裔法律扩大法 2505.19227v1

Authors: Frederik Kunstner, Francis Bach

Recent works have highlighted optimization difficulties faced by gradient descent in training the first and last layers of transformer-based language models, which are overcome by optimizers such as Adam. These works suggest that the difficulty is linked to the heavy-tailed distribution of words in text data, where the frequency of the $k$th most frequent word $\pi_k$ is proportional to $1/k$, following Zipf’s law. To better understand the impact of the data distribution on training performance, we study a linear bigram model for next-token prediction when the tokens follow a power law $\pi_k \propto 1/k^\alpha$ parameterized by the exponent $\alpha > 0$. We derive optimization scaling laws for deterministic gradient descent and sign descent as a proxy for Adam as a function of the exponent $\alpha$. Existing theoretical investigations in scaling laws assume that the eigenvalues of the data decay as a power law with exponent $\alpha > 1$. This assumption effectively makes the problem finite dimensional'' as most of the loss comes from a few of the largest eigencomponents. In comparison, we show that the problem is more difficult when the data have heavier tails. The case $\alpha = 1$ as found in text data isworst-case’’ for gradient descent, in that the number of iterations required to reach a small relative error scales almost linearly with dimension. While the performance of sign descent also depends on the dimension, for Zipf-distributed data the number of iterations scales only with the square-root of the dimension, leading to a large improvement for large vocabularies.

nan

Article 1627

Title@2025-05-25 (7): LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models

Title: LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models

LLaDA 1.5: Varianzreduzierte Preference-Optimierung für große Sprachdiffusionsmodelle

LLADA 1.5:大语言传播模式差异-减少优惠 2505.19223v1

Authors: Fengqi Zhu, Rongzhen Wang, Shen Nie, Xiaolu Zhang, Chunwei Wu, Jun Hu, Jun Zhou, Jianfei Chen, Yankai Lin, Ji-Rong Wen, Chongxuan Li

While Masked Diffusion Models (MDMs), such as LLaDA, present a promising paradigm for language modeling, there has been relatively little effort in aligning these models with human preferences via reinforcement learning. The challenge primarily arises from the high variance in Evidence Lower Bound (ELBO)-based likelihood estimates required for preference optimization. To address this issue, we propose Variance-Reduced Preference Optimization (VRPO), a framework that formally analyzes the variance of ELBO estimators and derives bounds on both the bias and variance of preference optimization gradients. Building on this theoretical foundation, we introduce unbiased variance reduction strategies, including optimal Monte Carlo budget allocation and antithetic sampling, that significantly improve the performance of MDM alignment. We demonstrate the effectiveness of VRPO by applying it to LLaDA, and the resulting model, LLaDA 1.5, outperforms its SFT-only predecessor consistently and significantly across mathematical (GSM8K +4.7), code (HumanEval +3.0, MBPP +1.8), and alignment benchmarks (IFEval +4.0, Arena-Hard +4.3). Furthermore, LLaDA 1.5 demonstrates a highly competitive mathematical performance compared to strong language MDMs and ARMs. Project page: https://ml-gsai.github.io/LLaDA-1.5-Demo/.

nan

Article 1628

Title@2025-05-25 (7): A Novel Transformer-Based Self-Supervised Learning Method to Enhance Photoplethysmogram Signal Artifact Detection

Title: A Novel Transformer-Based Self-Supervised Learning Method to Enhance Photoplethysmogram Signal Artifact Detection

Eine neuartige, auf Transformer basierende, selbstüberwachte Lernmethode zur Verbesserung der Photoplethysmogramm-Signal-Artefakt-Erkennung

一种基于新颖变形器的以自我监督为基础的学习方法,用以加强光膜成像信号异形探测 2401.01013v2

Authors: Thanh-Dung Le, Clara Macabiau, Kévin Albert, Philippe Jouvet, Rita Noumeir

Recent research at CHU Sainte Justine’s Pediatric Critical Care Unit (PICU) has revealed that traditional machine learning methods, such as semi-supervised label propagation and K-nearest neighbors, outperform Transformer-based models in artifact detection from PPG signals, mainly when data is limited. This study addresses the underutilization of abundant unlabeled data by employing self-supervised learning (SSL) to extract latent features from these data, followed by fine-tuning on labeled data. Our experiments demonstrate that SSL significantly enhances the Transformer model’s ability to learn representations, improving its robustness in artifact classification tasks. Among various SSL techniques, including masking, contrastive learning, and DINO (self-distillation with no labels)-contrastive learning exhibited the most stable and superior performance in small PPG datasets. Further, we delve into optimizing contrastive loss functions, which are crucial for contrastive SSL. Inspired by InfoNCE, we introduce a novel contrastive loss function that facilitates smoother training and better convergence, thereby enhancing performance in artifact classification. In summary, this study establishes the efficacy of SSL in leveraging unlabeled data, particularly in enhancing the capabilities of the Transformer model. This approach holds promise for broader applications in PICU environments, where annotated data is often limited.

nan

Article 1629

Title@2025-05-25 (7): Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding

Title: Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding

Where Paths Collide: Eine umfassende Untersuchung der klassischen und lernbasierten multi-agenten Pathfinding

路径相撞之处:对经典和以学习为基础的多方代理调查的全面调查 2505.19219v1

Authors: Shiyue Wang, Haozheng Xu, Yuhan Zhang, Jingran Lin, Changhong Lu, Xiangfeng Wang, Wenhao Li

Multi-Agent Path Finding (MAPF) is a fundamental problem in artificial intelligence and robotics, requiring the computation of collision-free paths for multiple agents navigating from their start locations to designated goals. As autonomous systems become increasingly prevalent in warehouses, urban transportation, and other complex environments, MAPF has evolved from a theoretical challenge to a critical enabler of real-world multi-robot coordination. This comprehensive survey bridges the long-standing divide between classical algorithmic approaches and emerging learning-based methods in MAPF research. We present a unified framework that encompasses search-based methods (including Conflict-Based Search, Priority-Based Search, and Large Neighborhood Search), compilation-based approaches (SAT, SMT, CSP, ASP, and MIP formulations), and data-driven techniques (reinforcement learning, supervised learning, and hybrid strategies). Through systematic analysis of experimental practices across 200+ papers, we uncover significant disparities in evaluation methodologies, with classical methods typically tested on larger-scale instances (up to 200 by 200 grids with 1000+ agents) compared to learning-based approaches (predominantly 10-100 agents). We provide a comprehensive taxonomy of evaluation metrics, environment types, and baseline selections, highlighting the need for standardized benchmarking protocols. Finally, we outline promising future directions including mixed-motive MAPF with game-theoretic considerations, language-grounded planning with large language models, and neural solver architectures that combine the rigor of classical methods with the flexibility of deep learning. This survey serves as both a comprehensive reference for researchers and a practical guide for deploying MAPF solutions in increasingly complex real-world applications.

nan

Article 1630

Title@2025-05-25 (7): Clustering by Nonparametric Smoothing

Title: Clustering by Nonparametric Smoothing

Clustering durch nichtparametrisches Glätten

以非参数平滑为群集 2503.09134v2

Authors: David P. Hofmeyr

A novel formulation of the clustering problem is introduced in which the task is expressed as an estimation problem, where the object to be estimated is a function which maps a point to its distribution of cluster membership. Unlike existing approaches which implicitly estimate such a function, like Gaussian Mixture Models (GMMs), the proposed approach bypasses any explicit modelling assumptions and exploits the flexible estimation potential of nonparametric smoothing. An intuitive approach for selecting the tuning parameters governing estimation is provided, which allows the proposed method to automatically determine both an appropriate level of flexibility and also the number of clusters to extract from a given data set. Experiments on a large collection of publicly available data sets are used to document the strong performance of the proposed approach, in comparison with relevant benchmarks from the literature. R code to implement the proposed approach is available from https://github.com/DavidHofmeyr/ CNS

nan

Article 1631

Title@2025-05-25 (7): Symmetries in Overparametrized Neural Networks: A Mean-Field View

Title: Symmetries in Overparametrized Neural Networks: A Mean-Field View

Symmetrien in überparametrisierten Neuralen Netzwerken: Eine Mittelfeldansicht

过度对称的神经神经网络的对称性:平均实地观点 2405.19995v3

Authors: Javier Maass, Joaquin Fontbona

We develop a Mean-Field (MF) view of the learning dynamics of overparametrized Artificial Neural Networks (NN) under data symmetric in law wrt the action of a general compact group $G$. We consider for this a class of generalized shallow NNs given by an ensemble of $N$ multi-layer units, jointly trained using stochastic gradient descent (SGD) and possibly symmetry-leveraging (SL) techniques, such as Data Augmentation (DA), Feature Averaging (FA) or Equivariant Architectures (EA). We introduce the notions of weakly and strongly invariant laws (WI and SI) on the parameter space of each single unit, corresponding, respectively, to $G$-invariant distributions, and to distributions supported on parameters fixed by the group action (which encode EA). This allows us to define symmetric models compatible with taking $N\to\infty$ and give an interpretation of the asymptotic dynamics of DA, FA and EA in terms of Wasserstein Gradient Flows describing their MF limits. When activations respect the group action, we show that, for symmetric data, DA, FA and freely-trained models obey the exact same MF dynamic, which stays in the space of WI laws and minimizes therein the population risk. We also give a counterexample to the general attainability of an optimum over SI laws. Despite this, quite remarkably, we show that the set of SI laws is also preserved by the MF dynamics even when freely trained. This sharply contrasts the finite-$N$ setting, in which EAs are generally not preserved by unconstrained SGD. We illustrate the validity of our findings as $N$ gets larger in a teacher-student experimental setting, training a student NN to learn from a WI, SI or arbitrary teacher model through various SL schemes. We last deduce a data-driven heuristic to discover the largest subspace of parameters supporting SI distributions for a problem, that could be used for designing EA with minimal generalization error.

nan

Article 1632

Title@2025-05-25 (7): Adaptive Cyclic Diffusion for Inference Scaling

Title: Adaptive Cyclic Diffusion for Inference Scaling

Adaptive zyklische Diffusion zur Inferenzskalierung

用于推断力缩放的适应性二次循环传播 2505.14036v2

Authors: Gyubin Lee, Truong Nhat Nguyen Bao, Jaesik Yoon, Dongwoo Lee, Minsu Kim, Yoshua Bengio, Sungjin Ahn

Diffusion models have demonstrated strong generative capabilities across domains ranging from image synthesis to complex reasoning tasks. However, most inference-time scaling methods rely on fixed denoising schedules, limiting their ability to allocate computation based on instance difficulty or task-specific demands adaptively. We introduce the challenge of adaptive inference-time scaling-dynamically adjusting computational effort during inference-and propose Adaptive Bi-directional Cyclic Diffusion (ABCD), a flexible, search-based inference framework. ABCD refines outputs through bi-directional diffusion cycles while adaptively controlling exploration depth and termination. It comprises three components: Cyclic Diffusion Search, Automatic Exploration-Exploitation Balancing, and Adaptive Thinking Time. Experiments show that ABCD improves performance across diverse tasks while maintaining computational efficiency.

nan

Article 1633

Title@2025-05-25 (7): SpeakStream: Streaming Text-to-Speech with Interleaved Data

Title: SpeakStream: Streaming Text-to-Speech with Interleaved Data

SpeakStream: Streaming von Text-zu-Speech mit interleaved Daten

语音Stream:用断开数据流流流文本到语音 2505.19206v1

Authors: Richard He Bai, Zijin Gu, Tatiana Likhomanenko, Navdeep Jaitly

The latency bottleneck of traditional text-to-speech (TTS) systems fundamentally hinders the potential of streaming large language models (LLMs) in conversational AI. These TTS systems, typically trained and inferenced on complete utterances, introduce unacceptable delays, even with optimized inference speeds, when coupled with streaming LLM outputs. This is particularly problematic for creating responsive conversational agents where low first-token latency is critical. In this paper, we present SpeakStream, a streaming TTS system that generates audio incrementally from streaming text using a decoder-only architecture. SpeakStream is trained using a next-step prediction loss on interleaved text-speech data. During inference, it generates speech incrementally while absorbing streaming input text, making it particularly suitable for cascaded conversational AI agents where an LLM streams text to a TTS system. Our experiments demonstrate that SpeakStream achieves state-of-the-art latency results in terms of first-token latency while maintaining the quality of non-streaming TTS systems.

nan

Article 1634

Title@2025-05-25 (7): Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety

Title: Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety

Benign Proben Materie! Feinabstimmung auf Aussergewöhnliche Benign Proben stark bricht Sicherheit

重大事件重大事件重大事件安全重大事件重大事件重大事件重大事件重大事件 2505.06843v2

Authors: Zihan Guan, Mengxuan Hu, Ronghang Zhu, Sheng Li, Anil Vullikanti

Recent studies have uncovered a troubling vulnerability in the fine-tuning stage of large language models (LLMs): even fine-tuning on entirely benign datasets can lead to a significant increase in the harmfulness of LLM outputs. Building on this finding, our red teaming study takes this threat one step further by developing a more effective attack. Specifically, we analyze and identify samples within benign datasets that contribute most to safety degradation, then fine-tune LLMs exclusively on these samples. We approach this problem from an outlier detection perspective and propose Self-Inf-N, to detect and extract outliers for fine-tuning. Our findings reveal that fine-tuning LLMs on 100 outlier samples selected by Self-Inf-N in the benign datasets severely compromises LLM safety alignment. Extensive experiments across seven mainstream LLMs demonstrate that our attack exhibits high transferability across different architectures and remains effective in practical scenarios. Alarmingly, our results indicate that most existing mitigation strategies fail to defend against this attack, underscoring the urgent need for more robust alignment safeguards. Codes are available at https://github.com/GuanZihan/Benign-Samples-Matter.

nan

Article 1635

Title@2025-05-25 (7): FedGuCci: Making Local Models More Connected in Landscape for Federated Learning

Title: FedGuCci: Making Local Models More Connected in Landscape for Federated Learning

FedGuCci: Lokale Modelle in der Landschaft für das Federated Learning stärker miteinander verbunden

FedGuCci:使地方模型在全局景观中更紧密地连接起来,促进联邦学习 2402.18949v3

Authors: Zexi Li, Jie Lin, Zhiqi Li, Didi Zhu, Tao Shen, Tao Lin, Chao Wu, Nicholas D. Lane

Federated learning (FL) involves multiple heterogeneous clients collaboratively training a global model via iterative local updates and model fusion. The generalization of FL’s global model has a large gap compared with centralized training, which is its bottleneck for broader applications. In this paper, we study and improve FL’s generalization through a fundamental connectivity'' perspective, which means how the local models are connected in the parameter region and fused into a generalized global model. The termconnectivity’’ is derived from linear mode connectivity (LMC), studying the interpolated loss landscape of two different solutions (e.g., modes) of neural networks. Bridging the gap between LMC and FL, in this paper, we leverage fixed anchor models to empirically and theoretically study the transitivity property of connectivity from two models (LMC) to a group of models (model fusion in FL). Based on the findings, we propose FedGuCci(+), improving group connectivity for better generalization. It is shown that our methods can boost the generalization of FL under client heterogeneity across various tasks (4 CV datasets and 6 NLP datasets) and model architectures (e.g., ViTs and PLMs). The code is available here: \href{https://github.com/ZexiLee/fedgucci}{\faGithub~FedGuCci Codebase}.

nan

Article 1636

Title@2025-05-25 (7): iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use

Title: iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use

iTool: Verstärkte Feinsteuerung mit dynamischer Kalibrierung bei fortgeschrittenem Werkzeugeinsatz

i Tool:加强先进工具使用动态缺乏度校准的精细测试 2501.09766v4

Authors: Yirong Zeng, Xiao Ding, Yuxian Wang, Weiwen Liu, Wu Ning, Yutai Hou, Xu Huang, Bing Qin, Ting Liu

Augmenting large language models (LLMs) with external tools is a promising approach to enhance their capabilities, especially for complex tasks. Synthesizing tool-use data through real-world simulations is an effective way to achieve this. However, our investigation reveals that training gains significantly decay as synthetic data increases. The model struggles to benefit from more synthetic data, and it can not equip the model with advanced tool-use capabilities in complex scenarios. Moreover, we discovered that the above limitation usually manifests as a fragment deficiency (i.e., parameter errors) in response. To this end, we propose an iterative reinforced fine-tuning strategy designed to alleviate this limitation. This strategy involves: (1) enhancing the diversity of response for synthetic data through path exploration of Monte Carlo Tree Search. (2) iteratively pinpointing the model’s deficiency by constructing fine-grained preference pairs, and then improving it by preference optimization algorithms for targeted improvement. The experiments show that our method achieves 13.11% better performance than the same-size base model. It achieves an improvement of 6.5% in complex scenarios compared to the baseline, and it also outperforms larger open-source and closed-source models.

nan

Article 1637

Title@2025-05-25 (7): Diffusion Instruction Tuning

Title: Diffusion Instruction Tuning

Diffusions-Anleitung Tuning

传播指示图 2502.06814v2

Authors: Chen Jin, Ryutaro Tanno, Amrutha Saseendran, Tom Diethe, Philip Teare

We introduce Lavender, a simple supervised fine-tuning (SFT) method that boosts the performance of advanced vision-language models (VLMs) by leveraging state-of-the-art image generation models such as Stable Diffusion. Specifically, Lavender aligns the text-vision attention in the VLM transformer with the equivalent used by Stable Diffusion during SFT, instead of adapting separate encoders. This alignment enriches the model’s visual understanding and significantly boosts performance across in- and out-of-distribution tasks. Lavender requires just 0.13 million training examples, 2.5% of typical large-scale SFT datasets, and fine-tunes on standard hardware (8 GPUs) in a single day. It consistently improves state-of-the-art open-source multimodal LLMs (e.g., Llama-3.2-11B, MiniCPM-Llama3-v2.5), achieving up to 30% gains and a 68% boost on challenging out-of-distribution medical QA tasks. By efficiently transferring the visual expertise of image generators with minimal supervision, Lavender offers a scalable solution for more accurate vision-language systems. All code, training data, and models will be shared at https://astrazeneca.github.io/vlm/.

nan

Article 1638

Title@2025-05-25 (7): Curvature Dynamic Black-box Attack: revisiting adversarial robustness via dynamic curvature estimation

Title: Curvature Dynamic Black-box Attack: revisiting adversarial robustness via dynamic curvature estimation

Krümmung Dynamischer Black-Box-Angriff: Wiederherstellung der gegnerischen Robustheit durch dynamische Krümmungsschätzung

曲线动态黑盒攻击: 通过动态曲线估计, 重新审视对抗性对称稳健性 2505.19194v1

Authors: Peiran Sun

Adversarial attack reveals the vulnerability of deep learning models. For about a decade, countless attack and defense methods have been proposed, leading to robustified classifiers and better understanding of models. Among these methods, curvature-based approaches have attracted attention because it is assumed that high curvature may give rise to rough decision boundary. However, the most commonly used \textit{curvature} is the curvature of loss function, scores or other parameters from within the model as opposed to decision boundary curvature, since the former can be relatively easily formed using second order derivative. In this paper, we propose a new query-efficient method, dynamic curvature estimation(DCE), to estimate the decision boundary curvature in a black-box setting. Our approach is based on CGBA, a black-box adversarial attack. By performing DCE on a wide range of classifiers, we discovered, statistically, a connection between decision boundary curvature and adversarial robustness. We also propose a new attack method, curvature dynamic black-box attack(CDBA) with improved performance using the dynamically estimated curvature.

nan

Article 1639

Title@2025-05-25 (7): Interpretable Graph Learning Over Sets of Temporally-Sparse Data

Title: Interpretable Graph Learning Over Sets of Temporally-Sparse Data

Interpretable Graph Learning Over Sets von temporär-Spardaten

一组暂时分隔数据上的解释性图表学习 2505.19193v1

Authors: Andrea Zerio, Maya Bechler-Speicher, Maor Huri, Marie Vibeke Vestergaard, Ran Gilad-Bachrach, Tine Jess, Samir Bhatt, Aleksejs Sazonovs

Real-world medical data often includes measurements from multiple signals that are collected at irregular and asynchronous time intervals. For example, different types of blood tests can be measured at different times and frequencies, resulting in fragmented and unevenly scattered temporal data. Similar issues of irregular sampling of different attributes occur in other domains, such as monitoring of large systems using event log files or the spread of fake news on social networks. Effectively learning from such data requires models that can handle sets of temporally sparse and heterogeneous signals. In this paper, we propose Graph Mixing Additive Networks (GMAN), a novel and interpretable-by-design model for learning over irregular sets of temporal signals. Our method achieves state-of-the-art performance in real-world medical tasks, including a 4-point increase in the AUROC score of in-hospital mortality prediction, compared to existing methods. We further showcase GMAN’s flexibility by applying it to a fake news detection task. We demonstrate how its interpretability capabilities, including node-level, graph-level, and subset-level importance, allow for transition phases detection and gaining medical insights with real-world high-stakes implications. Finally, we provide theoretical insights on GMAN expressive power.

nan

Article 1640

Title@2025-05-25 (7): I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts

Title: I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts

I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts

I2MoE:可解释的多式多式互动意识混合企业专家 2505.19190v1

Authors: Jiayi Xin, Sukwon Yun, Jie Peng, Inyoung Choi, Jenna L. Ballard, Tianlong Chen, Qi Long

Modality fusion is a cornerstone of multimodal learning, enabling information integration from diverse data sources. However, vanilla fusion methods are limited by (1) inability to account for heterogeneous interactions between modalities and (2) lack of interpretability in uncovering the multimodal interactions inherent in the data. To this end, we propose I2MoE (Interpretable Multimodal Interaction-aware Mixture of Experts), an end-to-end MoE framework designed to enhance modality fusion by explicitly modeling diverse multimodal interactions, as well as providing interpretation on a local and global level. First, I2MoE utilizes different interaction experts with weakly supervised interaction losses to learn multimodal interactions in a data-driven way. Second, I2MoE deploys a reweighting model that assigns importance scores for the output of each interaction expert, which offers sample-level and dataset-level interpretation. Extensive evaluation of medical and general multimodal datasets shows that I2MoE is flexible enough to be combined with different fusion techniques, consistently improves task performance, and provides interpretation across various real-world scenarios. Code is available at https://github.com/Raina-Xin/I2MoE.

nan

Article 1641

Title@2025-05-25 (7): Chordless Structure: A Pathway to Simple and Expressive GNNs

Title: Chordless Structure: A Pathway to Simple and Expressive GNNs

Chordless Structure: Ein Weg zu einfachen und expressiven GNNs

无字结构:通往简单和表达性全球NNN的路径 2505.19188v1

Authors: Hongxu Pan, Shuxian Hu, Mo Zhou, Zhibin Wang, Rong Gu, Chen Tian, Kun Yang, Sheng Zhong

Researchers have proposed various methods of incorporating more structured information into the design of Graph Neural Networks (GNNs) to enhance their expressiveness. However, these methods are either computationally expensive or lacking in provable expressiveness. In this paper, we observe that the chords increase the complexity of the graph structure while contributing little useful information in many cases. In contrast, chordless structures are more efficient and effective for representing the graph. Therefore, when leveraging the information of cycles, we choose to omit the chords. Accordingly, we propose a Chordless Structure-based Graph Neural Network (CSGNN) and prove that its expressiveness is strictly more powerful than the k-hop GNN (KPGNN) with polynomial complexity. Experimental results on real-world datasets demonstrate that CSGNN outperforms existing GNNs across various graph tasks while incurring lower computational costs and achieving better performance than the GNNs of 3-WL expressiveness.

nan

Article 1642

Title@2025-05-25 (7): Heterogeneous networks in drug-target interaction prediction

Title: Heterogeneous networks in drug-target interaction prediction

Heterogene Netzwerke in der Vorhersage von Wechselwirkungen mit Drogenzielen

药物目标相互作用预测中的不同类型网络 2504.16152v2

Authors: Mohammad Molaee, Nasrollah Moghadam Charkari, Foad Ghaderi

Drug discovery requires a tremendous amount of time and cost. Computational drug-target interaction prediction, a significant part of this process, can reduce these requirements by narrowing the search space for wet lab experiments. In this survey, we provide comprehensive details of graph machine learning-based methods in predicting drug-target interaction, as they have shown promising results in this field. These details include the overall framework, main contribution, datasets, and their source codes. The selected papers were mainly published from 2020 to 2024. Prior to discussing papers, we briefly introduce the datasets commonly used with these methods and measurements to assess their performance. Finally, future challenges and some crucial areas that need to be explored are discussed.

nan

Article 1643

Title@2025-05-25 (7): A Physics-preserved Transfer Learning Method for Differential Equations

Title: A Physics-preserved Transfer Learning Method for Differential Equations

Eine physikkonservierte Transfer-Lernmethode für Differentialgleichungen

不同等分法的受物理保留转移学习方法 2505.01281v2

Authors: Hao-Ran Yang, Chuan-Xian Ren

While data-driven methods such as neural operator have achieved great success in solving differential equations (DEs), they suffer from domain shift problems caused by different learning environments (with data bias or equation changes), which can be alleviated by transfer learning (TL). However, existing TL methods adopted in DEs problems lack either generalizability in general DEs problems or physics preservation during training. In this work, we focus on a general transfer learning method that adaptively correct the domain shift and preserve physical information. Mathematically, we characterize the data domain as product distribution and the essential problems as distribution bias and operator bias. A Physics-preserved Optimal Tensor Transport (POTT) method that simultaneously admits generalizability to common DEs and physics preservation of specific problem is proposed to adapt the data-driven model to target domain utilizing the push-forward distribution induced by the POTT map. Extensive experiments demonstrate the superior performance, generalizability and physics preservation of the proposed POTT method.

nan

Article 1644

Title@2025-05-25 (7): CAGES: Cost-Aware Gradient Entropy Search for Efficient Local Multi-Fidelity Bayesian Optimization

Title: CAGES: Cost-Aware Gradient Entropy Search for Efficient Local Multi-Fidelity Bayesian Optimization

CAGES: Kostenbewusste Gradienten-Entropie Suche nach effizienter lokaler Multi-Fidelity Bayesian-Optimierung

CAGES: 成本-软件软件渐进式 Entropy 搜索以高效的本地多纤维贝叶斯优化 2405.07760v2

Authors: Wei-Ting Tang, Joel A. Paulson

Bayesian optimization (BO) is a popular approach for optimizing expensive-to-evaluate black-box objective functions. An important challenge in BO is its application to high-dimensional search spaces due in large part to the curse of dimensionality. One way to overcome this challenge is to focus on local BO methods that aim to efficiently learn gradients, which have shown strong empirical performance on high-dimensional problems including policy search in reinforcement learning (RL). Current local BO methods assume access to only a single high-fidelity information source whereas, in many problems, one has access to multiple cheaper approximations of the objective. We propose a novel algorithm, Cost-Aware Gradient Entropy Search (CAGES), for local BO of multi-fidelity black-box functions. CAGES makes no assumption about the relationship between different information sources, making it more flexible than other multi-fidelity methods. It also employs a new information-theoretic acquisition function, which enables systematic identification of samples that maximize the information gain about the unknown gradient per evaluation cost. We demonstrate CAGES can achieve significant performance improvements compared to other state-of-the-art methods on synthetic and benchmark RL problems.

nan

Article 1645

Title@2025-05-25 (7): Federated Learning: From Theory to Practice

Title: Federated Learning: From Theory to Practice

Föderiertes Lernen: Von der Theorie zur Praxis

联邦学习:从理论到实践 2505.19183v1

Authors: A. Jung

This book offers a hands-on introduction to building and understanding federated learning (FL) systems. FL enables multiple devices – such as smartphones, sensors, or local computers – to collaboratively train machine learning (ML) models, while keeping their data private and local. It is a powerful solution when data cannot or should not be centralized due to privacy, regulatory, or technical reasons. The book is designed for students, engineers, and researchers who want to learn how to design scalable, privacy preserving FL systems. Our main focus is on personalization: enabling each device to train its own model while still benefiting from collaboration with relevant devices. This is achieved by leveraging similarities between (the learning tasks associated with) devices that are encoded by the weighted edges (or links) of a federated learning network (FL network). The key idea is to represent real-world FL systems as networks of devices, where nodes correspond to device and edges represent communication links and data similarities between them. The training of personalized models for these devices can be naturally framed as a distributed optimization problem. This optimization problem is referred to as generalized total variation minimization (GTVMin) and ensures that devices with similar learning tasks learn similar model parameters. Our approach is both mathematically principled and practically motivated. While we introduce some advanced ideas from optimization theory and graph-based learning, we aim to keep the book accessible. Readers are guided through the core ideas step by step, with intuitive explanations.

nan

Article 1646

Title@2025-05-25 (7): DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Title: DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

DiTAR: Diffusion Transformer Autoregressive Modellierung für Sprachgenerierung

DITAR: 发声的传播变异器自动递减模型 2502.03930v3

Authors: Dongya Jia, Zhuo Chen, Jiawei Chen, Chenpeng Du, Jian Wu, Jian Cong, Xiaobin Zhuang, Chumin Li, Zhen Wei, Yuping Wang, Yuxuan Wang

Several recent studies have attempted to autoregressively generate continuous speech representations without discrete speech tokens by combining diffusion and autoregressive models, yet they often face challenges with excessive computational loads or suboptimal outcomes. In this work, we propose Diffusion Transformer Autoregressive Modeling (DiTAR), a patch-based autoregressive framework combining a language model with a diffusion transformer. This approach significantly enhances the efficacy of autoregressive models for continuous tokens and reduces computational demands. DiTAR utilizes a divide-and-conquer strategy for patch generation, where the language model processes aggregated patch embeddings and the diffusion transformer subsequently generates the next patch based on the output of the language model. For inference, we propose defining temperature as the time point of introducing noise during the reverse diffusion ODE to balance diversity and determinism. We also show in the extensive scaling analysis that DiTAR has superb scalability. In zero-shot speech generation, DiTAR achieves state-of-the-art performance in robustness, speaker similarity, and naturalness.

nan

Article 1647

Title@2025-05-25 (7): Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees

Title: Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees

Auf dem Weg zu Graph Foundation Models: Allgemeines Lernen über Graphen über Task-Trees

走向图图基础模型:通过TLT-Trees对图的学习概观 2412.16441v3

Authors: Zehong Wang, Zheyuan Zhang, Tianyi Ma, Nitesh V Chawla, Chuxu Zhang, Yanfang Ye

Foundation models are pretrained on large-scale corpora to learn generalizable patterns across domains and tasks – such as contours, textures, and edges in images, or tokens and sentences in text. In contrast, discovering such generalities in graph-structured data, especially across heterogeneous graph tasks, remains an open challenge. To address this, we propose a novel approach to cross-task generalization in graphs via task-trees, which serve as unified learning instances aligning node-, edge-, and graph-level tasks. We theoretically analyze the stability, transferability, and generalization properties of task-trees, showing that pretraining a graph neural network (GNN) on diverse task-trees with a reconstruction objective induces transferable knowledge. This enables efficient adaptation to downstream tasks with minimal fine-tuning. To validate our framework, we introduce Graph Generality Identifier on Task-Trees (GIT), a graph foundation model that demonstrates strong performance on over 30 graphs across five domains via fine-tuning, in-context learning, and zero-shot generalization. Code and data are available at https://github.com/Zehong-Wang/GIT.

nan

Article 1648

Title@2025-05-25 (7): Nteasee: Understanding Needs in AI for Health in Africa – A Mixed-Methods Study of Expert and General Population Perspectives

Title: Nteasee: Understanding Needs in AI for Health in Africa – A Mixed-Methods Study of Expert and General Population Perspectives

Nteasee: Die Bedürfnisse von KI für die Gesundheit in Afrika verstehen – Eine gemischte Studie von Experten und allgemeinen Bevölkerungsperspektiven

Nteasee:了解大赦国际关于非洲保健的需要 – – 专家和一般人口观点混合方法研究 2409.12197v4

Authors: Mercy Nyamewaa Asiedu, Iskandar Haykel, Awa Dieng, Kerrie Kauer, Tousif Ahmed, Florence Ofori, Charisma Chan, Stephen Pfohl, Negar Rostamzadeh, Katherine Heller

Artificial Intelligence (AI) for health has the potential to significantly change and improve healthcare. However in most African countries, identifying culturally and contextually attuned approaches for deploying these solutions is not well understood. To bridge this gap, we conduct a qualitative study to investigate the best practices, fairness indicators, and potential biases to mitigate when deploying AI for health in African countries, as well as explore opportunities where artificial intelligence could make a positive impact in health. We used a mixed methods approach combining in-depth interviews (IDIs) and surveys. We conduct 1.5-2 hour long IDIs with 50 experts in health, policy, and AI across 17 countries, and through an inductive approach we conduct a qualitative thematic analysis on expert IDI responses. We administer a blinded 30-minute survey with case studies to 672 general population participants across 5 countries in Africa and analyze responses on quantitative scales, statistically comparing responses by country, age, gender, and level of familiarity with AI. We thematically summarize open-ended responses from surveys. Our results find generally positive attitudes, high levels of trust, accompanied by moderate levels of concern among general population participants for AI usage for health in Africa. This contrasts with expert responses, where major themes revolved around trust/mistrust, ethical concerns, and systemic barriers to integration, among others. This work presents the first-of-its-kind qualitative research study of the potential of AI for health in Africa from an algorithmic fairness angle, with perspectives from both experts and the general population. We hope that this work guides policymakers and drives home the need for further research and the inclusion of general population perspectives in decision-making around AI usage.

nan

Article 1649

Title@2025-05-25 (7): Beyond Message Passing: Neural Graph Pattern Machine

Title: Beyond Message Passing: Neural Graph Pattern Machine

Beyond Message Passing: Neural Graph Pattern Machine

超过消息传递: 神经图样机 2501.18739v2

Authors: Zehong Wang, Zheyuan Zhang, Tianyi Ma, Nitesh V Chawla, Chuxu Zhang, Yanfang Ye

Graph learning tasks often hinge on identifying key substructure patterns – such as triadic closures in social networks or benzene rings in molecular graphs – that underpin downstream performance. However, most existing graph neural networks (GNNs) rely on message passing, which aggregates local neighborhood information iteratively and struggles to explicitly capture such fundamental motifs, like triangles, k-cliques, and rings. This limitation hinders both expressiveness and long-range dependency modeling. In this paper, we introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures. GPM efficiently extracts, encodes, and prioritizes task-relevant graph patterns, offering greater expressivity and improved ability to capture long-range dependencies. Empirical evaluations across four standard tasks – node classification, link prediction, graph classification, and graph regression – demonstrate that GPM outperforms state-of-the-art baselines. Further analysis reveals that GPM exhibits strong out-of-distribution generalization, desirable scalability, and enhanced interpretability. Code and datasets are available at: https://github.com/Zehong-Wang/GPM.

nan

Article 1650

Title@2025-05-25 (7): Saliency-guided Emotion Modeling: Predicting Viewer Reactions from Video Stimuli

Title: Saliency-guided Emotion Modeling: Predicting Viewer Reactions from Video Stimuli

Saliency-guided Emotion Modeling: Vorhersage von Zuschauerreaktionen aus Video-Stimuli

以色素为指导的情感建模:视频刺激的预测查看器反应 2505.19178v1

Authors: Akhila Yaragoppa, Siddharth

Understanding the emotional impact of videos is crucial for applications in content creation, advertising, and Human-Computer Interaction (HCI). Traditional affective computing methods rely on self-reported emotions, facial expression analysis, and biosensing data, yet they often overlook the role of visual saliency – the naturally attention-grabbing regions within a video. In this study, we utilize deep learning to introduce a novel saliency-based approach to emotion prediction by extracting two key features: saliency area and number of salient regions. Using the HD2S saliency model and OpenFace facial action unit analysis, we examine the relationship between video saliency and viewer emotions. Our findings reveal three key insights: (1) Videos with multiple salient regions tend to elicit high-valence, low-arousal emotions, (2) Videos with a single dominant salient region are more likely to induce low-valence, high-arousal responses, and (3) Self-reported emotions often misalign with facial expression-based emotion detection, suggesting limitations in subjective reporting. By leveraging saliency-driven insights, this work provides a computationally efficient and interpretable alternative for emotion modeling, with implications for content creation, personalized media experiences, and affective computing research.

nan

Article 1651

Title@2025-05-25 (7): Mixture of Lookup Experts

Title: Mixture of Lookup Experts

Mischung von Lookup-Experten

查找专家混合 2503.15798v2

Authors: Shibo Jie, Yehui Tang, Kai Han, Yitong Li, Duyu Tang, Zhi-Hong Deng, Yunhe Wang

Mixture-of-Experts (MoE) activates only a subset of experts during inference, allowing the model to maintain low inference FLOPs and latency even as the parameter count scales up. However, since MoE dynamically selects the experts, all the experts need to be loaded into VRAM. Their large parameter size still limits deployment, and offloading, which load experts into VRAM only when needed, significantly increase inference latency. To address this, we propose Mixture of Lookup Experts (MoLE), a new MoE architecture that is efficient in both communication and VRAM usage. In MoLE, the experts are Feed-Forward Networks (FFNs) during training, taking the output of the embedding layer as input. Before inference, these experts can be re-parameterized as lookup tables (LUTs) that retrieves expert outputs based on input ids, and offloaded to storage devices. Therefore, we do not need to perform expert computations during inference. Instead, we directly retrieve the expert’s computation results based on input ids and load them into VRAM, and thus the resulting communication overhead is negligible. Experiments show that, with the same FLOPs and VRAM usage, MoLE achieves inference speeds comparable to dense models and significantly faster than MoE with experts offloading, while maintaining performance on par with MoE.

nan

Article 1652

Title@2025-05-25 (7): Computational Inertia as a Conserved Quantity in Frictionless and Damped Learning Dynamics

Title: Computational Inertia as a Conserved Quantity in Frictionless and Damped Learning Dynamics

Computational Inertia als konservierte Menge in friktionsloser und gedämpfter Lerndynamik

计算无损和断裂学习动力学的计算因电量 2505.19171v1

Authors: Atahan Karagoz

We identify a conserved quantity in continuous-time optimization dynamics, termed computational inertia. Defined as the sum of kinetic energy (parameter velocity) and potential energy (loss), this scalar remains invariant under idealized, frictionless training. We formalize this conservation law, derive its analytic decay under damping and stochastic perturbations, and demonstrate its behavior in a synthetic system. The invariant offers a compact lens for interpreting learning trajectories, and may inform theoretical tools for analyzing convergence, stability, and training geometry.

nan

Article 1653

Title@2025-05-25 (7): JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models

Title: JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models

JEDI: Die Macht der Jensen-Shannon-Divergenz bei entwirrenden Diffusionsmodellen

JEDI: 詹森-夏农分解扩散模型的分解力量 2505.19166v1

Authors: Eric Tillmann Bill, Enis Simsar, Thomas Hofmann

We introduce JEDI, a test-time adaptation method that enhances subject separation and compositional alignment in diffusion models without requiring retraining or external supervision. JEDI operates by minimizing semantic entanglement in attention maps using a novel Jensen-Shannon divergence based objective. To improve efficiency, we leverage adversarial optimization, reducing the number of updating steps required. JEDI is model-agnostic and applicable to architectures such as Stable Diffusion 1.5 and 3.5, consistently improving prompt alignment and disentanglement in complex scenes. Additionally, JEDI provides a lightweight, CLIP-free disentanglement score derived from internal attention distributions, offering a principled benchmark for compositional alignment under test-time conditions. We will publicly release the implementation of our method.

nan

Article 1654

Title@2025-05-25 (7): CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter

Title: CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter

CORAL: Lerne konsistente Repräsentationen über mehrstufiges Training mit leichterem spekulativen Entwurfer

CORAL: 利用轻型投机性起草者在多阶段培训中学习一致的代表性 2502.16880v3

Authors: Yepeng Weng, Dianwen Mei, Huishi Qiu, Xujie Chen, Li Liu, Jiang Tian, Zhongchao Shi

Speculative decoding is a powerful technique that accelerates Large Language Model (LLM) inference by leveraging a lightweight speculative draft model. However, existing designs suffers in performance due to misalignment between training and inference. Recent methods have tried to solve this issue by adopting a multi-step training strategy, but the complex inputs of different training steps make it harder for the draft model to converge. To address this, we propose CORAL, a novel framework that improves both accuracy and efficiency in speculative drafting. CORAL introduces Cross-Step Representation Alignment, a method that enhances consistency across multiple training steps, significantly improving speculative drafting performance. Additionally, we identify the LM head as a major bottleneck in the inference speed of the draft model. We introduce a weight-grouping mechanism that selectively activates a subset of LM head parameters during inference, substantially reducing the latency of the draft model. We evaluate CORAL on three LLM families and three benchmark datasets, achieving speedup ratios of 2.50x-4.07x, outperforming state-of-the-art methods such as EAGLE-2 and HASS. Our results demonstrate that CORAL effectively mitigates training-inference misalignment and delivers significant speedup for modern LLMs with large vocabularies.

nan

Article 1655

Title@2025-05-25 (7): Efficient Training of Multi-task Neural Solver for Combinatorial Optimization

Title: Efficient Training of Multi-task Neural Solver for Combinatorial Optimization

Effiziente Schulung von Multi-Task-Neural Solver zur kombinatorischen Optimierung

综合优化多任务神经溶剂高效培训 2305.06361v5

Authors: Chenguang Wang, Zhang-Hua Fu, Pinyan Lu, Tianshu Yu

Efficiently training a multi-task neural solver for various combinatorial optimization problems (COPs) has been less studied so far. Naive application of conventional multi-task learning approaches often falls short in delivering a high-quality, unified neural solver. This deficiency primarily stems from the significant computational demands and a lack of adequate consideration for the complexities inherent in COPs. In this paper, we propose a general and efficient training paradigm to deliver a unified combinatorial multi-task neural solver. To this end, we resort to the theoretical loss decomposition for multiple tasks under an encoder-decoder framework, which enables more efficient training via proper bandit task-sampling algorithms through an intra-task influence matrix. By employing theoretically grounded approximations, our method significantly enhances overall performance, regardless of whether it is within constrained training budgets, across equivalent training epochs, or in terms of generalization capabilities, when compared to conventional training schedules. On the real-world datasets of TSPLib and CVRPLib, our method also achieved the best results compared to single task learning and multi-task learning approaches. Additionally, the influence matrix provides empirical evidence supporting common practices in the field of learning to optimize, further substantiating the effectiveness of our approach. Our code is open-sourced and available at https://github.com/LOGO-CUHKSZ/MTL-COP.

nan

Article 1656

Title@2025-05-25 (7): Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation

Title: Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation

Divide-Then-Aggregat: Eine effiziente Tool-Learning-Methode über parallele Tool-Invokation

分离后生成工具:通过平行工具使用使用效率高的工具学习方法 2501.12432v2

Authors: Dongsheng Zhu, Weixian Shi, Zhengliang Shi, Zhaochun Ren, Shuaiqiang Wang, Lingyong Yan, Dawei Yin

Although current Large Language Models (LLMs) exhibit impressive capabilities, performing complex real-world tasks still requires tool learning. Mainstream methods, such as CoT/ReAct, rely on step-by-step tool invocation to interact with external environments, but they are limited in perceptual scope and lack adequate task-planning capability. To address these limitations, other studies introduce the first Search-based Decision Tree (DFSDT), which still suffers from the high computational cost. In this paper, we introduce a novel parallel tool invocation paradigm, DTA-Llama (Divide-Then-Aggregate Llama). First, we transform traditional tree-based tool search paths into Directed Acyclic Graph (DAG) structure, generating a high-quality parallel tool invocation dataset. The DTA-Llama is then trained on the dataset to learn to iteratively divide the current task into several parallel tool invocation sub-tasks and aggregate the invocation results to decide the next actions. Furthermore, we introduce an efficient inference framework inspired by the Process/Threads mechanism when applying the DTA-Llama to practical tasks. Experimental results show that our approach substantially enhances task performance while reducing token consumption and inference time. Llama2-7B, using our method, is comparable to the official parallel function calling method of GPT-3.5. The relevant code, dataset, and model weights are available at https://corn0205.github.io/

nan

Article 1657

Title@2025-05-25 (7): Mean-Shift Distillation for Diffusion Mode Seeking

Title: Mean-Shift Distillation for Diffusion Mode Seeking

Mean-Shift-Destillation für den Diffusionsmodus

用于扩散模式搜索的中质蒸馏 2502.15989v2

Authors: Vikas Thamizharasan, Nikitas Chatzis, Iliyan Georgiev, Matthew Fisher, Evangelos Kalogerakis, Difan Liu, Nanxuan Zhao, Michal Lukac

We present mean-shift distillation, a novel diffusion distillation technique that provides a provably good proxy for the gradient of the diffusion output distribution. This is derived directly from mean-shift mode seeking on the distribution, and we show that its extrema are aligned with the modes. We further derive an efficient product distribution sampling procedure to evaluate the gradient. Our method is formulated as a drop-in replacement for score distillation sampling (SDS), requiring neither model retraining nor extensive modification of the sampling procedure. We show that it exhibits superior mode alignment as well as improved convergence in both synthetic and practical setups, yielding higher-fidelity results when applied to both text-to-image and text-to-3D applications with Stable Diffusion.

nan

Article 1658

Title@2025-05-25 (7): Do Large Language Models (Really) Need Statistical Foundations?

Title: Do Large Language Models (Really) Need Statistical Foundations?

Brauchen große Sprachmodelle (wirklich) statistische Grundlagen?

大语言模式(真正)是否需要统计基础? 2505.19145v1

Authors: Weijie Su

Large language models (LLMs) represent a new paradigm for processing unstructured data, with applications across an unprecedented range of domains. In this paper, we address, through two arguments, whether the development and application of LLMs would genuinely benefit from foundational contributions from the statistics discipline. First, we argue affirmatively, beginning with the observation that LLMs are inherently statistical models due to their profound data dependency and stochastic generation processes, where statistical insights are naturally essential for handling variability and uncertainty. Second, we argue that the persistent black-box nature of LLMs – stemming from their immense scale, architectural complexity, and development practices often prioritizing empirical performance over theoretical interpretability – renders closed-form or purely mechanistic analyses generally intractable, thereby necessitating statistical approaches due to their flexibility and often demonstrated effectiveness. To substantiate these arguments, the paper outlines several research areas – including alignment, watermarking, uncertainty quantification, evaluation, and data mixture optimization – where statistical methodologies are critically needed and are already beginning to make valuable contributions. We conclude with a discussion suggesting that statistical research concerning LLMs will likely form a diverse ``mosaic’’ of specialized topics rather than deriving from a single unifying theory, and highlighting the importance of timely engagement by our statistics community in LLM research.

nan

Article 1659

Title@2025-05-25 (7): ADGSyn: Dual-Stream Learning for Efficient Anticancer Drug Synergy Prediction

Title: ADGSyn: Dual-Stream Learning for Efficient Anticancer Drug Synergy Prediction

ADGSyn: Dual-Stream-Lernen für effiziente Anti-Krebs-Arzneimittel-Synergie-Vorhersage

ADGSyn:双层学习促进高效抗癌药物协同效应预测 2505.19144v1

Authors: Yuxuan Nie, Yutong Song, Hong Peng

Drug combinations play a critical role in cancer therapy by significantly enhancing treatment efficacy and overcoming drug resistance. However, the combinatorial space of possible drug pairs grows exponentially, making experimental screening highly impractical. Therefore, developing efficient computational methods to predict promising drug combinations and guide experimental validation is of paramount importance. In this work, we propose ADGSyn, an innovative method for predicting drug synergy. The key components of our approach include: (1) shared projection matrices combined with attention mechanisms to enable cross-drug feature alignment; (2) automatic mixed precision (AMP)-optimized graph operations that reduce memory consumption by 40\% while accelerating training speed threefold; and (3) residual pathways stabilized by LayerNorm to ensure stable gradient propagation during training. Evaluated on the O’Neil dataset containing 13,243 drug–cell line combinations, ADGSyn demonstrates superior performance over eight baseline methods. Moreover, the framework supports full-batch processing of up to 256 molecular graphs on a single GPU, setting a new standard for efficiency in drug synergy prediction within the field of computational oncology.

nan

Article 1660

Title@2025-05-25 (7): AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

Title: AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering durch Verstärkungslernen

AdaCot:通过强化学习开拓探索的探索链 2505.11896v2

Authors: Chenwei Lou, Zewei Sun, Xinnian Liang, Meng Qu, Wei Shen, Wenqi Wang, Yuntao Li, Qingping Yang, Shuangzhi Wu

Large Language Models (LLMs) have demonstrated remarkable capabilities but often face challenges with tasks requiring sophisticated reasoning. While Chain-of-Thought (CoT) prompting significantly enhances reasoning, it indiscriminately generates lengthy reasoning steps for all queries, leading to substantial computational costs and inefficiency, especially for simpler inputs. To address this critical issue, we introduce AdaCoT (Adaptive Chain-of-Thought), a novel framework enabling LLMs to adaptively decide when to invoke CoT. AdaCoT framed adaptive reasoning as a Pareto optimization problem that seeks to balance model performance with the costs associated with CoT invocation (both frequency and computational overhead). We propose a reinforcement learning (RL) based method, specifically utilizing Proximal Policy Optimization (PPO), to dynamically control the CoT triggering decision boundary by adjusting penalty coefficients, thereby allowing the model to determine CoT necessity based on implicit query complexity. A key technical contribution is Selective Loss Masking (SLM), designed to counteract decision boundary collapse during multi-stage RL training, ensuring robust and stable adaptive triggering. Experimental results demonstrate that AdaCoT successfully navigates the Pareto frontier, achieving substantial reductions in CoT usage for queries not requiring elaborate reasoning. For instance, on our production traffic testset, AdaCoT reduced CoT triggering rates to as low as 3.18\% and decreased average response tokens by 69.06%, while maintaining high performance on complex tasks.

nan

Article 1661

Title@2025-05-25 (7): CER: Confidence Enhanced Reasoning in LLMs

Title: CER: Confidence Enhanced Reasoning in LLMs

CER: Vertrauen in LLMs gestärkte Vernunft

CER: LLM 中增强信任的理由 2502.14634v2

Authors: Ali Razghandi, Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah

Ensuring the reliability of Large Language Models (LLMs) in complex reasoning tasks remains a formidable challenge, particularly in scenarios that demand precise mathematical calculations and knowledge-intensive open-domain generation. In this work, we introduce an uncertainty-aware framework designed to enhance the accuracy of LLM responses by systematically incorporating model confidence at critical decision points. We propose an approach that encourages multi-step reasoning in LLMs and quantify the confidence of intermediate answers such as numerical results in mathematical reasoning and proper nouns in open-domain generation. Then, the overall confidence of each reasoning chain is evaluated based on confidence of these critical intermediate steps. Finally, we aggregate the answer of generated response paths in a way that reflects the reliability of each generated content (as opposed to self-consistency in which each generated chain contributes equally to majority voting). We conducted extensive experiments in five datasets, three mathematical datasets and two open-domain datasets, using four LLMs. The results consistently validate the effectiveness of our novel confidence aggregation method, leading to an accuracy improvement of up to 7.4% and 5.8% over baseline approaches in math and open-domain generation tasks, respectively. Code is publicly available at https://github.com/ Aquasar11/CER.

nan

Article 1662

Title@2025-05-25 (7): Uncertainty Quantification for Physics-Informed Neural Networks with Extended Fiducial Inference

Title: Uncertainty Quantification for Physics-Informed Neural Networks with Extended Fiducial Inference

Ungewissheitsquantifizierung für physikinformierte Neuronale Netzwerke mit erweiterter fiduzieller Schlussfolgerung

具有扩展影响推断力的物理成形神经网络的不确定性量化 2505.19136v1

Authors: Frank Shih, Zhenghao Jiang, Faming Liang

Uncertainty quantification (UQ) in scientific machine learning is increasingly critical as neural networks are widely adopted to tackle complex problems across diverse scientific disciplines. For physics-informed neural networks (PINNs), a prominent model in scientific machine learning, uncertainty is typically quantified using Bayesian or dropout methods. However, both approaches suffer from a fundamental limitation: the prior distribution or dropout rate required to construct honest confidence sets cannot be determined without additional information. In this paper, we propose a novel method within the framework of extended fiducial inference (EFI) to provide rigorous uncertainty quantification for PINNs. The proposed method leverages a narrow-neck hyper-network to learn the parameters of the PINN and quantify their uncertainty based on imputed random errors in the observations. This approach overcomes the limitations of Bayesian and dropout methods, enabling the construction of honest confidence sets based solely on observed data. This advancement represents a significant breakthrough for PINNs, greatly enhancing their reliability, interpretability, and applicability to real-world scientific and engineering challenges. Moreover, it establishes a new theoretical framework for EFI, extending its application to large-scale models, eliminating the need for sparse hyper-networks, and significantly improving the automaticity and robustness of statistical inference.

nan

Article 1663

Title@2025-05-25 (7): Incentivizing High-Quality Human Annotations with Golden Questions

Title: Incentivizing High-Quality Human Annotations with Golden Questions

Anreize für hochwertige menschliche Anmerkungen mit goldenen Fragen

以金质问题激励高品质人文说明 2505.19134v1

Authors: Shang Liu, Zhongze Cai, Hanzhao Wang, Zhongyao Ma, Xiaocheng Li

Human-annotated data plays a vital role in training large language models (LLMs), such as supervised fine-tuning and human preference alignment. However, it is not guaranteed that paid human annotators produce high-quality data. In this paper, we study how to incentivize human annotators to do so. We start from a principal-agent model to model the dynamics between the company (the principal) and the annotator (the agent), where the principal can only monitor the annotation quality by examining $n$ samples. We investigate the maximum likelihood estimators (MLE) and the corresponding hypothesis testing to incentivize annotators: the agent is given a bonus if the MLE passes the test. By analyzing the variance of the outcome, we show that the strategic behavior of the agent makes the hypothesis testing very different from traditional ones: Unlike the exponential rate proved by the large deviation theory, the principal-agent model’s hypothesis testing rate is of $\Theta(1/\sqrt{n \log n})$. Our theory implies two criteria for the \emph{golden questions} to monitor the performance of the annotators: they should be of (1) high certainty and (2) similar format to normal ones. In that light, we select a set of golden questions in human preference data. By doing incentive-compatible experiments, we find out that the annotators’ behavior is better revealed by those golden questions, compared to traditional survey techniques such as instructed manipulation checks.

nan

Article 1664

Title@2025-05-25 (7): Fast and Accurate Power Load Data Completion via Regularization-optimized Low-Rank Factorization

Title: Fast and Accurate Power Load Data Completion via Regularization-optimized Low-Rank Factorization

Schnelle und präzise Leistungslastdatenvervollständigung über Regularisierungsoptimierte Low-Rank-Fabrikisierung

通过正规化、优化低射速电荷因子化完成快速和准确电源负载数据 2505.19133v1

Authors: Yan Xia, Hao Feng, Hongwei Sun, Junjie Wang, Qicong Hu

Low-rank representation learning has emerged as a powerful tool for recovering missing values in power load data due to its ability to exploit the inherent low-dimensional structures of spatiotemporal measurements. Among various techniques, low-rank factorization models are favoured for their efficiency and interpretability. However, their performance is highly sensitive to the choice of regularization parameters, which are typically fixed or manually tuned, resulting in limited generalization capability or slow convergence in practical scenarios. In this paper, we propose a Regularization-optimized Low-Rank Factorization, which introduces a Proportional-Integral-Derivative controller to adaptively adjust the regularization coefficient. Furthermore, we provide a detailed algorithmic complexity analysis, showing that our method preserves the computational efficiency of stochastic gradient descent while improving adaptivity. Experimental results on real-world power load datasets validate the superiority of our method in both imputation accuracy and training efficiency compared to existing baselines.

nan

Article 1665

Title@2025-05-25 (7): Rank-One Modified Value Iteration

Title: Rank-One Modified Value Iteration

Rang eins geänderte Wert Iteration

Ran- One 修改值迭代 2505.01828v2

Authors: Arman Sharifi Kolarijani, Tolga Ok, Peyman Mohajerin Esfahani, Mohamad Amin Sharif Kolarijani

In this paper, we provide a novel algorithm for solving planning and learning problems of Markov decision processes. The proposed algorithm follows a policy iteration-type update by using a rank-one approximation of the transition probability matrix in the policy evaluation step. This rank-one approximation is closely related to the stationary distribution of the corresponding transition probability matrix, which is approximated using the power method. We provide theoretical guarantees for the convergence of the proposed algorithm to optimal (action-)value function with the same rate and computational complexity as the value iteration algorithm in the planning problem and as the Q-learning algorithm in the learning problem. Through our extensive numerical simulations, however, we show that the proposed algorithm consistently outperforms first-order algorithms and their accelerated versions for both planning and learning problems.

nan

Article 1666

Title@2025-05-25 (7): Natural Language Generation from Visual Events: Challenges and Future Directions

Title: Natural Language Generation from Visual Events: Challenges and Future Directions

Natürliche Sprachgenerierung aus visuellen Veranstaltungen: Herausforderungen und Zukunftsrichtungen

从视觉活动中产生自然语言:挑战和未来方向 2502.13034v2

Authors: Aditya K Surikuchi, Raquel Fernández, Sandro Pezzelle

The ability to use natural language to talk about visual events is at the core of human intelligence and a crucial feature of any artificial intelligence system. In recent years, a substantial body of work in visually grounded NLP has focused on describing content depicted in single images. By contrast, comparatively less attention has been devoted to exhaustively modeling scenarios in which natural language is employed to interpret and talk about events presented through videos or sequences of images. In this position paper, we argue that any NLG task dealing with sequences of images or frames is an instance of the broader, more general problem of modeling the intricate relationships between visual events unfolding over time and the features of the language used to interpret, describe, or narrate them. Therefore, solving these tasks requires models to be capable of identifying and managing such intricacies. We consider five seemingly different tasks, which we argue are compelling instances of this broader multimodal problem. Consistently, we claim that these tasks pose a common set of challenges and share similarities in terms of modeling and evaluation approaches. Building on this perspective, we identify key open questions and propose several research directions for future investigation. We claim that improving language-and-vision models’ understanding of visual events is both timely and essential, given their growing applications. Additionally, this challenge offers significant scientific insight, advancing model development through principles of human cognition and language use.

nan

Article 1667

Title: Interacting Large Language Model Agents. Interpretable Models and Social Learning

Interagieren von Large Language Model Agents. Interpretierbare Modelle und soziales Lernen

跨大语言示范工具、可解释模型和社会学习 2411.01271v2

Authors: Adit Jain, Vikram Krishnamurthy

This paper discusses the theory and algorithms for interacting large language model agents (LLMAs) using methods from statistical signal processing and microeconomics. While both fields are mature, their application to decision-making involving interacting LLMAs remains unexplored. Motivated by Bayesian sentiment analysis on online platforms, we construct interpretable models and algorithms that enable LLMAs to interact and perform Bayesian inference. Because interacting LLMAs learn from both prior decisions and external inputs, they can exhibit bias and herding behavior. Thus, developing interpretable models and stochastic control algorithms is essential to understand and mitigate these behaviors. This paper has three main results. First, we show using Bayesian revealed preferences from microeconomics that an individual LLMA satisfies the necessary and sufficient conditions for rationally inattentive (bounded rationality) Bayesian utility maximization and, given an observation, the LLMA chooses an action that maximizes a regularized utility. Second, we utilize Bayesian social learning to construct interpretable models for LLMAs that interact sequentially with each other and the environment while performing Bayesian inference. Our proposed models capture the herding behavior exhibited by interacting LLMAs. Third, we propose a stochastic control framework to delay herding and improve state estimation accuracy under 2 settings: (a) centrally controlled LLMAs (b) autonomous LLMAs with incentives. We demonstrate the effectiveness of our methods on real datasets for hate speech classification and product quality assessment, using open-source models like LLaMA and closed-source models like ChatGPT. The main takeaway of this paper, based on empirical analysis and mathematical formalism, is that LLMAs act as rationally bounded Bayesian agents that exhibit social learning when interacting.

nan

Article 1668

Title@2025-05-25 (7): Adaptive Sensor Steering Strategy Using Deep Reinforcement Learning for Dynamic Data Acquisition in Digital Twins

Title: Adaptive Sensor Steering Strategy Using Deep Reinforcement Learning for Dynamic Data Acquisition in Digital Twins

Adaptive Sensorlenkungsstrategie mit tief greifendem Verstärkungslernen für die dynamische Datenerfassung in digitalen Zwillingen

利用深强化学习促进数字双对动态数据采集的适应感感感感指导战略 2504.10248v2

Authors: Collins O. Ogbodo, Timothy J. Rogers, Mattia Dal Borgo, David J. Wagg

This paper introduces a sensor steering methodology based on deep reinforcement learning to enhance the predictive accuracy and decision support capabilities of digital twins by optimising the data acquisition process. Traditional sensor placement techniques are often constrained by one-off optimisation strategies, which limit their applicability for online applications requiring continuous informative data assimilation. The proposed approach addresses this limitation by offering an adaptive framework for sensor placement within the digital twin paradigm. The sensor placement problem is formulated as a Markov decision process, enabling the training and deployment of an agent capable of dynamically repositioning sensors in response to the evolving conditions of the physical structure as represented by the digital twin. This ensures that the digital twin maintains a highly representative and reliable connection to its physical counterpart. The proposed framework is validated through a series of comprehensive case studies involving a cantilever plate structure subjected to diverse conditions, including healthy and damaged conditions. The results demonstrate the capability of the deep reinforcement learning agent to adaptively reposition sensors improving the quality of data acquisition and hence enhancing the overall accuracy of digital twins.

nan

Article 1669

Title@2025-05-25 (7): Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods

Title: Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods

Birke SGD: Ein Baumdiagramm-Framework für lokale und asynchrone SGD-Methoden

Birch SGD: 当地和非同步 SGD 方法树图框架 2505.09218v2

Authors: Alexander Tyurin, Danil Sivtsov

We propose a new unifying framework, Birch SGD, for analyzing and designing distributed SGD methods. The central idea is to represent each method as a weighted directed tree, referred to as a computation tree. Leveraging this representation, we introduce a general theoretical result that reduces convergence analysis to studying the geometry of these trees. This perspective yields a purely graph-based interpretation of optimization dynamics, offering a new and intuitive foundation for method development. Using Birch SGD, we design eight new methods and analyze them alongside previously known ones, with at least six of the new methods shown to have optimal computational time complexity. Our research leads to two key insights: (i) all methods share the same “iteration rate” of $O\left(\frac{(R + 1) L \Delta}{\varepsilon} + \frac{\sigma^2 L \Delta}{\varepsilon^2}\right)$, where $R$ the maximum “tree distance” along the main branch of a tree; and (ii) different methods exhibit different trade-offs-for example, some update iterates more frequently, improving practical performance, while others are more communication-efficient or focus on other aspects. Birch SGD serves as a unifying framework for navigating these trade-offs. We believe these results provide a unified foundation for understanding, analyzing, and designing efficient asynchronous and parallel optimization methods.

nan

Article 1670

Title@2025-05-25 (7): Deep Active Speech Cancellation with Mamba-Masking Network

Title: Deep Active Speech Cancellation with Mamba-Masking Network

Deep Active Speech Stornierung mit Mamba-Masking Network

使用 Mamba- Masking 网络的深活动语音取消 2502.01185v2

Authors: Yehuda Mishaly, Lior Wolf, Eliya Nachmani

We present a novel deep learning network for Active Speech Cancellation (ASC), advancing beyond Active Noise Cancellation (ANC) methods by effectively canceling both noise and speech signals. The proposed Mamba-Masking architecture introduces a masking mechanism that directly interacts with the encoded reference signal, enabling adaptive and precisely aligned anti-signal generation-even under rapidly changing, high-frequency conditions, as commonly found in speech. Complementing this, a multi-band segmentation strategy further improves phase alignment across frequency bands. Additionally, we introduce an optimization-driven loss function that provides near-optimal supervisory signals for anti-signal generation. Experimental results demonstrate substantial performance gains, achieving up to 7.2dB improvement in ANC scenarios and 6.2dB in ASC, significantly outperforming existing methods.

nan

Article 1671

Title@2025-05-25 (7): Exploring Magnitude Preservation and Rotation Modulation in Diffusion Transformers

Title: Exploring Magnitude Preservation and Rotation Modulation in Diffusion Transformers

Erforschung der Magnitudenerhaltung und Rotationsmodulation in Diffusionstransformatoren

在扩散变异器中探索磁力保护与旋转调节 2505.19122v1

Authors: Eric Tillman Bill, Cristian Perez Jensen, Sotiris Anagnostidis, Dimitri von Rütte

Denoising diffusion models exhibit remarkable generative capabilities, but remain challenging to train due to their inherent stochasticity, where high-variance gradient estimates lead to slow convergence. Previous works have shown that magnitude preservation helps with stabilizing training in the U-net architecture. This work explores whether this effect extends to the Diffusion Transformer (DiT) architecture. As such, we propose a magnitude-preserving design that stabilizes training without normalization layers. Motivated by the goal of maintaining activation magnitudes, we additionally introduce rotation modulation, which is a novel conditioning method using learned rotations instead of traditional scaling or shifting. Through empirical evaluations and ablation studies on small-scale models, we show that magnitude-preserving strategies significantly improve performance, notably reducing FID scores by $\sim$12.8%. Further, we show that rotation modulation combined with scaling is competitive with AdaLN, while requiring $\sim$5.4% fewer parameters. This work provides insights into conditioning strategies and magnitude control. We will publicly release the implementation of our method.

nan

Article 1672

Title@2025-05-25 (7): FP4 All the Way: Fully Quantized Training of LLMs

Title: FP4 All the Way: Fully Quantized Training of LLMs

RP4: Vollständig quantifizierte Ausbildung von LLMs

FP4 全程:充分量化的LLMM培训 2505.19115v1

Authors: Brian Chmiel, Maxim Fishman, Ron Banner, Daniel Soudry

We demonstrate, for the first time, fully quantized training (FQT) of large language models (LLMs) using predominantly 4-bit floating-point (FP4) precision for weights, activations, and gradients on datasets up to 200 billion tokens. We extensively investigate key design choices for FP4, including block sizes, scaling formats, and rounding methods. Our analysis shows that the NVFP4 format, where each block of 16 FP4 values (E2M1) shares a scale represented in E4M3, provides optimal results. We use stochastic rounding for backward and update passes and round-to-nearest for the forward pass to enhance stability. Additionally, we identify a theoretical and empirical threshold for effective quantized training: when the gradient norm falls below approximately $\sqrt{3}$ times the quantization noise, quantized training becomes less effective. Leveraging these insights, we successfully train a 7-billion-parameter model on 256 Intel Gaudi2 accelerators. The resulting FP4-trained model achieves downstream task performance comparable to a standard BF16 baseline, confirming that FP4 training is a practical and highly efficient approach for large-scale LLM training. A reference implementation is supplied in https://github.com/Anonymous1252022/fp4-all-the-way .

nan

Article 1673

Title@2025-05-25 (7): Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling

Title: Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling

Verwandeln von Müll in Schatz: Beschleunigen von Inferenzen von großen Sprachmodellen mit Token-Recycling

将垃圾垃圾变成宝库:加快使用 Tok 回收利用大语言模型的推论 2408.08696v3

Authors: Xianzhen Luo, Yixuan Wang, Qingfu Zhu, Zhiming Zhang, Xuanyu Zhang, Qing Yang, Dongliang Xu

Massive parameters of LLMs have made inference latency a fundamental bottleneck. Speculative decoding represents a lossless approach to accelerate inference through a guess-and-verify paradigm. Some methods rely on additional architectures to guess draft tokens, which need extra training before use. Alternatively, retrieval-based training-free techniques build libraries from pre-existing corpora or by n-gram generation. However, they face challenges like large storage requirements, time-consuming retrieval, and limited adaptability. Observing that candidate tokens generated during the decoding process are likely to reoccur in future sequences, we propose Token Recycling. It stores candidate tokens in an adjacency matrix and employs a breadth-first-search (BFS)-like algorithm to construct a draft tree, which is then validated through tree attention. New candidate tokens from the decoding process are then used to update the matrix. Token Recycling requires \textless2MB of additional storage and achieves approximately 2x speedup across all sizes of LLMs. It significantly outperforms existing train-free methods by 30\% and even a widely recognized training method by 25\%.

nan

Article 1674

Title@2025-05-25 (7): Stochastic Compositional Optimization with Compositional Constraints

Title: Stochastic Compositional Optimization with Compositional Constraints

Stochastische kompositorische Optimierung mit kompositorischen Einschränkungen

具有组成限制的斯托具组成优化 2209.04086v2

Authors: Shuoguang Yang, Wei You, Zhe Zhang, Ethan X. Fang

Stochastic compositional optimization (SCO) has attracted considerable attention because of its broad applicability to important real-world problems. However, existing works on SCO assume that the projection within a solution update is simple, which fails to hold for problem instances where the constraints are in the form of expectations, such as empirical conditional value-at-risk constraints. We study a novel model that incorporates single-level expected value and two-level compositional constraints into the current SCO framework. Our model can be applied widely to data-driven optimization and risk management, including risk-averse optimization and high-moment portfolio selection, and can handle multiple constraints. We further propose a class of primal-dual algorithms that generates sequences converging to the optimal solution at the rate of $\cO(\frac{1}{\sqrt{N}})$under both single-level expected value and two-level compositional constraints, where $N$ is the iteration counter, establishing the benchmarks in expected value constrained SCO.

nan

Article 1675

Title@2025-05-25 (7): An Interpretable Representation Learning Approach for Diffusion Tensor Imaging

Title: An Interpretable Representation Learning Approach for Diffusion Tensor Imaging

Ein interpretierbarer Representations-Lernansatz für Diffusion Tensor Imaging

传播显像成像的可解释代表性学习方法 2505.19110v1

Authors: Vishwa Mohan Singh, Alberto Gaston Villagran Asiares, Luisa Sophie Schuhmacher, Kate Rendall, Simon Weißbrod, David Rügamer, Inga Körte

Diffusion Tensor Imaging (DTI) tractography offers detailed insights into the structural connectivity of the brain, but presents challenges in effective representation and interpretation in deep learning models. In this work, we propose a novel 2D representation of DTI tractography that encodes tract-level fractional anisotropy (FA) values into a 9x9 grayscale image. This representation is processed through a Beta-Total Correlation Variational Autoencoder with a Spatial Broadcast Decoder to learn a disentangled and interpretable latent embedding. We evaluate the quality of this embedding using supervised and unsupervised representation learning strategies, including auxiliary classification, triplet loss, and SimCLR-based contrastive learning. Compared to the 1D Group deep neural network (DNN) baselines, our approach improves the F1 score in a downstream sex classification task by 15.74% and shows a better disentanglement than the 3D representation.

nan

Article 1676

Title@2025-05-25 (7): Optimization-Inspired Few-Shot Adaptation for Large Language Models

Title: Optimization-Inspired Few-Shot Adaptation for Large Language Models

Optimization-Inspired Wenig-Shot-Anpassung für große Sprachmodelle

优化- 激发了对大语言模型的微热适应 2505.19107v1

Authors: Boyan Gao, Xin Wang, Yibo Yang, David Clifton

Large Language Models (LLMs) have demonstrated remarkable performance in real-world applications. However, adapting LLMs to novel tasks via fine-tuning often requires substantial training data and computational resources that are impractical in few-shot scenarios. Existing approaches, such as in-context learning and Parameter-Efficient Fine-Tuning (PEFT), face key limitations: in-context learning introduces additional inference computational overhead with limited performance gains, while PEFT models are prone to overfitting on the few demonstration examples. In this work, we reinterpret the forward pass of LLMs as an optimization process, a sequence of preconditioned gradient descent steps refining internal representations. Based on this connection, we propose Optimization-Inspired Few-Shot Adaptation (OFA), integrating a parameterization that learns preconditioners without introducing additional trainable parameters, and an objective that improves optimization efficiency by learning preconditioners based on a convergence bound, while simultaneously steering the optimization path toward the flat local minimum. Our method overcomes both issues of ICL-based and PEFT-based methods, and demonstrates superior performance over the existing methods on a variety of few-shot adaptation tasks in experiments.

nan

Article 1677

Title@2025-05-25 (7): Statistical inference for Linear Stochastic Approximation with Markovian Noise

Title: Statistical inference for Linear Stochastic Approximation with Markovian Noise

Statistische Schlussfolgerung zur linearen stochastischen Annäherung an Markovsche Geräusche

与Markovian噪音的线性斯托口接近的统计推推 2505.19102v1

Authors: Sergey Samsonov, Marina Sheshukova, Eric Moulines, Alexey Naumov

In this paper we derive non-asymptotic Berry-Esseen bounds for Polyak-Ruppert averaged iterates of the Linear Stochastic Approximation (LSA) algorithm driven by the Markovian noise. Our analysis yields $\mathcal{O}(n^{-1/4})$ convergence rates to the Gaussian limit in the Kolmogorov distance. We further establish the non-asymptotic validity of a multiplier block bootstrap procedure for constructing the confidence intervals, guaranteeing consistent inference under Markovian sampling. Our work provides the first non-asymptotic guarantees on the rate of convergence of bootstrap-based confidence intervals for stochastic approximation with Markov noise. Moreover, we recover the classical rate of order $\mathcal{O}(n^{-1/8})$ up to logarithmic factors for estimating the asymptotic variance of the iterates of the LSA algorithm.

nan

Article 1678

Title@2025-05-25 (7): Towards Robust Influence Functions with Flat Validation Minima

Title: Towards Robust Influence Functions with Flat Validation Minima

Auf dem Weg zu robusten Einflussfunktionen mit Flat Validation Minima

以平滑校准微型方式向强力影响函数方向 2505.19097v1

Authors: Xichen Ye, Yifan Wu, Weizhong Zhang, Cheng Jin, Yifan Chen

The Influence Function (IF) is a widely used technique for assessing the impact of individual training samples on model predictions. However, existing IF methods often fail to provide reliable influence estimates in deep neural networks, particularly when applied to noisy training data. This issue does not stem from inaccuracies in parameter change estimation, which has been the primary focus of prior research, but rather from deficiencies in loss change estimation, specifically due to the sharpness of validation risk. In this work, we establish a theoretical connection between influence estimation error, validation set risk, and its sharpness, underscoring the importance of flat validation minima for accurate influence estimation. Furthermore, we introduce a novel estimation form of Influence Function specifically designed for flat validation minima. Experimental results across various tasks validate the superiority of our approach.

nan

Article 1679

Title@2025-05-25 (7): A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random

Title: A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random

Ein einheitliches Framework zur variablen Auswahl im modellbasierten Clustering mit Fehlen nicht zufällig

以模型为基础的集束模式中变量选择的统一框架, 随机不失踪 2505.19093v1

Authors: Binh H. Ho, Long Nguyen Chi, TrungTin Nguyen, Binh T. Nguyen, Van Ha Hoang, Christopher Drovandi

Model-based clustering integrated with variable selection is a powerful tool for uncovering latent structures within complex data. However, its effectiveness is often hindered by challenges such as identifying relevant variables that define heterogeneous subgroups and handling data that are missing not at random, a prevalent issue in fields like transcriptomics. While several notable methods have been proposed to address these problems, they typically tackle each issue in isolation, thereby limiting their flexibility and adaptability. This paper introduces a unified framework designed to address these challenges simultaneously. Our approach incorporates a data-driven penalty matrix into penalized clustering to enable more flexible variable selection, along with a mechanism that explicitly models the relationship between missingness and latent class membership. We demonstrate that, under certain regularity conditions, the proposed framework achieves both asymptotic consistency and selection consistency, even in the presence of missing data. This unified strategy significantly enhances the capability and efficiency of model-based clustering, advancing methodologies for identifying informative variables that define homogeneous subgroups in the presence of complex missing data patterns. The performance of the framework, including its computational efficiency, is evaluated through simulations and demonstrated using both synthetic and real-world transcriptomic datasets.

nan

Article 1680

Title@2025-05-25 (7): ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

Title: ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

ReadBench: Vermessen der Dichte an Text Visuelle Lesefähigkeit von Vision-Sprachen-Modellen

” 阅读 “ :衡量视觉-语言模型的阅读能力 2505.19091v1

Authors: Benjamin Clavié, Florian Brand

Recent advancements in Large Vision-Language Models (VLMs), have greatly enhanced their capability to jointly process text and images. However, despite extensive benchmarks evaluating visual comprehension (e.g., diagrams, color schemes, OCR tasks…), there is limited assessment of VLMs’ ability to read and reason about text-rich images effectively. To fill this gap, we introduce ReadBench, a multimodal benchmark specifically designed to evaluate the reading comprehension capabilities of VLMs. ReadBench transposes contexts from established text-only benchmarks into images of text while keeping textual prompts and questions intact. Evaluating leading VLMs with ReadBench, we find minimal-but-present performance degradation on short, text-image inputs, while performance sharply declines for longer, multi-page contexts. Our experiments further reveal that text resolution has negligible effects on multimodal performance. These findings highlight needed improvements in VLMs, particularly their reasoning over visually presented extensive textual content, a capability critical for practical applications. ReadBench is available at https://github.com/answerdotai/ReadBench .

nan

Article 1681

Title@2025-05-25 (7): CMoS: Rethinking Time Series Prediction Through the Lens of Chunk-wise Spatial Correlations

Title: CMoS: Rethinking Time Series Prediction Through the Lens of Chunk-wise Spatial Correlations

CMoS: Die Vorhersage der Zeitreihen durch die Linse der spaltweisen räumlichen Korrelationen neu denken

CMoS: 重新思考时间序列,通过整节空间交汇的镜头预测 2505.19090v1

Authors: Haotian Si, Changhua Pei, Jianhui Li, Dan Pei, Gaogang Xie

Recent advances in lightweight time series forecasting models suggest the inherent simplicity of time series forecasting tasks. In this paper, we present CMoS, a super-lightweight time series forecasting model. Instead of learning the embedding of the shapes, CMoS directly models the spatial correlations between different time series chunks. Additionally, we introduce a Correlation Mixing technique that enables the model to capture diverse spatial correlations with minimal parameters, and an optional Periodicity Injection technique to ensure faster convergence. Despite utilizing as low as 1% of the lightweight model DLinear’s parameters count, experimental results demonstrate that CMoS outperforms existing state-of-the-art models across multiple datasets. Furthermore, the learned weights of CMoS exhibit great interpretability, providing practitioners with valuable insights into temporal structures within specific application scenarios.

nan

Article 1682

Title@2025-05-25 (7): Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes

Title: Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes

Temperatur ist alles, was Sie für die Generalisierung in Langevin Dynamics und anderen Markov-Prozessen benötigen

Langevin Dynamics 和其他Markov 进程需要的温度是全部您需要的普遍化 2505.19087v1

Authors: Itamar Harel, Yonathan Wolanowsky, Gal Vardi, Nathan Srebro, Daniel Soudry

We analyze the generalization gap (gap between the training and test errors) when training a potentially over-parametrized model using a Markovian stochastic training algorithm, initialized from some distribution $\theta_0 \sim p_0$. We focus on Langevin dynamics with a positive temperature $\beta^{-1}$, i.e. gradient descent on a training loss $L$ with infinitesimal step size, perturbed with $\beta^{-1}$-variances Gaussian noise, and lightly regularized or bounded. There, we bound the generalization gap, at any time during training, by $\sqrt{(\beta\mathbb{E} L (\theta_0) + \log(1/\delta))/N}$ with probability $1-\delta$ over the dataset, where $N$ is the sample size, and $\mathbb{E} L (\theta_0) =O(1)$ with standard initialization scaling. In contrast to previous guarantees, we have no dependence on either training time or reliance on mixing, nor a dependence on dimensionality, gradient norms, or any other properties of the loss or model. This guarantee follows from a general analysis of any Markov process-based training that has a Gibbs-style stationary distribution. The proof is surprisingly simple, once we observe that the marginal distribution divergence from initialization remains bounded, as implied by a generalized second law of thermodynamics.

nan

Article 1683

Title@2025-05-25 (7): Jodi: Unification of Visual Generation and Understanding via Joint Modeling

Title: Jodi: Unification of Visual Generation and Understanding via Joint Modeling

Jodi: Vereinheitlichung der visuellen Erzeugung und des Verständnisses durch gemeinsame Modellierung

Jodi:通过联合建模统一视觉生成和理解 2505.19084v1

Authors: Yifeng Xu, Zhenliang He, Meina Kan, Shiguang Shan, Xilin Chen

Visual generation and understanding are two deeply interconnected aspects of human intelligence, yet they have been traditionally treated as separate tasks in machine learning. In this paper, we propose Jodi, a diffusion framework that unifies visual generation and understanding by jointly modeling the image domain and multiple label domains. Specifically, Jodi is built upon a linear diffusion transformer along with a role switch mechanism, which enables it to perform three particular types of tasks: (1) joint generation, where the model simultaneously generates images and multiple labels; (2) controllable generation, where images are generated conditioned on any combination of labels; and (3) image perception, where multiple labels can be predicted at once from a given image. Furthermore, we present the Joint-1.6M dataset, which contains 200,000 high-quality images collected from public sources, automatic labels for 7 visual domains, and LLM-generated captions. Extensive experiments demonstrate that Jodi excels in both generation and understanding tasks and exhibits strong extensibility to a wider range of visual domains. Code is available at https://github.com/VIPL-GENUN/Jodi.

nan

Article 1684

Title@2025-05-25 (7): Geometric Determinations Of Characteristic Redshifts From DESI-DR2 BAO and DES-SN5YR Observations: Hints For New Expansion Rate Anomalies

Title: Geometric Determinations Of Characteristic Redshifts From DESI-DR2 BAO and DES-SN5YR Observations: Hints For New Expansion Rate Anomalies

Geometrische Bestimmung charakteristischer Rotverschiebungen aus DESI-DR2 BAO und DES-SN5YR Beobachtungen: Hinweise für neue Erweiterungsraten Anomalien

DESSI-DD2 BAO和DES-SN5YR观测的典型变迁的几何测定:新扩张率异常现象的提示 2505.19083v1

Authors: Purba Mukherjee, Anjan A Sen

In this work, we perform a model-agnostic reconstruction of the cosmic expansion history by combining DESI-DR2 BAO and DES-SN5YR data, with a focus on geometric determination of characteristic redshifts where notable tensions in the expansion rate are found to emerge. Employing Gaussian process regression alongside knot-based spline techniques, we reconstruct cosmic distances and their derivatives to pinpoint these characteristic redshifts and infer $E(z)$. Our analysis reveals significant deviations of approximately 4 to 5$\sigma$ from the Planck 2018 $\Lambda$CDM predictions, particularly pronounced in the redshift range $z \sim 0.35-0.55$. These anomalies are consistently observed across both reconstruction methods and combined datasets, indicating robust late-time departures that could signal new physics beyond the standard cosmological framework. The joint use of BAO and SN probes enhances the precision of our constraints, allowing us to isolate these deviations without reliance on specific cosmological assumptions. Our findings underscore the role of characteristic redshifts as sensitive indicators of expansion rate anomalies and motivate further scrutiny with forthcoming datasets from DESI-5YR BAO, Euclid, and LSST. These future surveys will tighten constraints and help distinguish whether these late-time anomalies arise from new fundamental physics or unresolved systematics in the data.

nan

Article 1685

Title@2025-05-25 (7): On Continuity of Robust and Accurate Classifiers

Title: On Continuity of Robust and Accurate Classifiers

Über die Kontinuität von robusten und präzisen Klassifikatoren

关于强力和准确性分类的连续性 2309.17048v2

Authors: Ramin Barati, Reza Safabakhsh, Mohammad Rahmati

The reliability of a learning model is key to the successful deployment of machine learning in various applications. However, it is difficult to describe the phenomenon due to the complicated nature of the problems in machine learning. It has been shown that adversarial training can improve the robustness of the hypothesis. However, this improvement usually comes at the cost of decreased performance on natural samples. Hence, it has been suggested that robustness and accuracy of a hypothesis are at odds with each other. In this paper, we put forth the alternative proposal that it is the continuity of a hypothesis that is incompatible with its robustness and accuracy in many of these scenarios. In other words, a continuous function cannot effectively learn the optimal robust hypothesis. We introduce a framework for a rigorous study of harmonic and holomorphic hypothesis in learning theory terms and provide empirical evidence that continuous hypotheses do not perform as well as discontinuous hypotheses in some common machine learning tasks. From a practical point of view, our results suggests that a robust and accurate learning rule would train different continuous hypotheses for different regions of the domain. From a theoretical perspective, our analysis explains the adversarial examples phenomenon in these situations as a conflict between the continuity of a sequence of functions and its uniform convergence to a discontinuous function. Given that many of the contemporary machine learning models are continuous functions, it is important to theoretically study the continuity of robust and accurate classifiers as it is consequential in their construction, analysis and evaluation.

nan

Article 1686

Title@2025-05-25 (7): Flow Annealed Importance Sampling Bootstrap meets Differentiable Particle Physics

Title: Flow Annealed Importance Sampling Bootstrap meets Differentiable Particle Physics

Flow Annealed Bedeutung Sampling Bootstrap trifft differenzierbare Teilchenphysik

流动的隐形重要性取样器装置符合可区分的粒子物理 2411.16234v2

Authors: Annalena Kofler, Vincent Stimper, Mikhail Mikhasenko, Michael Kagan, Lukas Heinrich

High-energy physics requires the generation of large numbers of simulated data samples from complex but analytically tractable distributions called matrix elements. Surrogate models, such as normalizing flows, are gaining popularity for this task due to their computational efficiency. We adopt an approach based on Flow Annealed importance sampling Bootstrap (FAB) that evaluates the differentiable target density during training and helps avoid the costly generation of training data in advance. We show that FAB reaches higher sampling efficiency with fewer target evaluations in high dimensions in comparison to other methods.

nan

Article 1687

Title@2025-05-25 (7): Cluster-Aware Multi-Round Update for Wireless Federated Learning in Heterogeneous Environments

Title: Cluster-Aware Multi-Round Update for Wireless Federated Learning in Heterogeneous Environments

Cluster-Aware Multi-Round Update für drahtloses Federated Learning in heterogenen Umgebungen

为不同不同环境无线联邦学习提供多功能集群软件多功能更新 2505.06268v2

Authors: Pengcheng Sun, Erwu Liu, Wei Ni, Kanglei Yu, Rui Wang, Abbas Jamalipour

The aggregation efficiency and accuracy of wireless Federated Learning (FL) are significantly affected by resource constraints, especially in heterogeneous environments where devices exhibit distinct data distributions and communication capabilities. This paper proposes a clustering strategy that leverages prior knowledge similarity to group devices with similar data and communication characteristics, mitigating performance degradation from heterogeneity. On this basis, a novel Cluster- Aware Multi-round Update (CAMU) strategy is proposed, which treats clusters as the basic units and adjusts the local update frequency based on the clustered contribution threshold, effectively reducing update bias and enhancing aggregation accuracy. The theoretical convergence of the CAMU strategy is rigorously validated. Meanwhile, based on the convergence upper bound, the local update frequency and transmission power of each cluster are jointly optimized to achieve an optimal balance between computation and communication resources under constrained conditions, significantly improving the convergence efficiency of FL. Experimental results demonstrate that the proposed method effectively improves the model performance of FL in heterogeneous environments and achieves a better balance between communication cost and computational load under limited resources.

nan

Article 1688

Title@2025-05-25 (7): Recalibrating binary probabilistic classifiers

Title: Recalibrating binary probabilistic classifiers

Rekalibrierung von binären probabilistischen Klassifikatoren

重新计算二进制概率分解器 2505.19068v1

Authors: Dirk Tasche

Recalibration of binary probabilistic classifiers to a target prior probability is an important task in areas like credit risk management. We analyse methods for recalibration from a distribution shift perspective. Distribution shift assumptions linked to the area under the curve (AUC) of a probabilistic classifier are found to be useful for the design of meaningful recalibration methods. Two new methods called parametric covariate shift with posterior drift (CSPD) and ROC-based quasi moment matching (QMM) are proposed and tested together with some other methods in an example setting. The outcomes of the test suggest that the QMM methods discussed in the paper can provide appropriately conservative results in evaluations with concave functionals like for instance risk weights functions for credit risk.

nan

Article 1689

Title@2025-05-25 (7): Adversarial Bandit over Bandits: Hierarchical Bandits for Online Configuration Management

Title: Adversarial Bandit over Bandits: Hierarchical Bandits for Online Configuration Management

Adversarial Bandit über Bandits: Hierarchische Bandits für Online-Konfigurationsmanagement

反强盗强盗: 用于在线配置管理的等级强盗 2505.19061v1

Authors: Chen Avin, Zvi Lotker, Shie Mannor, Gil Shabat, Hanan Shteingart, Roey Yadgar

Motivated by dynamic parameter optimization in finite, but large action (configurations) spaces, this work studies the nonstochastic multi-armed bandit (MAB) problem in metric action spaces with oblivious Lipschitz adversaries. We propose ABoB, a hierarchical Adversarial Bandit over Bandits algorithm that can use state-of-the-art existing “flat” algorithms, but additionally clusters similar configurations to exploit local structures and adapt to changing environments. We prove that in the worst-case scenario, such clustering approach cannot hurt too much and ABoB guarantees a standard worst-case regret bound of $O\left(k^{\frac{1}{2}}T^{\frac{1}{2}}\right)$, where $T$ is the number of rounds and $k$ is the number of arms, matching the traditional flat approach. However, under favorable conditions related to the algorithm properties, clusters properties, and certain Lipschitz conditions, the regret bound can be improved to $O\left(k^{\frac{1}{4}}T^{\frac{1}{2}}\right)$. Simulations and experiments on a real storage system demonstrate that ABoB, using standard algorithms like EXP3 and Tsallis-INF, achieves lower regret and faster convergence than the flat method, up to 50% improvement in known previous setups, nonstochastic and stochastic, as well as in our settings.

nan

Article 1690

Title@2025-05-25 (7): An Initial Exploration of Fine-tuning Small Language Models for Smart Contract Reentrancy Vulnerability Detection

Title: An Initial Exploration of Fine-tuning Small Language Models for Smart Contract Reentrancy Vulnerability Detection

Eine erste Erkundung von Feinsteuerungs-Kleinsprachenmodellen für intelligente Vertragsrepentrancy Sicherheitserkennung

初步探索智能合同留置率易变性探测智能合同微调小型语言模型 2505.19059v1

Authors: Ignacio Mariano Andreozzi Pofcher, Joshua Ellul

Large Language Models (LLMs) are being used more and more for various coding tasks, including to help coders identify bugs and are a promising avenue to support coders in various tasks including vulnerability detection – particularly given the flexibility of such generative AI models and tools. Yet for many tasks it may not be suitable to use LLMs, for which it may be more suitable to use smaller language models that can fit and easily execute and train on a developer’s computer. In this paper we explore and evaluate whether smaller language models can be fine-tuned to achieve reasonable results for a niche area: vulnerability detection – specifically focusing on detecting the reentrancy bug in Solidity smart contracts.

nan

Article 1691

Title@2025-05-25 (7): Policy Gradient with Tree Expansion

Title: Policy Gradient with Tree Expansion

Politischer Gradient mit Baumerweiterung

随着树树扩张的政策渐变 2301.13236v2

Authors: Gal Dalal, Assaf Hallak, Gugan Thoppe, Shie Mannor, Gal Chechik

Policy gradient methods are notorious for having a large variance and high sample complexity. To mitigate this, we introduce SoftTreeMax – a generalization of softmax that employs planning. In SoftTreeMax, we extend the traditional logits with the multi-step discounted cumulative reward, topped with the logits of future states. We analyze SoftTreeMax and explain how tree expansion helps to reduce its gradient variance. We prove that the variance depends on the chosen tree-expansion policy. Specifically, we show that the closer the induced transitions are to being state-independent, the stronger the variance decay. With approximate forward models, we prove that the resulting gradient bias diminishes with the approximation error while retaining the same variance reduction. Ours is the first result to bound the gradient bias for an approximate model. In a practical implementation of SoftTreeMax, we utilize a parallel GPU-based simulator for fast and efficient tree expansion. Using this implementation in Atari, we show that SoftTreeMax reduces the gradient variance by three orders of magnitude. This leads to better sample complexity and improved performance compared to distributed PPO.

nan

Article 1692

Title@2025-05-25 (7): Distributionally Robust Deep Q-Learning

Title: Distributionally Robust Deep Q-Learning

Verteilungsstarkes tiefes Q-Lernen

分布强力深学习 Q- 学习 2505.19058v1

Authors: Chung I Lu, Julian Sester, Aijia Zhang

We propose a novel distributionally robust $Q$-learning algorithm for the non-tabular case accounting for continuous state spaces where the state transition of the underlying Markov decision process is subject to model uncertainty. The uncertainty is taken into account by considering the worst-case transition from a ball around a reference probability measure. To determine the optimal policy under the worst-case state transition, we solve the associated non-linear Bellman equation by dualising and regularising the Bellman operator with the Sinkhorn distance, which is then parameterized with deep neural networks. This approach allows us to modify the Deep Q-Network algorithm to optimise for the worst case state transition. We illustrate the tractability and effectiveness of our approach through several applications, including a portfolio optimisation task based on S\&{P}~500 data.

nan

Article 1693

Title@2025-05-25 (7): An Embarrassingly Simple Defense Against LLM Abliteration Attacks

Title: An Embarrassingly Simple Defense Against LLM Abliteration Attacks

Eine erschreckend einfache Verteidigung gegen LLM-Abliterationsangriffe

一种令人尴尬的简单防御对付LLM 缩写攻击 2505.19056v1

Authors: Harethah Abu Shairah, Hasan Abed Al Kader Hammoud, Bernard Ghanem, George Turkiyyah

Large language models (LLMs) are typically aligned to comply with safety guidelines by refusing harmful instructions. A recent attack, termed abliteration, isolates and suppresses the single latent direction most responsible for refusal behavior, enabling the model to generate unethical content. We propose a defense that modifies how models generate refusals. We construct an extended-refusal dataset that contains harmful prompts with a full response that justifies the reason for refusal. We then fine-tune Llama-2-7B-Chat and Qwen2.5-Instruct (1.5B and 3B parameters) on our extended-refusal dataset, and evaluate the resulting systems on a set of harmful prompts. In our experiments, extended-refusal models maintain high refusal rates, dropping at most by 10%, whereas baseline models’ refusal rates drop by 70-80% after abliteration. A broad evaluation of safety and utility shows that extended-refusal fine-tuning neutralizes the abliteration attack while preserving general performance.

nan

Article 1694

Title@2025-05-25 (7): Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning

Title: Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning

Computerische Kosten im Deep-Verstärkung-Lernen durch Randomized Policy Learning reduzieren

降低深强化学习的计算成本 2505.19054v1

Authors: Zhuochen Liu, Rahul Jain, Quan Nguyen

Recent advancements in reinforcement learning (RL) have leveraged neural networks to achieve state-of-the-art performance across various control tasks. However, these successes often come at the cost of significant computational resources, as training deep neural networks requires substantial time and data. In this paper, we introduce an actor-critic algorithm that utilizes randomized neural networks to drastically reduce computational costs while maintaining strong performance. Despite its simple architecture, our method effectively solves a range of control problems, including the locomotion control of a highly dynamic 12-motor quadruped robot, and achieves results comparable to leading algorithms such as Proximal Policy Optimization (PPO). Notably, our approach does not outperform other algorithms in terms of sample efficnency but rather in terms of wall-clock training time. That is, although our algorithm requires more timesteps to converge to an optimal policy, the actual time required for training turns out to be lower.

nan

Article 1695

Title@2025-05-25 (7): Structured Reinforcement Learning for Combinatorial Decision-Making

Title: Structured Reinforcement Learning for Combinatorial Decision-Making

Strukturiertes Stärkungslernen für kombinatorische Entscheidungsfindung

结构强化学习促进综合决策决策 2505.19053v1

Authors: Heiko Hoppe, Léo Baty, Louis Bouvier, Axel Parmentier, Maximilian Schiffer

Reinforcement learning (RL) is increasingly applied to real-world problems involving complex and structured decisions, such as routing, scheduling, and assortment planning. These settings challenge standard RL algorithms, which struggle to scale, generalize, and exploit structure in the presence of combinatorial action spaces. We propose Structured Reinforcement Learning (SRL), a novel actor-critic framework that embeds combinatorial optimization layers into the actor neural network. We enable end-to-end learning of the actor via Fenchel-Young losses and provide a geometric interpretation of SRL as a primal-dual algorithm in the dual of the moment polytope. Across six environments with exogenous and endogenous uncertainty, SRL matches or surpasses the performance of unstructured RL and imitation learning on static tasks and improves over these baselines by up to 92% on dynamic problems, with improved stability and convergence speed.

nan

Article 1696

Title@2025-05-25 (7): Efficient Data Selection at Scale via Influence Distillation

Title: Efficient Data Selection at Scale via Influence Distillation

Effiziente Datenauswahl auf Scale durch Einflussdestillation

通过影响蒸馏在规模上高效数据选择 2505.19051v1

Authors: Mahdi Nikdan, Vincent Cohen-Addad, Dan Alistarh, Vahab Mirrokni

Effective data selection is critical for efficient training of modern Large Language Models (LLMs). This paper introduces Influence Distillation, a novel, mathematically-justified framework for data selection that employs second-order information to optimally weight training samples. By distilling each sample’s influence on a target distribution, our method assigns model-specific weights that are used to select training data for LLM fine-tuning, guiding it toward strong performance on the target domain. We derive these optimal weights for both Gradient Descent and Adam optimizers. To ensure scalability and reduce computational cost, we propose a $\textit{landmark-based approximation}$: influence is precisely computed for a small subset of “landmark” samples and then efficiently propagated to all other samples to determine their weights. We validate Influence Distillation by applying it to instruction tuning on the Tulu V2 dataset, targeting a range of tasks including GSM8k, SQuAD, and MMLU, across several models from the Llama and Qwen families. Experiments show that Influence Distillation matches or outperforms state-of-the-art performance while achieving up to $3.5\times$ faster selection.

nan

Article 1697

Title@2025-05-25 (7): SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

Title: SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

SliM-LLM: Salience-getriebene Mixed-Precision-Quantisierung für große Sprachmodelle

SliM-LLM:大语言模型的盐度驱动混合精度量 2405.14917v2

Authors: Wei Huang, Haotong Qin, Yangdong Liu, Yawei Li, Qinshuo Liu, Xianglong Liu, Luca Benini, Michele Magno, Shiming Zhang, Xiaojuan Qi

Post-training quantization (PTQ) is an effective technique for compressing large language models (LLMs). However, while uniform-precision quantization is computationally efficient, it often compromises model performance. To address this, we propose SliM-LLM, a salience-driven mixed-precision quantization framework that allocates bit-widths at the group-wise. Our approach leverages the observation that important weights follow a structured distribution and introduces two key components: \textbf{1)} \textit{Salience-Determined Bit Allocation} adaptively assigns bit-widths to groups within each layer based on their salience; and \textbf{2)} \textit{Salience-Weighted Quantizer Calibration} optimizes quantizer parameters by incorporating element-level salience. With its structured partitioning, SliM-LLM provides a hardware-friendly solution that matches the efficiency of uniform quantization methods while improving accuracy. Experiments show that SliM-LLM achieves superior performance across various LLMs at low bit-widths. For example, a 2-bit quantized LLaMA-7B model reduces memory usage by nearly 6x compared to the floating-point baseline, decreases perplexity by 48\% compared to state-of-the-art gradient-free PTQ methods, and maintains GPU inference speed. Additionally, the extended version, SliM-LLM$^+$, which incorporates gradient-based quantization, further reduces perplexity by 35.1\%. Our code is available at https://github.com/Aaronhuang-778/SliM-LLM

nan

Article 1698

Title@2025-05-25 (7): PII-Scope: A Comprehensive Study on Training Data PII Extraction Attacks in LLMs

Title: PII-Scope: A Comprehensive Study on Training Data PII Extraction Attacks in LLMs

PII-Scope: Eine umfassende Studie über Trainingsdaten PII-Extraktionsangriffe in LLMs

PII-范围:关于培训数据的综合研究 2410.06704v2

Authors: Krishna Kanth Nakka, Ahmed Frikha, Ricardo Mendes, Xue Jiang, Xuebing Zhou

In this work, we introduce PII-Scope, a comprehensive benchmark designed to evaluate state-of-the-art methodologies for PII extraction attacks targeting LLMs across diverse threat settings. Our study provides a deeper understanding of these attacks by uncovering several hyperparameters (e.g., demonstration selection) crucial to their effectiveness. Building on this understanding, we extend our study to more realistic attack scenarios, exploring PII attacks that employ advanced adversarial strategies, including repeated and diverse querying, and leveraging iterative learning for continual PII extraction. Through extensive experimentation, our results reveal a notable underestimation of PII leakage in existing single-query attacks. In fact, we show that with sophisticated adversarial capabilities and a limited query budget, PII extraction rates can increase by up to fivefold when targeting the pretrained model. Moreover, we evaluate PII leakage on finetuned models, showing that they are more vulnerable to leakage than pretrained models. Overall, our work establishes a rigorous empirical benchmark for PII extraction attacks in realistic threat scenarios and provides a strong foundation for developing effective mitigation strategies.

nan

Article 1699

Title@2025-05-25 (7): When Models Don’t Collapse: On the Consistency of Iterative MLE

Title: When Models Don’t Collapse: On the Consistency of Iterative MLE

Wenn Modelle nicht zusammenbrechen: Über die Konsistenz iterativer MLE

当模型不折叠时: 在迭代 MLE 一致性上 2505.19046v1

Authors: Daniel Barzilai, Ohad Shamir

The widespread use of generative models has created a feedback loop, in which each generation of models is trained on data partially produced by its predecessors. This process has raised concerns about \emph{model collapse}: A critical degradation in performance caused by repeated training on synthetic data. However, different analyses in the literature have reached different conclusions as to the severity of model collapse. As such, it remains unclear how concerning this phenomenon is, and under which assumptions it can be avoided. To address this, we theoretically study model collapse for maximum likelihood estimation (MLE), in a natural setting where synthetic data is gradually added to the original data set. Under standard assumptions (similar to those long used for proving asymptotic consistency and normality of MLE), we establish non-asymptotic bounds showing that collapse can be avoided even as the fraction of real data vanishes. On the other hand, we prove that some assumptions (beyond MLE consistency) are indeed necessary: Without them, model collapse can occur arbitrarily quickly, even when the original data is still present in the training set. To the best of our knowledge, these are the first rigorous examples of iterative generative modeling with accumulating data that rapidly leads to model collapse.

nan

Article 1700

Title@2025-05-25 (7): Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

Title: Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments

Offline-Clustering von linearen Banditen: Entriegelung der Macht von Clustern in datenbeschränkten Umgebungen

线性强盗离线集群:解锁数据限制环境中的群集力量 2505.19043v1

Authors: Jingyuan Liu, Zeyu Zhang, Xuchuang Wang, Xutong Liu, John C. S. Lui, Mohammad Hajiesmaili, Carlee Joe-Wong

Contextual linear multi-armed bandits are a learning framework for making a sequence of decisions, e.g., advertising recommendations for a sequence of arriving users. Recent works have shown that clustering these users based on the similarity of their learned preferences can significantly accelerate the learning. However, prior work has primarily focused on the online setting, which requires continually collecting user data, ignoring the offline data widely available in many applications. To tackle these limitations, we study the offline clustering of bandits (Off-ClusBand) problem, which studies how to use the offline dataset to learn cluster properties and improve decision-making across multiple users. The key challenge in Off-ClusBand arises from data insufficiency for users: unlike the online case, in the offline case, we have a fixed, limited dataset to work from and thus must determine whether we have enough data to confidently cluster users together. To address this challenge, we propose two algorithms: Off-C$^2$LUB, which we analytically show performs well for arbitrary amounts of user data, and Off-CLUB, which is prone to bias when data is limited but, given sufficient data, matches a theoretical lower bound that we derive for the offline clustered MAB problem. We experimentally validate these results on both real and synthetic datasets.

nan

Article 1701

Title@2025-05-25 (7): Turb-L1: Achieving Long-term Turbulence Tracing By Tackling Spectral Bias

Title: Turb-L1: Achieving Long-term Turbulence Tracing By Tackling Spectral Bias

Turb-L1: Langfristige Turbulenzen erreichen, die durch das Greifen spektraler Bias verfolgt werden

Turb-L1:通过处理光辉双鱼,实现长期动荡追踪 2505.19038v1

Authors: Hao Wu, Yuan Gao, Ruiqi Shu, Zean Han, Fan Xu, Zhihong Zhu, Qingsong Wen, Xian Wu, Kun Wang, Xiaomeng Huang

Accurately predicting the long-term evolution of turbulence is crucial for advancing scientific understanding and optimizing engineering applications. However, existing deep learning methods face significant bottlenecks in long-term autoregressive prediction, which exhibit excessive smoothing and fail to accurately track complex fluid dynamics. Our extensive experimental and spectral analysis of prevailing methods provides an interpretable explanation for this shortcoming, identifying Spectral Bias as the core obstacle. Concretely, spectral bias is the inherent tendency of models to favor low-frequency, smooth features while overlooking critical high-frequency details during training, thus reducing fidelity and causing physical distortions in long-term predictions. Building on this insight, we propose Turb-L1, an innovative turbulence prediction method, which utilizes a Hierarchical Dynamics Synthesis mechanism within a multi-grid architecture to explicitly overcome spectral bias. It accurately captures cross-scale interactions and preserves the fidelity of high-frequency dynamics, enabling reliable long-term tracking of turbulence evolution. Extensive experiments on the 2D turbulence benchmark show that Turb-L1 demonstrates excellent performance: (I) In long-term predictions, it reduces Mean Squared Error (MSE) by $80.3\%$ and increases Structural Similarity (SSIM) by over $9\times$ compared to the SOTA baseline, significantly improving prediction fidelity. (II) It effectively overcomes spectral bias, accurately reproducing the full enstrophy spectrum and maintaining physical realism in high-wavenumber regions, thus avoiding the spectral distortions or spurious energy accumulation seen in other methods.

nan

Article 1702

Title@2025-05-25 (7): Optimal Conformal Prediction under Epistemic Uncertainty

Title: Optimal Conformal Prediction under Epistemic Uncertainty

Optimale konforme Vorhersage unter epistemischer Unsicherheit

在不确定性下最优化的共变预测 2505.19033v1

Authors: Alireza Javanmardi, Soroush H. Zargarbashi, Santo M. A. R. Thies, Willem Waegeman, Aleksandar Bojchevski, Eyke Hüllermeier

Conformal prediction (CP) is a popular frequentist framework for representing uncertainty by providing prediction sets that guarantee coverage of the true label with a user-adjustable probability. In most applications, CP operates on confidence scores coming from a standard (first-order) probabilistic predictor (e.g., softmax outputs). Second-order predictors, such as credal set predictors or Bayesian models, are also widely used for uncertainty quantification and are known for their ability to represent both aleatoric and epistemic uncertainty. Despite their popularity, there is still an open question on ``how they can be incorporated into CP’’. In this paper, we discuss the desiderata for CP when valid second-order predictions are available. We then introduce Bernoulli prediction sets (BPS), which produce the smallest prediction sets that ensure conditional coverage in this setting. When given first-order predictions, BPS reduces to the well-known adaptive prediction sets (APS). Furthermore, when the validity assumption on the second-order predictions is compromised, we apply conformal risk control to obtain a marginal coverage guarantee while still accounting for epistemic uncertainty.

nan

Article 1703

Title@2025-05-25 (7): SoK: Dataset Copyright Auditing in Machine Learning Systems

Title: SoK: Dataset Copyright Auditing in Machine Learning Systems

SoK: Datensatz Copyright Auditing in Machine Learning Systemen

SoK:机器学习系统中的数据集版权审计 2410.16618v2

Authors: Linkang Du, Xuanru Zhou, Min Chen, Chusong Zhang, Zhou Su, Peng Cheng, Jiming Chen, Zhikun Zhang

As the implementation of machine learning (ML) systems becomes more widespread, especially with the introduction of larger ML models, we perceive a spring demand for massive data. However, it inevitably causes infringement and misuse problems with the data, such as using unauthorized online artworks or face images to train ML models. To address this problem, many efforts have been made to audit the copyright of the model training dataset. However, existing solutions vary in auditing assumptions and capabilities, making it difficult to compare their strengths and weaknesses. In addition, robustness evaluations usually consider only part of the ML pipeline and hardly reflect the performance of algorithms in real-world ML applications. Thus, it is essential to take a practical deployment perspective on the current dataset copyright auditing tools, examining their effectiveness and limitations. Concretely, we categorize dataset copyright auditing research into two prominent strands: intrusive methods and non-intrusive methods, depending on whether they require modifications to the original dataset. Then, we break down the intrusive methods into different watermark injection options and examine the non-intrusive methods using various fingerprints. To summarize our results, we offer detailed reference tables, highlight key points, and pinpoint unresolved issues in the current literature. By combining the pipeline in ML systems and analyzing previous studies, we highlight several future directions to make auditing tools more suitable for real-world copyright protection requirements.

nan

Article 1704

Title@2025-05-25 (7): Learn Beneficial Noise as Graph Augmentation

Title: Learn Beneficial Noise as Graph Augmentation

Benefitial Noise als Graph Augmentation lernen

学习以图增益为受益噪音 2505.19024v1

Authors: Siqi Huang, Yanchen Xu, Hongyuan Zhang, Xuelong Li

Although graph contrastive learning (GCL) has been widely investigated, it is still a challenge to generate effective and stable graph augmentations. Existing methods often apply heuristic augmentation like random edge dropping, which may disrupt important graph structures and result in unstable GCL performance. In this paper, we propose Positive-incentive Noise driven Graph Data Augmentation (PiNGDA), where positive-incentive noise (pi-noise) scientifically analyzes the beneficial effect of noise under the information theory. To bridge the standard GCL and pi-noise framework, we design a Gaussian auxiliary variable to convert the loss function to information entropy. We prove that the standard GCL with pre-defined augmentations is equivalent to estimate the beneficial noise via the point estimation. Following our analysis, PiNGDA is derived from learning the beneficial noise on both topology and attributes through a trainable noise generator for graph augmentations, instead of the simple estimation. Since the generator learns how to produce beneficial perturbations on graph topology and node attributes, PiNGDA is more reliable compared with the existing methods. Extensive experimental results validate the effectiveness and stability of PiNGDA.

nan

Article 1705

Title@2025-05-25 (7): A Smart Healthcare System for Monkeypox Skin Lesion Detection and Tracking

Title: A Smart Healthcare System for Monkeypox Skin Lesion Detection and Tracking

Ein intelligentes Gesundheitssystem für Monkeypox-Hautläsionserkennung und -verfolgung

用于探测和跟踪猴子天花皮肤皮层的智能保健系统 2505.19023v1

Authors: Huda Alghoraibi, Nuha Alqurashi, Sarah Alotaibi, Renad Alkhudaydi, Bdoor Aldajani, Lubna Alqurashi, Jood Batweel, Maha A. Thafar

Monkeypox is a viral disease characterized by distinctive skin lesions and has been reported in many countries. The recent global outbreak has emphasized the urgent need for scalable, accessible, and accurate diagnostic solutions to support public health responses. In this study, we developed ITMAINN, an intelligent, AI-driven healthcare system specifically designed to detect Monkeypox from skin lesion images using advanced deep learning techniques. Our system consists of three main components. First, we trained and evaluated several pretrained models using transfer learning on publicly available skin lesion datasets to identify the most effective models. For binary classification (Monkeypox vs. non-Monkeypox), the Vision Transformer, MobileViT, Transformer-in-Transformer, and VGG16 achieved the highest performance, each with an accuracy and F1-score of 97.8%. For multiclass classification, which contains images of patients with Monkeypox and five other classes (chickenpox, measles, hand-foot-mouth disease, cowpox, and healthy), ResNetViT and ViT Hybrid models achieved 92% accuracy, with F1 scores of 92.24% and 92.19%, respectively. The best-performing and most lightweight model, MobileViT, was deployed within the mobile application. The second component is a cross-platform smartphone application that enables users to detect Monkeypox through image analysis, track symptoms, and receive recommendations for nearby healthcare centers based on their location. The third component is a real-time monitoring dashboard designed for health authorities to support them in tracking cases, analyzing symptom trends, guiding public health interventions, and taking proactive measures. This system is fundamental in developing responsive healthcare infrastructure within smart cities. Our solution, ITMAINN, is part of revolutionizing public health management.

nan

Article 1706

Title@2025-05-25 (7): Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs

Title: Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs

Unbestimmte Quantifizierung auf Funktionsebene für die Kalibrierung von Feinabstimmungen auf LLMs

对LLMML进行校准微调的不确定性定量 2410.06431v3

Authors: Ruijia Niu, Dongxia Wu, Rose Yu, Yi-An Ma

Accurate uncertainty quantification in large language models (LLMs) is essential for providing credible confidence estimates over their outputs. However, fine-tuned LLMs often exhibit overconfidence in uncertain predictions, which stems from their limited ability to generalize with sparse data. Existing parameter efficient fine-tuning (PEFT) uncertainty quantification methods for LLMs focus on post fine-tuning stage, and thus fail to address the core issue: limited specialization of PEFT adapters to accurately capture task-specific input-output relationships. To address these limitations, we propose Functional-Level Uncertainty Quantification for Calibrated Fine-Tuning (UQ4CT), which captures and calibrates uncertainty over the space of functions that map input prompts to outputs. We implement UQ4CT during the fine-tuning stage via a mixture-of-experts framework that hierarchically decomposes the functional space. Empirically, UQ4CT achieves over $25\%$ reduction in Expected Calibration Error (ECE) while preserving high accuracy across five benchmarks. Even under distribution shift, UQ4CT maintains superior ECE performance with high accuracy, showcasing improved generalizability.

nan

Article 1707

Title@2025-05-25 (7): AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer

Title: AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer

AnchorFormer: Differentielle Anker-Achtung für effizienten Vision Transformer

Anchor Former: 高效愿景变异器的可区别的锁定器注意 2505.16463v2

Authors: Jiquan Shan, Junxiao Wang, Lifeng Zhao, Liang Cai, Hongyuan Zhang, Ioannis Liritzis

Recently, vision transformers (ViTs) have achieved excellent performance on vision tasks by measuring the global self-attention among the image patches. Given $n$ patches, they will have quadratic complexity such as $\mathcal{O}(n^2)$ and the time cost is high when splitting the input image with a small granularity. Meanwhile, the pivotal information is often randomly gathered in a few regions of an input image, some tokens may not be helpful for the downstream tasks. To handle this problem, we introduce an anchor-based efficient vision transformer (AnchorFormer), which employs the anchor tokens to learn the pivotal information and accelerate the inference. Firstly, by estimating the bipartite attention between the anchors and tokens, the complexity will be reduced from $\mathcal{O}(n^2)$ to $\mathcal{O}(mn)$, where $m$ is an anchor number and $m < n$. Notably, by representing the anchors with the neurons in a neural layer, we can differentiable learn these distributions and approximate global self-attention through the Markov process. Moreover, we extend the proposed model to three downstream tasks including classification, detection, and segmentation. Extensive experiments show the effectiveness of our AnchorFormer, e.g., achieving up to a 9.0% higher accuracy or 46.7% FLOPs reduction on ImageNet classification, 81.3% higher mAP on COCO detection under comparable FLOPs, as compared to the current baselines.

nan

Article 1708

Title@2025-05-25 (7): When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers

Title: When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers

Wann ist Task Vector für die Modellbearbeitung wahrscheinlich wirksam? Eine Generalisierungsanalyse von nichtlinearen Transformern

任务矢量何时对模式编辑有效? 非线性变换器的概括分析 2504.10957v3

Authors: Hongkang Li, Yihua Zhang, Shuai Zhang, Meng Wang, Sijia Liu, Pin-Yu Chen

Task arithmetic refers to editing the pre-trained model by adding a weighted sum of task vectors, each of which is the weight update from the pre-trained model to fine-tuned models for certain tasks. This approach recently gained attention as a computationally efficient inference method for model editing, e.g., multi-task learning, forgetting, and out-of-domain generalization capabilities. However, the theoretical understanding of why task vectors can execute various conceptual operations remains limited, due to the highly non-convexity of training Transformer-based models. To the best of our knowledge, this paper provides the first theoretical characterization of the generalization guarantees of task vector methods on nonlinear Transformers. We consider a conceptual learning setting, where each task is a binary classification problem based on a discriminative pattern. We theoretically prove the effectiveness of task addition in simultaneously learning a set of irrelevant or aligned tasks, as well as the success of task negation in unlearning one task from irrelevant or contradictory tasks. Moreover, we prove the proper selection of linear coefficients for task arithmetic to achieve guaranteed generalization to out-of-domain tasks. All of our theoretical results hold for both dense-weight parameters and their low-rank approximations. Although established in a conceptual setting, our theoretical findings were validated on a practical machine unlearning task using the large language model Phi-1.5 (1.3B).

nan

Article 1709

Title@2025-05-25 (7): Fractured Chain-of-Thought Reasoning

Title: Fractured Chain-of-Thought Reasoning

Zersplitterte Kette von nachdenklichen Gründen

断断断断断断断断断断断断的探讨链原因 2505.12992v2

Authors: Baohao Liao, Hanze Dong, Yuhui Xu, Doyen Sahoo, Christof Monz, Junnan Li, Caiming Xiong

Inference-time scaling techniques have significantly bolstered the reasoning capabilities of large language models (LLMs) by harnessing additional computational effort at inference without retraining. Similarly, Chain-of-Thought (CoT) prompting and its extension, Long CoT, improve accuracy by generating rich intermediate reasoning trajectories, but these approaches incur substantial token costs that impede their deployment in latency-sensitive settings. In this work, we first show that truncated CoT, which stops reasoning before completion and directly generates the final answer, often matches full CoT sampling while using dramatically fewer tokens. Building on this insight, we introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling along three orthogonal axes: (1) the number of reasoning trajectories, (2) the number of final solutions per trajectory, and (3) the depth at which reasoning traces are truncated. Through extensive experiments on five diverse reasoning benchmarks and several model scales, we demonstrate that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget. Our analysis reveals how to allocate computation across these dimensions to maximize performance, paving the way for more efficient and scalable LLM reasoning. Code is available at https://github.com/BaohaoLiao/frac-cot.

nan

Article 1710

Title@2025-05-25 (7): Lorentzian Graph Isomorphic Network

Title: Lorentzian Graph Isomorphic Network

Lorentzian Graph Isomorphic Network

Lorentzian 图形异形网络 2504.00142v4

Authors: Srinitish Srinivasan, Omkumar CU

While hyperbolic GNNs show promise for hierarchical data, they often have limited discriminative power compared to Euclidean counterparts or the WL test, due to non-injective aggregation. To address this expressivity gap, we propose the Lorentzian Graph Isomorphic Network (LGIN), a novel HGNN designed for enhanced discrimination within the Lorentzian model. LGIN introduces a new update rule that preserves the Lorentzian metric while effectively capturing richer structural information. This marks a significant step towards more expressive GNNs on Riemannian manifolds. Extensive evaluations across nine benchmark datasets demonstrate LGIN’s superior performance, consistently outperforming or matching state-of-the-art hyperbolic and Euclidean baselines, showcasing its ability to capture complex graph structures. LGIN is the first to adapt principles of powerful, highly discriminative GNN architectures to a Riemannian manifold. The code for our paper can be found at https://github.com/Deceptrax123/LGIN

nan

Article 1711

Title@2025-05-25 (7): Querying Kernel Methods Suffices for Reconstructing their Training Data

Title: Querying Kernel Methods Suffices for Reconstructing their Training Data

Abfrage von Kernel-Methoden Möglichkeiten zur Wiederherstellung ihrer Trainingsdaten

查询重新构建其培训数据所需的核心内核方法 2505.19019v1

Authors: Daniel Barzilai, Yuval Margalit, Eitan Gronich, Gilad Yehudai, Meirav Galun, Ronen Basri

Over-parameterized models have raised concerns about their potential to memorize training data, even when achieving strong generalization. The privacy implications of such memorization are generally unclear, particularly in scenarios where only model outputs are accessible. We study this question in the context of kernel methods, and demonstrate both empirically and theoretically that querying kernel models at various points suffices to reconstruct their training data, even without access to model parameters. Our results hold for a range of kernel methods, including kernel regression, support vector machines, and kernel density estimation. Our hope is that this work can illuminate potential privacy concerns for such models.

nan

Article 1712

Title@2025-05-25 (7): Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering

Title: Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering

Genaue und effiziente Multivariate Zeitreihenprognose über Offline-Clustering

通过离线群集预测准确而高效的多变量时间序列 2505.05738v2

Authors: Yiming Niu, Jinliang Deng, Lulu Zhang, Zimu Zhou, Yongxin Tong

Accurate and efficient multivariate time series (MTS) forecasting is essential for applications such as traffic management and weather prediction, which depend on capturing long-range temporal dependencies and interactions between entities. Existing methods, particularly those based on Transformer architectures, compute pairwise dependencies across all time steps, leading to a computational complexity that scales quadratically with the length of the input. To overcome these challenges, we introduce the Forecaster with Offline Clustering Using Segments (FOCUS), a novel approach to MTS forecasting that simplifies long-range dependency modeling through the use of prototypes extracted via offline clustering. These prototypes encapsulate high-level events in the real-world system underlying the data, summarizing the key characteristics of similar time segments. In the online phase, FOCUS dynamically adapts these patterns to the current input and captures dependencies between the input segment and high-level events, enabling both accurate and efficient forecasting. By identifying prototypes during the offline clustering phase, FOCUS reduces the computational complexity of modeling long-range dependencies in the online phase to linear scaling. Extensive experiments across diverse benchmarks demonstrate that FOCUS achieves state-of-the-art accuracy while significantly reducing computational costs.

nan

Article 1713

Title@2025-05-25 (7): Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis

Title: Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis

Ausbildung nichtlinearer Transformer für den Schlussfolgerungsketten-of-Thought: Eine theoretische Generalisierungsanalyse

培训非线性非线性变换器,用于研究链推论:理论一般分析 2410.02167v3

Authors: Hongkang Li, Songtao Lu, Pin-Yu Chen, Xiaodong Cui, Meng Wang

Chain-of-Thought (CoT) is an efficient prompting method that enables the reasoning ability of large language models by augmenting the query using multiple examples with multiple intermediate steps. Despite the empirical success, the theoretical understanding of how to train a Transformer to achieve the CoT ability remains less explored. This is primarily due to the technical challenges involved in analyzing the nonconvex optimization on nonlinear attention models. To the best of our knowledge, this work provides the first theoretical study of training Transformers with nonlinear attention to obtain the CoT generalization capability so that the resulting model can inference on unseen tasks when the input is augmented by examples of the new task. We first quantify the required training samples and iterations to train a Transformer model towards CoT ability. We then prove the success of its CoT generalization on unseen tasks with distribution-shifted testing data. Moreover, we theoretically characterize the conditions for an accurate reasoning output by CoT even when the provided reasoning examples contain noises and are not always accurate. In contrast, in-context learning (ICL), which can be viewed as one-step CoT without intermediate steps, may fail to provide an accurate output when CoT does. These theoretical findings are justified through experiments.

nan

Article 1714

Title@2025-05-25 (7): Understanding the Robustness of Graph Neural Networks against Adversarial Attacks

Title: Understanding the Robustness of Graph Neural Networks against Adversarial Attacks

Verständnis der Robustheit von Graphen-Neuralen Netzwerken gegen feindliche Angriffe

理解反对反向攻击的平面神经网络的强大力 2406.13920v2

Authors: Tao Wu, Canyixing Cui, Xingping Xian, Shaojie Qiao, Chao Wang, Lin Yuan, Shui Yu

Recent studies have shown that graph neural networks (GNNs) are vulnerable to adversarial attacks, posing significant challenges to their deployment in safety-critical scenarios. This vulnerability has spurred a growing focus on designing robust GNNs. Despite this interest, current advancements have predominantly relied on empirical trial and error, resulting in a limited understanding of the robustness of GNNs against adversarial attacks. To address this issue, we conduct the first large-scale systematic study on the adversarial robustness of GNNs by considering the patterns of input graphs, the architecture of GNNs, and their model capacity, along with discussions on sensitive neurons and adversarial transferability. This work proposes a comprehensive empirical framework for analyzing the adversarial robustness of GNNs. To support the analysis of adversarial robustness in GNNs, we introduce two evaluation metrics: the confidence-based decision surface and the accuracy-based adversarial transferability rate. Through experimental analysis, we derive 11 actionable guidelines for designing robust GNNs, enabling model developers to gain deeper insights. The code of this study is available at https://github.com/star4455/GraphRE.

nan

Article 1715

Title@2025-05-25 (7): WorldEval: World Model as Real-World Robot Policies Evaluator

Title: WorldEval: World Model as Real-World Robot Policies Evaluator

WorldEval: Weltmodell als Real-World-Roboterpolitik Evaluator

WorldEval:世界作为真实世界机器人政策评价人的世界模式 2505.19017v1

Authors: Yaxuan Li, Yichen Zhu, Junjie Wen, Chaomin Shen, Yi Xu

The field of robotics has made significant strides toward developing generalist robot manipulation policies. However, evaluating these policies in real-world scenarios remains time-consuming and challenging, particularly as the number of tasks scales and environmental conditions change. In this work, we demonstrate that world models can serve as a scalable, reproducible, and reliable proxy for real-world robot policy evaluation. A key challenge is generating accurate policy videos from world models that faithfully reflect the robot actions. We observe that directly inputting robot actions or using high-dimensional encoding methods often fails to generate action-following videos. To address this, we propose Policy2Vec, a simple yet effective approach to turn a video generation model into a world simulator that follows latent action to generate the robot video. We then introduce WorldEval, an automated pipeline designed to evaluate real-world robot policies entirely online. WorldEval effectively ranks various robot policies and individual checkpoints within a single policy, and functions as a safety detector to prevent dangerous actions by newly developed robot models. Through comprehensive paired evaluations of manipulation policies in real-world environments, we demonstrate a strong correlation between policy performance in WorldEval and real-world scenarios. Furthermore, our method significantly outperforms popular methods such as real-to-sim approach.

nan

Article 1716

Title@2025-05-25 (7): Tokenizing Electron Cloud in Protein-Ligand Interaction Learning

Title: Tokenizing Electron Cloud in Protein-Ligand Interaction Learning

Tokenizing Electron Cloud in Protein-Ligand Interaktion Lernen

将电云投入蛋白碱的相互作用学习 2505.19014v1

Authors: Haitao Lin, Odin Zhang, Jia Xu, Yunfan Liu, Zheng Cheng, Lirong Wu, Yufei Huang, Zhifeng Gao, Stan Z. Li

The affinity and specificity of protein-molecule binding directly impact functional outcomes, uncovering the mechanisms underlying biological regulation and signal transduction. Most deep-learning-based prediction approaches focus on structures of atoms or fragments. However, quantum chemical properties, such as electronic structures, are the key to unveiling interaction patterns but remain largely underexplored. To bridge this gap, we propose ECBind, a method for tokenizing electron cloud signals into quantized embeddings, enabling their integration into downstream tasks such as binding affinity prediction. By incorporating electron densities, ECBind helps uncover binding modes that cannot be fully represented by atom-level models. Specifically, to remove the redundancy inherent in electron cloud signals, a structure-aware transformer and hierarchical codebooks encode 3D binding sites enriched with electron structures into tokens. These tokenized codes are then used for specific tasks with labels. To extend its applicability to a wider range of scenarios, we utilize knowledge distillation to develop an electron-cloud-agnostic prediction model. Experimentally, ECBind demonstrates state-of-the-art performance across multiple tasks, achieving improvements of 6.42\% and 15.58\% in per-structure Pearson and Spearman correlation coefficients, respectively.

nan

Article 1717

Title@2025-05-25 (7): Faithful Group Shapley Value

Title: Faithful Group Shapley Value

Treue Gruppe Shapley Wert

忠实的群群形状值 2505.19013v1

Authors: Kiljae Lee, Ziqi Liu, Weijing Tang, Yuan Zhang

Data Shapley is an important tool for data valuation, which quantifies the contribution of individual data points to machine learning models. In practice, group-level data valuation is desirable when data providers contribute data in batch. However, we identify that existing group-level extensions of Data Shapley are vulnerable to shell company attacks, where strategic group splitting can unfairly inflate valuations. We propose Faithful Group Shapley Value (FGSV) that uniquely defends against such attacks. Building on original mathematical insights, we develop a provably fast and accurate approximation algorithm for computing FGSV. Empirical experiments demonstrate that our algorithm significantly outperforms state-of-the-art methods in computational efficiency and approximation accuracy, while ensuring faithful group-level valuation.

nan

Article 1718

Title@2025-05-25 (7): Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

Title: Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

Alberta Wells Datensatz: Pinpointing Öl- und Gasquellen aus Satellitenbildern

艾伯塔·韦尔斯数据集:从卫星图象中点出石油和天然气井 2410.09032v3

Authors: Pratinav Seth, Michelle Lin, Brefo Dwamena Yaw, Jade Boutot, Mary Kang, David Rolnick

Millions of abandoned oil and gas wells are scattered across the world, leaching methane into the atmosphere and toxic compounds into the groundwater. Many of these locations are unknown, preventing the wells from being plugged and their polluting effects averted. Remote sensing is a relatively unexplored tool for pinpointing abandoned wells at scale. We introduce the first large-scale benchmark dataset for this problem, leveraging medium-resolution multi-spectral satellite imagery from Planet Labs. Our curated dataset comprises over 213,000 wells (abandoned, suspended, and active) from Alberta, a region with especially high well density, sourced from the Alberta Energy Regulator and verified by domain experts. We evaluate baseline algorithms for well detection and segmentation, showing the promise of computer vision approaches but also significant room for improvement.

nan

Article 1719

Title@2025-05-25 (7): FERGI: Automatic Scoring of User Preferences for Text-to-Image Generation from Spontaneous Facial Expression Reaction

Title: FERGI: Automatic Scoring of User Preferences for Text-to-Image Generation from Spontaneous Facial Expression Reaction

FERGI: Automatische Bewertung von Benutzereinstellungen für die Text-zu-Bild-Erzeugung aus spontaner Gesichtsausdrucksreaktion

FERGI: 自动自发面性表达反应生成文本到图像的用户首选项自动排序 2312.03187v4

Authors: Shuangquan Feng, Junhua Ma, Virginia R. de Sa

Researchers have proposed to use data of human preference feedback to fine-tune text-to-image generative models. However, the scalability of human feedback collection has been limited by its reliance on manual annotation. Therefore, we develop and test a method to automatically score user preferences from their spontaneous facial expression reaction to the generated images. We collect a dataset of Facial Expression Reaction to Generated Images (FERGI) and show that the activations of multiple facial action units (AUs) are highly correlated with user evaluations of the generated images. We develop an FAU-Net (Facial Action Units Neural Network), which receives inputs from an AU estimation model, to automatically score user preferences for text-to-image generation based on their facial expression reactions, which is complementary to the pre-trained scoring models based on the input text prompts and generated images. Integrating our FAU-Net valence score with the pre-trained scoring models improves their consistency with human preferences. This method of automatic annotation with facial expression analysis can be potentially generalized to other generation tasks. The code is available at https://github.com/ShuangquanFeng/FERGI, and the dataset is also available at the same link for research purposes.

nan

Article 1720

Title@2025-05-25 (7): Handling Label Noise via Instance-Level Difficulty Modeling and Dynamic Optimization

Title: Handling Label Noise via Instance-Level Difficulty Modeling and Dynamic Optimization

Handhabung von Etikettengeräuschen über Instance-Level-Schwierigkeitsmodellierung und dynamische Optimierung

通过实度难度建模和动态优化处理标签噪音 2505.00812v2

Authors: Kuan Zhang, Chengliang Chai, Jingzhe Xu, Chi Zhang, Ye Yuan, Guoren Wang, Lei Cao

Recent studies indicate that deep neural networks degrade in generalization performance under noisy supervision. Existing methods focus on isolating clean subsets or correcting noisy labels, facing limitations such as high computational costs, heavy hyperparameter tuning process, and coarse-grained optimization. To address these challenges, we propose a novel two-stage noisy learning framework that enables instance-level optimization through a dynamically weighted loss function, avoiding hyperparameter tuning. To obtain stable and accurate information about noise modeling, we introduce a simple yet effective metric, termed wrong event, which dynamically models the cleanliness and difficulty of individual samples while maintaining computational costs. Our framework first collects wrong event information and builds a strong base model. Then we perform noise-robust training on the base model, using a probabilistic model to handle the wrong event information of samples. Experiments on five synthetic and real-world LNL benchmarks demonstrate our method surpasses state-of-the-art methods in performance, achieves a nearly 75% reduction in computational time and improves model scalability.

nan

Article 1721

Title@2025-05-25 (7): Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding

Title: Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding

Galaxy Walker: Geometry-aware VLMs für Galaxy-Skala Verständnis

Galaxy Walker: 用于银河系统系统理解的几何觉测甚低LMS 2503.18578v3

Authors: Tianyu Chen, Xingcheng Fu, Yisen Gao, Haodong Qian, Yuecen Wei, Kun Yan, Haoyi Zhou, Jianxin Li

Modern vision-language models (VLMs) develop patch embedding and convolution backbone within vector space, especially Euclidean ones, at the very founding. When expanding VLMs to a galaxy scale for understanding astronomical phenomena, the integration of spherical space for planetary orbits and hyperbolic spaces for black holes raises two formidable challenges. a) The current pre-training model is confined to Euclidean space rather than a comprehensive geometric embedding. b) The predominant architecture lacks suitable backbones for anisotropic physical geometries. In this paper, we introduced Galaxy-Walker, a geometry-aware VLM, for the universe-level vision understanding tasks. We proposed the geometry prompt that generates geometry tokens by random walks across diverse spaces on a multi-scale physical graph, along with a geometry adapter that compresses and reshapes the space anisotropy in a mixture-of-experts manner. Extensive experiments demonstrate the effectiveness of our approach, with Galaxy-Walker achieving state-of-the-art performance in both galaxy property estimation ($R^2$ scores up to $0.91$) and morphology classification tasks (up to $+0.17$ F1 improvement in challenging features), significantly outperforming both domain-specific models and general-purpose VLMs.

nan

Article 1722

Title@2025-05-25 (7): Inductive Gradient Adjustment For Spectral Bias In Implicit Neural Representations

Title: Inductive Gradient Adjustment For Spectral Bias In Implicit Neural Representations

Induktive Gradientenanpassung für Spektralbien in impliziten Neuraldarstellungen

隐含神经表层旁观生物的感应梯度调整 2410.13271v2

Authors: Kexuan Shi, Hai Chen, Leheng Zhang, Shuhang Gu

Implicit Neural Representations (INRs), as a versatile representation paradigm, have achieved success in various computer vision tasks. Due to the spectral bias of the vanilla multi-layer perceptrons (MLPs), existing methods focus on designing MLPs with sophisticated architectures or repurposing training techniques for highly accurate INRs. In this paper, we delve into the linear dynamics model of MLPs and theoretically identify the empirical Neural Tangent Kernel (eNTK) matrix as a reliable link between spectral bias and training dynamics. Based on this insight, we propose a practical Inductive Gradient Adjustment (IGA) method, which could purposefully improve the spectral bias via inductive generalization of eNTK-based gradient transformation matrix. Theoretical and empirical analyses validate impacts of IGA on spectral bias. Further, we evaluate our method on different INRs tasks with various INR architectures and compare to existing training techniques. The superior and consistent improvements clearly validate the advantage of our IGA. Armed with our gradient adjustment method, better INRs with more enhanced texture details and sharpened edges can be learned from data by tailored impacts on spectral bias.

nan

Article 1723

Title@2025-05-25 (7): Semi-pessimistic Reinforcement Learning

Title: Semi-pessimistic Reinforcement Learning

Halbpessimistisches Erlernen der Verstärkung

半悲观强化学习 2505.19002v1

Authors: Jin Zhu, Xin Zhou, Jiaang Yao, Gholamali Aminian, Omar Rivasplata, Simon Little, Lexin Li, Chengchun Shi

Offline reinforcement learning (RL) aims to learn an optimal policy from pre-collected data. However, it faces challenges of distributional shift, where the learned policy may encounter unseen scenarios not covered in the offline data. Additionally, numerous applications suffer from a scarcity of labeled reward data. Relying on labeled data alone often leads to a narrow state-action distribution, further amplifying the distributional shift, and resulting in suboptimal policy learning. To address these issues, we first recognize that the volume of unlabeled data is typically substantially larger than that of labeled data. We then propose a semi-pessimistic RL method to effectively leverage abundant unlabeled data. Our approach offers several advantages. It considerably simplifies the learning process, as it seeks a lower bound of the reward function, rather than that of the Q-function or state transition function. It is highly flexible, and can be integrated with a range of model-free and model-based RL algorithms. It enjoys the guaranteed improvement when utilizing vast unlabeled data, but requires much less restrictive conditions. We compare our method with a number of alternative solutions, both analytically and numerically, and demonstrate its clear competitiveness. We further illustrate with an application to adaptive deep brain stimulation for Parkinson’s disease.

nan

Article 1724

Title@2025-05-25 (7): Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

Title: Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

Automatische und strukturschonende Sparsifikation von Hybrid-Neural-ODEs

混合神经代码的自动和结构软件分离 2505.18996v1

Authors: Bob Junyi Zou, Lu Tian

Hybrid neural ordinary differential equations (neural ODEs) integrate mechanistic models with neural ODEs, offering strong inductive bias and flexibility, and are particularly advantageous in data-scarce healthcare settings. However, excessive latent states and interactions from mechanistic models can lead to training inefficiency and over-fitting, limiting practical effectiveness of hybrid neural ODEs. In response, we propose a new hybrid pipeline for automatic state selection and structure optimization in mechanistic neural ODEs, combining domain-informed graph modifications with data-driven regularization to sparsify the model for improving predictive performance and stability while retaining mechanistic plausibility. Experiments on synthetic and real-world data show improved predictive performance and robustness with desired sparsity, establishing an effective solution for hybrid model reduction in healthcare applications.

nan

Article 1725

Title@2025-05-25 (7): Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Title: Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Verstärktes Lernen zur Vernunft in großen Sprachmodellen mit einem Trainingsbeispiel

采用 “ 一个培训实例 “ 采用大语言模式强化学习 2504.20571v2

Authors: Yiping Wang, Qing Yang, Zhiyuan Zeng, Liliang Ren, Liyuan Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang, Simon Shaolei Du, Yelong Shen

We show that reinforcement learning with verifiable reward using one training example (1-shot RLVR) is effective in incentivizing the mathematical reasoning capabilities of large language models (LLMs). Applying RLVR to the base model Qwen2.5-Math-1.5B, we identify a single example that elevates model performance on MATH500 from 36.0% to 73.6%, and improves the average performance across six common mathematical reasoning benchmarks from 17.6% to 35.7%. This result matches the performance obtained using the 1.2k DeepScaleR subset (MATH500: 73.6%, average: 35.9%), which includes the aforementioned example. Furthermore, RLVR with only two examples even slightly exceeds these results (MATH500: 74.8%, average: 36.6%). Similar substantial improvements are observed across various models (Qwen2.5-Math-7B, Llama3.2-3B-Instruct, DeepSeek-R1-Distill-Qwen-1.5B), RL algorithms (GRPO and PPO), and different math examples (when employed as a single training example). In addition, we identify some interesting phenomena during 1-shot RLVR, including cross-domain generalization, increased frequency of self-reflection, and sustained test performance improvement even after the training accuracy has saturated, a phenomenon we term post-saturation generalization. Moreover, we verify that the effectiveness of 1-shot RLVR primarily arises from the policy gradient loss, distinguishing it from the “grokking” phenomenon. We also show the critical role of promoting exploration (e.g., by incorporating entropy loss with an appropriate coefficient) in 1-shot RLVR training. We also further discuss related observations about format correction, label robustness and prompt modification. These findings can inspire future work on RLVR efficiency and encourage a re-examination of recent progress and the underlying mechanisms in RLVR. Our code, model, and data are open source at https://github.com/ypwang61/One-Shot-RLVR.

nan

Article 1726

Title@2025-05-25 (7): PDFBench: A Benchmark for De novo Protein Design from Function

Title: PDFBench: A Benchmark for De novo Protein Design from Function

PDFBench: Ein Benchmark für De novo Protein Design von der Funktion

PDFBench:从函数调出新蛋白设计基准 2505.20346v1

Authors: Jiahao Kuang, Nuowei Liu, Changzhi Sun, Tao Ji, Yuanbin Wu

In recent years, while natural language processing and multimodal learning have seen rapid advancements, the field of de novo protein design has also experienced significant growth. However, most current methods rely on proprietary datasets and evaluation rubrics, making fair comparisons between different approaches challenging. Moreover, these methods often employ evaluation metrics that capture only a subset of the desired properties of designed proteins, lacking a comprehensive assessment framework. To address these, we introduce PDFBench, the first comprehensive benchmark for evaluating de novo protein design from function. PDFBench supports two tasks: description-guided design and keyword-guided design. To ensure fair and multifaceted evaluation, we compile 22 metrics covering sequence plausibility, structural fidelity, and language-protein alignment, along with measures of novelty and diversity. We evaluate five state-of-the-art baselines, revealing their respective strengths and weaknesses across tasks. Finally, we analyze inter-metric correlations, exploring the relationships between four categories of metrics, and offering guidelines for metric selection. PDFBench establishes a unified framework to drive future advances in function-driven de novo protein design.

nan

Article 1727

Title@2025-05-25 (7): STRICT: Stress Test of Rendering Images Containing Text

Title: STRICT: Stress Test of Rendering Images Containing Text

STRICT: Stresstest von Rendering-Bildern mit Text

STICT: 含有文字的图像的显示压力测试 2505.18985v1

Authors: Tianyu Zhang, Xinyu Wang, Zhenghan Tai, Lu Li, Jijun Chi, Jingrui Tian, Hailin He, Suyuchen Wang

While diffusion models have revolutionized text-to-image generation with their ability to synthesize realistic and diverse scenes, they continue to struggle to generate consistent and legible text within images. This shortcoming is commonly attributed to the locality bias inherent in diffusion-based generation, which limits their ability to model long-range spatial dependencies. In this paper, we introduce $\textbf{STRICT}$, a benchmark designed to systematically stress-test the ability of diffusion models to render coherent and instruction-aligned text in images. Our benchmark evaluates models across multiple dimensions: (1) the maximum length of readable text that can be generated; (2) the correctness and legibility of the generated text, and (3) the ratio of not following instructions for generating text. We evaluate several state-of-the-art models, including proprietary and open-source variants, and reveal persistent limitations in long-range consistency and instruction-following capabilities. Our findings provide insights into architectural bottlenecks and motivate future research directions in multimodal generative modeling. We release our entire evaluation pipeline at https://github.com/tianyu-z/STRICT-Bench.

nan

Article 1728

Title@2025-05-25 (7): AmorLIP: Efficient Language-Image Pretraining via Amortization

Title: AmorLIP: Efficient Language-Image Pretraining via Amortization

AmorLIP: Effizientes Sprach-Bild-Vortraining über Amortisation

AmorLIP:通过摊销进行高效的语文图像预培训 2505.18983v1

Authors: Haotian Sun, Yitong Li, Yuchen Zhuang, Niao He, Hanjun Dai, Bo Dai

Contrastive Language-Image Pretraining (CLIP) has demonstrated strong zero-shot performance across diverse downstream text-image tasks. Existing CLIP methods typically optimize a contrastive objective using negative samples drawn from each minibatch. To achieve robust representation learning, these methods require extremely large batch sizes and escalate computational demands to hundreds or even thousands of GPUs. Prior approaches to mitigate this issue often compromise downstream performance, prolong training duration, or face scalability challenges with very large datasets. To overcome these limitations, we propose AmorLIP, an efficient CLIP pretraining framework that amortizes expensive computations involved in contrastive learning through lightweight neural networks, which substantially improves training efficiency and performance. Leveraging insights from a spectral factorization of energy-based models, we introduce novel amortization objectives along with practical techniques to improve training stability. Extensive experiments across 38 downstream tasks demonstrate the superior zero-shot classification and retrieval capabilities of AmorLIP, consistently outperforming standard CLIP baselines with substantial relative improvements of up to 12.24%.

nan

Article 1729

Title@2025-05-25 (7): Learning Mamba as a Continual Learner: Meta-learning Selective State Space Models for Efficient Continual Learning

Title: Learning Mamba as a Continual Learner: Meta-learning Selective State Space Models for Efficient Continual Learning

Mamba als Continual Learner lernen: Meta-Learning Selective State Space Models für effizientes Continual Learning

Mamba作为不断学习者学习Mamba:高效持续学习的元学习选择性国家空间模型 2412.00776v4

Authors: Chongyang Zhao, Dong Gong

Continual learning (CL) aims to efficiently learn from a non-stationary data stream, without storing or recomputing all seen samples. CL enables prediction on new tasks by incorporating sequential training samples. Building on this connection between CL and sequential modeling, meta-continual learning (MCL) aims to meta-learn an efficient continual learner as a sequence prediction model, with advanced sequence models like Transformers being natural choices. However, despite decent performance, Transformers rely on a linearly growing cache to store all past representations, conflicting with CL’s objective of not storing all seen samples and limiting efficiency. In this paper, we focus on meta-learning sequence-prediction-based continual learners without retaining all past representations. While attention-free models with fixed-size hidden states (e.g., Linear Transformers) align with CL’s essential goal and efficiency needs, they have shown limited effectiveness in MCL in previous literature. Given Mamba’s strong sequence modeling performance and attention-free nature, we explore a key question: Can attention-free models like Mamba perform well on MCL? By formulating Mamba and the SSM for MCL tasks, we propose MambaCL, a meta-learned continual learner. To enhance MambaCL’s training, we introduce selectivity regularization, leveraging the connection between Mamba and Transformers to guide its behavior over sequences. Furthermore, we study how Mamba and other models perform across various MCL scenarios through extensive and well-designed experiments. Our results highlight the promising performance and strong generalization of Mamba and attention-free models in MCL, demonstrating its potential for efficient continual learning and adaptation.

nan

Article 1730

Title@2025-05-25 (7): LLMScan: Causal Scan for LLM Misbehavior Detection

Title: LLMScan: Causal Scan for LLM Misbehavior Detection

LLMScan: Kausalscan zur Erkennung von LLM-Missverhalten

LLMScan:用于LLM Misbehavavor探测的成因扫描 2410.16638v4

Authors: Mengdi Zhang, Kai Kiat Goh, Peixin Zhang, Jun Sun, Rose Lin Xin, Hongyu Zhang

Despite the success of Large Language Models (LLMs) across various fields, their potential to generate untruthful, biased and harmful responses poses significant risks, particularly in critical applications. This highlights the urgent need for systematic methods to detect and prevent such misbehavior. While existing approaches target specific issues such as harmful responses, this work introduces LLMScan, an innovative LLM monitoring technique based on causality analysis, offering a comprehensive solution. LLMScan systematically monitors the inner workings of an LLM through the lens of causal inference, operating on the premise that the LLM’s `brain’ behaves differently when misbehaving. By analyzing the causal contributions of the LLM’s input tokens and transformer layers, LLMScan effectively detects misbehavior. Extensive experiments across various tasks and models reveal clear distinctions in the causal distributions between normal behavior and misbehavior, enabling the development of accurate, lightweight detectors for a variety of misbehavior detection tasks.

nan

Article 1731

Title@2025-05-25 (7): FedSKC: Federated Learning with Non-IID Data via Structural Knowledge Collaboration

Title: FedSKC: Federated Learning with Non-IID Data via Structural Knowledge Collaboration

FedSKC: Föderiertes Lernen mit nicht-ID-Daten über strukturelle Wissenskooperation

FDSKC:通过结构性知识协作,采用非IID数据的联邦学习 2505.18981v1

Authors: Huan Wang, Haoran Li, Huaming Chen, Jun Yan, Lijuan Wang, Jiahua Shi, Shiping Chen, Jun Shen

With the advancement of edge computing, federated learning (FL) displays a bright promise as a privacy-preserving collaborative learning paradigm. However, one major challenge for FL is the data heterogeneity issue, which refers to the biased labeling preferences among multiple clients, negatively impacting convergence and model performance. Most previous FL methods attempt to tackle the data heterogeneity issue locally or globally, neglecting underlying class-wise structure information contained in each client. In this paper, we first study how data heterogeneity affects the divergence of the model and decompose it into local, global, and sampling drift sub-problems. To explore the potential of using intra-client class-wise structural knowledge in handling these drifts, we thus propose Federated Learning with Structural Knowledge Collaboration (FedSKC). The key idea of FedSKC is to extract and transfer domain preferences from inter-client data distributions, offering diverse class-relevant knowledge and a fair convergent signal. FedSKC comprises three components: i) local contrastive learning, to prevent weight divergence resulting from local training; ii) global discrepancy aggregation, which addresses the parameter deviation between the server and clients; iii) global period review, correcting for the sampling drift introduced by the server randomly selecting devices. We have theoretically analyzed FedSKC under non-convex objectives and empirically validated its superiority through extensive experimental results.

nan

Article 1732

Title@2025-05-25 (7): GhostPrompt: Jailbreaking Text-to-image Generative Models based on Dynamic Optimization

Title: GhostPrompt: Jailbreaking Text-to-image Generative Models based on Dynamic Optimization

GhostPrompt: Jailbreaking Text-to-image Generative Modelle basierend auf dynamischer Optimierung

GhostPropt:基于动态最佳化的破狱用文字到图像生成模型 2505.18979v1

Authors: Zixuan Chen, Hao Lin, Ke Xu, Xinghao Jiang, Tanfeng Sun

Text-to-image (T2I) generation models can inadvertently produce not-safe-for-work (NSFW) content, prompting the integration of text and image safety filters. Recent advances employ large language models (LLMs) for semantic-level detection, rendering traditional token-level perturbation attacks largely ineffective. However, our evaluation shows that existing jailbreak methods are ineffective against these modern filters. We introduce GhostPrompt, the first automated jailbreak framework that combines dynamic prompt optimization with multimodal feedback. It consists of two key components: (i) Dynamic Optimization, an iterative process that guides a large language model (LLM) using feedback from text safety filters and CLIP similarity scores to generate semantically aligned adversarial prompts; and (ii) Adaptive Safety Indicator Injection, which formulates the injection of benign visual cues as a reinforcement learning problem to bypass image-level filters. GhostPrompt achieves state-of-the-art performance, increasing the ShieldLM-7B bypass rate from 12.5\% (Sneakyprompt) to 99.0\%, improving CLIP score from 0.2637 to 0.2762, and reducing the time cost by $4.2 \times$. Moreover, it generalizes to unseen filters including GPT-4.1 and successfully jailbreaks DALLE 3 to generate NSFW images in our evaluation, revealing systemic vulnerabilities in current multimodal defenses. To support further research on AI safety and red-teaming, we will release code and adversarial prompts under a controlled-access protocol.

nan

Article 1733

Title@2025-05-25 (7): ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting

Title: ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting

ScaleBiO: Skalierbare Bilevel-Optimierung für LLM-Datenumgewichtung

缩放 BIO: LLM 数据重新加权的可缩放双级优化 2406.19976v2

Authors: Rui Pan, Dylan Zhang, Hanning Zhang, Xingyuan Pan, Minrui Xu, Jipeng Zhang, Renjie Pi, Xiaoyu Wang, Tong Zhang

Bilevel optimization has shown its utility across various machine learning settings, yet most algorithms in practice require second-order information, making it challenging to scale them up. Only recently, a paradigm of first-order algorithms has emerged in the theoretical literature, capable of effectively addressing bilevel optimization problems. Nevertheless, the practical efficiency of this paradigm remains unverified, particularly in the context of large language models (LLMs). This paper introduces the first scalable instantiation of this paradigm called ScaleBiO, focusing on bilevel optimization for large-scale LLM data reweighting. By combining with a recently proposed memory-efficient training technique called LISA, our novel algorithm allows the paradigm to scale to $\sim$30B-sized LLMs on $8\times$H100 GPUs, marking the first successful application of bilevel optimization under practical scenarios for large-sized LLMs. Empirically, extensive experiments on data reweighting verify the effectiveness of ScaleBiO for different-scaled models, including Llama-3-8B, Gemma-2-9B, Qwen-2-7B, and Qwen-2.5-32B, where bilevel optimization succeeds in instruction-following and math reasoning tasks, outperforming several popular baselines, including uniform sampling, influence-aware data filtering, and reference-model-based sampling methods. Theoretically, ScaleBiO ensures the optimality of the learned data weights, along with a convergence guarantee matching the conventional first-order bilevel optimization paradigm on smooth and strongly convex objectives.

nan

Article 1734

Title@2025-05-25 (7): GraSS: Scalable Influence Function with Sparse Gradient Compression

Title: GraSS: Scalable Influence Function with Sparse Gradient Compression

GraSS: Skalierbare Einflussfunktion mit Sparse Gradient Compression

GraSS: 带有微缩梯度压缩的可缩放影响函数 2505.18976v1

Authors: Pingbang Hu, Joseph Melkonian, Weijing Tang, Han Zhao, Jiaqi W. Ma

Gradient-based data attribution methods, such as influence functions, are critical for understanding the impact of individual training samples without requiring repeated model retraining. However, their scalability is often limited by the high computational and memory costs associated with per-sample gradient computation. In this work, we propose GraSS, a novel gradient compression algorithm and its variants FactGraSS for linear layers specifically, that explicitly leverage the inherent sparsity of per-sample gradients to achieve sub-linear space and time complexity. Extensive experiments demonstrate the effectiveness of our approach, achieving substantial speedups while preserving data influence fidelity. In particular, FactGraSS achieves up to 165% faster throughput on billion-scale models compared to the previous state-of-the-art baselines. Our code is publicly available at https://github.com/TRAIS-Lab/GraSS.

nan

Article 1735

Title@2025-05-25 (7): The Final Layer Holds the Key: A Unified and Efficient GNN Calibration Framework

Title: The Final Layer Holds the Key: A Unified and Efficient GNN Calibration Framework

Die letzte Ebene hält den Schlüssel: Ein einheitliches und effizientes GNN-Kalibrierungssystem

最后层掌握着关键:统一有效的全球NNN校准框架 2505.11335v2

Authors: Jincheng Huang, Jie Xu, Xiaoshuang Shi, Ping Hu, Lei Feng, Xiaofeng Zhu

Graph Neural Networks (GNNs) have demonstrated remarkable effectiveness on graph-based tasks. However, their predictive confidence is often miscalibrated, typically exhibiting under-confidence, which harms the reliability of their decisions. Existing calibration methods for GNNs normally introduce additional calibration components, which fail to capture the intrinsic relationship between the model and the prediction confidence, resulting in limited theoretical guarantees and increased computational overhead. To address this issue, we propose a simple yet efficient graph calibration method. We establish a unified theoretical framework revealing that model confidence is jointly governed by class-centroid-level and node-level calibration at the final layer. Based on this insight, we theoretically show that reducing the weight decay of the final-layer parameters alleviates GNN under-confidence by acting on the class-centroid level, while node-level calibration acts as a finer-grained complement to class-centroid level calibration, which encourages each test node to be closer to its predicted class centroid at the final-layer representations. Extensive experiments validate the superiority of our method.

nan

Article 1736

Title@2025-05-25 (7): MoLAE: Mixture of Latent Experts for Parameter-Efficient Language Models

Title: MoLAE: Mixture of Latent Experts for Parameter-Efficient Language Models

MoLAE: Mischung aus latenten Experten für Parameter-Effiziente Sprachmodelle

MoLAE:参数有效语言模型原始专家混合 2503.23100v2

Authors: Zehua Liu, Han Wu, Ruifeng She, Xiaojin Fu, Xiongwei Han, Tao Zhong, Mingxuan Yuan

Mixture of Experts (MoE) has become a key architectural paradigm for efficiently scaling Large Language Models (LLMs) by selectively activating a subset of parameters for each input token. However, standard MoE architectures face significant challenges, including high memory consumption and communication overhead during distributed training. In this paper, we introduce Mixture of Latent Experts (MoLAE), a novel parameterization that addresses these limitations by reformulating expert operations through a shared projection into a lower-dimensional latent space, followed by expert-specific transformations. This factorized approach substantially reduces parameter count and computational requirements, particularly in existing LLMs where hidden dimensions significantly exceed MoE intermediate dimensions. We provide a rigorous mathematical framework for transforming pre-trained MoE models into MoLAE architecture, characterizing conditions for optimal factorization, and developing a systematic two-step algorithm for this conversion. Our comprehensive theoretical analysis demonstrates that MoLAE significantly improves efficiency across multiple dimensions while preserving model capabilities. Experimental results confirm that MoLAE achieves comparable performance to standard MoE with substantially reduced resource requirements.

nan

Article 1737

Title@2025-05-25 (7): Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees

Title: Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees

Multi-Step-Konsistenzmodelle: Schnelle Generation mit theoretischen Garantien

多层次一致性模式:有理论保障的快速一代 2505.01049v2

Authors: Nishant Jain, Xunpeng Huang, Yian Ma, Tong Zhang

Consistency models have recently emerged as a compelling alternative to traditional SDE-based diffusion models. They offer a significant acceleration in generation by producing high-quality samples in very few steps. Despite their empirical success, a proper theoretic justification for their speed-up is still lacking. In this work, we address the gap by providing a theoretical analysis of consistency models capable of mapping inputs at a given time to arbitrary points along the reverse trajectory. We show that one can achieve a KL divergence of order $ O(\varepsilon^2) $ using only $ O\left(\log\left(\frac{d}{\varepsilon}\right)\right) $ iterations with a constant step size. Additionally, under minimal assumptions on the data distribution (non smooth case) an increasingly common setting in recent diffusion model analyses we show that a similar KL convergence guarantee can be obtained, with the number of steps scaling as $ O\left(d \log\left(\frac{d}{\varepsilon}\right)\right) $. Going further, we also provide a theoretical analysis for estimation of such consistency models, concluding that accurate learning is feasible using small discretization steps, both in smooth and non-smooth settings. Notably, our results for the non-smooth case yield best in class convergence rates compared to existing SDE or ODE based analyses under minimal assumptions.

nan

Article 1738

Title@2025-05-25 (7): Genetic Influences on Brain Aging: Analyzing Sex Differences in the UK Biobank using Structural MRI

Title: Genetic Influences on Brain Aging: Analyzing Sex Differences in the UK Biobank using Structural MRI

Genetische Einflüsse auf das Altern des Gehirns: Analyse von Geschlechtsunterschieden in der britischen Biobank mittels struktureller MRT

对大脑老龄化的遗传基因影响:利用结构MRI分析联合王国生物库中的性别差异 2505.20344v1

Authors: Karen Ardila, Aashka Mohite, Abdoljalil Addeh, Amanda V. Tyndall, Cindy K. Barha, Quan Long, M. Ethan MacDonald

Brain aging trajectories differ between males and females, yet the genetic factors underlying these differences remain underexplored. Using structural MRI and genotyping data from 40,940 UK Biobank participants (aged 45-83), we computed Brain Age Gap Estimates (BrainAGE) for total brain, hippocampal, and ventricular volumes. We conducted sex-stratified genome-wide association studies (GWAS) and Post-GWAS analyses to identify genetic variants associated with accelerated brain aging. Distinct gene sets emerged by sex: in females, neurotransmitter transport and mitochondrial stress response genes were implicated; in males, immune and inflammation-related genes dominated. Shared genes, including GMNC and OSTN, were consistently linked to brain volumes across sexes, suggesting core roles in neurostructural maintenance. Tissue expression analyses revealed sex-specific enrichment in pathways tied to neurodegeneration. These findings highlight the importance of sex-stratified approaches in aging research and suggest genetic targets for personalized interventions against age-related cognitive decline.

nan

Article 1739

Title@2025-05-25 (7): Protein Design with Dynamic Protein Vocabulary

Title: Protein Design with Dynamic Protein Vocabulary

Protein Design mit dynamischem Protein Vokabular

配有动态蛋白质词汇词典的蛋白因设计 2505.18966v1

Authors: Nuowei Liu, Jiahao Kuang, Yanting Liu, Changzhi Sun, Tao Ji, Yuanbin Wu, Man Lan

Protein design is a fundamental challenge in biotechnology, aiming to design novel sequences with specific functions within the vast space of possible proteins. Recent advances in deep generative models have enabled function-based protein design from textual descriptions, yet struggle with structural plausibility. Inspired by classical protein design methods that leverage natural protein structures, we explore whether incorporating fragments from natural proteins can enhance foldability in generative models. Our empirical results show that even random incorporation of fragments improves foldability. Building on this insight, we introduce ProDVa, a novel protein design approach that integrates a text encoder for functional descriptions, a protein language model for designing proteins, and a fragment encoder to dynamically retrieve protein fragments based on textual functional descriptions. Experimental results demonstrate that our approach effectively designs protein sequences that are both functionally aligned and structurally plausible. Compared to state-of-the-art models, ProDVa achieves comparable function alignment using less than 0.04% of the training data, while designing significantly more well-folded proteins, with the proportion of proteins having pLDDT above 70 increasing by 7.38% and those with PAE below 10 increasing by 9.6%.

nan

Article 1740

Title@2025-05-25 (7): Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models

Title: Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models

Expansion Span: Kombinieren von Fading Memory und Retrieval in Hybrid State Space Models

扩展空间:在混合国家空间模型中将平缓内存和检索合并 2412.13328v2

Authors: Elvis Nunez, Luca Zancato, Benjamin Bowman, Aditya Golatkar, Wei Xia, Stefano Soatto

The “state” of State Space Models (SSMs) represents their memory, which fades exponentially over an unbounded span. By contrast, Attention-based models have “eidetic” (i.e., verbatim, or photographic) memory over a finite span (context size). Hybrid architectures combine State Space layers with Attention, but still cannot recall the distant past and can access only the most recent tokens eidetically. Unlike current methods of combining SSM and Attention layers, we allow the state to be allocated based on relevancy rather than recency. In this way, for every new set of query tokens, our models can “eidetically” access tokens from beyond the Attention span of current Hybrid SSMs without requiring extra hardware resources. We introduce a method to expand the memory span of the hybrid state by “reserving” a fraction of the Attention context for tokens retrieved from arbitrarily distant in the past, thus expanding the eidetic memory span of the overall state. We call this reserved fraction of tokens the “expansion span,” and the mechanism to retrieve and aggregate it “Span-Expanded Attention” (SE-Attn). To adapt Hybrid models to using SE-Attn, we propose a novel fine-tuning method that extends LoRA to Hybrid models (HyLoRA) and allows efficient adaptation on long spans of tokens. We show that SE-Attn enables us to efficiently adapt pre-trained Hybrid models on sequences of tokens up to 8 times longer than the ones used for pre-training. We show that HyLoRA with SE-Attn is cheaper and more performant than alternatives like LongLoRA when applied to Hybrid models on natural language benchmarks with long-range dependencies, such as PG-19, RULER, and other common natural language downstream tasks.

nan

Article 1741

Title: How Do Images Align and Complement LiDAR? Towards a Harmonized Multi-modal 3D Panoptic Segmentation

Wie richten und ergänzen Bilder LiDAR? Auf dem Weg zu einer harmonisierten multimodalen 3D-Panoptischen Segmentierung

图像如何对齐和补充 LiDAR ? 2505.18956v1

Authors: Yining Pan, Qiongjie Cui, Xulei Yang, Na Zhao

LiDAR-based 3D panoptic segmentation often struggles with the inherent sparsity of data from LiDAR sensors, which makes it challenging to accurately recognize distant or small objects. Recently, a few studies have sought to overcome this challenge by integrating LiDAR inputs with camera images, leveraging the rich and dense texture information provided by the latter. While these approaches have shown promising results, they still face challenges, such as misalignment during data augmentation and the reliance on post-processing steps. To address these issues, we propose Image-Assists-LiDAR (IAL), a novel multi-modal 3D panoptic segmentation framework. In IAL, we first introduce a modality-synchronized data augmentation strategy, PieAug, to ensure alignment between LiDAR and image inputs from the start. Next, we adopt a transformer decoder to directly predict panoptic segmentation results. To effectively fuse LiDAR and image features into tokens for the decoder, we design a Geometric-guided Token Fusion (GTF) module. Additionally, we leverage the complementary strengths of each modality as priors for query initialization through a Prior-based Query Generation (PQG) module, enhancing the decoder’s ability to generate accurate instance masks. Our IAL framework achieves state-of-the-art performance compared to previous multi-modal 3D panoptic segmentation methods on two widely used benchmarks. Code and models are publicly available at https://github.com/IMPL-Lab/IAL.git.

nan

Article 1742

Title@2025-05-25 (7): Online Knowledge Distillation with Reward Guidance

Title: Online Knowledge Distillation with Reward Guidance

Online-Wissensdestillation mit lohnender Anleitung

网上知识蒸馏与奖励指导 2505.18952v1

Authors: Chen Jia

This work studies knowledge distillation (KD) for large language models (LLMs) through preference optimization. We propose a reward-guided imitation learning framework for sequential KD, formulating a min-max optimization problem between the policy and reward model (RM) to minimize the performance gap between the student and teacher policies. Specifically, the reward optimization is constrained to achieve near-optimality within a confidence set for preference alignment. For preference data construction, we explore both offline and online preference-based KD. Additionally, we reformulate the RM using the $Q$-value function and extend the framework to white-box KD, where the teacher policy’s predicted probabilities are accessible. Theoretical analysis and empirical results demonstrate the effectiveness of the proposed framework.

nan

Article 1743

Title@2025-05-25 (7): The Price of Format: Diversity Collapse in LLMs

Title: The Price of Format: Diversity Collapse in LLMs

Der Preis des Formats: Diversity Collapse in LLMs

格式价格:多样化在LLMM中崩溃 2505.18949v1

Authors: Longfei Yun, Chenyang An, Zilong Wang, Letian Peng, Jingbo Shang

Instruction-tuned large language models (LLMs) employ structured templates, such as role markers and special tokens, to enforce format consistency during inference. However, we identify a critical limitation of such formatting: it induces a phenomenon we term diversity collapse, where the model generates semantically similar outputs for open-ended inputs, undermining creativity and variability. We systematically evaluate this effect across tasks like story completion and free-form generation, finding that (1) diversity collapse persists even under high-temperature sampling, and (2) structural tokens in templates significantly constrain the model’s output space. To contextualize these findings, we fine-tune the same model using a range of structured prompts and then evaluate them across three axes: downstream task performance, alignment behavior, and output diversity. Our analysis shows that format consistency between fine-tuning and inference is crucial for structure-sensitive tasks (e.g., GSM8K, IFEval), but has marginal influence on knowledge-heavy tasks (e.g., MMLU, WebQuestions). In contrast, output diversity is primarily governed by the presence or absence of structural tokens, with minimal formatting yielding the most diverse outputs. These findings reveal that current prompting conventions, while beneficial for alignment, may inadvertently suppress output diversity, underscoring the need for diversity-aware prompt design and instruction tuning.

nan

Article 1744

Title@2025-05-25 (7): Exact Expressive Power of Transformers with Padding

Title: Exact Expressive Power of Transformers with Padding

Exakte Expressive Kraft von Transformatoren mit Padding

带有斜面的变形器的精确表达力 2505.18948v1

Authors: William Merrill, Ashish Sabharwal

Chain of thought is a natural inference-time method for increasing the computational power of transformer-based large language models (LLMs), but comes at the cost of sequential decoding. Are there more efficient alternatives to expand a transformer’s expressive power without adding parameters? We consider transformers with padding tokens as a form of parallelizable test-time compute. We show that averaging-hard-attention, masked-pre-norm transformers with polynomial padding converge to precisely the class $\mathsf{TC}^0$ of extremely parallelizable problems. While the $\mathsf{TC}^0$ upper bound was known, proving a matching lower bound had been elusive. Further, our novel analysis reveals the precise expanded power of padded transformers when coupled with another form of inference-time compute, namely dynamically increasing depth via looping. Our core technical contribution is to show how padding helps bring the notions of complete problems and reductions, which have been a cornerstone of classical complexity theory, to the formal study of transformers. Armed with this new tool, we prove that padded transformers with $O(\log^d n)$ looping on inputs of length $n$ recognize exactly the class $\mathsf{TC}^d$ of moderately parallelizable problems. Thus, padding and looping together systematically expand transformers’ expressive power: with polylogarithmic looping, padded transformers converge to the class $\mathsf{NC}$, the best that could be expected without losing parallelism (unless $\mathsf{NC} = \mathsf{P}$). Our results thus motivate further exploration of padding and looping as parallelizable alternatives to chain of thought.

nan

Article 1745

Title@2025-05-25 (7): Minimax Optimal Reinforcement Learning with Quasi-Optimism

Title: Minimax Optimal Reinforcement Learning with Quasi-Optimism

Minimax Optimales Stärkungslernen mit Quasi-Optimismus

以准适应主义进行最优化强化学习 2503.00810v2

Authors: Harin Lee, Min-hwan Oh

In our quest for a reinforcement learning (RL) algorithm that is both practical and provably optimal, we introduce EQO (Exploration via Quasi-Optimism). Unlike existing minimax optimal approaches, EQO avoids reliance on empirical variances and employs a simple bonus term proportional to the inverse of the state-action visit count. Central to EQO is the concept of quasi-optimism, where estimated values need not be fully optimistic, allowing for a simpler yet effective exploration strategy. The algorithm achieves the sharpest known regret bound for tabular RL under the mildest assumptions, proving that fast convergence can be attained with a practical and computationally efficient approach. Empirical evaluations demonstrate that EQO consistently outperforms existing algorithms in both regret performance and computational efficiency, providing the best of both theoretical soundness and practical effectiveness.

nan

Article 1746

Title@2025-05-25 (7): Efficient Pauli channel estimation with logarithmic quantum memory

Title: Efficient Pauli channel estimation with logarithmic quantum memory

Effiziente Pauli-Kanalschätzung mit logarithmischem Quantenspeicher

具有对数量内存的高效保利频道估计 2309.14326v4

Authors: Sitan Chen, Weiyuan Gong

Here we revisit one of the prototypical tasks for characterizing the structure of noise in quantum devices: estimating every eigenvalue of an $n$-qubit Pauli noise channel to error $\epsilon$. Prior work [14] proved no-go theorems for this task in the practical regime where one has a limited amount of quantum memory, e.g. any protocol with $\le 0.99n$ ancilla qubits of quantum memory must make exponentially many measurements, provided it is non-concatenating. Such protocols can only interact with the channel by repeatedly preparing a state, passing it through the channel, and measuring immediately afterward. This left open a natural question: does the lower bound hold even for general protocols, i.e. ones which chain together many queries to the channel, interleaved with arbitrary data-processing channels, before measuring? Surprisingly, in this work we show the opposite: there is a protocol that can estimate the eigenvalues of a Pauli channel to error $\epsilon$ using only $O(\log n/\epsilon^2)$ ancilla and $\tilde{O}(n^2/\epsilon^2)$ measurements. In contrast, we show that any protocol with zero ancilla, even a concatenating one, must make $\Omega(2^n/\epsilon^2)$ measurements, which is tight. Our results imply, to our knowledge, the first quantum learning task where logarithmically many qubits of quantum memory suffice for an exponential statistical advantage. Our protocol can be naturally extended to a protocol that learns the eigenvalues of Pauli terms within any subset $A$ of a Pauli channel with $O(\log\log(

)/\epsilon^2)$ ancilla and $\tilde{O}(n^2/\epsilon^2)$ measurements.

nan

Article 1747

Title@2025-05-25 (7): Structural Alignment Improves Graph Test-Time Adaptation

Title: Structural Alignment Improves Graph Test-Time Adaptation

Struktural Alignment verbessert Graph Test-Time Anpassung

结构调整改进图示测试时间适应 2502.18334v2

Authors: Hans Hao-Hsun Hsu, Shikun Liu, Han Zhao, Pan Li

Graph-based learning excels at capturing interaction patterns in diverse domains like recommendation, fraud detection, and particle physics. However, its performance often degrades under distribution shifts, especially those altering network connectivity. Current methods to address these shifts typically require retraining with the source dataset, which is often infeasible due to computational or privacy limitations. We introduce Test-Time Structural Alignment (TSA), a novel algorithm for Graph Test-Time Adaptation (GTTA) that aligns graph structures during inference without accessing the source data. Grounded in a theoretical understanding of graph data distribution shifts, TSA employs three synergistic strategies: uncertainty-aware neighborhood weighting to accommodate neighbor label distribution shifts, adaptive balancing of self-node and aggregated neighborhood representations based on their signal-to-noise ratio, and decision boundary refinement to correct residual label and feature shifts. Extensive experiments on synthetic and real-world datasets demonstrate TSA’s consistent outperformance of both non-graph TTA methods and state-of-the-art GTTA baselines.

nan

Article 1748

Title@2025-05-25 (7): Chi-Square Wavelet Graph Neural Networks for Heterogeneous Graph Anomaly Detection

Title: Chi-Square Wavelet Graph Neural Networks for Heterogeneous Graph Anomaly Detection

Chi-Square Wavelet Graph Neural Networks für Heterogene Graph Anomalie Detection

用于异源图异常异常图探测的千平方波浪图神经网络 2505.18934v1

Authors: Xiping Li, Xiangyu Dong, Xingyi Zhang, Kun Xie, Yuanhao Feng, Bo Wang, Guilin Li, Wuxiong Zeng, Xiujun Shu, Sibo Wang

Graph Anomaly Detection (GAD) in heterogeneous networks presents unique challenges due to node and edge heterogeneity. Existing Graph Neural Network (GNN) methods primarily focus on homogeneous GAD and thus fail to address three key issues: (C1) Capturing abnormal signal and rich semantics across diverse meta-paths; (C2) Retaining high-frequency content in HIN dimension alignment; and (C3) Learning effectively from difficult anomaly samples with class imbalance. To overcome these, we propose ChiGAD, a spectral GNN framework based on a novel Chi-Square filter, inspired by the wavelet effectiveness in diverse domains. Specifically, ChiGAD consists of: (1) Multi-Graph Chi-Square Filter, which captures anomalous information via applying dedicated Chi-Square filters to each meta-path graph; (2) Interactive Meta-Graph Convolution, which aligns features while preserving high-frequency information and incorporates heterogeneous messages by a unified Chi-Square Filter; and (3) Contribution-Informed Cross-Entropy Loss, which prioritizes difficult anomalies to address class imbalance. Extensive experiments on public and industrial datasets show that ChiGAD outperforms state-of-the-art models on multiple metrics. Additionally, its homogeneous variant, ChiGNN, excels on seven GAD datasets, validating the effectiveness of Chi-Square filters. Our code is available at https://github.com/HsipingLi/ChiGAD.

nan

Article 1749

Title@2025-05-25 (7): Can Large Language Models Infer Causal Relationships from Real-World Text?

Title: Can Large Language Models Infer Causal Relationships from Real-World Text?

Können große Sprachmodelle Kausalbeziehungen aus Real-World Text ableiten?

大语言模型能否从真实世界文本中推断出因果关系? 2505.18931v1

Authors: Ryan Saklad, Aman Chadha, Oleg Pavlov, Raha Moraffah

Understanding and inferring causal relationships from texts is a core aspect of human cognition and is essential for advancing large language models (LLMs) towards artificial general intelligence. Existing work primarily focuses on synthetically generated texts which involve simple causal relationships explicitly mentioned in the text. This fails to reflect the complexities of real-world tasks. In this paper, we investigate whether LLMs are capable of inferring causal relationships from real-world texts. We develop a benchmark drawn from real-world academic literature which includes diverse texts with respect to length, complexity of relationships (different levels of explicitness, number of events, and causal relationships), and domains and sub-domains. To the best of our knowledge, our benchmark is the first-ever real-world dataset for this task. Our experiments on state-of-the-art LLMs evaluated on our proposed benchmark demonstrate significant challenges, with the best-performing model achieving an average F1 score of only 0.477. Analysis reveals common pitfalls: difficulty with implicitly stated information, in distinguishing relevant causal factors from surrounding contextual details, and with connecting causally relevant information spread across lengthy textual passages. By systematically characterizing these deficiencies, our benchmark offers targeted insights for further research into advancing LLM causal reasoning.

nan

Article 1750

Title@2025-05-25 (7): Hybrid Neural-MPM for Interactive Fluid Simulations in Real-Time

Title: Hybrid Neural-MPM for Interactive Fluid Simulations in Real-Time

Hybrid-Neural-MPM für interaktive Fluidsimulationen in Echtzeit

用于实时交互流力模拟的神经-MPM混合神经-MPM 2505.18926v1

Authors: Jingxuan Xu, Hong Huang, Chuhang Zou, Manolis Savva, Yunchao Wei, Wuyang Chen

We propose a neural physics system for real-time, interactive fluid simulations. Traditional physics-based methods, while accurate, are computationally intensive and suffer from latency issues. Recent machine-learning methods reduce computational costs while preserving fidelity; yet most still fail to satisfy the latency constraints for real-time use and lack support for interactive applications. To bridge this gap, we introduce a novel hybrid method that integrates numerical simulation, neural physics, and generative control. Our neural physics jointly pursues low-latency simulation and high physical fidelity by employing a fallback safeguard to classical numerical solvers. Furthermore, we develop a diffusion-based controller that is trained using a reverse modeling strategy to generate external dynamic force fields for fluid manipulation. Our system demonstrates robust performance across diverse 2D/3D scenarios, material types, and obstacle interactions, achieving real-time simulations at high frame rates (11~29% latency) while enabling fluid control guided by user-friendly freehand sketches. We present a significant step towards practical, controllable, and physically plausible fluid simulations for real-time interactive applications. We promise to release both models and data upon acceptance.

nan

Article 1751

Title@2025-05-25 (7): Graph-Based Operator Learning from Limited Data on Irregular Domains

Title: Graph-Based Operator Learning from Limited Data on Irregular Domains

Graph-based Operator Lernen von begrenzten Daten über irreguläre Domains

以图图为基础的操作员学习关于非常规域域的有限数据 2505.18923v1

Authors: Yile Li, Shandian Zhe

Operator learning seeks to approximate mappings from input functions to output solutions, particularly in the context of partial differential equations (PDEs). While recent advances such as DeepONet and Fourier Neural Operator (FNO) have demonstrated strong performance, they often rely on regular grid discretizations, limiting their applicability to complex or irregular domains. In this work, we propose a Graph-based Operator Learning with Attention (GOLA) framework that addresses this limitation by constructing graphs from irregularly sampled spatial points and leveraging attention-enhanced Graph Neural Netwoks (GNNs) to model spatial dependencies with global information. To improve the expressive capacity, we introduce a Fourier-based encoder that projects input functions into a frequency space using learnable complex coefficients, allowing for flexible embeddings even with sparse or nonuniform samples. We evaluated our approach across a range of 2D PDEs, including Darcy Flow, Advection, Eikonal, and Nonlinear Diffusion, under varying sampling densities. Our method consistently outperforms baselines, particularly in data-scarce regimes, demonstrating strong generalization and efficiency on irregular domains.

nan

Article 1752

Title@2025-05-25 (7): ALPCAHUS: Subspace Clustering for Heteroscedastic Data

Title: ALPCAHUS: Subspace Clustering for Heteroscedastic Data

ALPCAHUS: Subraum-Clustering für heterosexuelle Daten

ALPCAHUS: 用于河流测量数据的子空间集群 2505.18918v1

Authors: Javier Salazar Cavazos, Jeffrey A Fessler, Laura Balzano

Principal component analysis (PCA) is a key tool in the field of data dimensionality reduction. Various methods have been proposed to extend PCA to the union of subspace (UoS) setting for clustering data that come from multiple subspaces like K-Subspaces (KSS). However, some applications involve heterogeneous data that vary in quality due to noise characteristics associated with each data sample. Heteroscedastic methods aim to deal with such mixed data quality. This paper develops a heteroscedastic-focused subspace clustering method, named ALPCAHUS, that can estimate the sample-wise noise variances and use this information to improve the estimate of the subspace bases associated with the low-rank structure of the data. This clustering algorithm builds on K-Subspaces (KSS) principles by extending the recently proposed heteroscedastic PCA method, named LR-ALPCAH, for clusters with heteroscedastic noise in the UoS setting. Simulations and real-data experiments show the effectiveness of accounting for data heteroscedasticity compared to existing clustering algorithms. Code available at https://github.com/javiersc1/ALPCAHUS.

nan

Article 1753

Title@2025-05-25 (7): Behavior Injection: Preparing Language Models for Reinforcement Learning

Title: Behavior Injection: Preparing Language Models for Reinforcement Learning

Verhaltensinjektion: Vorbereitung von Sprachmodellen für verstärktes Lernen

行为注射:为强化学习准备语言模式 2505.18917v1

Authors: Zhepeng Cen, Yihang Yao, William Han, Zuxin Liu, Ding Zhao

Reinforcement fine-tuning (RFT) has emerged as a powerful post-training technique to incentivize the reasoning ability of large language models (LLMs). However, LLMs can respond very inconsistently to RFT: some show substantial performance gains, while others plateau or even degrade. To understand this divergence, we analyze the per-step influence of the RL objective and identify two key conditions for effective post-training: (1) RL-informative rollout accuracy, and (2) strong data co-influence, which quantifies how much the training data affects performance on other samples. Guided by these insights, we propose behavior injection, a task-agnostic data-augmentation scheme applied prior to RL. Behavior injection enriches the supervised finetuning (SFT) data by seeding exploratory and exploitative behaviors, effectively making the model more RL-ready. We evaluate our method across two reasoning benchmarks with multiple base models. The results demonstrate that our theoretically motivated augmentation can significantly increases the performance gain from RFT over the pre-RL model.

nan

Article 1754

Title@2025-05-25 (7): PySAD: A Streaming Anomaly Detection Framework in Python

Title: PySAD: A Streaming Anomaly Detection Framework in Python

PySAD: Ein Streaming-Anomaly Detection-Framework in Python

PySAD: Python 流动异常检测框架 2009.02572v2

Authors: Selim F. Yilmaz, Suleyman S. Kozat

Streaming anomaly detection requires algorithms that operate under strict constraints: bounded memory, single-pass processing, and constant-time complexity. We present PySAD, a comprehensive Python framework addressing these challenges through a unified architecture. The framework implements 17+ streaming algorithms (LODA, Half-Space Trees, xStream) with specialized components including projectors, probability calibrators, and postprocessors. Unlike existing batch-focused frameworks, PySAD enables efficient real-time processing with bounded memory while maintaining compatibility with PyOD and scikit-learn. Supporting all learning paradigms for univariate and multivariate streams, PySAD provides the most comprehensive streaming anomaly detection toolkit in Python. The source code is publicly available at github.com/selimfirat/pysad.

nan

Article 1755

Title@2025-05-25 (7): Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach

Title: Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach

Multimodale LLMs unter Verteilungsverschiebungen verstehen: Ein informationstheoretischer Ansatz

在分销变更下理解多式LLMs:信息理论方法 2502.00577v2

Authors: Changdae Oh, Zhen Fang, Shawn Im, Xuefeng Du, Yixuan Li

Multimodal large language models (MLLMs) have shown promising capabilities but struggle under distribution shifts, where evaluation data differ from instruction tuning distributions. Although previous works have provided empirical evaluations, we argue that establishing a formal framework that can characterize and quantify the risk of MLLMs is necessary to ensure the safe and reliable application of MLLMs in the real world. By taking an information-theoretic perspective, we propose the first theoretical framework that enables the quantification of the maximum risk of MLLMs under distribution shifts. Central to our framework is the introduction of Effective Mutual Information (EMI), a principled metric that quantifies the relevance between input queries and model responses. We derive an upper bound for the EMI difference between in-distribution (ID) and out-of-distribution (OOD) data, connecting it to visual and textual distributional discrepancies. Extensive experiments on real benchmark datasets, spanning 61 shift scenarios, empirically validate our theoretical insights.

nan

Article 1756

Title@2025-05-25 (7): On the Role of Label Noise in the Feature Learning Process

Title: On the Role of Label Noise in the Feature Learning Process

Über die Rolle von Etikettengeräuschen im Feature-Learning-Prozess

关于标签噪音在专题学习过程中的作用 2505.18909v1

Authors: Andi Han, Wei Huang, Zhanpeng Zhou, Gang Niu, Wuyang Chen, Junchi Yan, Akiko Takeda, Taiji Suzuki

Deep learning with noisy labels presents significant challenges. In this work, we theoretically characterize the role of label noise from a feature learning perspective. Specifically, we consider a signal-noise data distribution, where each sample comprises a label-dependent signal and label-independent noise, and rigorously analyze the training dynamics of a two-layer convolutional neural network under this data setup, along with the presence of label noise. Our analysis identifies two key stages. In Stage I, the model perfectly fits all the clean samples (i.e., samples without label noise) while ignoring the noisy ones (i.e., samples with noisy labels). During this stage, the model learns the signal from the clean samples, which generalizes well on unseen data. In Stage II, as the training loss converges, the gradient in the direction of noise surpasses that of the signal, leading to overfitting on noisy samples. Eventually, the model memorizes the noise present in the noisy samples and degrades its generalization ability. Furthermore, our analysis provides a theoretical basis for two widely used techniques for tackling label noise: early stopping and sample selection. Experiments on both synthetic and real-world setups validate our theory.

nan

Article 1757

Title@2025-05-25 (7): Stronger Enforcement of Instruction Hierarchy via Augmented Intermediate Representations

Title: Stronger Enforcement of Instruction Hierarchy via Augmented Intermediate Representations

Stärkere Durchsetzung der Instruktionshierarchie durch Augmented Intermediate Representations

通过扩大中级代表,加强执行指示分级制度 2505.18907v1

Authors: Sanjay Kariyappa, G. Edward Suh

Prompt injection attacks are a critical security vulnerability in large language models (LLMs), allowing attackers to hijack model behavior by injecting malicious instructions within the input context. Recent defense mechanisms have leveraged an Instruction Hierarchy (IH) Signal, often implemented through special delimiter tokens or additive embeddings to denote the privilege level of input tokens. However, these prior works typically inject the IH signal exclusively at the initial input layer, which we hypothesize limits its ability to effectively distinguish the privilege levels of tokens as it propagates through the different layers of the model. To overcome this limitation, we introduce a novel approach that injects the IH signal into the intermediate token representations within the network. Our method augments these representations with layer-specific trainable embeddings that encode the privilege information. Our evaluations across multiple models and training methods reveal that our proposal yields between $1.6\times$ and $9.2\times$ reduction in attack success rate on gradient-based prompt injection attacks compared to state-of-the-art methods, without significantly degrading the model’s utility.

nan

Article 1758

Title@2025-05-24 (6): Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services

Title: Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services

Pre-trained Encoder-Schlussfolgerung: Enthüllen Upstream-Encoder in Downstream Machine Learning Services

培训前编码器推断:在下游机器学习服务中向上游编码器 2408.02814v2

Authors: Shaopeng Fu, Xuexue Sun, Ke Qing, Tianhang Zheng, Di Wang

Pre-trained encoders available online have been widely adopted to build downstream machine learning (ML) services, but various attacks against these encoders also post security and privacy threats toward such a downstream ML service paradigm. We unveil a new vulnerability: the Pre-trained Encoder Inference (PEI) attack, which can extract sensitive encoder information from a targeted downstream ML service that can then be used to promote other ML attacks against the targeted service. By only providing API accesses to a targeted downstream service and a set of candidate encoders, the PEI attack can successfully infer which encoder is secretly used by the targeted service based on candidate ones. Compared with existing encoder attacks, which mainly target encoders on the upstream side, the PEI attack can compromise encoders even after they have been deployed and hidden in downstream ML services, which makes it a more realistic threat. We empirically verify the effectiveness of the PEI attack on vision encoders. we first conduct PEI attacks against two downstream services (i.e., image classification and multimodal generation), and then show how PEI attacks can facilitate other ML attacks (i.e., model stealing attacks vs. image classification models and adversarial attacks vs. multimodal generative models). Our results call for new security and privacy considerations when deploying encoders in downstream services. The code is available at https://github.com/fshp971/encoder-inference.

nan

Article 1759

Title@2025-05-24 (6): PromptWise: Online Learning for Cost-Aware Prompt Assignment in Generative Models

Title: PromptWise: Online Learning for Cost-Aware Prompt Assignment in Generative Models

PromptWise: Online-Lernen für kostenbewusste Prompt-Zuweisung in generativen Modellen

快速Wise:在创用模型中进行成本-软件快速指派在线学习 2505.18901v1

Authors: Xiaoyan Hu, Lauren Pick, Ho-fung Leung, Farzan Farnia

The rapid advancement of generative AI models has provided users with numerous options to address their prompts. When selecting a generative AI model for a given prompt, users should consider not only the performance of the chosen model but also its associated service cost. The principle guiding such consideration is to select the least expensive model among the available satisfactory options. However, existing model-selection approaches typically prioritize performance, overlooking pricing differences between models. In this paper, we introduce PromptWise, an online learning framework designed to assign a sequence of prompts to a group of large language models (LLMs) in a cost-effective manner. PromptWise strategically queries cheaper models first, progressing to more expensive options only if the lower-cost models fail to adequately address a given prompt. Through numerical experiments, we demonstrate PromptWise’s effectiveness across various tasks, including puzzles of varying complexity and code generation/translation tasks. The results highlight that PromptWise consistently outperforms cost-unaware baseline methods, emphasizing that directly assigning prompts to the most expensive models can lead to higher costs and potentially lower average performance.

nan

Article 1760

Title@2025-05-24 (6): Beyond Domain Randomization: Event-Inspired Perception for Visually Robust Adversarial Imitation from Videos

Title: Beyond Domain Randomization: Event-Inspired Perception for Visually Robust Adversarial Imitation from Videos

Beyond Domain Randomization: Event-inspirierte Wahrnehmung für visuell robuste Adversarial Imitation aus Videos

超出域随机化: 视频中视觉强力反逆模仿受事件启发的感知 2505.18899v1

Authors: Andrea Ramazzina, Vittorio Giammarino, Matteo El-Hariry, Mario Bijelic

Imitation from videos often fails when expert demonstrations and learner environments exhibit domain shifts, such as discrepancies in lighting, color, or texture. While visual randomization partially addresses this problem by augmenting training data, it remains computationally intensive and inherently reactive, struggling with unseen scenarios. We propose a different approach: instead of randomizing appearances, we eliminate their influence entirely by rethinking the sensory representation itself. Inspired by biological vision systems that prioritize temporal transients (e.g., retinal ganglion cells) and by recent sensor advancements, we introduce event-inspired perception for visually robust imitation. Our method converts standard RGB videos into a sparse, event-based representation that encodes temporal intensity gradients, discarding static appearance features. This biologically grounded approach disentangles motion dynamics from visual style, enabling robust visual imitation from observations even in the presence of visual mismatches between expert and agent environments. By training policies on event streams, we achieve invariance to appearance-based distractors without requiring computationally expensive and environment-specific data augmentation techniques. Experiments across the DeepMind Control Suite and the Adroit platform for dynamic dexterous manipulation show the efficacy of our method. Our code is publicly available at Eb-LAIfO.

nan

Article 1761

Title@2025-05-24 (6): Marginal Fairness: Fair Decision-Making under Risk Measures

Title: Marginal Fairness: Fair Decision-Making under Risk Measures

Marginal Fairness: Faire Entscheidungsfindung im Rahmen von Risikomaßnahmen

边际公平:风险措施下的公平决策 2505.18895v1

Authors: Fei Huang, Silvana M. Pesenti

This paper introduces marginal fairness, a new individual fairness notion for equitable decision-making in the presence of protected attributes such as gender, race, and religion. This criterion ensures that decisions based on generalized distortion risk measures are insensitive to distributional perturbations in protected attributes, regardless of whether these attributes are continuous, discrete, categorical, univariate, or multivariate. To operationalize this notion and reflect real-world regulatory environments (such as the EU gender-neutral pricing regulation), we model business decision-making in highly regulated industries (such as insurance and finance) as a two-step process: (i) a predictive modeling stage, in which a prediction function for the target variable (e.g., insurance losses) is estimated based on both protected and non-protected covariates; and (ii) a decision-making stage, in which a generalized distortion risk measure is applied to the target variable, conditional only on non-protected covariates, to determine the decision. In this second step, we modify the risk measure such that the decision becomes insensitive to the protected attribute, thus enforcing fairness to ensure equitable outcomes under risk-sensitive, regulatory constraints. Furthermore, by utilizing the concept of cascade sensitivity, we extend the marginal fairness framework to capture how dependencies between covariates propagate the influence of protected attributes through the modeling pipeline. A numerical study and an empirical implementation using an auto insurance dataset demonstrate how the framework can be applied in practice.

nan

Article 1762

Title@2025-05-24 (6): Conformal Prediction for Uncertainty Estimation in Drug-Target Interaction Prediction

Title: Conformal Prediction for Uncertainty Estimation in Drug-Target Interaction Prediction

Konforme Vorhersage für Unsicherheitsschätzungen in der Drogen-Ziel-Interaktionsvorhersage

药物-目标相互作用预测中不确定性估计的非正式预测 2505.18890v1

Authors: Morteza Rakhshaninejad, Mira Jurgens, Nicolas Dewolf, Willem Waegeman

Accurate drug-target interaction (DTI) prediction with machine learning models is essential for drug discovery. Such models should also provide a credible representation of their uncertainty, but applying classical marginal conformal prediction (CP) in DTI prediction often overlooks variability across drug and protein subgroups. In this work, we analyze three cluster-conditioned CP methods for DTI prediction, and compare them with marginal and group-conditioned CP. Clusterings are obtained via nonconformity scores, feature similarity, and nearest neighbors, respectively. Experiments on the KIBA dataset using four data-splitting strategies show that nonconformity-based clustering yields the tightest intervals and most reliable subgroup coverage, especially in random and fully unseen drug-protein splits. Group-conditioned CP works well when one entity is familiar, but residual-driven clustering provides robust uncertainty estimates even in sparse or novel scenarios. These results highlight the potential of cluster-based CP for improving DTI prediction under uncertainty.

nan

Article 1763

Title@2025-05-24 (6): Enabling Unstructured Sparse Acceleration on Structured Sparse Accelerators

Title: Enabling Unstructured Sparse Acceleration on Structured Sparse Accelerators

Ermöglichung unstrukturierter Spars-Beschleunigung bei strukturierten Spars-Beschleunigern

启用结构散开加速器, 启用无结构的分散加速器 2403.07953v3

Authors: Geonhwa Jeong, Po-An Tsai, Abhimanyu R. Bambhaniya, Stephen W. Keckler, Tushar Krishna

Exploiting sparsity in deep neural networks (DNNs) has been a promising area for meeting the growing computation requirements. To minimize the overhead of sparse acceleration, hardware designers have proposed structured sparsity support, but it provides limited flexibility and requires extra model fine-tuning. Moreover, any sparse model fine-tuned for certain structured sparse HW cannot be accelerated by other structured hardware. To enable acceleration using unstructured sparsity of DNNs on structured sparse hardware, we propose an approximation method leveraging the distributive property in linear algebra to turn any sparse tensor into a series of structured sparse tensors. We also develop a software framework, TASDER, to apply high-quality structured approximation on weights and activations of DNNs. Our method accelerates dense and sparse DNNs without fine-tuning and improves energy-delay-product (EDP) by up to 83% and 74%. It achieves up to 39% speed-up on a real system.

nan

Article 1764

Title@2025-05-24 (6): Neural Encoding and Decoding at Scale

Title: Neural Encoding and Decoding at Scale

Neurale Enkodierung und Dekodierung auf Scale

缩放时神经编码和解码 2504.08201v4

Authors: Yizi Zhang, Yanchen Wang, Mehdi Azabou, Alexandre Andre, Zixuan Wang, Hanrui Lyu, The International Brain Laboratory, Eva Dyer, Liam Paninski, Cole Hurwitz

Recent work has demonstrated that large-scale, multi-animal models are powerful tools for characterizing the relationship between neural activity and behavior. Current large-scale approaches, however, focus exclusively on either predicting neural activity from behavior (encoding) or predicting behavior from neural activity (decoding), limiting their ability to capture the bidirectional relationship between neural activity and behavior. To bridge this gap, we introduce a multimodal, multi-task model that enables simultaneous Neural Encoding and Decoding at Scale (NEDS). Central to our approach is a novel multi-task-masking strategy, which alternates between neural, behavioral, within-modality, and cross-modality masking. We pretrain our method on the International Brain Laboratory (IBL) repeated site dataset, which includes recordings from 83 animals performing the same visual decision-making task. In comparison to other large-scale models, we demonstrate that NEDS achieves state-of-the-art performance for both encoding and decoding when pretrained on multi-animal data and then fine-tuned on new animals. Surprisingly, NEDS’s learned embeddings exhibit emergent properties: even without explicit training, they are highly predictive of the brain regions in each recording. Altogether, our approach is a step towards a foundation model of the brain that enables seamless translation between neural activity and behavior.

nan

Article 1765

Title@2025-05-24 (6): Data Augmentation for Time-Series Classification: An Extensive Empirical Study and Comprehensive Survey

Title: Data Augmentation for Time-Series Classification: An Extensive Empirical Study and Comprehensive Survey

Datenvergrößerung für die Zeitreihenklassifikation: Eine umfangreiche empirische Studie und umfassende Umfrage

时间-系列分类数据扩充:广泛经验研究和全面调查 2310.10060v6

Authors: Zijun Gao, Haibao Liu, Lingbo Li

Data Augmentation (DA) has become a critical approach in Time Series Classification (TSC), primarily for its capacity to expand training datasets, enhance model robustness, introduce diversity, and reduce overfitting. However, the current landscape of DA in TSC is plagued with fragmented literature reviews, nebulous methodological taxonomies, inadequate evaluative measures, and a dearth of accessible and user-oriented tools. This study addresses these challenges through a comprehensive examination of DA methodologies within the TSC domain.Our research began with an extensive literature review spanning a decade, revealing significant gaps in existing surveys and necessitating a detailed analysis of over 100 scholarly articles to identify more than 60 distinct DA techniques. This rigorous review led to the development of a novel taxonomy tailored to the specific needs of DA in TSC, categorizing techniques into five primary categories: Transformation-Based, Pattern-Based, Generative, Decomposition-Based, and Automated Data Augmentation. This taxonomy is intended to guide researchers in selecting appropriate methods with greater clarity. In response to the lack of comprehensive evaluations of foundational DA techniques, we conducted a thorough empirical study, testing nearly 20 DA strategies across 15 diverse datasets representing all types within the UCR time-series repository. Using ResNet and LSTM architectures, we employed a multifaceted evaluation approach, including metrics such as Accuracy, Method Ranking, and Residual Analysis, resulting in a benchmark accuracy of 84.98 +- 16.41% in ResNet and 82.41 +- 18.71% in LSTM. Our investigation underscored the inconsistent efficacies of DA techniques, for instance, methods like RGWs and Random Permutation significantly improved model performance, whereas others, like EMD, were less effective.

nan

Article 1766

Title@2025-05-24 (6): KerZOO: Kernel Function Informed Zeroth-Order Optimization for Accurate and Accelerated LLM Fine-Tuning

Title: KerZOO: Kernel Function Informed Zeroth-Order Optimization for Accurate and Accelerated LLM Fine-Tuning

KerZOO: Kernel-Funktion informierte Zeroth-Order-Optimierung für präzise und beschleunigte LLM-Feinsteuerung

KerZOO:为准确和加速 LLM 精密推荐而优化使用核心(KerZOO): 2505.18886v1

Authors: Zhendong Mi, Qitao Tan, Xiaodong Yu, Zining Zhu, Geng Yuan, Shaoyi Huang

Large language models (LLMs) have demonstrated impressive capabilities across numerous NLP tasks. Nevertheless, conventional first-order fine-tuning techniques impose heavy memory demands, creating practical obstacles to real-world applications. Zeroth-order (ZO) optimization has recently emerged as a promising memory-efficient alternative, as it circumvents the need for backpropagation by estimating gradients solely through forward passes–making it particularly suitable for resource-limited environments. Despite its efficiency, ZO optimization suffers from gradient estimation bias, which significantly hinders convergence speed. To address this, we analytically identify and characterize the lower-order bias introduced during ZO-based gradient estimation in LLM fine-tuning. Motivated by tools in mathematical physics, we introduce a kernel-function-based ZO framework aimed at mitigating this bias and improving optimization stability. KerZOO achieves comparable or superior performance to existing ZO baselines in both full-parameter and parameter-efficient fine-tuning settings of LLMs, while significantly reducing the number of iterations required to reach convergence. For example, KerZOO reduces total GPU training hours by as much as 74% and 44% on WSC and MultiRC datasets in fine-tuning OPT-2.7B model and can exceed the MeZO baseline by 2.9% and 2.6% in accuracy. We show that the kernel function is an effective avenue for reducing estimation bias in ZO methods.

nan

Article 1767

Title@2025-05-24 (6): LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders

Title: LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders

LORE: Lagrangian-optimierte robuste Einbettungen für visuelle Encoder

Lagrangian- 优化的视觉编码器强力嵌入器 2505.18884v1

Authors: Borna Khodabandeh, Amirabbas Afzali, Amirhossein Afsharrad, Seyed Shahabeddin Mousavi, Sanjay Lall, Sajjad Amini, Seyed-Mohsen Moosavi-Dezfooli

Visual encoders have become fundamental components in modern computer vision pipelines. However, ensuring robustness against adversarial perturbations remains a critical challenge. Recent efforts have explored both supervised and unsupervised adversarial fine-tuning strategies. We identify two key limitations in these approaches: (i) they often suffer from instability, especially during the early stages of fine-tuning, resulting in suboptimal convergence and degraded performance on clean data, and (ii) they exhibit a suboptimal trade-off between robustness and clean data accuracy, hindering the simultaneous optimization of both objectives. To overcome these challenges, we propose Lagrangian-Optimized Robust Embeddings (LORE), a novel unsupervised adversarial fine-tuning framework. LORE utilizes constrained optimization, which offers a principled approach to balancing competing goals, such as improving robustness while preserving nominal performance. By enforcing embedding-space proximity constraints, LORE effectively maintains clean data performance throughout adversarial fine-tuning. Extensive experiments show that LORE significantly improves zero-shot adversarial robustness with minimal degradation in clean data accuracy. Furthermore, we demonstrate the effectiveness of the adversarially fine-tuned CLIP image encoder in out-of-distribution generalization and enhancing the interpretability of image embeddings.

nan

Article 1768

Title@2025-05-24 (6): LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity

Title: LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity

LinGen: Auf dem Weg zur High-Resolution Minute-Length Text-to-Video-Generation mit linearer Computational Complexity

LinGen:迈向具有线性比较复杂度的高分辨率分钟-语言文本到视频的生成 2412.09856v2

Authors: Hongjie Wang, Chih-Yao Ma, Yen-Cheng Liu, Ji Hou, Tao Xu, Jialiang Wang, Felix Juefei-Xu, Yaqiao Luo, Peizhao Zhang, Tingbo Hou, Peter Vajda, Niraj K. Jha, Xiaoliang Dai

Text-to-video generation enhances content creation but is highly computationally intensive: The computational cost of Diffusion Transformers (DiTs) scales quadratically in the number of pixels. This makes minute-length video generation extremely expensive, limiting most existing models to generating videos of only 10-20 seconds length. We propose a Linear-complexity text-to-video Generation (LinGen) framework whose cost scales linearly in the number of pixels. For the first time, LinGen enables high-resolution minute-length video generation on a single GPU without compromising quality. It replaces the computationally-dominant and quadratic-complexity block, self-attention, with a linear-complexity block called MATE, which consists of an MA-branch and a TE-branch. The MA-branch targets short-to-long-range correlations, combining a bidirectional Mamba2 block with our token rearrangement method, Rotary Major Scan, and our review tokens developed for long video generation. The TE-branch is a novel TEmporal Swin Attention block that focuses on temporal correlations between adjacent tokens and medium-range tokens. The MATE block addresses the adjacency preservation issue of Mamba and improves the consistency of generated videos significantly. Experimental results show that LinGen outperforms DiT (with a 75.6% win rate) in video quality with up to 15$\times$ (11.5$\times$) FLOPs (latency) reduction. Furthermore, both automatic metrics and human evaluation demonstrate our LinGen-4B yields comparable video quality to state-of-the-art models (with a 50.5%, 52.1%, 49.1% win rate with respect to Gen-3, LumaLabs, and Kling, respectively). This paves the way to hour-length movie generation and real-time interactive video generation. We provide 68s video generation results and more examples in our project website: https://lineargen.github.io/.

nan

Article 1769

Title@2025-05-24 (6): Partition Generative Modeling: Masked Modeling Without Masks

Title: Partition Generative Modeling: Masked Modeling Without Masks

Partition Generative Modellierung: Maskenmodellierung ohne Masken

生成建模:没有遮罩的蒙面建模 2505.18883v1

Authors: Justin Deschenaux, Lan Tran, Caglar Gulcehre

We introduce ``Partition Generative Models’’ (PGMs), a novel approach to masked generative modeling (MGMs), particularly effective for masked diffusion language modeling (MDLMs). PGM divides tokens into two distinct groups and employs sparse attention patterns to prevent cross-group information exchange. Hence, the model is trained to predict tokens in one group based solely on information from the other group. This partitioning strategy eliminates the need for MASK tokens entirely. While traditional MGMs inefficiently process MASK tokens during generation, PGMs achieve greater computational efficiency by operating exclusively on unmasked tokens. Our experiments on OpenWebText with a context length of 1024 tokens demonstrate that PGMs deliver at least 5x improvements in both latency and throughput compared to MDLM when using the same number of sampling steps, while generating samples with better generative perplexity than MDLM. Finally, we show that PGMs can be distilled with Self-Distillation Through Time (SDTT), a method originally devised for MDLM, in order to achieve further inference gains.

nan

Article 1770

Title@2025-05-24 (6): RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models

Title: RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models

RefLoRA: Refactored Low-Rank-Anpassung für effizientes Feintuning großer Modelle

RefLORA:为对大型模型进行高效微调而进行重构的低Rank适应 2505.18877v1

Authors: Yilang Zhang, Bingcong Li, Georgios B. Giannakis

Low-Rank Adaptation (LoRA) lowers the computational and memory overhead of fine-tuning large models by updating a low-dimensional subspace of the pre-trained weight matrix. Albeit efficient, LoRA exhibits suboptimal convergence and noticeable performance degradation, due to inconsistent and imbalanced weight updates induced by its nonunique low-rank factorizations. To overcome these limitations, this article identifies the optimal low-rank factorization per step that minimizes an upper bound on the loss. The resultant refactored low-rank adaptation (RefLoRA) method promotes a flatter loss landscape, along with consistent and balanced weight updates, thus speeding up stable convergence. Extensive experiments evaluate RefLoRA on natural language understanding, and commonsense reasoning tasks with popular large language models including DeBERTaV3, LLaMA-7B, LLaMA2-7B and LLaMA3-8B. The numerical tests corroborate that RefLoRA converges faster, outperforms various benchmarks, and enjoys negligible computational overhead compared to state-of-the-art LoRA variants.

nan

Article 1771

Title@2025-05-24 (6): Non-Stationary Lipschitz Bandits

Title: Non-Stationary Lipschitz Bandits

Nicht-stationäre Lipschitz Banditen

非固定的利普施奇茨猛匪 2505.18871v1

Authors: Nicolas Nguyen, Solenne Gaucher, Claire Vernade

We study the problem of non-stationary Lipschitz bandits, where the number of actions is infinite and the reward function, satisfying a Lipschitz assumption, can change arbitrarily over time. We design an algorithm that adaptively tracks the recently introduced notion of significant shifts, defined by large deviations of the cumulative reward function. To detect such reward changes, our algorithm leverages a hierarchical discretization of the action space. Without requiring any prior knowledge of the non-stationarity, our algorithm achieves a minimax-optimal dynamic regret bound of $\mathcal{\widetilde{O}}(\tilde{L}^{1/3}T^{2/3})$, where $\tilde{L}$ is the number of significant shifts and $T$ the horizon. This result provides the first optimal guarantee in this setting.

nan

Article 1772

Title@2025-05-24 (6): Sci-LoRA: Mixture of Scientific LoRAs for Cross-Domain Lay Paraphrasing

Title: Sci-LoRA: Mixture of Scientific LoRAs for Cross-Domain Lay Paraphrasing

Sci-LoRA: Mischung aus wissenschaftlichen LoRAs für Cross-Domain Lay Paraphrasing

Sci-LORA:将科学LORA混合起来,用于跨域地谱图谱绘制 2505.18867v1

Authors: Ming Cheng, Jiaying Gong, Hoda Eldardiry

Lay paraphrasing aims to make scientific information accessible to audiences without technical backgrounds. However, most existing studies focus on a single domain, such as biomedicine. With the rise of interdisciplinary research, it is increasingly necessary to comprehend knowledge spanning multiple technical fields. To address this, we propose Sci-LoRA, a model that leverages a mixture of LoRAs fine-tuned on multiple scientific domains. In particular, Sci-LoRA dynamically generates and applies weights for each LoRA, enabling it to adjust the impact of different domains based on the input text, without requiring explicit domain labels. To balance domain-specific knowledge and generalization across various domains, Sci-LoRA integrates information at both the data and model levels. This dynamic fusion enhances the adaptability and performance across various domains. Experimental results across twelve domains on five public datasets show that Sci-LoRA significantly outperforms state-of-the-art large language models and demonstrates flexible generalization and adaptability in cross-domain lay paraphrasing.

nan

Article 1773

Title@2025-05-24 (6): Distribution-Aware Mobility-Assisted Decentralized Federated Learning

Title: Distribution-Aware Mobility-Assisted Decentralized Federated Learning

Distribution-Aware Mobility-Assisted Dezentrales Federated Learning

分发通知 – – 流动协助 – – 分权力下放的联邦学习 2505.18866v1

Authors: Md Farhamdur Reza, Reza Jahani, Richeng Jin, Huaiyu Dai

Decentralized federated learning (DFL) has attracted significant attention due to its scalability and independence from a central server. In practice, some participating clients can be mobile, yet the impact of user mobility on DFL performance remains largely unexplored, despite its potential to facilitate communication and model convergence. In this work, we demonstrate that introducing a small fraction of mobile clients, even with random movement, can significantly improve the accuracy of DFL by facilitating information flow. To further enhance performance, we propose novel distribution-aware mobility patterns, where mobile clients strategically navigate the network, leveraging knowledge of data distributions and static client locations. The proposed moving strategies mitigate the impact of data heterogeneity and boost learning convergence. Extensive experiments validate the effectiveness of induced mobility in DFL and demonstrate the superiority of our proposed mobility patterns over random movement.

nan

Article 1774

Title@2025-05-24 (6): Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Title: Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Geführt von Guardrails: Steuerungsbarrierenfunktionen als Sicherheitsinstruktoren für das Roboterlernen

由警卫队指导:作为机器人学习安全教官的控制障碍功能 2505.18858v1

Authors: Maeva Guerrier, Karthik Soma, Hassan Fouad, Giovanni Beltrame

Safety stands as the primary obstacle preventing the widespread adoption of learning-based robotic systems in our daily lives. While reinforcement learning (RL) shows promise as an effective robot learning paradigm, conventional RL frameworks often model safety by using single scalar negative rewards with immediate episode termination, failing to capture the temporal consequences of unsafe actions (e.g., sustained collision damage). In this work, we introduce a novel approach that simulates these temporal effects by applying continuous negative rewards without episode termination. Our experiments reveal that standard RL methods struggle with this model, as the accumulated negative values in unsafe zones create learning barriers. To address this challenge, we demonstrate how Control Barrier Functions (CBFs), with their proven safety guarantees, effectively help robots avoid catastrophic regions while enhancing learning outcomes. We present three CBF-based approaches, each integrating traditional RL methods with Control Barrier Functions, guiding the agent to learn safe behavior. Our empirical analysis, conducted in both simulated environments and real-world settings using a four-wheel differential drive robot, explores the possibilities of employing these approaches for safe robotic learning.

nan

Article 1775

Title@2025-05-24 (6): USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations

Title: USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations

USDC: Ein Datensatz von $\underline{U}$ser $\underline{S}$tance und $\underline{D}$ogmatism in langen $\underline{C}$onversations

USCC: 以 $\ underline{U}$ser $\ underline{S}$tance 和 $\ underline{D}$ogmatism 的数据集, 以 Long $\ underline{C} 美元对数值 2406.16833v2

Authors: Mounika Marreddy, Subba Reddy Oota, Venkata Charan Chinni, Manish Gupta, Lucie Flek

Analyzing user opinion changes in long conversation threads is extremely critical for applications like enhanced personalization, market research, political campaigns, customer service, targeted advertising, and content moderation. Unfortunately, previous studies on stance and dogmatism in user conversations have focused on training models using datasets annotated at the post level, treating each post as independent and randomly sampling posts from conversation threads. Hence, first, we build a dataset for studying user opinion fluctuations in 764 long multi-user Reddit conversation threads, called USDC. USDC contains annotations for 2 tasks: i) User Stance classification, which involves labeling a user’s stance in a post within a conversation on a five-point scale; ii) User Dogmatism classification, which involves labeling a user’s overall opinion in the conversation on a four-point scale. Besides being time-consuming and costly, manual annotations for USDC are challenging because: 1) Conversation threads could be very long, increasing the chances of noisy annotations; and 2) Interpreting instances where a user changes their opinion within a conversation is difficult because often such transitions are subtle and not expressed explicitly. Hence, we leverage majority voting on zero-shot, one-shot, and few-shot annotations from Mistral Large and GPT-4 to automate the annotation process. Human annotations on 200 test conversations achieved inter-annotator agreement scores of 0.49 for stance and 0.50 for dogmatism with these LLM annotations, indicating a reasonable level of consistency between human and LLM annotations. USDC is then used to finetune and instruction-tune multiple deployable small language models like LLaMA, Falcon and Vicuna for the stance and dogmatism classification tasks. We make the code and dataset publicly available [https://github.com/mounikamarreddy/USDC].

nan

Article 1776

Title@2025-05-24 (6): Toward Malicious Clients Detection in Federated Learning

Title: Toward Malicious Clients Detection in Federated Learning

Auf dem Weg zu bösartigen Kunden Erkennung im Föderierten Lernen

争取在联邦学习中发现恶意客户 2505.09110v2

Authors: Zhihao Dou, Jiaqi Wang, Wei Sun, Zhuqing Liu, Minghong Fang

Federated learning (FL) enables multiple clients to collaboratively train a global machine learning model without sharing their raw data. However, the decentralized nature of FL introduces vulnerabilities, particularly to poisoning attacks, where malicious clients manipulate their local models to disrupt the training process. While Byzantine-robust aggregation rules have been developed to mitigate such attacks, they remain inadequate against more advanced threats. In response, recent advancements have focused on FL detection techniques to identify potentially malicious participants. Unfortunately, these methods often misclassify numerous benign clients as threats or rely on unrealistic assumptions about the server’s capabilities. In this paper, we propose a novel algorithm, SafeFL, specifically designed to accurately identify malicious clients in FL. The SafeFL approach involves the server collecting a series of global models to generate a synthetic dataset, which is then used to distinguish between malicious and benign models based on their behavior. Extensive testing demonstrates that SafeFL outperforms existing methods, offering superior efficiency and accuracy in detecting malicious clients.

nan

Article 1777

Title@2025-05-24 (6): Corruption-Aware Training of Latent Video Diffusion Models for Robust Text-to-Video Generation

Title: Corruption-Aware Training of Latent Video Diffusion Models for Robust Text-to-Video Generation

Korruption-Bewusst Training von latenten Video-Diffusions-Modellen für robuste Text-zu-Video-Generation

原始视频视频传播模型的反腐败知识培训 2505.21545v1

Authors: Chika Maduabuchi, Hao Chen, Yujin Han, Jindong Wang

Latent Video Diffusion Models (LVDMs) achieve high-quality generation but are sensitive to imperfect conditioning, which causes semantic drift and temporal incoherence on noisy, web-scale video-text datasets. We introduce CAT-LVDM, the first corruption-aware training framework for LVDMs that improves robustness through structured, data-aligned noise injection. Our method includes Batch-Centered Noise Injection (BCNI), which perturbs embeddings along intra-batch semantic directions to preserve temporal consistency. BCNI is especially effective on caption-rich datasets like WebVid-2M, MSR-VTT, and MSVD. We also propose Spectrum-Aware Contextual Noise (SACN), which injects noise along dominant spectral directions to improve low-frequency smoothness, showing strong results on UCF-101. On average, BCNI reduces FVD by 31.9% across WebVid-2M, MSR-VTT, and MSVD, while SACN yields a 12.3% improvement on UCF-101. Ablation studies confirm the benefit of low-rank, data-aligned noise. Our theoretical analysis further explains how such perturbations tighten entropy, Wasserstein, score-drift, mixing-time, and generalization bounds. CAT-LVDM establishes a principled, scalable training approach for robust video diffusion under multimodal noise. Code and models: https://github.com/chikap421/catlvdm

nan

Article 1778

Title@2025-05-24 (6): On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization

Title: On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization

Auf die Wirkung des negativen Gradienten in der Gruppe Relative Tiefenverstärkung Optimierung

对群体相对深强化优化中的负梯度效应的影响 2505.18830v1

Authors: Wenlong Deng, Yi Ren, Muchen Li, Danica J. Sutherland, Xiaoxiao Li, Christos Thrampoulidis

Reinforcement learning (RL) has become popular in enhancing the reasoning capabilities of large language models (LLMs), with Group Relative Policy Optimization (GRPO) emerging as a widely used algorithm in recent systems. Despite GRPO’s widespread adoption, we identify a previously unrecognized phenomenon we term Lazy Likelihood Displacement (LLD), wherein the likelihood of correct responses marginally increases or even decreases during training. This behavior mirrors a recently discovered misalignment issue in Direct Preference Optimization (DPO), attributed to the influence of negative gradients. We provide a theoretical analysis of GRPO’s learning dynamic, identifying the source of LLD as the naive penalization of all tokens in incorrect responses with the same strength. To address this, we develop a method called NTHR, which downweights penalties on tokens contributing to the LLD. Unlike prior DPO-based approaches, NTHR takes advantage of GRPO’s group-based structure, using correct responses as anchors to identify influential tokens. Experiments on math reasoning benchmarks demonstrate that NTHR effectively mitigates LLD, yielding consistent performance gains across models ranging from 0.5B to 3B parameters.

nan

Article 1779

Title@2025-05-24 (6): Multi-Agent Best Arm Identification in Stochastic Linear Bandits

Title: Multi-Agent Best Arm Identification in Stochastic Linear Bandits

Multi-Agent Best Arm Identification in stochastische Linear Banditen

斯托切斯定线强盗中多代理最佳武器识别 2411.13690v2

Authors: Sanjana Agrawal, Saúl A. Blanco

We study the problem of collaborative best-arm identification in stochastic linear bandits under a fixed-budget scenario. In our learning model, we first consider multiple agents connected through a star network, interacting with a linear bandit instance in parallel. We then extend our analysis to arbitrary network topologies. The objective of the agents is to collaboratively identify the best arm of the given bandit instance with the help of a central server while minimizing the probability of error in best arm estimation. To this end, we propose two algorithms, MaLinBAI-Star and MaLinBAI-Gen for star networks and networks with arbitrary structure, respectively. Both algorithms utilize the technique of G-optimal design along with the successive elimination based strategy where agents share their knowledge through a central server at each communication round. We demonstrate, both theoretically and empirically, that our algorithms achieve exponentially decaying probability of error in the allocated time budget. Furthermore, experimental results on both synthetic and real-world data validate the effectiveness of our algorithms over the state-of-the art existing multi-agent algorithms.

nan

Article 1780

Title@2025-05-24 (6): Improved Regret and Contextual Linear Extension for Pandora’s Box and Prophet Inequality

Title: Improved Regret and Contextual Linear Extension for Pandora’s Box and Prophet Inequality

Verbesserte regret und kontextuelle lineare Erweiterung für Pandora’s Box und Prophet Inequality

改进潘多拉盒子和先知不平等的遗憾和背景扩展线性扩展 2505.18828v1

Authors: Junyan Liu, Ziyun Chen, Kun Wang, Haipeng Luo, Lillian J. Ratliff

We study the Pandora’s Box problem in an online learning setting with semi-bandit feedback. In each round, the learner sequentially pays to open up to $n$ boxes with unknown reward distributions, observes rewards upon opening, and decides when to stop. The utility of the learner is the maximum observed reward minus the cumulative cost of opened boxes, and the goal is to minimize regret defined as the gap between the cumulative expected utility and that of the optimal policy. We propose a new algorithm that achieves $\widetilde{O}(\sqrt{nT})$ regret after $T$ rounds, which improves the $\widetilde{O}(n\sqrt{T})$ bound of Agarwal et al. [2024] and matches the known lower bound up to logarithmic factors. To better capture real-life applications, we then extend our results to a natural but challenging contextual linear setting, where each box’s expected reward is linear in some known but time-varying $d$-dimensional context and the noise distribution is fixed over time. We design an algorithm that learns both the linear function and the noise distributions, achieving $\widetilde{O}(nd\sqrt{T})$ regret. Finally, we show that our techniques also apply to the online Prophet Inequality problem, where the learner must decide immediately whether or not to accept a revealed reward. In both non-contextual and contextual settings, our approach achieves similar improvements and regret bounds.

nan

Article 1781

Title@2025-05-24 (6): A Real-World Energy Management Dataset from a Smart Company Building for Optimization and Machine Learning

Title: A Real-World Energy Management Dataset from a Smart Company Building for Optimization and Machine Learning

Ein Echtzeit-Energiemanagement-Datensatz aus einem Smart Company Building für Optimierung und maschinelles Lernen

最佳优化和机器学习智能公司大楼的 “ 现实世界能源管理数据集 “ 2503.11469v2

Authors: Jens Engel, Andrea Castellani, Patricia Wollstadt, Felix Lanfermann, Thomas Schmitt, Sebastian Schmitt, Lydia Fischer, Steffen Limmer, David Luttropp, Florian Jomrich, René Unger, Tobias Rodemann

We present a large real-world dataset obtained from monitoring a smart company facility over the course of six years, from 2018 to 2023. The dataset includes energy consumption data from various facility areas and components, energy production data from a photovoltaic system and a combined heat and power plant, operational data from heating and cooling systems, and weather data from an on-site weather station. The measurement sensors installed throughout the facility are organized in a hierarchical metering structure with multiple sub-metering levels, which is reflected in the dataset. The dataset contains measurement data from 72 energy meters, 9 heat meters and a weather station. Both raw and processed data at different processing levels, including labeled issues, is available. In this paper, we describe the data acquisition and post-processing employed to create the dataset. The dataset enables the application of a wide range of methods in the domain of energy management, including optimization, modeling, and machine learning to optimize building operations and reduce costs and carbon emissions.

nan

Article 1782

Title@2025-05-24 (6): How to build a consistency model: Learning flow maps via self-distillation

Title: How to build a consistency model: Learning flow maps via self-distillation

Wie man ein Konsistenzmodell baut: Flusskarten über Selbstdestillation lernen

如何建立一致性模式:通过自我蒸馏学习流程图 2505.18825v1

Authors: Nicholas M. Boffi, Michael S. Albergo, Eric Vanden-Eijnden

Building on the framework proposed in Boffi et al. (2024), we present a systematic approach for learning flow maps associated with flow and diffusion models. Flow map-based models, commonly known as consistency models, encompass recent efforts to improve the efficiency of generative models based on solutions to differential equations. By exploiting a relationship between the velocity field underlying a continuous-time flow and the instantaneous rate of change of the flow map, we show how to convert existing distillation schemes into direct training algorithms via self-distillation, eliminating the need for pre-trained models. We empirically evaluate several instantiations of our framework, finding that high-dimensional tasks like image synthesis benefit from objective functions that avoid temporal and spatial derivatives of the flow map, while lower-dimensional tasks can benefit from objectives incorporating higher-order derivatives to capture sharp features.

nan

Article 1783

Title@2025-05-24 (6): Robust multi-coil MRI reconstruction via self-supervised denoising

Title: Robust multi-coil MRI reconstruction via self-supervised denoising

Robuste Multi-Coil-MRT-Rekonstruktion durch selbstüberwachte Denoisierung

通过自我监督的自监管的去注水进行强有力的多石油MRI重建 2411.12919v4

Authors: Asad Aali, Marius Arvinte, Sidharth Kumar, Yamin I. Arefeen, Jonathan I. Tamir

We study the effect of incorporating self-supervised denoising as a pre-processing step for training deep learning (DL) based reconstruction methods on data corrupted by Gaussian noise. K-space data employed for training are typically multi-coil and inherently noisy. Although DL-based reconstruction methods trained on fully sampled data can enable high reconstruction quality, obtaining large, noise-free datasets is impractical. We leverage Generalized Stein’s Unbiased Risk Estimate (GSURE) for denoising. We evaluate two DL-based reconstruction methods: Diffusion Probabilistic Models (DPMs) and Model-Based Deep Learning (MoDL). We evaluate the impact of denoising on the performance of these DL-based methods in solving accelerated multi-coil magnetic resonance imaging (MRI) reconstruction. The experiments were carried out on T2-weighted brain and fat-suppressed proton-density knee scans. We observed that self-supervised denoising enhances the quality and efficiency of MRI reconstructions across various scenarios. Specifically, employing denoised images rather than noisy counterparts when training DL networks results in lower normalized root mean squared error (NRMSE), higher structural similarity index measure (SSIM) and peak signal-to-noise ratio (PSNR) across different SNR levels, including 32dB, 22dB, and 12dB for T2-weighted brain data, and 24dB, 14dB, and 4dB for fat-suppressed knee data. Overall, we showed that denoising is an essential pre-processing technique capable of improving the efficacy of DL-based MRI reconstruction methods under diverse conditions. By refining the quality of input data, denoising enables training more effective DL networks, potentially bypassing the need for noise-free reference MRI scans.

nan

Article 1784

Title@2025-05-24 (6): Fully tensorial approach to hypercomplex neural networks

Title: Fully tensorial approach to hypercomplex neural networks

Voller Tensoransatz für hyperkomplexe neuronale Netzwerke

对超复合性神经神经网络采取完全强制的全方位方法 2407.00449v3

Authors: Agnieszka Niemczynowicz, Radosław Antoni Kycia

Fully tensorial theory of hypercomplex neural networks is given. It allows neural networks to use arithmetic based on arbitrary algebras. The key point is to observe that algebra multiplication can be represented as a rank three tensor and use this tensor in every algebraic operation. This approach is attractive for neural network libraries that support effective tensorial operations. It agrees with previous implementations for four-dimensional algebras.

nan

Article 1785

Title@2025-05-24 (6): Stealing Training Graphs from Graph Neural Networks

Title: Stealing Training Graphs from Graph Neural Networks

Stealing Training Graphen aus Graph Neural Networks

图表神经网络中的偷窃培训图 2411.11197v2

Authors: Minhua Lin, Enyan Dai, Junjie Xu, Jinyuan Jia, Xiang Zhang, Suhang Wang

Graph Neural Networks (GNNs) have shown promising results in modeling graphs in various tasks. The training of GNNs, especially on specialized tasks such as bioinformatics, demands extensive expert annotations, which are expensive and usually contain sensitive information of data providers. The trained GNN models are often shared for deployment in the real world. As neural networks can memorize the training samples, the model parameters of GNNs have a high risk of leaking private training data. Our theoretical analysis shows the strong connections between trained GNN parameters and the training graphs used, confirming the training graph leakage issue. However, explorations into training data leakage from trained GNNs are rather limited. Therefore, we investigate a novel problem of stealing graphs from trained GNNs. To obtain high-quality graphs that resemble the target training set, a graph diffusion model with diffusion noise optimization is deployed as a graph generator. Furthermore, we propose a selection method that effectively leverages GNN model parameters to identify training graphs from samples generated by the graph diffusion model. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed framework in stealing training graphs from the trained GNN.

nan

Article 1786

Title@2025-05-24 (6): GRoQ-LoCO: Generalist and Robot-agnostic Quadruped Locomotion Control using Offline Datasets

Title: GRoQ-LoCO: Generalist and Robot-agnostic Quadruped Locomotion Control using Offline Datasets

GRoQ-LoCO: Generalist und Roboter-agnostische Quadruped Locomotion Control mit Offline-Datensätzen

GROQ-LoCO:使用离线数据集的通用和机器人-不可知性四分流移动控制 2505.10973v3

Authors: Narayanan PP, Sarvesh Prasanth Venkatesan, Srinivas Kantha Reddy, Shishir Kolathaya

Recent advancements in large-scale offline training have demonstrated the potential of generalist policy learning for complex robotic tasks. However, applying these principles to legged locomotion remains a challenge due to continuous dynamics and the need for real-time adaptation across diverse terrains and robot morphologies. In this work, we propose GRoQ-LoCO, a scalable, attention-based framework that learns a single generalist locomotion policy across multiple quadruped robots and terrains, relying solely on offline datasets. Our approach leverages expert demonstrations from two distinct locomotion behaviors - stair traversal (non-periodic gaits) and flat terrain traversal (periodic gaits) - collected across multiple quadruped robots, to train a generalist model that enables behavior fusion. Crucially, our framework operates solely on proprioceptive data from all robots without incorporating any robot-specific encodings. The policy is directly deployable on an Intel i7 nuc, producing low-latency control outputs without any test-time optimization. Our extensive experiments demonstrate zero-shot transfer across highly diverse quadruped robots and terrains, including hardware deployment on the Unitree Go1, a commercially available 12kg robot. Notably, we evaluate challenging cross-robot training setups where different locomotion skills are unevenly distributed across robots, yet observe successful transfer of both flat walking and stair traversal behaviors to all robots at test time. We also show preliminary walking on Stoch 5, a 70kg quadruped, on flat and outdoor terrains without requiring any fine tuning. These results demonstrate the potential of offline, data-driven learning to generalize locomotion across diverse quadruped morphologies and behaviors.

nan

Article 1787

Title@2025-05-24 (6): Preference Leakage: A Contamination Problem in LLM-as-a-judge

Title: Preference Leakage: A Contamination Problem in LLM-as-a-judge

Bevorzugte Leckage: Ein Kontaminierungsproblem im LLM-as-a-Richter

优先渗漏:LLM-作为法官的LLM中的污染问题 2502.01534v2

Authors: Dawei Li, Renliang Sun, Yue Huang, Ming Zhong, Bohan Jiang, Jiawei Han, Xiangliang Zhang, Wei Wang, Huan Liu

Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods in model development. While their combination significantly enhances the efficiency of model training and evaluation, little attention has been given to the potential contamination brought by this new model development paradigm. In this work, we expose preference leakage, a contamination problem in LLM-as-a-judge caused by the relatedness between the synthetic data generators and LLM-based evaluators. To study this issue, we first define three common relatednesses between the data generator LLM and the judge LLM: being the same model, having an inheritance relationship, and belonging to the same model family. Through extensive experiments, we empirically confirm the bias of judges towards their related student models caused by preference leakage across multiple LLM baselines and benchmarks. Further analysis suggests that preference leakage is a pervasive and real-world problem that is harder to detect compared to previously identified biases in LLM-as-a-judge scenarios. All of these findings imply that preference leakage is a widespread and challenging problem in the area of LLM-as-a-judge. We release all codes and data at: https://github.com/David-Li0406/Preference-Leakage.

nan

Article 1788

Title@2025-05-24 (6): Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis

Title: Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis

Erforschung der QUIC-Dynamik: Ein großformatiger Datensatz für verschlüsselte Verkehrsanalyse

探索 QUIC 动态动态:加密流量分析的大型数据集 2410.03728v6

Authors: Barak Gahtan, Robert J. Shahla, Alex M. Bronstein, Reuven Cohen

The increasing adoption of the QUIC transport protocol has transformed encrypted web traffic, necessitating new methodologies for network analysis. However, existing datasets lack the scope, metadata, and decryption capabilities required for robust benchmarking in encrypted traffic research. We introduce VisQUIC, a large-scale dataset of 100,000 labeled QUIC traces from over 44,000 websites, collected over four months. Unlike prior datasets, VisQUIC provides SSL keys for controlled decryption, supports multiple QUIC implementations (Chromium QUIC, Facebooks mvfst, Cloudflares quiche), and introduces a novel image-based representation that enables machine learning-driven encrypted traffic analysis. The dataset includes standardized benchmarking tools, ensuring reproducibility. To demonstrate VisQUICs utility, we present a benchmarking task for estimating HTTP/3 responses in encrypted QUIC traffic, achieving 97% accuracy using only observable packet features. By publicly releasing VisQUIC, we provide an open foundation for advancing encrypted traffic analysis, QUIC security research, and network monitoring.

nan

Article 1789

Title@2025-05-24 (6): DiSCo: Device-Server Collaborative LLM-Based Text Streaming Services

Title: DiSCo: Device-Server Collaborative LLM-Based Text Streaming Services

DiSCo: Geräte-Server Kollaborative LLM-basierte Text-Streaming-Dienste

DisCo: 设备-服务器协作协作LLM基于LLM的文本流服务 2502.11417v2

Authors: Ting Sun, Penghan Wang, Fan Lai

The rapid rise of large language models (LLMs) in text streaming services has introduced significant cost and Quality of Experience (QoE) challenges in serving millions of daily requests, especially in meeting Time-To-First-Token (TTFT) and Time-Between-Token (TBT) requirements for real-time interactions. Our real-world measurements show that both server-based and on-device deployments struggle to meet diverse QoE demands: server deployments face high costs and last-hop issues (e.g., Internet latency and dynamics), while on-device LLM inference is constrained by resources. We introduce DiSCo, a device-server cooperative scheduler designed to optimize users’ QoE by adaptively routing requests and migrating response generation between endpoints while maintaining cost constraints. DiSCo employs cost-aware scheduling, leveraging the predictable speed of on-device LLM inference with the flexible capacity of server-based inference to dispatch requests on the fly, while introducing a token-level migration mechanism to ensure consistent token delivery during migration. Evaluations on real-world workloads – including commercial services like OpenAI GPT and DeepSeek, and open-source deployments such as LLaMA3 – show that DiSCo can improve users’ QoE by reducing tail TTFT (11-52\%) and mean TTFT (6-78\%) across different model-device configurations, while dramatically reducing serving costs by up to 84\% through its migration mechanism while maintaining comparable QoE levels.

nan

Article 1790

Title@2025-05-24 (6): Operator-Informed Score Matching for Markov Diffusion Models

Title: Operator-Informed Score Matching for Markov Diffusion Models

Operator-Informed Score Matching für Markov Diffusion Modelle

Markov 扩散模型的操作员不完善的评分匹配 2406.09084v2

Authors: Zheyang Shen, Huihui Wang, Marina Riabiz, Chris J. Oates

Diffusion models are typically trained using score matching, a learning objective agnostic to the underlying noising process that guides the model. This paper argues that Markov noising processes enjoy an advantage over alternatives, as the Markov operators that govern the noising process are well-understood. Specifically, by leveraging the spectral decomposition of the infinitesimal generator of the Markov noising process, we obtain parametric estimates of the score functions simultaneously for all marginal distributions, using only sample averages with respect to the data distribution. The resulting operator-informed score matching provides both a standalone approach to sample generation for low-dimensional distributions, as well as a recipe for better informed neural score estimators in high-dimensional settings.

nan

Article 1791

Title@2025-05-24 (6): Expert-Agnostic Learning to Defer

Title: Expert-Agnostic Learning to Defer

Experten-Agnostisches Lernen zur Abwehr

专家 – – 无法无天学习 2502.10533v2

Authors: Joshua Strong, Pramit Saha, Yasin Ibrahim, Cheng Ouyang, Alison Noble

Learning to Defer (L2D) trains autonomous systems to handle straightforward cases while deferring uncertain ones to human experts. Recent advancements in this field have introduced methods that offer flexibility to unseen experts at test time. However, we find these approaches struggle to generalise to experts with behaviours not seen during training, require extensive human annotation, and lack mechanisms for incorporating prior knowledge of expert capabilities. To address these challenges, we introduce Expert-Agnostic Learning to Defer (EA-L2D), a novel L2D framework that employs a Bayesian approach to model expert behaviour in an \textit{expert-agnostic} fashion. Across benchmark medical imaging datasets (HAM10000, Blood Cells, Retinal OCT, and Liver Tumours), EA-L2D significantly outperforms prior methods on unseen experts, achieving up to a 28\% relative improvement, while also matching or exceeding state-of-the-art performance on seen experts.

nan

Article 1792

Title@2025-05-24 (6): Partial Distribution Matching via Partial Wasserstein Adversarial Networks

Title: Partial Distribution Matching via Partial Wasserstein Adversarial Networks

Teilverteilung Passend über Teilwasserstein Adversarial Networks

通过部分瓦森斯坦对冲网络进行部分配配 2409.10499v2

Authors: Zi-Ming Wang, Nan Xue, Ling Lei, Rebecka Jörnsten, Gui-Song Xia

This paper studies the problem of distribution matching (DM), which is a fundamental machine learning problem seeking to robustly align two probability distributions. Our approach is established on a relaxed formulation, called partial distribution matching (PDM), which seeks to match a fraction of the distributions instead of matching them completely. We theoretically derive the Kantorovich-Rubinstein duality for the partial Wasserstain-1 (PW) discrepancy, and develop a partial Wasserstein adversarial network (PWAN) that efficiently approximates the PW discrepancy based on this dual form. Partial matching can then be achieved by optimizing the network using gradient descent. Two practical tasks, point set registration and partial domain adaptation are investigated, where the goals are to partially match distributions in 3D space and high-dimensional feature space respectively. The experiment results confirm that the proposed PWAN effectively produces highly robust matching results, performing better or on par with the state-of-the-art methods.

nan

Article 1793

Title@2025-05-24 (6): MAPLE: Enhancing Review Generation with Multi-Aspect Prompt LEarning in Explainable Recommendation

Title: MAPLE: Enhancing Review Generation with Multi-Aspect Prompt LEarning in Explainable Recommendation

MAPLE: Verbesserung der Review Generation mit Multi-Aspect Prompt Learning in erklärbarer Empfehlung

MMALE: 在可解释建议中以多角度迅速和迅速的分解方式加强审查的产生 2408.09865v2

Authors: Ching-Wen Yang, Zhi-Quan Feng, Ying-Jia Lin, Che-Wei Chen, Kun-da Wu, Hao Xu, Jui-Feng Yao, Hung-Yu Kao

The Explainable Recommendation task is designed to receive a pair of user and item and output explanations to justify why an item is recommended to a user. Many models approach review generation as a proxy for explainable recommendations. While these models can produce fluent and grammatically correct sentences, they often lack precision and fail to provide personalized, informative recommendations. To address this issue, we propose a personalized, aspect-controlled model called Multi-Aspect Prompt LEarner (MAPLE), which integrates aspect category as another input dimension to facilitate memorizing fine-grained aspect terms. Experiments conducted on two real-world review datasets in the restaurant domain demonstrate that MAPLE significantly outperforms baseline review-generation models. MAPLE excels in both text and feature diversity, ensuring that the generated content covers a wide range of aspects. Additionally, MAPLE delivers good generation quality while maintaining strong coherence and factual relevance. The code and dataset used in this paper can be found here https://github.com/Nana2929/MAPLE.git.

nan

Article 1794

Title@2025-05-24 (6): Governing Equation Discovery from Data Based on Differential Invariants

Title: Governing Equation Discovery from Data Based on Differential Invariants

Regulierende Gleichungs-Entdeckung aus Daten basierend auf unterschiedlichen Invarianten

从基于差异内在变量的数据中分离出来的数据 2505.18798v1

Authors: Lexiang Hu, Yikang Li, Zhouchen Lin

The explicit governing equation is one of the simplest and most intuitive forms for characterizing physical laws. However, directly discovering partial differential equations (PDEs) from data poses significant challenges, primarily in determining relevant terms from a vast search space. Symmetry, as a crucial prior knowledge in scientific fields, has been widely applied in tasks such as designing equivariant networks and guiding neural PDE solvers. In this paper, we propose a pipeline for governing equation discovery based on differential invariants, which can losslessly reduce the search space of existing equation discovery methods while strictly adhering to symmetry. Specifically, we compute the set of differential invariants corresponding to the infinitesimal generators of the symmetry group and select them as the relevant terms for equation discovery. Taking DI-SINDy (SINDy based on Differential Invariants) as an example, we demonstrate that its success rate and accuracy in PDE discovery surpass those of other symmetry-informed governing equation discovery methods across a series of PDEs.

nan

Article 1795

Title@2025-05-24 (6): Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection

Title: Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection

Überwachung von Graphen-Neuralnetzwerken für unbeaufsichtigte Graphenanomalienerkennung

用于不受监督的异常图图探测的保护图形神经网络 2404.16366v2

Authors: Yuanchen Bei, Sheng Zhou, Jinke Shi, Yao Ma, Haishuai Wang, Jiajun Bu

Unsupervised graph anomaly detection aims at identifying rare patterns that deviate from the majority in a graph without the aid of labels, which is important for a variety of real-world applications. Recent advances have utilized Graph Neural Networks (GNNs) to learn effective node representations by aggregating information from neighborhoods. This is motivated by the hypothesis that nodes in the graph tend to exhibit consistent behaviors with their neighborhoods. However, such consistency can be disrupted by graph anomalies in multiple ways. Most existing methods directly employ GNNs to learn representations, disregarding the negative impact of graph anomalies on GNNs, resulting in sub-optimal node representations and anomaly detection performance. While a few recent approaches have redesigned GNNs for graph anomaly detection under semi-supervised label guidance, how to address the adverse effects of graph anomalies on GNNs in unsupervised scenarios and learn effective representations for anomaly detection are still under-explored. To bridge this gap, in this paper, we propose a simple yet effective framework for Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection (G3AD). Specifically, G3AD first introduces two auxiliary networks along with correlation constraints to guard the GNNs against inconsistent information encoding. Furthermore, G3AD introduces an adaptive caching module to guard the GNNs from directly reconstructing the observed graph data that contains anomalies. Extensive experiments demonstrate that our G3AD can outperform twenty state-of-the-art methods on both synthetic and real-world graph anomaly datasets, with flexible generalization ability in different GNN backbones.

nan

Article 1796

Title@2025-05-24 (6): Leveraging Per-Instance Privacy for Machine Unlearning

Title: Leveraging Per-Instance Privacy for Machine Unlearning

Per-Instance-Leveraging-Privatsphäre für das maschinelle Lernen

利用个人隐私促进机器脱学 2505.18786v1

Authors: Nazanin Mohammadi Sepahvand, Anvith Thudi, Berivan Isik, Ashmita Bhattacharyya, Nicolas Papernot, Eleni Triantafillou, Daniel M. Roy, Gintare Karolina Dziugaite

We present a principled, per-instance approach to quantifying the difficulty of unlearning via fine-tuning. We begin by sharpening an analysis of noisy gradient descent for unlearning (Chien et al., 2024), obtaining a better utility-unlearning tradeoff by replacing worst-case privacy loss bounds with per-instance privacy losses (Thudi et al., 2024), each of which bounds the (Renyi) divergence to retraining without an individual data point. To demonstrate the practical applicability of our theory, we present empirical results showing that our theoretical predictions are born out both for Stochastic Gradient Langevin Dynamics (SGLD) as well as for standard fine-tuning without explicit noise. We further demonstrate that per-instance privacy losses correlate well with several existing data difficulty metrics, while also identifying harder groups of data points, and introduce novel evaluation methods based on loss barriers. All together, our findings provide a foundation for more efficient and adaptive unlearning strategies tailored to the unique properties of individual data points.

nan

Article 1797

Title@2025-05-24 (6): A physics-guided smoothing method for material modeling with digital image correlation (DIC) measurements

Title: A physics-guided smoothing method for material modeling with digital image correlation (DIC) measurements

Ein physikgeführtes Glättverfahren für die Materialmodellierung mit Messungen der digitalen Bildkorrelation (DIC)

采用物理制导平滑法进行数字图像相关测量材料建模 2505.18784v1

Authors: Jihong Wang, Chung-Hao Lee, William Richardson, Yue Yu

In this work, we present a novel approach to process the DIC measurements of multiple biaxial stretching protocols. In particular, we develop a optimization-based approach, which calculates the smoothed nodal displacements using a moving least-squares algorithm subject to positive strain constraints. As such, physically consistent displacement and strain fields are obtained. Then, we further deploy a data-driven workflow to heterogeneous material modeling from these physically consistent DIC measurements, by estimating a nonlocal constitutive law together with the material microstructure. To demonstrate the applicability of our approach, we apply it in learning a material model and fiber orientation field from DIC measurements of a porcine tricuspid valve anterior leaflet. Our results demonstrate that the proposed DIC data processing approach can significantly improve the accuracy of modeling biological materials.

nan

Article 1798

Title@2025-05-24 (6): Soft Weighted Machine Unlearning

Title: Soft Weighted Machine Unlearning

Weichgewichtete Maschine nicht lernen

软加权机器脱学 2505.18783v1

Authors: Xinbao Qiao, Ningning Ding, Yushi Cheng, Meng Zhang

Machine unlearning, as a post-hoc processing technique, has gained widespread adoption in addressing challenges like bias mitigation and robustness enhancement, colloquially, machine unlearning for fairness and robustness. However, existing non-privacy unlearning-based solutions persist in using binary data removal framework designed for privacy-driven motivation, leading to significant information loss, a phenomenon known as over-unlearning. While over-unlearning has been largely described in many studies as primarily causing utility degradation, we investigate its fundamental causes and provide deeper insights in this work through counterfactual leave-one-out analysis. In this paper, we introduce a weighted influence function that assigns tailored weights to each sample by solving a convex quadratic programming problem analytically. Building on this, we propose a soft-weighted framework enabling fine-grained model adjustments to address the over-unlearning challenge. We demonstrate that the proposed soft-weighted scheme is versatile and can be seamlessly integrated into most existing unlearning algorithms. Extensive experiments show that in fairness- and robustness-driven tasks, the soft-weighted scheme significantly outperforms hard-weighted schemes in fairness/robustness metrics and alleviates the decline in utility metric, thereby enhancing machine unlearning algorithm as an effective correction solution.

nan

Article 1799

Title@2025-05-24 (6): One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion

Title: One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion

Eine Politik, aber viele Welten: Eine skalierbare, einheitliche Politik für vielseitige humanoide Lokomotion

一个政策,但许多世界:一个可扩展的统一政策,促进有生命力的人类活动 2505.18780v1

Authors: Yahao Fan, Tianxiang Gui, Kaiyang Ji, Shutong Ding, Chixuan Zhang, Jiayuan Gu, Jingyi Yu, Jingya Wang, Ye Shi

Humanoid locomotion faces a critical scalability challenge: traditional reinforcement learning (RL) methods require task-specific rewards and struggle to leverage growing datasets, even as more training terrains are introduced. We propose DreamPolicy, a unified framework that enables a single policy to master diverse terrains and generalize zero-shot to unseen scenarios by systematically integrating offline data and diffusion-driven motion synthesis. At its core, DreamPolicy introduces Humanoid Motion Imagery (HMI) - future state predictions synthesized through an autoregressive terrain-aware diffusion planner curated by aggregating rollouts from specialized policies across various distinct terrains. Unlike human motion datasets requiring laborious retargeting, our data directly captures humanoid kinematics, enabling the diffusion planner to synthesize “dreamed” trajectories that encode terrain-specific physical constraints. These trajectories act as dynamic objectives for our HMI-conditioned policy, bypassing manual reward engineering and enabling cross-terrain generalization. DreamPolicy addresses the scalability limitations of prior methods: while traditional RL fails to exploit growing datasets, our framework scales seamlessly with more offline data. As the dataset expands, the diffusion prior learns richer locomotion skills, which the policy leverages to master new terrains without retraining. Experiments demonstrate that DreamPolicy achieves average 90% success rates in training environments and an average of 20% higher success on unseen terrains than the prevalent method. It also generalizes to perturbed and composite scenarios where prior approaches collapse. By unifying offline data, diffusion-based trajectory synthesis, and policy optimization, DreamPolicy overcomes the “one task, one policy” bottleneck, establishing a paradigm for scalable, data-driven humanoid control.

nan

Article 1800

Title@2025-05-24 (6): HD-PiSSA: High-Rank Distributed Orthogonal Adaptation

Title: HD-PiSSA: High-Rank Distributed Orthogonal Adaptation

HD-PiSSA: High-Rank verteilte Orthogonalanpassung

HD-PiSSA: 高射分散的正心调整适应 2505.18777v1

Authors: Yiding Wang, Fauxu meng, Xuefeng Zhang, Fan Jiang, Pingzhi Tang, Muhan Zhang

Existing parameter-efficient fine-tuning (PEFT) methods for large language models (LLMs), such as LoRA and PiSSA, constrain model updates to low-rank subspaces, limiting their expressiveness and leading to suboptimal performance on complex tasks. To address this, we introduce High-rank Distributed PiSSA (HD-PiSSA), a distributed PEFT approach that initializes orthogonal adapters across different devices and aggregates their delta updates collectively on W for fine-tuning. Unlike Data Parallel LoRA or PiSSA, which maintain identical adapters across all devices, HD-PiSSA assigns different principal components of the pre-trained weights to each GPU, significantly expanding the range of update directions. This results in over 16x higher effective updated ranks than data-parallel LoRA or PiSSA when fine-tuning on 8 GPUs with the same per-device adapter rank. Empirically, we evaluate HD-PiSSA across various challenging downstream tasks, including mathematics, code generation, and multi-task learning. In the multi-task setting, HD-PiSSA achieves average gains of 10.0 absolute points (14.63%) over LoRA and 4.98 points (6.60%) over PiSSA across 12 benchmarks, demonstrating its benefits from the extra optimization flexibility.

nan

Article 1801

Title@2025-05-24 (6): Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Title: Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Starke Mitgliedschafts-Inferenzangriffe auf massive Datensätze und (Moderate) große Sprachmodelle

对大规模数据集和(口头)大语言模型的强烈成员推论攻击 2505.18773v1

Authors: Jamie Hayes, Ilia Shumailov, Christopher A. Choquette-Choo, Matthew Jagielski, George Kaissis, Katherine Lee, Milad Nasr, Sahra Ghalebikesabi, Niloofar Mireshghallah, Meenatchi Sundaram Mutu Selva Annamalai, Igor Shilov, Matthieu Meeus, Yves-Alexandre de Montjoye, Franziska Boenisch, Adam Dziedzic, A. Feder Cooper

State-of-the-art membership inference attacks (MIAs) typically require training many reference models, making it difficult to scale these attacks to large pre-trained language models (LLMs). As a result, prior research has either relied on weaker attacks that avoid training reference models (e.g., fine-tuning attacks), or on stronger attacks applied to small-scale models and datasets. However, weaker attacks have been shown to be brittle - achieving close-to-arbitrary success - and insights from strong attacks in simplified settings do not translate to today’s LLMs. These challenges have prompted an important question: are the limitations observed in prior work due to attack design choices, or are MIAs fundamentally ineffective on LLMs? We address this question by scaling LiRA - one of the strongest MIAs - to GPT-2 architectures ranging from 10M to 1B parameters, training reference models on over 20B tokens from the C4 dataset. Our results advance the understanding of MIAs on LLMs in three key ways: (1) strong MIAs can succeed on pre-trained LLMs; (2) their effectiveness, however, remains limited (e.g., AUC<0.7) in practical settings; and, (3) the relationship between MIA success and related privacy metrics is not as straightforward as prior work has suggested.

nan

Article 1802

Title@2025-05-24 (6): CageNet: A Meta-Framework for Learning on Wild Meshes

Title: CageNet: A Meta-Framework for Learning on Wild Meshes

CageNet: Ein Meta-Rahmen für das Lernen auf Wild Meshes

CageNet:野生动物类学习的元框架 2505.18772v1

Authors: Michal Edelstein, Hsueh-Ti Derek Liu, Mirela Ben-Chen

Learning on triangle meshes has recently proven to be instrumental to a myriad of tasks, from shape classification, to segmentation, to deformation and animation, to mention just a few. While some of these applications are tackled through neural network architectures which are tailored to the application at hand, many others use generic frameworks for triangle meshes where the only customization required is the modification of the input features and the loss function. Our goal in this paper is to broaden the applicability of these generic frameworks to “wild”, i.e. meshes in-the-wild which often have multiple components, non-manifold elements, disrupted connectivity, or a combination of these. We propose a configurable meta-framework based on the concept of caged geometry: Given a mesh, a cage is a single component manifold triangle mesh that envelopes it closely. Generalized barycentric coordinates map between functions on the cage, and functions on the mesh, allowing us to learn and test on a variety of data, in different applications. We demonstrate this concept by learning segmentation and skinning weights on difficult data, achieving better performance to state of the art techniques on wild meshes.

nan

Article 1803

Title@2025-05-24 (6): Dual-Path Stable Soft Prompt Generation for Domain Generalization

Title: Dual-Path Stable Soft Prompt Generation for Domain Generalization

Dual-Path stabile Soft Prompt Generation für Domain-Verallgemeinerung

两平面稳定软软生成域通用化快速生成 2505.18770v1

Authors: Yuedi Zhang, Shuanghao Bai, Wanqi Zhou, Zhirong Luan, Badong Chen

Domain generalization (DG) aims to learn a model using data from one or multiple related but distinct source domains that can generalize well to unseen out-of-distribution target domains. Inspired by the success of large pre-trained vision-language models (VLMs), prompt tuning has emerged as an effective generalization strategy. However, it often struggles to capture domain-specific features due to its reliance on manually or fixed prompt inputs. Recently, some prompt generation methods have addressed this limitation by dynamically generating instance-specific and domain-specific prompts for each input, enriching domain information and demonstrating potential for enhanced generalization. Through further investigation, we identify a notable issue in existing prompt generation methods: the same input often yields significantly different and suboptimal prompts across different random seeds, a phenomenon we term Prompt Variability. To address this, we introduce negative learning into the prompt generation process and propose Dual-Path Stable Soft Prompt Generation (DPSPG), a transformer-based framework designed to improve both the stability and generalization of prompts. Specifically, DPSPG incorporates a complementary prompt generator to produce negative prompts, thereby reducing the risk of introducing misleading information. Both theoretical and empirical analyses demonstrate that negative learning leads to more robust and effective prompts by increasing the effective margin and reducing the upper bound of the gradient norm. Extensive experiments on five DG benchmark datasets show that DPSPG consistently outperforms state-of-the-art methods while maintaining prompt stability.

nan

Article 1804

Title@2025-05-24 (6): Multiple Wasserstein Gradient Descent Algorithm for Multi-Objective Distributional Optimization

Title: Multiple Wasserstein Gradient Descent Algorithm for Multi-Objective Distributional Optimization

Vielfacher Wasserstein Gradient Descent Algorithmus für Multi-Objective Distributional Optimization

多目标分布优化多瓦森斯坦梯度底源值 2505.18765v1

Authors: Dai Hai Nguyen, Hiroshi Mamitsuka, Atsuyoshi Nakamura

We address the optimization problem of simultaneously minimizing multiple objective functionals over a family of probability distributions. This type of Multi-Objective Distributional Optimization commonly arises in machine learning and statistics, with applications in areas such as multiple target sampling, multi-task learning, and multi-objective generative modeling. To solve this problem, we propose an iterative particle-based algorithm, which we call Muliple Wasserstein Gradient Descent (MWGraD), which constructs a flow of intermediate empirical distributions, each being represented by a set of particles, which gradually minimize the multiple objective functionals simultaneously. Specifically, MWGraD consists of two key steps at each iteration. First, it estimates the Wasserstein gradient for each objective functional based on the current particles. Then, it aggregates these gradients into a single Wasserstein gradient using dynamically adjusted weights and updates the particles accordingly. In addition, we provide theoretical analysis and present experimental results on both synthetic and real-world datasets, demonstrating the effectiveness of MWGraD.

nan

Article 1805

Title@2025-05-24 (6): Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model

Title: Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model

Textgeführte Multi-Property-Molekularoptimierung mit einem Diffusions-Sprachenmodell

带有传播语言模型的文本引导多财产分子优化 2410.13597v2

Authors: Yida Xiong, Kun Li, Jiameng Chen, Hongzhi Zhang, Di Lin, Yan Che, Wenbin Hu

Molecular optimization (MO) is a crucial stage in drug discovery in which task-oriented generated molecules are optimized to meet practical industrial requirements. Existing mainstream MO approaches primarily utilize external property predictors to guide iterative property optimization. However, learning all molecular samples in the vast chemical space is unrealistic for predictors. As a result, errors and noise are inevitably introduced during property prediction due to the nature of approximation. This leads to discrepancy accumulation, generalization reduction and suboptimal molecular candidates. In this paper, we propose a text-guided multi-property molecular optimization method utilizing transformer-based diffusion language model (TransDLM). TransDLM leverages standardized chemical nomenclature as semantic representations of molecules and implicitly embeds property requirements into textual descriptions, thereby mitigating error propagation during diffusion process. By fusing physically and chemically detailed textual semantics with specialized molecular representations, TransDLM effectively integrates diverse information sources to guide precise optimization, which enhances the model’s ability to balance structural retention and property enhancement. Additionally, the success of a case study further demonstrates TransDLM’s ability to solve practical problems. Experimentally, our approach surpasses state-of-the-art methods in maintaining molecular structural similarity and enhancing chemical properties on the benchmark dataset. The code is available at: https://github.com/Cello2195/TransDLM.

nan

Article 1806

Title@2025-05-24 (6): How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark

Title: How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark

Wie wird LLM-Reasoning vom irrelevanten Kontext abgelenkt? Eine Analyse mit einem kontrollierten Benchmark

LLM 为何被不相关背景所忽略? 2505.18761v1

Authors: Minglai Yang, Ethan Huang, Liang Zhang, Mihai Surdeanu, William Wang, Liangming Pan

We introduce Grade School Math with Distracting Context (GSM-DC), a synthetic benchmark to evaluate Large Language Models’ (LLMs) reasoning robustness against systematically controlled irrelevant context (IC). GSM-DC constructs symbolic reasoning graphs with precise distractor injections, enabling rigorous, reproducible evaluation. Our experiments demonstrate that LLMs are significantly sensitive to IC, affecting both reasoning path selection and arithmetic accuracy. Additionally, training models with strong distractors improves performance in both in-distribution and out-of-distribution scenarios. We further propose a stepwise tree search guided by a process reward model, which notably enhances robustness in out-of-distribution conditions.

nan

Article 1807

Title@2025-05-24 (6): The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation

Title: The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation

Die Suche nach einer effizienten Begründung: Ein datenzentrischer Benchmark zur CoT-Destillation

有效合理理由的查询:COT蒸馏的数据中心基准 2505.18759v1

Authors: Ruichen Zhang, Rana Muhammad Shahroz Khan, Zhen Tan, Dawei Li, Song Wang, Tianlong Chen

Data-centric distillation, including data augmentation, selection, and mixing, offers a promising path to creating smaller, more efficient student Large Language Models (LLMs) that retain strong reasoning abilities. However, there still lacks a comprehensive benchmark to systematically assess the effect of each distillation approach. This paper introduces DC-CoT, the first data-centric benchmark that investigates data manipulation in chain-of-thought (CoT) distillation from method, model and data perspectives. Utilizing various teacher models (e.g., o4-mini, Gemini-Pro, Claude-3.5) and student architectures (e.g., 3B, 7B parameters), we rigorously evaluate the impact of these data manipulations on student model performance across multiple reasoning datasets, with a focus on in-distribution (IID) and out-of-distribution (OOD) generalization, and cross-domain transfer. Our findings aim to provide actionable insights and establish best practices for optimizing CoT distillation through data-centric techniques, ultimately facilitating the development of more accessible and capable reasoning models. The dataset can be found at https://huggingface.co/datasets/rana-shahroz/DC-COT, while our code is shared in https://anonymous.4open.science/r/DC-COT-FF4C/.

nan

Article 1808

Title@2025-05-24 (6): Lean and Mean Adaptive Optimization via Subset-Norm and Subspace-Momentum with Convergence Guarantees

Title: Lean and Mean Adaptive Optimization via Subset-Norm and Subspace-Momentum with Convergence Guarantees

Lean and Mean Adaptive Optimization via Subset-Norm und Subspace-Momentum mit Konvergenzgarantien

通过具有聚合担保的子元和子空间动力及子空间动力进行皮和平均适应性优化 2411.07120v2

Authors: Thien Hang Nguyen, Huy Le Nguyen

We introduce two complementary techniques for efficient optimization that reduce memory requirements while accelerating training of large-scale neural networks. The first technique, Subset-Norm step size, generalizes AdaGrad-Norm and AdaGrad(-Coordinate) through step-size sharing. Subset-Norm (SN) reduces AdaGrad’s memory footprint from $O(d)$ to $O(\sqrt{d})$, where $d$ is the model size. For non-convex smooth objectives under coordinate-wise sub-gaussian noise, we show a noise-adapted high-probability convergence guarantee with improved dimensional dependence of SN over existing methods. Our second technique, Subspace-Momentum, reduces the momentum state’s memory footprint by restricting momentum to a low-dimensional subspace while performing SGD in the orthogonal complement. We prove a high-probability convergence result for Subspace-Momentum under standard assumptions. Empirical evaluation on pre-training and fine-tuning LLMs demonstrates the effectiveness of our methods. For instance, combining Subset-Norm with Subspace-Momentum achieves Adam’s validation perplexity for LLaMA 1B in approximately half the training tokens (6.8B vs 13.1B) while reducing Adam’s optimizer-states memory footprint by more than 80\% with minimal additional hyperparameter tuning.

nan

Article 1809

Title@2025-05-24 (6): Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding

Title: Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding

Reduzierung der Speicherung vortrainierter neuraler Netzwerke durch ratenkontrainierte Quantisierung und Entropiecodierung

通过受费率限制的量化和元件编码减少储存预培训神经网络 2505.18758v1

Authors: Alexander Conzelmann, Robert Bamler

The ever-growing size of neural networks poses serious challenges on resource-constrained devices, such as embedded sensors. Compression algorithms that reduce their size can mitigate these problems, provided that model performance stays close to the original. We propose a novel post-training compression framework that combines rate-aware quantization with entropy coding by (1) extending the well-known layer-wise loss by a quadratic rate estimation, and (2) providing locally exact solutions to this modified objective following the Optimal Brain Surgeon (OBS) method. Our method allows for very fast decoding and is compatible with arbitrary quantization grids. We verify our results empirically by testing on various computer-vision networks, achieving a 20-40\% decrease in bit rate at the same performance as the popular compression algorithm NNCodec. Our code is available at https://github.com/Conzel/cerwu.

nan

Article 1810

Title@2025-05-24 (6): Smart Energy Guardian: A Hybrid Deep Learning Model for Detecting Fraudulent PV Generation

Title: Smart Energy Guardian: A Hybrid Deep Learning Model for Detecting Fraudulent PV Generation

Smart Energy Guardian: Ein hybrides Deep-Learning-Modell zur Erkennung betrügerischer PV-Generation

智能能源守护者:发现欺诈性光电池发电的混合深学习模式 2505.18755v1

Authors: Xiaolu Chen, Chenghao Huang, Yanru Zhang, Hao Wang

With the proliferation of smart grids, smart cities face growing challenges due to cyber-attacks and sophisticated electricity theft behaviors, particularly in residential photovoltaic (PV) generation systems. Traditional Electricity Theft Detection (ETD) methods often struggle to capture complex temporal dependencies and integrating multi-source data, limiting their effectiveness. In this work, we propose an efficient ETD method that accurately identifies fraudulent behaviors in residential PV generation, thus ensuring the supply-demand balance in smart cities. Our hybrid deep learning model, combining multi-scale Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Transformer, excels in capturing both short-term and long-term temporal dependencies. Additionally, we introduce a data embedding technique that seamlessly integrates time-series data with discrete temperature variables, enhancing detection robustness. Extensive simulation experiments using real-world data validate the effectiveness of our approach, demonstrating significant improvements in the accuracy of detecting sophisticated energy theft activities, thereby contributing to the stability and fairness of energy systems in smart cities.

nan

Article 1811

Title@2025-05-24 (6): HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting

Title: HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting

HiMoE: Heterogenitäts-informierte Mixture-of-Experts für faire räumlich-zeitliche Vorhersagen

HimMoE:公平空间-时空预报专家的异异质性异构混合 2412.00316v3

Authors: Shaohan Yu, Pan Deng, Yu Zhao, Junting Liu, Zi’ang Wang

Achieving both accurate and consistent predictive performance across spatial nodes is crucial for ensuring the validity and reliability of outcomes in fair spatial-temporal forecasting tasks. However, existing training methods treat heterogeneous nodes with a fully averaged perspective, resulting in inherently biased prediction targets. Balancing accuracy and consistency is particularly challenging due to the multi-objective nature of spatial-temporal forecasting. To address this issue, we propose a novel Heterogeneity-Informed Mixture-of-Experts (HiMoE) framework that delivers both uniform and precise spatial-temporal predictions. From a model architecture perspective, we design the Heterogeneity-Informed Graph Convolutional Network (HiGCN) to address trend heterogeneity, and we introduce the Node-wise Mixture-of-Experts (NMoE) module to handle cardinality heterogeneity across nodes. From an evaluation perspective, we propose STFairBench, a benchmark that handles fairness in spatial-temporal prediction from both training and evaluation stages. Extensive experiments on four real-world datasets demonstrate that HiMoE achieves state-of-the-art performance, outperforming the best baseline by at least 9.22% across all evaluation metrics.

nan

Article 1812

Title@2025-05-24 (6): Season-Independent PV Disaggregation Using Multi-Scale Net Load Temporal Feature Extraction and Weather Factor Fusion

Title: Season-Independent PV Disaggregation Using Multi-Scale Net Load Temporal Feature Extraction and Weather Factor Fusion

Saisonunabhängige PV-Disaggregation mittels Multi-Scale Net Load Temporal Feature Extraktion und Wetterfaktor Fusion

使用多种规模净负荷时间特征抽取和天气因素融合的季节独立光电池拆分 2505.18747v1

Authors: Xiaolu Chen, Chenghao Huang, Yanru Zhang, Hao Wang

With the advancement of energy Internet and energy system integration, the increasing adoption of distributed photovoltaic (PV) systems presents new challenges on smart monitoring and measurement for utility companies, particularly in separating PV generation from net electricity load. Existing methods struggle with feature extraction from net load and capturing the relevance between weather factors. This paper proposes a PV disaggregation method that integrates Hierarchical Interpolation (HI) and multi-head self-attention mechanisms. By using HI to extract net load features and multi-head self-attention to capture the complex dependencies between weather factors, the method achieves precise PV generation predictions. Simulation experiments demonstrate the effectiveness of the proposed method in real-world data, supporting improved monitoring and management of distributed energy systems.

nan

Article 1813

Title@2025-05-24 (6): C3R: Channel Conditioned Cell Representations for unified evaluation in microscopy imaging

Title: C3R: Channel Conditioned Cell Representations for unified evaluation in microscopy imaging

C3R: Kanalkonditionierte Zelldarstellungen zur einheitlichen Auswertung in der Mikroskopie-Bildgebung

C3R:用于对显微镜成像进行统一评价的有条件细胞代表的频道 2505.18745v1

Authors: Umar Marikkar, Syed Sameed Husain, Muhammad Awais, Sara Atito

Immunohistochemical (IHC) images reveal detailed information about structures and functions at the subcellular level. However, unlike natural images, IHC datasets pose challenges for deep learning models due to their inconsistencies in channel count and configuration, stemming from varying staining protocols across laboratories and studies. Existing approaches build channel-adaptive models, which unfortunately fail to support out-of-distribution (OOD) evaluation across IHC datasets and cannot be applied in a true zero-shot setting with mismatched channel counts. To address this, we introduce a structured view of cellular image channels by grouping them into either context or concept, where we treat the context channels as a reference to the concept channels in the image. We leverage this context-concept principle to develop Channel Conditioned Cell Representations (C3R), a framework designed for unified evaluation on in-distribution (ID) and OOD datasets. C3R is a two-fold framework comprising a channel-adaptive encoder architecture and a masked knowledge distillation training strategy, both built around the context-concept principle. We find that C3R outperforms existing benchmarks on both ID and OOD tasks, while a trivial implementation of our core idea also outperforms the channel-adaptive methods reported on the CHAMMI benchmark. Our method opens a new pathway for cross-dataset generalization between IHC datasets, without requiring dataset-specific adaptation or retraining.

nan

Article 1814

Title@2025-05-24 (6): Interpretable Company Similarity with Sparse Autoencoders

Title: Interpretable Company Similarity with Sparse Autoencoders

Interpretierbare Firmenähnlichkeit mit Sparse Autoencodern

与Sparse Autoencolders 相似 2412.02605v3

Authors: Marco Molinari, Victor Shao, Luca Imeneo, Mateusz Mikolajczak, Vladimir Tregubiak, Abhimanyu Pandey, Sebastian Kuznetsov Ryder Torres Pereira

Determining company similarity is a vital task in finance, underpinning risk management, hedging, and portfolio diversification. Practitioners often rely on sector and industry classifications such as SIC and GICS codes to gauge similarity, the former being used by the U.S. Securities and Exchange Commission (SEC), and the latter widely used by the investment community. Since these classifications lack granularity and need regular updating, using clusters of embeddings of company descriptions has been proposed as a potential alternative, but the lack of interpretability in token embeddings poses a significant barrier to adoption in high-stakes contexts. Sparse Autoencoders (SAEs) have shown promise in enhancing the interpretability of Large Language Models (LLMs) by decomposing Large Language Model (LLM) activations into interpretable features. Moreover, SAEs capture an LLM’s internal representation of a company description, as opposed to semantic similarity alone, as is the case with embeddings. We apply SAEs to company descriptions, and obtain meaningful clusters of equities. We benchmark SAE features against SIC-codes, Industry codes, and Embeddings. Our results demonstrate that SAE features surpass sector classifications and embeddings in capturing fundamental company characteristics. This is evidenced by their superior performance in correlating logged monthly returns - a proxy for similarity - and generating higher Sharpe ratios in co-integration trading strategies, which underscores deeper fundamental similarities among companies. Finally, we verify the interpretability of our clusters, and demonstrate that sparse features form simple and interpretable explanations for our clusters.

nan

Article 1815

Title@2025-05-24 (6): Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models

Title: Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models

Feature-Extraktion und -Lenkung für eine verbesserte Kettenbildung in Sprachmodellen

语言模型中强化研究链理由的特征采掘和指南 2505.15634v2

Authors: Zihao Li, Xu Wang, Yuzhe Yang, Ziyu Yao, Haoyi Xiong, Mengnan Du

Large Language Models (LLMs) demonstrate the ability to solve reasoning and mathematical problems using the Chain-of-Thought (CoT) technique. Expanding CoT length, as seen in models such as DeepSeek-R1, significantly enhances this reasoning for complex problems, but requires costly and high-quality long CoT data and fine-tuning. This work, inspired by the deep thinking paradigm of DeepSeek-R1, utilizes a steering technique to enhance the reasoning ability of an LLM without external datasets. Our method first employs Sparse Autoencoders (SAEs) to extract interpretable features from vanilla CoT. These features are then used to steer the LLM’s internal states during generation. Recognizing that many LLMs do not have corresponding pre-trained SAEs, we further introduce a novel SAE-free steering algorithm, which directly computes steering directions from the residual activations of an LLM, obviating the need for an explicit SAE. Experimental results demonstrate that both our SAE-based and subsequent SAE-free steering algorithms significantly enhance the reasoning capabilities of LLMs.

nan

Article 1816

Title@2025-05-24 (6): An Interpretable Deep-Learning Framework for Predicting Hospital Readmissions From Electronic Health Records

Title: An Interpretable Deep-Learning Framework for Predicting Hospital Readmissions From Electronic Health Records

Ein interpretierbarer Deep-Learning-Rahmen für die Vorhersage von Krankenhausrückübernahmen aus elektronischen Gesundheitsakten

预测医院从电子健康记录中读取的医院可解释的深学习框架 2310.10187v2

Authors: Fabio Azzalini, Tommaso Dolci, Marco Vagaggini

With the increasing availability of patient data, modern medicine is shifting towards prospective healthcare. Electronic health records offer a variety of information useful for clinical patient characterization and the development of predictive models, given that similar medical histories often lead to analogous health progressions. One application is the prediction of unplanned hospital readmissions, an essential task for reducing healthcare costs and improving patient outcomes. While predictive models demonstrate strong performances especially with deep learning approaches, they are often criticized for their lack of interpretability, a critical requirement in the medical domain where incorrect predictions may have severe consequences for patient safety. In this paper, we propose a novel and interpretable deep learning framework for predicting unplanned hospital readmissions, supported by NLP findings on word embeddings and by ConvLSTM neural networks for better handling temporal data. We validate the framework on two predictive tasks for hospital readmission within 30 and 180 days, using real-world data. Additionally, we introduce and evaluate a model-dependent technique designed to enhance result interpretability for medical professionals. Our solution outperforms traditional machine learning models in prediction accuracy while simultaneously providing more interpretable results.

nan

Article 1817

Title@2025-05-24 (6): AuroRA: Breaking Low-Rank Bottleneck of LoRA with Nonlinear Mapping

Title: AuroRA: Breaking Low-Rank Bottleneck of LoRA with Nonlinear Mapping

AuroRA: Breaking Low-Rank Engpass von LoRA mit nichtlinearer Kartierung

AuroRA:用非线性绘图法打破LORA的低兰克瓶尾裂 2505.18738v1

Authors: Haonan Dong, Wenhao Zhu, Guojie Song, Liang Wang

Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method validated across NLP and CV domains. However, LoRA faces an inherent low-rank bottleneck: narrowing its performance gap with full finetuning requires increasing the rank of its parameter matrix, resulting in significant parameter overhead. Recent linear LoRA variants have attempted to enhance expressiveness by introducing additional linear mappings; however, their composition remains inherently linear and fails to fundamentally improve LoRA’s representational capacity. To address this limitation, we propose AuroRA, which incorporates an Adaptive Nonlinear Layer (ANL) between two linear projectors to capture fixed and learnable nonlinearities. This combination forms an MLP-like structure with a compressed rank, enabling flexible and precise approximation of diverse target functions while theoretically guaranteeing lower approximation errors and bounded gradients. Extensive experiments on 22 datasets and 6 pretrained models demonstrate that AuroRA: (I) not only matches or surpasses full fine-tuning performance with only 6.18% ~ 25% of LoRA’s parameters but also (II) outperforms state-of-the-art PEFT methods by up to 10.88% in both NLP and CV tasks, and (III) exhibits robust performance across various rank configurations.

nan

Article 1818

Title@2025-05-24 (6): Graph Neural Networks for Knowledge Enhanced Visual Representation of Paintings

Title: Graph Neural Networks for Knowledge Enhanced Visual Representation of Paintings

Graph Neural Networks for Knowledge Enhanced Visual Representation of Paintings

知识强化画画视觉表现神经网络 2105.08190v2

Authors: Athanasios Efthymiou, Stevan Rudinac, Monika Kackovic, Marcel Worring, Nachoem Wijnberg

We propose ArtSAGENet, a novel multimodal architecture that integrates Graph Neural Networks (GNNs) and Convolutional Neural Networks (CNNs), to jointly learn visual and semantic-based artistic representations. First, we illustrate the significant advantages of multi-task learning for fine art analysis and argue that it is conceptually a much more appropriate setting in the fine art domain than the single-task alternatives. We further demonstrate that several GNN architectures can outperform strong CNN baselines in a range of fine art analysis tasks, such as style classification, artist attribution, creation period estimation, and tag prediction, while training them requires an order of magnitude less computational time and only a small amount of labeled data. Finally, through extensive experimentation we show that our proposed ArtSAGENet captures and encodes valuable relational dependencies between the artists and the artworks, surpassing the performance of traditional methods that rely solely on the analysis of visual content. Our findings underline a great potential of integrating visual content and semantics for fine art analysis and curation.

nan

Article 1819

Title@2025-05-24 (6): MADCAT: Combating Malware Detection Under Concept Drift with Test-Time Adaptation

Title: MADCAT: Combating Malware Detection Under Concept Drift with Test-Time Adaptation

MADCAT: Bekämpfung der Malware-Erkennung unter Konzept Drift mit Test-Zeit-Anpassung

MADCAT: 在 “ 漂流 “ 概念下,通过测试-时间适应来打击 “ 恶意探测 “ 2505.18734v1

Authors: Eunjin Roh, Yigitcan Kaya, Christopher Kruegel, Giovanni Vigna, Sanghyun Hong

We present MADCAT, a self-supervised approach designed to address the concept drift problem in malware detection. MADCAT employs an encoder-decoder architecture and works by test-time training of the encoder on a small, balanced subset of the test-time data using a self-supervised objective. During test-time training, the model learns features that are useful for detecting both previously seen (old) data and newly arriving samples. We demonstrate the effectiveness of MADCAT in continuous Android malware detection settings. MADCAT consistently outperforms baseline methods in detection performance at test time. We also show the synergy between MADCAT and prior approaches in addressing concept drift in malware detection

nan

Article 1820

Title@2025-05-24 (6): ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search

Title: ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search

ReGUIDE: Dateneffizientes GUI Grounding über räumliche Vernunft und Suche

数据高效界面:通过空间理性和搜索进行数据高效界面定位 2505.15259v2

Authors: Hyunseok Lee, Jeonghoon Kim, Beomjun Kim, Jihoon Tack, Chansong Jo, Jaehong Lee, Cheonbok Park, Sookyo In, Jinwoo Shin, Kang Min Yoo

Recent advances in Multimodal Large Language Models (MLLMs) have enabled autonomous agents to interact with computers via Graphical User Interfaces (GUIs), where accurately localizing the coordinates of interface elements (e.g., buttons) is often required for fine-grained actions. However, this remains significantly challenging, leading prior works to rely on large-scale web datasets to improve the grounding accuracy. In this work, we propose Reasoning Graphical User Interface Grounding for Data Efficiency (ReGUIDE), a novel and effective framework for web grounding that enables MLLMs to learn data efficiently through self-generated reasoning and spatial-aware criticism. More specifically, ReGUIDE learns to (i) self-generate a language reasoning process for the localization via online reinforcement learning, and (ii) criticize the prediction using spatial priors that enforce equivariance under input transformations. At inference time, ReGUIDE further boosts performance through a test-time scaling strategy, which combines spatial search with coordinate aggregation. Our experiments demonstrate that ReGUIDE significantly advances web grounding performance across multiple benchmarks, outperforming baselines with substantially fewer training data points (e.g., only 0.2% samples compared to the best open-sourced baselines).

nan

Article 1821

Title@2025-05-24 (6): Reward-Driven Interaction: Enhancing Proactive Dialogue Agents through User Satisfaction Prediction

Title: Reward-Driven Interaction: Enhancing Proactive Dialogue Agents through User Satisfaction Prediction

Reward-Driven Interaction: Verbesserung proaktiver Dialog-Agenten durch Nutzerzufriedenheitsvorhersage

回报率互动:通过用户满意度预测加强积极主动的对话机构 2505.18731v1

Authors: Wei Shen, Xiaonan He, Chuheng Zhang, Xuyun Zhang, Xiaolong Xu, Wanchun Dou

Reward-driven proactive dialogue agents require precise estimation of user satisfaction as an intrinsic reward signal to determine optimal interaction strategies. Specifically, this framework triggers clarification questions when detecting potential user dissatisfaction during interactions in the industrial dialogue system. Traditional works typically rely on training a neural network model based on weak labels which are generated by a simple model trained on user actions after current turn. However, existing methods suffer from two critical limitations in real-world scenarios: (1) Noisy Reward Supervision, dependence on weak labels derived from post-hoc user actions introduces bias, particularly failing to capture satisfaction signals in ASR-error-induced utterances; (2) Long-Tail Feedback Sparsity, the power-law distribution of user queries causes reward prediction accuracy to drop in low-frequency domains. The noise in the weak labels and a power-law distribution of user utterances results in that the model is hard to learn good representation of user utterances and sessions. To address these limitations, we propose two auxiliary tasks to improve the representation learning of user utterances and sessions that enhance user satisfaction prediction. The first one is a contrastive self-supervised learning task, which helps the model learn the representation of rare user utterances and identify ASR errors. The second one is a domain-intent classification task, which aids the model in learning the representation of user sessions from long-tailed domains and improving the model’s performance on such domains. The proposed method is evaluated on DuerOS, demonstrating significant improvements in the accuracy of error recognition on rare user utterances and long-tailed domains.

nan

Article 1822

Title@2025-05-24 (6): Influence Functions for Scalable Data Attribution in Diffusion Models

Title: Influence Functions for Scalable Data Attribution in Diffusion Models

Einflussfunktionen für skalierbare Datenzuweisungen in Diffusionsmodellen

扩散模型中可缩放数据归属的影响函数 2410.13850v5

Authors: Bruno Mlodozeniec, Runa Eschenhagen, Juhan Bae, Alexander Immer, David Krueger, Richard Turner

Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to help address such challenges in diffusion models by developing an influence functions framework. Influence function-based data attribution methods approximate how a model’s output would have changed if some training data were removed. In supervised learning, this is usually used for predicting how the loss on a particular example would change. For diffusion models, we focus on predicting the change in the probability of generating a particular example via several proxy measurements. We show how to formulate influence functions for such quantities and how previously proposed methods can be interpreted as particular design choices in our framework. To ensure scalability of the Hessian computations in influence functions, we systematically develop K-FAC approximations based on generalised Gauss-Newton matrices specifically tailored to diffusion models. We recast previously proposed methods as specific design choices in our framework and show that our recommended method outperforms previous data attribution approaches on common evaluations, such as the Linear Data-modelling Score (LDS) or retraining without top influences, without the need for method-specific hyperparameter tuning.

nan

Article 1823

Title@2025-05-24 (6): Message-Passing State-Space Models: Improving Graph Learning with Modern Sequence Modeling

Title: Message-Passing State-Space Models: Improving Graph Learning with Modern Sequence Modeling

Message-Passing State-Space-Modelle: Verbesserung des Graphen-Lernens mit moderner Sequenzmodellierung

传递信息的国家空间模型:利用现代序列模型改进图表学习 2505.18728v1

Authors: Andrea Ceni, Alessio Gravina, Claudio Gallicchio, Davide Bacciu, Carola-Bibiane Schonlieb, Moshe Eliasof

The recent success of State-Space Models (SSMs) in sequence modeling has motivated their adaptation to graph learning, giving rise to Graph State-Space Models (GSSMs). However, existing GSSMs operate by applying SSM modules to sequences extracted from graphs, often compromising core properties such as permutation equivariance, message-passing compatibility, and computational efficiency. In this paper, we introduce a new perspective by embedding the key principles of modern SSM computation directly into the Message-Passing Neural Network framework, resulting in a unified methodology for both static and temporal graphs. Our approach, MP-SSM, enables efficient, permutation-equivariant, and long-range information propagation while preserving the architectural simplicity of message passing. Crucially, MP-SSM enables an exact sensitivity analysis, which we use to theoretically characterize information flow and evaluate issues like vanishing gradients and over-squashing in the deep regime. Furthermore, our design choices allow for a highly optimized parallel implementation akin to modern SSMs. We validate MP-SSM across a wide range of tasks, including node classification, graph property prediction, long-range benchmarks, and spatiotemporal forecasting, demonstrating both its versatility and strong empirical performance.

nan

Article 1824

Title@2025-05-24 (6): Length independent generalization bounds for deep SSM architectures via Rademacher contraction and stability constraints

Title: Length independent generalization bounds for deep SSM architectures via Rademacher contraction and stability constraints

Längenunabhängige Verallgemeinerungsgrenzen für tiefe SSM-Architekturen über Rademacher Kontraktion und Stabilitätsbeschränkungen

通过雷德马赫公司收缩和稳定制约因素对深层的SMS结构进行长度独立概括的界限 2405.20278v3

Authors: Dániel Rácz, Mihály Petreczky, Bálint Daróczy

Many state-of-the-art models trained on long-range sequences, for example S4, S5 or LRU, are made of sequential blocks combining State-Space Models (SSMs) with neural networks. In this paper we provide a PAC bound that holds for these kind of architectures with \emph{stable} SSM blocks and does not depend on the length of the input sequence. Imposing stability of the SSM blocks is a standard practice in the literature, and it is known to help performance. Our results provide a theoretical justification for the use of stable SSM blocks as the proposed PAC bound decreases as the degree of stability of the SSM blocks increases.

nan

Article 1825

Title@2025-05-24 (6): Audio Geolocation: A Natural Sounds Benchmark

Title: Audio Geolocation: A Natural Sounds Benchmark

Audio Geolocation: Ein natürlicher Klang Benchmark

音频地理定位:自然声音基准 2505.18726v1

Authors: Mustafa Chasmai, Wuao Liu, Subhransu Maji, Grant Van Horn

Can we determine someone’s geographic location purely from the sounds they hear? Are acoustic signals enough to localize within a country, state, or even city? We tackle the challenge of global-scale audio geolocation, formalize the problem, and conduct an in-depth analysis with wildlife audio from the iNatSounds dataset. Adopting a vision-inspired approach, we convert audio recordings to spectrograms and benchmark existing image geolocation techniques. We hypothesize that species vocalizations offer strong geolocation cues due to their defined geographic ranges and propose an approach that integrates species range prediction with retrieval-based geolocation. We further evaluate whether geolocation improves when analyzing species-rich recordings or when aggregating across spatiotemporal neighborhoods. Finally, we introduce case studies from movies to explore multimodal geolocation using both audio and visual content. Our work highlights the advantages of integrating audio and visual cues, and sets the stage for future research in audio geolocation.

nan

Article 1826

Title@2025-05-24 (6): LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning

Title: LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning

LoTA-QAF: Lossless Ternary Adaptation für Quantization-Aware Fine-Tuning

LoTA-QAF:量化软件微调的无损失田间适应 2505.18724v1

Authors: Junyu Chen, Junzhuo Li, Zhen Peng, Wenjie Wang, Yuxiang Ren, Long Shi, Xuming Hu

Quantization and fine-tuning are crucial for deploying large language models (LLMs) on resource-constrained edge devices. However, fine-tuning quantized models presents significant challenges, primarily stemming from: First, the mismatch in data types between the low-precision quantized weights (e.g., 4-bit) and the high-precision adaptation weights (e.g., 16-bit). This mismatch limits the computational efficiency advantage offered by quantized weights during inference. Second, potential accuracy degradation when merging these high-precision adaptation weights into the low-precision quantized weights, as the adaptation weights often necessitate approximation or truncation. Third, as far as we know, no existing methods support the lossless merging of adaptation while adjusting all quantized weights. To address these challenges, we introduce lossless ternary adaptation for quantization-aware fine-tuning (LoTA-QAF). This is a novel fine-tuning method specifically designed for quantized LLMs, enabling the lossless merging of ternary adaptation weights into quantized weights and the adjustment of all quantized weights. LoTA-QAF operates through a combination of: i) A custom-designed ternary adaptation (TA) that aligns ternary weights with the quantization grid and uses these ternary weights to adjust quantized weights. ii) A TA-based mechanism that enables the lossless merging of adaptation weights. iii) Ternary signed gradient descent (t-SignSGD) for updating the TA weights. We apply LoTA-QAF to Llama-3.1/3.3 and Qwen-2.5 model families and validate its effectiveness on several downstream tasks. On the MMLU benchmark, our method effectively recovers performance for quantized models, surpassing 16-bit LoRA by up to 5.14\%. For task-specific fine-tuning, 16-bit LoRA achieves superior results, but LoTA-QAF still outperforms other methods.

nan

Article 1827

Title@2025-05-24 (6): Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization

Title: Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization

Optimales Transport-basiertes Token-Gewichtungssystem für verbesserte Preference-Optimierung

增强优惠优化的优化运输托肯加权计划 2505.18720v1

Authors: Meng Li, Guangda Huzhang, Haibo Zhang, Xiting Wang, Anxiang Zeng

Direct Preference Optimization (DPO) has emerged as a promising framework for aligning Large Language Models (LLMs) with human preferences by directly optimizing the log-likelihood difference between chosen and rejected responses. However, existing methods assign equal importance to all tokens in the response, while humans focus on more meaningful parts. This leads to suboptimal preference optimization, as irrelevant or noisy tokens disproportionately influence DPO loss. To address this limitation, we propose \textbf{O}ptimal \textbf{T}ransport-based token weighting scheme for enhancing direct \textbf{P}reference \textbf{O}ptimization (OTPO). By emphasizing semantically meaningful token pairs and de-emphasizing less relevant ones, our method introduces a context-aware token weighting scheme that yields a more contrastive reward difference estimate. This adaptive weighting enhances reward stability, improves interpretability, and ensures that preference optimization focuses on meaningful differences between responses. Extensive experiments have validated OTPO’s effectiveness in improving instruction-following ability across various settings\footnote{Code is available at https://github.com/Mimasss2/OTPO.}.

nan

Article 1828

Title@2025-05-24 (6): Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer

Title: Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer

Neurale Parameter Suche nach schlankeren Modellen und besserer Übertragung

搜索细微精制模型和更好传输的神经参数 2505.18713v1

Authors: Guodong Du, Zitao Fang, Jing Li, Junlin Li, Runhua Jiang, Shuyang Yu, Yifei Guo, Yangneng Chen, Sim Kuan Goh, Ho-Kin Tang, Daojing He, Honghai Liu, Min Zhang

Foundation models and their checkpoints have significantly advanced deep learning, boosting performance across various applications. However, fine-tuned models often struggle outside their specific domains and exhibit considerable redundancy. Recent studies suggest that combining a pruned fine-tuned model with the original pre-trained model can mitigate forgetting, reduce interference when merging model parameters across tasks, and improve compression efficiency. In this context, developing an effective pruning strategy for fine-tuned models is crucial. Leveraging the advantages of the task vector mechanism, we preprocess fine-tuned models by calculating the differences between them and the original model. Recognizing that different task vector subspaces contribute variably to model performance, we introduce a novel method called Neural Parameter Search (NPS-Pruning) for slimming down fine-tuned models. This method enhances pruning efficiency by searching through neural parameters of task vectors within low-rank subspaces. Our method has three key applications: enhancing knowledge transfer through pairwise model interpolation, facilitating effective knowledge fusion via model merging, and enabling the deployment of compressed models that retain near-original performance while significantly reducing storage costs. Extensive experiments across vision, NLP, and multi-modal benchmarks demonstrate the effectiveness and robustness of our approach, resulting in substantial performance gains. The code is publicly available at: https://github.com/duguodong7/NPS-Pruning.

nan

Article 1829

Title@2025-05-24 (6): Learning on LLM Output Signatures for gray-box Behavior Analysis

Title: Learning on LLM Output Signatures for gray-box Behavior Analysis

Lernen auf LLM-Ausgangssignaturen für graue Verhaltensanalyse

学习用于灰箱行为分析的 LLM 输出签名 2503.14043v2

Authors: Guy Bar-Shalom, Fabrizio Frasca, Derek Lim, Yoav Gelberg, Yftah Ziser, Ran El-Yaniv, Gal Chechik, Haggai Maron

Large Language Models (LLMs) have achieved widespread adoption, yet our understanding of their behavior remains limited, particularly in detecting data contamination and hallucinations. While recently proposed probing techniques provide insights through activation analysis, they require white-box'' access to model internals, often unavailable. Currentgray-box’’ approaches typically analyze only the probability of the actual tokens in the sequence with simple task-specific heuristics. Importantly, these methods overlook the rich information contained in the full token distribution at each processing step. To address these limitations, we propose that gray-box analysis should leverage the complete observable output of LLMs, consisting of both the previously used token probabilities as well as the complete token distribution sequences - a unified data type we term LOS (LLM Output Signature). To this end, we develop a transformer-based approach to process LOS that theoretically guarantees approximation of existing techniques while enabling more nuanced analysis. Our approach achieves superior performance on hallucination and data contamination detection in gray-box settings, significantly outperforming existing baselines. Furthermore, it demonstrates strong transfer capabilities across datasets and LLMs, suggesting that LOS captures fundamental patterns in LLM behavior. Our code is available at: https://github.com/BarSGuy/LLM-Output-Signatures-Network.

nan

Article 1830

Title@2025-05-24 (6): Steering LLM Reasoning Through Bias-Only Adaptation

Title: Steering LLM Reasoning Through Bias-Only Adaptation

Steuerung der LLM-Vernunft durch Bias-Only-Anpassung

仅有的偏差调整导致的偏差调整 2505.18706v1

Authors: Viacheslav Sinii, Alexey Gorbatovski, Artem Cherepanov, Boris Shaposhnikov, Nikita Balagansky, Daniil Gavrilov

Recent work on reasoning-oriented language models, exemplified by o1-like systems, suggests that reinforcement-learning (RL) finetuning does not create new capabilities but instead strengthens reasoning patterns already latent in the pretrained network. We test this claim by training steering vectors: layer-wise biases that additively amplify selected hidden features while leaving all original weights unchanged. Experiments on four base models across the GSM8K and MATH benchmarks show that steering vectors recover, and in several cases exceed, the accuracy of fully-tuned counterparts. This result supports the view that the required reasoning skills pre-exist in the base model. Further, logit-lens analysis reveals that the trained vectors consistently boost token groups linked to structured languages and logical connectors, providing an interpretable account that aligns with the demands of quantitative reasoning tasks.

nan

Article 1831

Title@2025-05-24 (6): (Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models

Title: (Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models

(Implizit) Ensembles von Ensembles: Epistemische Ungewissheit bricht in großen Modellen zusammen

群集集合:大型模型中的不确定性粒子折叠 2409.02628v2

Authors: Andreas Kirsch

Epistemic uncertainty is crucial for safety-critical applications and data acquisition tasks. Yet, we find an important phenomenon in deep learning models: an epistemic uncertainty collapse as model complexity increases, challenging the assumption that larger models invariably offer better uncertainty quantification. We introduce implicit ensembling as a possible explanation for this phenomenon. To investigate this hypothesis, we provide theoretical analysis and experiments that demonstrate uncertainty collapse in explicit ensembles of ensembles and show experimental evidence of similar collapse in wider models across various architectures, from simple MLPs to state-of-the-art vision models including ResNets and Vision Transformers. We further develop implicit ensemble extraction techniques to decompose larger models into diverse sub-models, showing we can thus recover epistemic uncertainty. We explore the implications of these findings for uncertainty estimation.

nan

Article 1832

Title@2025-05-24 (6): Data Overvaluation Attack and Truthful Data Valuation in Federated Learning

Title: Data Overvaluation Attack and Truthful Data Valuation in Federated Learning

Datenüberbewertung Angriff und Truthful Data Bewertung im Föderierten Lernen

联邦学习联盟的数据评价高估攻击和真实数据估值 2502.00494v3

Authors: Shuyuan Zheng, Sudong Cai, Chuan Xiao, Yang Cao, Jianbin Qin, Masatoshi Yoshikawa, Makoto Onizuka

In collaborative machine learning (CML), data valuation, i.e., evaluating the contribution of each client’s data to the machine learning model, has become a critical task for incentivizing and selecting positive data contributions. However, existing studies often assume that clients engage in data valuation truthfully, overlooking the practical motivation for clients to exaggerate their contributions. To unlock this threat, this paper introduces the data overvaluation attack, enabling strategic clients to have their data significantly overvalued in federated learning, a widely adopted paradigm for decentralized CML. Furthermore, we propose a Bayesian truthful data valuation metric, named Truth-Shapley. Truth-Shapley is the unique metric that guarantees some promising axioms for data valuation while ensuring that clients’ optimal strategy is to perform truthful data valuation under certain conditions. Our experiments demonstrate the vulnerability of existing data valuation metrics to the proposed attack and validate the robustness and effectiveness of Truth-Shapley.

nan

Article 1833

Title@2025-05-24 (6): MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention

Title: MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention

MonarchAchtung: Null-Schuss-Umwandlung zu schneller, Hardware-Bewusst strukturierter Aufmerksamkeit

MonarchAttention: 零热转换为快速硬件软件 2505.18698v1

Authors: Can Yaras, Alec S. Xu, Pierre Abillama, Changwoo Lee, Laura Balzano

Transformers have achieved state-of-the-art performance across various tasks, but suffer from a notable quadratic complexity in sequence length due to the attention mechanism. In this work, we propose MonarchAttention – a novel approach to sub-quadratic attention approximation via Monarch matrices, an expressive class of structured matrices. Based on the variational form of softmax, we describe an efficient optimization-based algorithm to compute an approximate projection of softmax attention onto the class of Monarch matrices with $\Theta(N\sqrt{N} d)$ computational complexity and $\Theta(Nd)$ memory/IO complexity. Unlike previous approaches, MonarchAttention is both (1) transferable, yielding minimal performance loss with no additional training, even when replacing every attention layer of the transformer, and (2) hardware-efficient, utilizing the highest-throughput tensor core units on modern GPUs. With optimized kernels, MonarchAttention achieves substantial speed-ups in wall-time over FlashAttention-2: $1.4\times$ for shorter sequences $(N=256)$, $4.5\times$ for medium-length sequences $(N=4K)$, and $8.2\times$ for longer sequences $(N=16K)$. We demonstrate the quality of MonarchAttention on diverse tasks and architectures in vision and language problems, showing that it flexibly and accurately approximates softmax attention in a variety of contexts. Our code is available at https://github.com/cjyaras/monarch-attention.

nan

Article 1834

Title@2025-05-24 (6): Can LLMs Alleviate Catastrophic Forgetting in Graph Continual Learning? A Systematic Study

Title: Can LLMs Alleviate Catastrophic Forgetting in Graph Continual Learning? A Systematic Study

Kann LLMs in Graph Continual Learning Katastrophisches Vergessen lindern? Eine systematische Studie

LLMs LLM 能够减轻图持续学习中的灾难性遗忘吗?系统研究 2505.18697v1

Authors: Ziyang Cheng, Zhixun Li, Yuhan Li, Yixin Song, Kangyi Zhao, Dawei Cheng, Jia Li, Jeffrey Xu Yu

Nowadays, real-world data, including graph-structure data, often arrives in a streaming manner, which means that learning systems need to continuously acquire new knowledge without forgetting previously learned information. Although substantial existing works attempt to address catastrophic forgetting in graph machine learning, they are all based on training from scratch with streaming data. With the rise of pretrained models, an increasing number of studies have leveraged their strong generalization ability for continual learning. Therefore, in this work, we attempt to answer whether large language models (LLMs) can mitigate catastrophic forgetting in Graph Continual Learning (GCL). We first point out that current experimental setups for GCL have significant flaws, as the evaluation stage may lead to task ID leakage. Then, we evaluate the performance of LLMs in more realistic scenarios and find that even minor modifications can lead to outstanding results. Finally, based on extensive experiments, we propose a simple-yet-effective method, Simple Graph Continual Learning (SimGCL), that surpasses the previous state-of-the-art GNN-based baseline by around 20% under the rehearsal-free constraint. To facilitate reproducibility, we have developed an easy-to-use benchmark LLM4GCL for training and evaluating existing GCL methods. The code is available at: https://github.com/ZhixunLEE/LLM4GCL.

nan

Article 1835

Title@2025-05-24 (6): Revisiting Model Inversion Evaluation: From Misleading Standards to Reliable Privacy Assessment

Title: Revisiting Model Inversion Evaluation: From Misleading Standards to Reliable Privacy Assessment

Revisiting Model Inversion Evaluation: Von irreführenden Standards zur zuverlässigen Datenschutzbewertung

重新审视示范反向评价:从错误领导标准到可靠隐私评估 2505.03519v3

Authors: Sy-Tuyen Ho, Koh Jun Hao, Ngoc-Bao Nguyen, Alexander Binder, Ngai-Man Cheung

Model Inversion (MI) attacks aim to reconstruct information from private training data by exploiting access to machine learning models T. To evaluate such attacks, the standard evaluation framework for such attacks relies on an evaluation model E, trained under the same task design as T. This framework has become the de facto standard for assessing progress in MI research, used across nearly all recent MI attacks and defenses without question. In this paper, we present the first in-depth study of this MI evaluation framework. In particular, we identify a critical issue of this standard MI evaluation framework: Type-I adversarial examples. These are reconstructions that do not capture the visual features of private training data, yet are still deemed successful by the target model T and ultimately transferable to E. Such false positives undermine the reliability of the standard MI evaluation framework. To address this issue, we introduce a new MI evaluation framework that replaces the evaluation model E with advanced Multimodal Large Language Models (MLLMs). By leveraging their general-purpose visual understanding, our MLLM-based framework does not depend on training of shared task design as in T, thus reducing Type-I transferability and providing more faithful assessments of reconstruction success. Using our MLLM-based evaluation framework, we reevaluate 26 diverse MI attack setups and empirically reveal consistently high false positive rates under the standard evaluation framework. Importantly, we demonstrate that many state-of-the-art (SOTA) MI methods report inflated attack accuracy, indicating that actual privacy leakage is significantly lower than previously believed. By uncovering this critical issue and proposing a robust solution, our work enables a reassessment of progress in MI research and sets a new standard for reliable and robust evaluation.

nan

Article 1836

Title@2025-05-24 (6): Simultaneous Optimization of Efficiency and Degradation in Tunable HTL-Free Perovskite Solar Cells with MWCNT-Integrated Back Contact Using a Machine Learning-Derived Polynomial Regressor

Title: Simultaneous Optimization of Efficiency and Degradation in Tunable HTL-Free Perovskite Solar Cells with MWCNT-Integrated Back Contact Using a Machine Learning-Derived Polynomial Regressor

Gleichzeitige Optimierung von Effizienz und Degradation in Tunablen HTL-freien Perovskite-Solarzellen mit MWCNT-Integriert Zurück Kontakt mit einem maschinenlernenden Polynom-Regressor

利用机械学习多面制反转器,与MWCNT综合后退联系,同时优化金枪鱼可HTL-无 Perovskite的无Perovskite太阳能电池的效率和退化 2505.18693v1

Authors: Ihtesham Ibn Malek, Hafiz Imtiaz, Samia Subrina

Perovskite solar cells (PSCs) without a hole transport layer (HTL) offer a cost-effective and stable alternative to conventional architectures, utilizing only an absorber layer and an electron transport layer (ETL). This study presents a machine learning (ML)-driven framework to optimize the efficiency and stability of HTL-free PSCs by integrating experimental validation with numerical simulations. Excellent agreement is achieved between a fabricated device and its simulated counterpart at a molar fraction ( x = 68.7\% ) in (\mathrm{MAPb}{1-x}\mathrm{Sb}{2x/3}\mathrm{I}_3), where MA is methylammonium. A dataset of 1650 samples is generated by varying molar fraction, absorber defect density, thickness, and ETL doping, with corresponding efficiency and 50-hour degradation as targets. A fourth-degree polynomial regressor (PR-4) shows the best performance, achieving RMSEs of 0.0179 and 0.0117, and ( R^2 ) scores of 1 and 0.999 for efficiency and degradation, respectively. The derived model generalizes beyond the training range and is used in an L-BFGS-B optimization algorithm with a weighted objective function to maximize efficiency and minimize degradation. This improves device efficiency from 13.7\% to 16.84\% and reduces degradation from 6.61\% to 2.39\% over 1000 hours. Finally, the dataset is labeled into superior and inferior classes, and a multilayer perceptron (MLP) classifier achieves 100\% accuracy, successfully identifying optimal configurations.

nan

Article 1837

Title@2025-05-24 (6): Variational Schrödinger Diffusion Models

Title: Variational Schrödinger Diffusion Models

Variationelle Schrödinger-Diffusionsmodelle

挥发模型 2405.04795v5

Authors: Wei Deng, Weijian Luo, Yixin Tan, Marin Biloš, Yu Chen, Yuriy Nevmyvaka, Ricky T. Q. Chen

Schr"odinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize the forward score functions (variational scores) of SB and restore simulation-free properties in training backward scores. We propose the variational Schr"odinger diffusion model (VSDM), where the forward process is a multivariate diffusion and the variational scores are adaptively optimized for efficient transport. Theoretically, we use stochastic approximation to prove the convergence of the variational scores and show the convergence of the adaptively generated samples based on the optimal variational scores. Empirically, we test the algorithm in simulated examples and observe that VSDM is efficient in generations of anisotropic shapes and yields straighter sample trajectories compared to the single-variate diffusion. We also verify the scalability of the algorithm in real-world data and achieve competitive unconditional generation performance in CIFAR10 and conditional generation in time series modeling. Notably, VSDM no longer depends on warm-up initializations and has become tuning-friendly in training large-scale experiments.

nan

Article 1838

Title@2025-05-24 (6): Large Language Models in the Task of Automatic Validation of Text Classifier Predictions

Title: Large Language Models in the Task of Automatic Validation of Text Classifier Predictions

Große Sprachmodelle in der Aufgabe der automatischen Validierung von Textklassifikatoren Vorhersagen

文本分类自动验证任务中的大语言模型 2505.18688v1

Authors: Aleksandr Tsymbalov

Machine learning models for text classification are trained to predict a class for a given text. To do this, training and validation samples must be prepared: a set of texts is collected, and each text is assigned a class. These classes are usually assigned by human annotators with different expertise levels, depending on the specific classification task. Collecting such samples from scratch is labor-intensive because it requires finding specialists and compensating them for their work; moreover, the number of available specialists is limited, and their productivity is constrained by human factors. While it may not be too resource-intensive to collect samples once, the ongoing need to retrain models (especially in incremental learning pipelines) to address data drift (also called model drift) makes the data collection process crucial and costly over the model’s entire lifecycle. This paper proposes several approaches to replace human annotators with Large Language Models (LLMs) to test classifier predictions for correctness, helping ensure model quality and support high-quality incremental learning.

nan

Article 1839

Title@2025-05-24 (6): Predictive Performance of Deep Quantum Data Re-uploading Models

Title: Predictive Performance of Deep Quantum Data Re-uploading Models

Predictive Performance von Deep Quantum Data Re-Uploading-Modellen

深量量数据数据重新加载模型的预测性性能 2505.20337v1

Authors: Xin Wang, Han-Xiao Tao, Re-Bing Wu

Quantum machine learning models incorporating data re-uploading circuits have garnered significant attention due to their exceptional expressivity and trainability. However, their ability to generate accurate predictions on unseen data, referred to as the predictive performance, remains insufficiently investigated. This study reveals a fundamental limitation in predictive performance when deep encoding layers are employed within the data re-uploading model. Concretely, we theoretically demonstrate that when processing high-dimensional data with limited-qubit data re-uploading models, their predictive performance progressively degenerates to near random-guessing levels as the number of encoding layers increases. In this context, the repeated data uploading cannot mitigate the performance degradation. These findings are validated through experiments on both synthetic linearly separable datasets and real-world datasets. Our results demonstrate that when processing high-dimensional data, the quantum data re-uploading models should be designed with wider circuit architectures rather than deeper and narrower ones.

nan

Article 1840

Title@2025-05-24 (6): A fast algorithm to minimize prediction loss of the optimal solution in inverse optimization problem of MILP

Title: A fast algorithm to minimize prediction loss of the optimal solution in inverse optimization problem of MILP

Ein schneller Algorithmus zur Minimierung des Vorhersageverlusts der optimalen Lösung im inversen Optimierungsproblem von MILP

快速算法,以尽量减少MILP反优化问题最佳解决办法的预测损失 2405.14273v3

Authors: Akira Kitaoka

We consider the inverse optimization problem of estimating the weights of the objective function such that the given solution is an optimal solution for a mixed integer linear program (MILP). In this inverse optimization problem, the known methods exhibit inefficient convergence. Specifically, if $d$ denotes the dimension of the weights and $k$ the number of iterations, then the error of the weights is bounded by $O(k^{-1/(d-1)})$, leading to slow convergence as $d$ increases. We propose a projected subgradient method with a step size of $k^{-1/2}$ based on suboptimality loss. We theoretically show and demonstrate that the proposed method efficiently learns the weights. In particular, we show that there exists a constant $\gamma > 0$ such that the distance between the learned and true weights is bounded by $ O\left(k^{-1/(1+\gamma)} \exp\left(-\frac{\gamma k^{1/2}}{2+\gamma}\right)\right), $ or the optimal solution is exactly recovered. Furthermore, experiments demonstrate that the proposed method solves the inverse optimization problems of MILP using fewer than $1/7$ the number of MILP calls required by known methods, and converges within a finite number of iterations.

nan

Article 1841

Title@2025-05-24 (6): Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens

Title: Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens

Wie ein CHEMIST denken: Kombiniertes Heterogenes Einbetten von Modellintegrationsstrukturen und Tokens

思考像CHEMIST: 混合异基因嵌入模型集成结构和调子 2502.17986v2

Authors: Nikolai Rekut, Alexey Orlov, Klea Ziu, Elizaveta Starykh, Martin Takac, Aleksandr Beznosikov

Representing molecular structures effectively in chemistry remains a challenging task. Language models and graph-based models are extensively utilized within this domain, consistently achieving state-of-the-art results across an array of tasks. However, the prevailing practice of representing chemical compounds in the SMILES format - used by most data sets and many language models - presents notable limitations as a training data format. In this study, we present a novel approach that decomposes molecules into substructures and computes descriptor-based representations for these fragments, providing more detailed and chemically relevant input for model training. We use this substructure and descriptor data as input for language model and also propose a bimodal architecture that integrates this language model with graph-based models. As LM we use RoBERTa, Graph Isomorphism Networks (GIN), Graph Convolutional Networks (GCN) and Graphormer as graph ones. Our framework shows notable improvements over traditional methods in various tasks such as Quantitative Structure-Activity Relationship (QSAR) prediction.

nan

Article 1842

Title@2025-05-24 (6): Augmenting the action space with conventions to improve multi-agent cooperation in Hanabi

Title: Augmenting the action space with conventions to improve multi-agent cooperation in Hanabi

Erweiterung des Aktionsraums mit Konventionen zur Verbesserung der Multi-Agenten-Kooperation in Hanabi

与公约扩大行动空间,以改进哈纳比多剂合作 2412.06333v3

Authors: F. Bredell, H. A. Engelbrecht, J. C. Schoeman

The card game Hanabi is considered a strong medium for the testing and development of multi-agent reinforcement learning (MARL) algorithms, due to its cooperative nature, partial observability, limited communication and remarkable complexity. Previous research efforts have explored the capabilities of MARL algorithms within Hanabi, focusing largely on advanced architecture design and algorithmic manipulations to achieve state-of-the-art performance for various number of cooperators. However, this often leads to complex solution strategies with high computational cost and requiring large amounts of training data. For humans to solve the Hanabi game effectively, they require the use of conventions, which often allows for a means to implicitly convey ideas or knowledge based on a predefined, and mutually agreed upon, set of “rules” or principles. Multi-agent problems containing partial observability, especially when limited communication is present, can benefit greatly from the use of implicit knowledge sharing. In this paper, we propose a novel approach to augmenting an agent’s action space using conventions, which act as a sequence of special cooperative actions that span over and include multiple time steps and multiple agents, requiring agents to actively opt in for it to reach fruition. These conventions are based on existing human conventions, and result in a significant improvement on the performance of existing techniques for self-play and cross-play for various number of cooperators within Hanabi.

nan

Article 1843

Title@2025-05-24 (6): COPA: Comparing the incomparable in multi-objective model evaluation

Title: COPA: Comparing the incomparable in multi-objective model evaluation

COPA: Vergleich des Unvergleichbaren in der multiobjektiven Modellauswertung

CCOPA: 比较在多目标模式评价中无法比较的模型评价 2503.14321v2

Authors: Adrián Javaloy, Antonio Vergari, Isabel Valera

As machine learning (ML) practitioners, we often have hundreds of (trained) ML models at hand from which we need to choose one, based on various objectives such as accuracy, robustness, fairness, scalability, etc. However, how to compare, aggregate and, ultimately, trade-off these objectives is usually a time-consuming task that requires of expert knowledge, as they may be measured in different units or scales. In this work, we investigate how objectives can be automatically normalized and aggregated to systematically navigate their Pareto front. To do so, we make incomparable objectives comparable using their CDFs, approximated by their relative rankings. As a result, we can aggregate them while matching user-specific preferences, allowing practitioners to meaningfully navigate and search for models in the Pareto front. We demonstrate the potential impact of our approach, COPA, in both model selection and benchmarking tasks across diverse ML areas such as fair ML, domain generalization, AutoML and foundation models, where classical ways to normalize and aggregate objectives fall short.

nan

Article 1844

Title@2025-05-24 (6): End-to-End Framework for Predicting the Remaining Useful Life of Lithium-Ion Batteries

Title: End-to-End Framework for Predicting the Remaining Useful Life of Lithium-Ion Batteries

End-to-End-Framework zur Vorhersage der verbleibenden Nutzungsdauer von Lithium-Ionen-Batterien

预测锂-碘电池剩余使用寿命的端至端框架 2505.16664v2

Authors: Khoa Tran, Tri Le, Bao Huynh, Hung-Cuong Trinh, Vy-Rin Nguyen

Accurate prediction of the Remaining Useful Life (RUL) is essential for enabling timely maintenance of lithium-ion batteries, impacting the operational efficiency of electric applications that rely on them. This paper proposes a RUL prediction approach that leverages data from recent charge-discharge cycles to estimate the number of remaining usable cycles. The approach introduces both a novel signal processing pipeline and a deep learning prediction model. In the signal preprocessing pipeline, a derived capacity feature $\dot{Q}(I, Q)$ is computed based on current and capacity signals. Alongside original capacity, voltage and current, these features are denoised and enhanced using statistical metrics and a delta-based method to capture differences between the current and previous cycles. In the prediction model, the processed features are then fed into a hybrid deep learning architecture composed of 1D Convolutional Neural Networks (CNN), Attentional Long Short-Term Memory (A-LSTM), and Ordinary Differential Equation-based LSTM (ODE-LSTM) blocks. This architecture is designed to capture both local signal characteristics and long-range temporal dependencies while modeling the continuous-time dynamics of battery degradation. The model is further evaluated using transfer learning across different learning strategies and target data partitioning scenarios. Results indicate that the model maintains robust performance, even when fine-tuned on limited target data. Experimental results on two publicly available large-scale datasets demonstrate that the proposed method outperforms a baseline deep learning approach and machine learning techniques, achieving an RMSE of 101.59, highlighting its strong potential for real-world RUL prediction applications.

nan

Article 1845

Title@2025-05-24 (6): A Quantum Approximation Scheme for k-Means

Title: A Quantum Approximation Scheme for k-Means

Ein Quantenannäherungsprogramm für k-Means

k- Means 的量接近量计划 2308.08167v3

Authors: Ragesh Jaiswal

We give a quantum approximation scheme (i.e., $(1 + \varepsilon)$-approximation for every $\varepsilon > 0$) for the classical $k$-means clustering problem in the QRAM model with a running time that has only polylogarithmic dependence on the number of data points. More specifically, given a dataset $V$ with $N$ points in $\mathbb{R}^d$ stored in QRAM data structure, our quantum algorithm runs in time $\tilde{O} \left( 2^{\tilde{O}(\frac{k}{\varepsilon})} \eta^2 d\right)$ and with high probability outputs a set $C$ of $k$ centers such that $cost(V, C) \leq (1+\varepsilon) \cdot cost(V, C_{OPT})$. Here $C_{OPT}$ denotes the optimal $k$-centers, $cost(.)$ denotes the standard $k$-means cost function (i.e., the sum of the squared distance of points to the closest center), and $\eta$ is the aspect ratio (i.e., the ratio of maximum distance to minimum distance). This is the first quantum algorithm with a polylogarithmic running time that gives a provable approximation guarantee of $(1+\varepsilon)$ for the $k$-means problem. Also, unlike previous works on unsupervised learning, our quantum algorithm does not require quantum linear algebra subroutines and has a running time independent of parameters (e.g., condition number) that appear in such procedures.

nan

Article 1846

Title@2025-05-24 (6): Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations

Title: Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations

Erzeugen der Vollfeld-Evolution der physikalischen Dynamik aus irregulären Sparse-Beobachtungen

从不定期的偏差观测中生成物理动态全场演变 2505.09284v2

Authors: Panqi Chen, Yifan Sun, Lei Cheng, Yang Yang, Weichang Li, Yang Liu, Weiqing Liu, Jiang Bian, Shikai Fang

Modeling and reconstructing multidimensional physical dynamics from sparse and off-grid observations presents a fundamental challenge in scientific research. Recently, diffusion-based generative modeling shows promising potential for physical simulation. However, current approaches typically operate on on-grid data with preset spatiotemporal resolution, but struggle with the sparsely observed and continuous nature of real-world physical dynamics. To fill the gaps, we present SDIFT, Sequential DIffusion in Functional Tucker space, a novel framework that generates full-field evolution of physical dynamics from irregular sparse observations. SDIFT leverages the functional Tucker model as the latent space representer with proven universal approximation property, and represents observations as latent functions and Tucker core sequences. We then construct a sequential diffusion model with temporally augmented UNet in the functional Tucker space, denoising noise drawn from a Gaussian process to generate the sequence of core tensors. At the posterior sampling stage, we propose a Message-Passing Posterior Sampling mechanism, enabling conditional generation of the entire sequence guided by observations at limited time steps. We validate SDIFT on three physical systems spanning astronomical (supernova explosions, light-year scale), environmental (ocean sound speed fields, kilometer scale), and molecular (organic liquid, millimeter scale) domains, demonstrating significant improvements in both reconstruction accuracy and computational efficiency compared to state-of-the-art approaches.

nan

Article 1847

Title@2025-05-24 (6): Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment?

Title: Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment?

Findet Repräsentationsintervention wirklich Wunschvorstellungen und Ausgeglichenheit wieder?

代表权干预是否真正确定了理想概念和目的一致? 2505.18672v1

Authors: Hongzheng Yang, Yongqiang Chen, Zeyu Qin, Tongliang Liu, Chaowei Xiao, Kun Zhang, Bo Han

Representation intervention aims to locate and modify the representations that encode the underlying concepts in Large Language Models (LLMs) to elicit the aligned and expected behaviors. Despite the empirical success, it has never been examined whether one could locate the faithful concepts for intervention. In this work, we explore the question in safety alignment. If the interventions are faithful, the intervened LLMs should erase the harmful concepts and be robust to both in-distribution adversarial prompts and the out-of-distribution (OOD) jailbreaks. While it is feasible to erase harmful concepts without degrading the benign functionalities of LLMs in linear settings, we show that it is infeasible in the general non-linear setting. To tackle the issue, we propose Concept Concentration (COCA). Instead of identifying the faithful locations to intervene, COCA refractors the training data with an explicit reasoning process, which firstly identifies the potential unsafe concepts and then decides the responses. Essentially, COCA simplifies the decision boundary between harmful and benign representations, enabling more effective linear erasure. Extensive experiments with multiple representation intervention methods and model architectures demonstrate that COCA significantly reduces both in-distribution and OOD jailbreak success rates, and meanwhile maintaining strong performance on regular tasks such as math and code generation.

nan

Article 1848

Title@2025-05-24 (6): Flat-LoRA: Low-Rank Adaptation over a Flat Loss Landscape

Title: Flat-LoRA: Low-Rank Adaptation over a Flat Loss Landscape

Flat-LoRA: Low-Rank Anpassung über eine flache verlorene Landschaft

Flat-LORA: 适应平坦损失地貌的低Rank适应 2409.14396v2

Authors: Tao Li, Zhengbao He, Yujun Li, Yasheng Wang, Lifeng Shang, Xiaolin Huang

Fine-tuning large-scale pre-trained models is prohibitively expensive in terms of computation and memory costs. Low-Rank Adaptation (LoRA), a popular Parameter-Efficient Fine-Tuning (PEFT) method, offers an efficient solution by optimizing only low-rank matrices. Despite recent progress in improving LoRA’s performance, the relationship between the LoRA optimization space and the full parameter space is often overlooked. A solution that appears flat in the loss landscape of the LoRA space may still exhibit sharp directions in the full parameter space, potentially compromising generalization. We introduce Flat-LoRA, which aims to identify a low-rank adaptation situated in a flat region of the full parameter space. Instead of adopting the well-established sharpness-aware minimization approach, which incurs significant computation and memory overheads, we employ a Bayesian expectation loss objective to preserve training efficiency. Further, we design a refined random perturbation generation strategy for improved performance and carefully manage memory overhead using random seeds. Experiments across diverse tasks-including mathematical reasoning, coding abilities, dialogue generation, instruction following, and text-to-image generation-demonstrate that Flat-LoRA improves both in-domain and out-of-domain generalization. Code is available at https://github.com/nblt/Flat-LoRA.

nan

Article 1849

Title@2025-05-24 (6): DeCaFlow: A Deconfounding Causal Generative Model

Title: DeCaFlow: A Deconfounding Causal Generative Model

DeCaFlow: Ein entkonfoundierendes Kausalgeneratives Modell

DeCaFlow:一个破碎的因果创造模型 2503.15114v2

Authors: Alejandro Almodóvar, Adrián Javaloy, Juan Parras, Santiago Zazo, Isabel Valera

We introduce DeCaFlow, a deconfounding causal generative model. Training once per dataset using just observational data and the underlying causal graph, DeCaFlow enables accurate causal inference on continuous variables under the presence of hidden confounders. Specifically, we extend previous results on causal estimation under hidden confounding to show that a single instance of DeCaFlow provides correct estimates for all causal queries identifiable with do-calculus, leveraging proxy variables to adjust for the causal effects when do-calculus alone is insufficient. Moreover, we show that counterfactual queries are identifiable as long as their interventional counterparts are identifiable, and thus are also correctly estimated by DeCaFlow. Our empirical results on diverse settings (including the Ecoli70 dataset, with 3 independent hidden confounders, tens of observed variables and hundreds of causal queries) show that DeCaFlow outperforms existing approaches, while demonstrating its out-of-the-box applicability to any given causal graph. An implementation can be found in https://github.com/aalmodovares/DeCaFlow

nan

Article 1850

Title@2025-05-24 (6): Self-Supervised Evolution Operator Learning for High-Dimensional Dynamical Systems

Title: Self-Supervised Evolution Operator Learning for High-Dimensional Dynamical Systems

Selbstüberwachtes Evolutionsoperator-Lernen für hochdimensionelle dynamische Systeme

高多元动态系统学习 2505.18671v1

Authors: Giacomo Turri, Luigi Bonati, Kai Zhu, Massimiliano Pontil, Pietro Novelli

We introduce an encoder-only approach to learn the evolution operators of large-scale non-linear dynamical systems, such as those describing complex natural phenomena. Evolution operators are particularly well-suited for analyzing systems that exhibit complex spatio-temporal patterns and have become a key analytical tool across various scientific communities. As terabyte-scale weather datasets and simulation tools capable of running millions of molecular dynamics steps per day are becoming commodities, our approach provides an effective tool to make sense of them from a data-driven perspective. The core of it lies in a remarkable connection between self-supervised representation learning methods and the recently established learning theory of evolution operators. To show the usefulness of the proposed method, we test it across multiple scientific domains: explaining the folding dynamics of small proteins, the binding process of drug-like molecules in host sites, and autonomously finding patterns in climate data. Code and data to reproduce the experiments are made available open source.

nan

Article 1851

Title@2025-05-24 (6): Memory-Efficient Super-Resolution of 3D Micro-CT Images Using Octree-Based GANs: Enhancing Resolution and Segmentation Accuracy

Title: Memory-Efficient Super-Resolution of 3D Micro-CT Images Using Octree-Based GANs: Enhancing Resolution and Segmentation Accuracy

Speichereffiziente Super-Resolution von 3D-Mikro-CT-Bildern mit oktree-basierten GANs: Verbesserung der Auflösung und Segmentierung Genauigkeit

使用以屋底为主的GANs:加强分辨率和分解准确度 2505.18664v1

Authors: Evgeny Ugolkov, Xupeng He, Hyung Kwak, Hussein Hoteit

We present a memory-efficient algorithm for significantly enhancing the quality of segmented 3D micro-Computed Tomography (micro-CT) images of rocks using a generative model. The proposed model achieves a 16x increase in resolution and corrects inaccuracies in segmentation caused by the overlapping X-ray attenuation in micro-CT measurements across different minerals. The generative model employed is a 3D Octree-based convolutional Wasserstein generative adversarial network with gradient penalty. To address the challenge of high memory consumption inherent in standard 3D convolutional layers, we implemented an Octree structure within the 3D progressive growing generator model. This enabled the use of memory-efficient 3D Octree-based convolutional layers. The approach is pivotal in overcoming the long-standing memory bottleneck in volumetric deep learning, making it possible to reach 16x super-resolution in 3D, a scale that is challenging to attain due to cubic memory scaling. For training, we utilized segmented 3D low-resolution micro-CT images along with unpaired segmented complementary 2D high-resolution laser scanning microscope images. Post-training, resolution improved from 7 to 0.44 micro-m/voxel with accurate segmentation of constituent minerals. Validated on Berea sandstone, this framework demonstrates substantial improvements in pore characterization and mineral differentiation, offering a robust solution to one of the primary computational limitations in modern geoscientific imaging.

nan

Article 1852

Title@2025-05-24 (6): Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees

Title: Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees

Adaptive Vorhersage-Powered AutoEval mit Zuverlässigkeit und Effizienzgarantien

具有可靠性和效率保障的适应性预测力自动评估 2505.18659v1

Authors: Sangwoo Park, Matteo Zecchin, Osvaldo Simeone

Selecting artificial intelligence (AI) models, such as large language models (LLMs), from multiple candidates requires accurate performance estimation. This is ideally achieved through empirical evaluations involving abundant real-world data. However, such evaluations are costly and impractical at scale. To address this challenge, autoevaluation methods leverage synthetic data produced by automated evaluators, such as LLMs-as-judges, reducing variance but potentially introducing bias. Recent approaches have employed semi-supervised prediction-powered inference (\texttt{PPI}) to correct for the bias of autoevaluators. However, the use of autoevaluators may lead in practice to a degradation in sample efficiency compared to conventional methods using only real-world data. In this paper, we propose \texttt{R-AutoEval+}, a novel framework that provides finite-sample reliability guarantees on the model evaluation, while also ensuring an enhanced (or at least no worse) sample efficiency compared to conventional methods. The key innovation of \texttt{R-AutoEval+} is an adaptive construction of the model evaluation variable, which dynamically tunes its reliance on synthetic data, reverting to conventional methods when the autoevaluator is insufficiently accurate. Experiments on the use of LLMs-as-judges for the optimization of quantization settings for the weights of an LLM, and for prompt design in LLMs confirm the reliability and efficiency of \texttt{R-AutoEval+}.

nan

Article 1853

Title@2025-05-24 (6): Robustness in Large Language Models: A Survey of Mitigation Strategies and Evaluation Metrics

Title: Robustness in Large Language Models: A Survey of Mitigation Strategies and Evaluation Metrics

Robustheit in großen Sprachmodellen: Eine Umfrage zu Mitigationsstrategien und Evaluationsmetrics

大语言模式的强强力:减轻战略调查和评价 2505.18658v1

Authors: Pankaj Kumar, Subhankar Mishra

Large Language Models (LLMs) have emerged as a promising cornerstone for the development of natural language processing (NLP) and artificial intelligence (AI). However, ensuring the robustness of LLMs remains a critical challenge. To address these challenges and advance the field, this survey provides a comprehensive overview of current studies in this area. First, we systematically examine the nature of robustness in LLMs, including its conceptual foundations, the importance of consistent performance across diverse inputs, and the implications of failure modes in real-world applications. Next, we analyze the sources of non-robustness, categorizing intrinsic model limitations, data-driven vulnerabilities, and external adversarial factors that compromise reliability. Following this, we review state-of-the-art mitigation strategies, and then we discuss widely adopted benchmarks, emerging metrics, and persistent gaps in assessing real-world reliability. Finally, we synthesize findings from existing surveys and interdisciplinary studies to highlight trends, unresolved issues, and pathways for future research.

nan

Article 1854

Title@2025-05-24 (6): LLM-QFL: Distilling Large Language Model for Quantum Federated Learning

Title: LLM-QFL: Distilling Large Language Model for Quantum Federated Learning

LLM-QFL: Destillieren eines großen Sprachmodells für Quantum-Federated Learning

LLM-QFL:为量子联邦学习保留大语言模式 2505.18656v1

Authors: Dev Gurung, Shiva Raj Pokhrel

Inspired by the power of large language models (LLMs), our research adapts them to quantum federated learning (QFL) to boost efficiency and performance. We propose a federated fine-tuning method that distills an LLM within QFL, allowing each client to locally adapt the model to its own data while preserving privacy and reducing unnecessary global updates. The fine-tuned LLM also acts as a reinforcement agent, optimizing QFL by adjusting optimizer steps, cutting down communication rounds, and intelligently selecting clients. Experiments show significant efficiency gains. We pioneer a synergy between LLM and QFL, offering: i) practical efficiency: Reduced communication costs and faster convergence. ii) theoretical rigor: Provable guarantees for adaptive federated optimization. iii) scalability: PEFT methods (LoRA, QLoRA) enable deployment on resource-constrained quantum devices. Code implementation is available here 1.

nan

Article 1855

Title@2025-05-24 (6): On the Emergence of Linear Analogies in Word Embeddings

Title: On the Emergence of Linear Analogies in Word Embeddings

Zur Entstehung linearer Analogien in Word-Embeddings

单线模拟在文字嵌入中的出现 2505.18651v1

Authors: Daniel J. Korchinski, Dhruva Karkada, Yasaman Bahri, Matthieu Wyart

Models such as Word2Vec and GloVe construct word embeddings based on the co-occurrence probability $P(i,j)$ of words $i$ and $j$ in text corpora. The resulting vectors $W_i$ not only group semantically similar words but also exhibit a striking linear analogy structure – for example, $W_{\text{king}} - W_{\text{man}} + W_{\text{woman}} \approx W_{\text{queen}}$ – whose theoretical origin remains unclear. Previous observations indicate that this analogy structure: (i) already emerges in the top eigenvectors of the matrix $M(i,j) = P(i,j)/P(i)P(j)$, (ii) strengthens and then saturates as more eigenvectors of $M (i, j)$, which controls the dimension of the embeddings, are included, (iii) is enhanced when using $\log M(i,j)$ rather than $M(i,j)$, and (iv) persists even when all word pairs involved in a specific analogy relation (e.g., king-queen, man-woman) are removed from the corpus. To explain these phenomena, we introduce a theoretical generative model in which words are defined by binary semantic attributes, and co-occurrence probabilities are derived from attribute-based interactions. This model analytically reproduces the emergence of linear analogy structure and naturally accounts for properties (i)-(iv). It can be viewed as giving fine-grained resolution into the role of each additional embedding dimension. It is robust to various forms of noise and agrees well with co-occurrence statistics measured on Wikipedia and the analogy benchmark introduced by Mikolov et al.

nan

Article 1856

Title@2025-05-24 (6): Flow Matching for Geometric Trajectory Simulation

Title: Flow Matching for Geometric Trajectory Simulation

Flow Matching für geometrische Trajektoriensimulation

几何轨迹模拟流程匹配 2505.18647v1

Authors: Kiet Bennema ten Brinke, Koen Minartz, Vlado Menkovski

The simulation of N-body systems is a fundamental problem with applications in a wide range of fields, such as molecular dynamics, biochemistry, and pedestrian dynamics. Machine learning has become an invaluable tool for scaling physics-based simulators and developing models directly from experimental data. In particular, recent advances based on deep generative modeling and geometric deep learning have enabled probabilistic simulation by modeling complex distributions over trajectories while respecting the permutation symmetry that is fundamental to N-body systems. However, to generate realistic trajectories, existing methods must learn complex transformations starting from uninformed noise and do not allow for the exploitation of domain-informed priors. In this work, we propose STFlow to address this limitation. By leveraging flow matching and data-dependent couplings, STFlow facilitates physics-informed simulation of geometric trajectories without sacrificing model expressivity or scalability. Our evaluation on N-body dynamical systems, molecular dynamics, and pedestrian dynamics benchmarks shows that STFlow produces significantly lower prediction errors while enabling more efficient inference, highlighting the benefits of employing physics-informed prior distributions in probabilistic geometric trajectory modeling.

nan

Article 1857

Title@2025-05-24 (6): Randomized Midpoint Method for Log-Concave Sampling under Constraints

Title: Randomized Midpoint Method for Log-Concave Sampling under Constraints

Randomisierte Midpoint-Methode für Log-Concave-Sampling unter Einschränkungen

制约下对日志集点取样的随机中点方法 2405.15379v2

Authors: Yifeng Yu, Lu Yu

In this paper, we study the problem of sampling from log-concave distributions supported on convex, compact sets, with a particular focus on the randomized midpoint discretization of both vanilla and kinetic Langevin diffusions in this constrained setting. We propose a unified proximal framework for handling constraints via a broad class of projection operators, including Euclidean, Bregman, and Gauge projections. Within this framework, we establish non-asymptotic bounds in both $\mathcal{W}_1$ and $\mathcal{W}_2$ distances, providing precise complexity guarantees and performance comparisons. In addition, our analysis leads to sharper convergence guarantees for both vanilla and kinetic Langevin Monte Carlo under constraints, improving upon existing theoretical results.

nan

Article 1858

Title@2025-05-24 (6): STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data

Title: STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data

StaRFormer: Halbüberwachtes Task-Informiertes Representation-Lernen über dynamisches, aufmerksamkeitsbasiertes regionales Masking für sequentielle Daten

STARFormer:通过动态关注-基于关注的区域按顺序数据区域掩码,进行半超常任务化代表性学习 2504.10097v2

Authors: Maximilian Forstenhäusler, Daniel Külzer, Christos Anagnostopoulos, Shameem Puthiya Parambath, Natascha Weber

Accurate predictions using sequential spatiotemporal data are crucial for various applications. Utilizing real-world data, we aim to learn the intent of a smart device user within confined areas of a vehicle’s surroundings. However, in real-world scenarios, environmental factors and sensor limitations result in non-stationary and irregularly sampled data, posing significant challenges. To address these issues, we developed a Transformer-based approach, STaRFormer, which serves as a universal framework for sequential modeling. STaRFormer employs a novel, dynamic attention-based regional masking scheme combined with semi-supervised contrastive learning to enhance task-specific latent representations. Comprehensive experiments on 15 datasets varying in types (including non-stationary and irregularly sampled), domains, sequence lengths, training samples, and applications, demonstrate the efficacy and practicality of STaRFormer. We achieve notable improvements over state-of-the-art approaches. Code and data will be made available.

nan

Article 1859

Title@2025-05-24 (6): ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation

Title: ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation

ThanoRA: Aufgabe Heterogenität bewusst Multi-Task Low-Rank-Anpassung

塔诺拉:任务差异性-软件多功能、多任务、低风险适应 2505.18640v1

Authors: Jian Liang, Wenke Huang, Xianda Guo, Guancheng Wan, Bo Du, Mang Ye

Low-Rank Adaptation (LoRA) is widely adopted for downstream fine-tuning of foundation models due to its efficiency and zero additional inference cost. Many real-world applications require foundation models to specialize in multiple tasks simultaneously, motivating the need for efficient multi-task adaptation. While recent approaches integrate LoRA with mixture-of-experts (MoE) to address this, the use of routers prevents parameter mergeability, which increases inference overhead and hinders unified multi-task adaptation, thereby limiting deployment practicality. In this work, we propose ThanoRA, a Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation framework that enables multi-task adaptation while preserving the inference efficiency of LoRA. ThanoRA jointly models task heterogeneity and mitigates subspace interference throughout training. Specifically, motivated by inherent differences in complexity and heterogeneity across tasks, ThanoRA constructs task-specific LoRA subspaces at initialization, enabling fine-grained knowledge injection aligned with task heterogeneity. Furthermore, to prevent task interference and subspace collapse during multi-task training, ThanoRA introduces a subspace-preserving regularization that maintains the independence of task-specific representations. With the synergy of both components, ThanoRA enables efficient and unified multi-task adaptation. Extensive experiments across multimodal and text-only benchmarks under varying multi-task mixtures demonstrate that ThanoRA consistently achieves robust and superior performance over strong baselines without introducing additional inference overhead. Our code is publicly available at: https://github.com/LiangJian24/ThanoRA.

nan

Article 1860

Title@2025-05-24 (6): Graph-Supported Dynamic Algorithm Configuration for Multi-Objective Combinatorial Optimization

Title: Graph-Supported Dynamic Algorithm Configuration for Multi-Objective Combinatorial Optimization

Graphunterstützte dynamische Algorithmenkonfiguration für multi-objektive Kombinator-Optimierung

多目标组合优化多目标组合优化支持的图形支持动态算法配置 2505.16471v2

Authors: Robbert Reijnen, Yaoxin Wu, Zaharah Bukhsh, Yingqian Zhang

Deep reinforcement learning (DRL) has been widely used for dynamic algorithm configuration, particularly in evolutionary computation, which benefits from the adaptive update of parameters during the algorithmic execution. However, applying DRL to algorithm configuration for multi-objective combinatorial optimization (MOCO) problems remains relatively unexplored. This paper presents a novel graph neural network (GNN) based DRL to configure multi-objective evolutionary algorithms. We model the dynamic algorithm configuration as a Markov decision process, representing the convergence of solutions in the objective space by a graph, with their embeddings learned by a GNN to enhance the state representation. Experiments on diverse MOCO challenges indicate that our method outperforms traditional and DRL-based algorithm configuration methods in terms of efficacy and adaptability. It also exhibits advantageous generalizability across objective types and problem sizes, and applicability to different evolutionary computation methods.

nan

Article 1861

Title@2025-05-24 (6): DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection

Title: DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection

DitHub: Modulares Framework zur inkrementellen Open-Vocabulary-Objekterkennung

DitHub: 递增开放词汇物体探测模块框架 2503.09271v2

Authors: Chiara Cappellino, Gianluca Mancusi, Matteo Mosconi, Angelo Porrello, Simone Calderara, Rita Cucchiara

Open-Vocabulary object detectors can generalize to an unrestricted set of categories through simple textual prompting. However, adapting these models to rare classes or reinforcing their abilities on multiple specialized domains remains essential. While recent methods rely on monolithic adaptation strategies with a single set of weights, we embrace modular deep learning. We introduce DitHub, a framework designed to build and maintain a library of efficient adaptation modules. Inspired by Version Control Systems, DitHub manages expert modules as branches that can be fetched and merged as needed. This modular approach allows us to conduct an in-depth exploration of the compositional properties of adaptation modules, marking the first such study in Object Detection. Our method achieves state-of-the-art performance on the ODinW-13 benchmark and ODinW-O, a newly introduced benchmark designed to assess class reappearance. For more details, visit our project page: https://aimagelab.github.io/DitHub/

nan

Article 1862

Title@2025-05-24 (6): Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees

Title: Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees

Multi-Step Alignment als Markov Games: Ein optimaler Online-Gradient-Abstieg mit Konvergenzgarantien

作为Markov运动会的多步对齐:带有一致保障的乐观的在线逐渐递增人种方法 2502.12678v2

Authors: Yongtao Wu, Luca Viano, Yihang Chen, Zhenyu Zhu, Kimon Antonakopoulos, Quanquan Gu, Volkan Cevher

Reinforcement Learning from Human Feedback (RLHF) has been highly successful in aligning large language models with human preferences. While prevalent methods like DPO have demonstrated strong performance, they frame interactions with the language model as a bandit problem, which limits their applicability in real-world scenarios where multi-turn conversations are common. Additionally, DPO relies on the Bradley-Terry model assumption, which does not adequately capture the non-transitive nature of human preferences. In this paper, we address these challenges by modeling the alignment problem as a two-player constant-sum Markov game, where each player seeks to maximize their winning rate against the other across all steps of the conversation. Our approach Optimistic Multi-step Preference Optimization (OMPO) is built upon the optimistic online mirror descent algorithm~\citep{rakhlin2013online,joulani17a}. Theoretically, we provide a rigorous analysis for the convergence of OMPO and show that OMPO requires $\mathcal{O}(\epsilon^{-1})$ policy updates to converge to an $\epsilon$-approximate Nash equilibrium. We also validate the effectiveness of our method on multi-turn conversations dataset and math reasoning dataset.

nan

Article 1863

Title@2025-05-24 (6): Leveraging Structural Knowledge in Diffusion Models for Source Localization in Data-Limited Graph Scenarios

Title: Leveraging Structural Knowledge in Diffusion Models for Source Localization in Data-Limited Graph Scenarios

Nutzung struktureller Kenntnisse in Diffusionsmodellen für die Quellenlokalisierung in datenbeschränkten Graphenszenarien

利用传播模型中的结构性知识,在数据限制的图表假设情景中实现源本地化 2502.17928v2

Authors: Hongyi Chen, Jingtao Ding, Xiaojun Liang, Yong Li, Xiao-Ping Zhang

The source localization problem in graph information propagation is crucial for managing various network disruptions, from misinformation spread to infrastructure failures. While recent deep generative approaches have shown promise in this domain, their effectiveness is limited by the scarcity of real-world propagation data. This paper introduces SIDSL (\textbf{S}tructure-prior \textbf{I}nformed \textbf{D}iffusion model for \textbf{S}ource \textbf{L}ocalization), a novel framework that addresses three key challenges in limited-data scenarios: unknown propagation patterns, complex topology-propagation relationships, and class imbalance between source and non-source nodes. SIDSL incorporates topology-aware priors through graph label propagation and employs a propagation-enhanced conditional denoiser with a GNN-parameterized label propagation module (GNN-LP). Additionally, we propose a structure-prior biased denoising scheme that initializes from structure-based source estimations rather than random noise, effectively countering class imbalance issues. Experimental results across four real-world datasets demonstrate SIDSL’s superior performance, achieving 7.5-13.3% improvements in F1 scores compared to state-of-the-art methods. Notably, when pretrained with simulation data of synthetic patterns, SIDSL maintains robust performance with only 10% of training data, surpassing baselines by more than 18.8%. These results highlight SIDSL’s effectiveness in real-world applications where labeled data is scarce.

nan

Article 1864

Title@2025-05-24 (6): Asymmetric Duos: Sidekicks Improve Uncertainty

Title: Asymmetric Duos: Sidekicks Improve Uncertainty

Asymmetrische Duos: Sidekicks verbessern Unsicherheit

非对称 Duos: 侧边icks 改善不确定性 2505.18636v1

Authors: Tim G. Zhou, Evan Shelhamer, Geoff Pleiss

The go-to strategy to apply deep networks in settings where uncertainty informs decisions–ensembling multiple training runs with random initializations–is ill-suited for the extremely large-scale models and practical fine-tuning workflows of today. We introduce a new cost-effective strategy for improving the uncertainty quantification and downstream decisions of a large model (e.g. a fine-tuned ViT-B): coupling it with a less accurate but much smaller “sidekick” (e.g. a fine-tuned ResNet-34) with a fraction of the computational cost. We propose aggregating the predictions of this \emph{Asymmetric Duo} by simple learned weighted averaging. Surprisingly, despite their inherent asymmetry, the sidekick model almost never harms the performance of the larger model. In fact, across five image classification benchmarks and a variety of model architectures and training schemes (including soups), Asymmetric Duos significantly improve accuracy, uncertainty quantification, and selective classification metrics with only ${\sim}10-20\%$ more computation.

nan

Article 1865

Title@2025-05-24 (6): You Can Wash Hands Better: Accurate Daily Handwashing Assessment with a Smartwatch

Title: You Can Wash Hands Better: Accurate Daily Handwashing Assessment with a Smartwatch

Sie können Hände besser waschen: Genaue tägliche Handwäsche Bewertung mit einer Smartwatch

你可以更好地洗手:用智能观察准确进行每日洗手评估 2112.06657v5

Authors: Fei Wang, Tingting Zhang, Xilei Wu, Pengcheng Wang, Xin Wang, Han Ding, Jingang Shi, Jinsong Han, Dong Huang

Hand hygiene is among the most effective daily practices for preventing infectious diseases such as influenza, malaria, and skin infections. While professional guidelines emphasize proper handwashing to reduce the risk of viral infections, surveys reveal that adherence to these recommendations remains low. To address this gap, we propose UWash, a wearable solution leveraging smartwatches to evaluate handwashing procedures, aiming to raise awareness and cultivate high-quality handwashing habits. We frame the task of handwashing assessment as an action segmentation problem, similar to those in computer vision, and introduce a simple yet efficient two-stream UNet-like network to achieve this goal. Experiments involving 51 subjects demonstrate that UWash achieves 92.27% accuracy in handwashing gesture recognition, an error of <0.5 seconds in onset/offset detection, and an error of <5 points in gesture scoring under user-dependent settings. The system also performs robustly in user-independent and user-independent-location-independent evaluations. Remarkably, UWash maintains high performance in real-world tests, including evaluations with 10 random passersby at a hospital 9 months later and 10 passersby in an in-the-wild test conducted 2 years later. UWash is the first system to score handwashing quality based on gesture sequences, offering actionable guidance for improving daily hand hygiene. The code and dataset are publicly available at https://github.com/aiotgroup/UWash

nan

Article 1866

Title@2025-05-24 (6): Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding

Title: Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding

Denken Sie, bevor Sie akzeptieren: Semantische Reflektierende Verifizierung für schnellere spekulative Dekodierung

在你接受之前先想想: 快速投机代号的语义反省校验 2505.18629v1

Authors: Yixuan Wang, Yijun Liu, Shiyu ji, Yuzhuang Xu, Yang Xu, Qingfu Zhu, Wanxiang Che

Large language models (LLMs) suffer from high inference latency due to the auto-regressive decoding process. Speculative decoding accelerates inference by generating multiple draft tokens using a lightweight model and verifying them in parallel. However, existing verification methods rely heavily on distributional consistency while overlooking semantic correctness, thereby limiting the potential speedup of speculative decoding. While some methods employ additional models for relaxed verification of draft tokens, they often fail to generalize effectively to more diverse or open-domain settings. In this work, we propose Reflective Verification, a training-free and semantics-aware approach that achieves a better trade-off between correctness and efficiency. Specifically, we leverage the inherent reflective capacity of LLMs to semantically assess the correctness of draft tokens in parallel during verification. Using prompt-based probing, we obtain both the original and reflective distributions of draft tokens in a single forward pass. The fusion of these distributions enables semantic-level verification of draft tokens that incorporates both consistency and correctness. Experiments across multiple domain benchmarks and model scales demonstrate that our method significantly increases the acceptance length of draft tokens without compromising model performance. Furthermore, we find that the proposed Reflective Verification is orthogonal to existing statistical verification methods, and their combination yields additional 5$\sim$15\% improvements in decoding speed.

nan

Article 1867

Title@2025-05-24 (6): HARP: Hesitation-Aware Reframing in Transformer Inference Pass

Title: HARP: Hesitation-Aware Reframing in Transformer Inference Pass

HARP: Hezitation-Aware Reframing in Transformer Inferenz Pass

HARP: 变压器推断通过中的偏移-软件重新配置 2412.07282v2

Authors: Romain Storaï, Seung-won Hwang

This paper aims to improve the performance of large language models by addressing the variable computational demands in inference steps, where some tokens require more computational resources than others. We present HARP, a simple modification to “off-the-shelf” Transformer forward pass. Drawing from hesitation and the framing effect in decision-making, HARP selectively applies additional computation when the model encounters uncertainty during token generation. Our method mimics human cognitive processes by pausing at difficult decision points and reframing inputs for a different perspective. Unlike other approaches, HARP is model-agnostic, training-free, and easy to implement. We evaluate our method across various downstream tasks and model sizes, demonstrating performance improvements up to +5.16%. Notably, HARP achieves these gains while maintaining inference times twice faster than beam search. Simple and yet with significant gains, HARP provides insights into the potential of adaptive computation for enhancing the performance of Transformer-based language models.

nan

Article 1868

Title@2025-05-24 (6): QUCE: The Minimisation and Quantification of Path-Based Uncertainty for Generative Counterfactual Explanations

Title: QUCE: The Minimisation and Quantification of Path-Based Uncertainty for Generative Counterfactual Explanations

QUCE: Die Minimierung und Quantifizierung pfadbasierter Unsicherheiten für generative gegenfaktische Erklärungen

QUCE: 产生反事实解释的路径不确定性的最小化和量化 2402.17516v5

Authors: Jamie Duell, Monika Seisenberger, Hsuan Fu, Xiuyi Fan

Deep Neural Networks (DNNs) stand out as one of the most prominent approaches within the Machine Learning (ML) domain. The efficacy of DNNs has surged alongside recent increases in computational capacity, allowing these approaches to scale to significant complexities for addressing predictive challenges in big data. However, as the complexity of DNN models rises, interpretability diminishes. In response to this challenge, explainable models such as Adversarial Gradient Integration (AGI) leverage path-based gradients provided by DNNs to elucidate their decisions. Yet the performance of path-based explainers can be compromised when gradients exhibit irregularities during out-of-distribution path traversal. In this context, we introduce Quantified Uncertainty Counterfactual Explanations (QUCE), a method designed to mitigate out-of-distribution traversal by minimizing path uncertainty. QUCE not only quantifies uncertainty when presenting explanations but also generates more certain counterfactual examples. We showcase the performance of the QUCE method by comparing it with competing methods for both path-based explanations and generative counterfactual examples.

nan

Article 1869

Title@2025-05-24 (6): Mind The Gap: Deep Learning Doesn’t Learn Deeply

Title: Mind The Gap: Deep Learning Doesn’t Learn Deeply

Mind The Gap: Deep Learning lernt nicht tief

思想差距:深学习不深入学习 2505.18623v1

Authors: Lucas Saldyt, Subbarao Kambhampati

This paper aims to understand how neural networks learn algorithmic reasoning by addressing two questions: How faithful are learned algorithms when they are effective, and why do neural networks fail to learn effective algorithms otherwise? To answer these questions, we use neural compilation, a technique that directly encodes a source algorithm into neural network parameters, enabling the network to compute the algorithm exactly. This enables comparison between compiled and conventionally learned parameters, intermediate vectors, and behaviors. This investigation is crucial for developing neural networks that robustly learn complexalgorithms from data. Our analysis focuses on graph neural networks (GNNs), which are naturally aligned with algorithmic reasoning tasks, specifically our choices of BFS, DFS, and Bellman-Ford, which cover the spectrum of effective, faithful, and ineffective learned algorithms. Commonly, learning algorithmic reasoning is framed as induction over synthetic data, where a parameterized model is trained on inputs, traces, and outputs produced by an underlying ground truth algorithm. In contrast, we introduce a neural compilation method for GNNs, which sets network parameters analytically, bypassing training. Focusing on GNNs leverages their alignment with algorithmic reasoning, extensive algorithmic induction literature, and the novel application of neural compilation to GNNs. Overall, this paper aims to characterize expressability-trainability gaps - a fundamental shortcoming in learning algorithmic reasoning. We hypothesize that inductive learning is most effective for parallel algorithms contained within the computational class \texttt{NC}.

nan

Article 1870

Title@2025-05-24 (6): Trust, or Don’t Predict: Introducing the CWSA Family for Confidence-Aware Model Evaluation

Title: Trust, or Don’t Predict: Introducing the CWSA Family for Confidence-Aware Model Evaluation

Vertrauen oder nicht voraussagen: Einführung der CWSA-Familie für vertrauensbewusste Modellbewertung

信任或不要预测:介绍CWSA家庭促进信任-了解模型评价 2505.18622v1

Authors: Kourosh Shahnazari, Seyed Moein Ayyoubzadeh, Mohammadali Keshtparvar, Pegah Ghaffari

In recent machine learning systems, confidence scores are being utilized more and more to manage selective prediction, whereby a model can abstain from making a prediction when it is unconfident. Yet, conventional metrics like accuracy, expected calibration error (ECE), and area under the risk-coverage curve (AURC) do not capture the actual reliability of predictions. These metrics either disregard confidence entirely, dilute valuable localized information through averaging, or neglect to suitably penalize overconfident misclassifications, which can be particularly detrimental in real-world systems. We introduce two new metrics Confidence-Weighted Selective Accuracy (CWSA) and its normalized variant CWSA+ that offer a principled and interpretable way to evaluate predictive models under confidence thresholds. Unlike existing methods, our metrics explicitly reward confident accuracy and penalize overconfident mistakes. They are threshold-local, decomposable, and usable in both evaluation and deployment settings where trust and risk must be quantified. Through exhaustive experiments on both real-world data sets (MNIST, CIFAR-10) and artificial model variants (calibrated, overconfident, underconfident, random, perfect), we show that CWSA and CWSA+ both effectively detect nuanced failure modes and outperform classical metrics in trust-sensitive tests. Our results confirm that CWSA is a sound basis for developing and assessing selective prediction systems for safety-critical domains.

nan

Article 1871

Title@2025-05-24 (6): Neural Solver Selection for Combinatorial Optimization

Title: Neural Solver Selection for Combinatorial Optimization

Neural Solver Selection zur kombinatorischen Optimierung

组合优化的神经溶剂选择 2410.09693v2

Authors: Chengrui Gao, Haopu Shang, Ke Xue, Chao Qian

Machine learning has increasingly been employed to solve NP-hard combinatorial optimization problems, resulting in the emergence of neural solvers that demonstrate remarkable performance, even with minimal domain-specific knowledge. To date, the community has created numerous open-source neural solvers with distinct motivations and inductive biases. While considerable efforts are devoted to designing powerful single solvers, our findings reveal that existing solvers typically demonstrate complementary performance across different problem instances. This suggests that significant improvements could be achieved through effective coordination of neural solvers at the instance level. In this work, we propose the first general framework to coordinate the neural solvers, which involves feature extraction, selection model, and selection strategy, aiming to allocate each instance to the most suitable solvers. To instantiate, we collect several typical neural solvers with state-of-the-art performance as alternatives, and explore various methods for each component of the framework. We evaluated our framework on two extensively studied combinatorial optimization problems, Traveling Salesman Problem (TSP) and Capacitated Vehicle Routing Problem (CVRP). Experimental results show that the proposed framework can effectively distribute instances and the resulting composite solver can achieve significantly better performance (e.g., reduce the optimality gap by 0.88\% on TSPLIB and 0.71\% on CVRPLIB) than the best individual neural solver with little extra time cost.

nan

Article 1872

Title@2025-05-24 (6): Federated Class-Incremental Learning with Hierarchical Generative Prototypes

Title: Federated Class-Incremental Learning with Hierarchical Generative Prototypes

Föderiertes Klassen-Inkrementelles Lernen mit Hierarchischen Generativen Prototypen

具有等级制起源原型的联邦高级高等程度学习 2406.02447v4

Authors: Riccardo Salami, Pietro Buzzega, Matteo Mosconi, Mattia Verasani, Simone Calderara

Federated Learning (FL) aims at unburdening the training of deep models by distributing computation across multiple devices (clients) while safeguarding data privacy. On top of that, Federated Continual Learning (FCL) also accounts for data distribution evolving over time, mirroring the dynamic nature of real-world environments. While previous studies have identified Catastrophic Forgetting and Client Drift as primary causes of performance degradation in FCL, we shed light on the importance of Incremental Bias and Federated Bias, which cause models to prioritize classes that are recently introduced or locally predominant, respectively. Our proposal constrains both biases in the last layer by efficiently finetuning a pre-trained backbone using learnable prompts, resulting in clients that produce less biased representations and more biased classifiers. Therefore, instead of solely relying on parameter aggregation, we leverage generative prototypes to effectively balance the predictions of the global model. Our method significantly improves the current State Of The Art, providing an average increase of +7.8% in accuracy.

nan

Article 1873

Title@2025-05-24 (6): MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation

Title: MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation

MAVL: Ein mehrsprachiger Audio-Video-Text Datensatz für animierte Song-Übersetzung

MAVL: 动动歌曲翻译多语种视听歌词数据集 2505.18614v1

Authors: Woohyun Cho, Youngmin Kim, Sunghyun Lee, Youngjae Yu

Lyrics translation requires both accurate semantic transfer and preservation of musical rhythm, syllabic structure, and poetic style. In animated musicals, the challenge intensifies due to alignment with visual and auditory cues. We introduce Multilingual Audio-Video Lyrics Benchmark for Animated Song Translation (MAVL), the first multilingual, multimodal benchmark for singable lyrics translation. By integrating text, audio, and video, MAVL enables richer and more expressive translations than text-only approaches. Building on this, we propose Syllable-Constrained Audio-Video LLM with Chain-of-Thought SylAVL-CoT, which leverages audio-video cues and enforces syllabic constraints to produce natural-sounding lyrics. Experimental results demonstrate that SylAVL-CoT significantly outperforms text-based models in singability and contextual accuracy, emphasizing the value of multimodal, multilingual approaches for lyrics translation.

nan

Article 1874

Title@2025-05-24 (6): MLRan: A Behavioural Dataset for Ransomware Analysis and Detection

Title: MLRan: A Behavioural Dataset for Ransomware Analysis and Detection

MLRan: Ein Verhaltensdatensatz für Ransomware Analyse und Erkennung

MLran:用于分析和探测Ransomware 分析和探测的行为数据集 2505.18613v1

Authors: Faithful Chiagoziem Onwuegbuche, Adelodun Olaoluwa, Anca Delia Jurcut, Liliana Pasquale

Ransomware remains a critical threat to cybersecurity, yet publicly available datasets for training machine learning-based ransomware detection models are scarce and often have limited sample size, diversity, and reproducibility. In this paper, we introduce MLRan, a behavioural ransomware dataset, comprising over 4,800 samples across 64 ransomware families and a balanced set of goodware samples. The samples span from 2006 to 2024 and encompass the four major types of ransomware: locker, crypto, ransomware-as-a-service, and modern variants. We also propose guidelines (GUIDE-MLRan), inspired by previous work, for constructing high-quality behavioural ransomware datasets, which informed the curation of our dataset. We evaluated the ransomware detection performance of several machine learning (ML) models using MLRan. For this purpose, we performed feature selection by conducting mutual information filtering to reduce the initial 6.4 million features to 24,162, followed by recursive feature elimination, yielding 483 highly informative features. The ML models achieved an accuracy, precision and recall of up to 98.7%, 98.9%, 98.5%, respectively. Using SHAP and LIME, we identified critical indicators of malicious behaviour, including registry tampering, strings, and API misuse. The dataset and source code for feature extraction, selection, ML training, and evaluation are available publicly to support replicability and encourage future research, which can be found at https://github.com/faithfulco/mlran.

nan

Article 1875

Title@2025-05-24 (6): An Artificial Intelligence Model for Early Stage Breast Cancer Detection from Biopsy Images

Title: An Artificial Intelligence Model for Early Stage Breast Cancer Detection from Biopsy Images

Ein Modell der Künstlichen Intelligenz zur Früherkennung von Brustkrebs aus Biopsiebildern

早期从生物心理图像中检测乳腺癌的人工智能模型 2505.20332v1

Authors: Neil Chaudhary, Zaynah Dhunny

Accurate identification of breast cancer types plays a critical role in guiding treatment decisions and improving patient outcomes. This paper presents an artificial intelligence enabled tool designed to aid in the identification of breast cancer types using histopathological biopsy images. Traditionally additional tests have to be done on women who are detected with breast cancer to find out the types of cancer it is to give the necessary cure. Those tests are not only invasive but also delay the initiation of treatment and increase patient burden. The proposed model utilizes a convolutional neural network (CNN) architecture to distinguish between benign and malignant tissues as well as accurate subclassification of breast cancer types. By preprocessing the images to reduce noise and enhance features, the model achieves reliable levels of classification performance. Experimental results on such datasets demonstrate the model’s effectiveness, outperforming several existing solutions in terms of accuracy, precision, recall, and F1-score. The study emphasizes the potential of deep learning techniques in clinical diagnostics and offers a promising tool to assist pathologists in breast cancer classification.

nan

Article 1876

Title@2025-05-24 (6): Exemplar-Free Continual Learning for State Space Models

Title: Exemplar-Free Continual Learning for State Space Models

Beispielfreies kontinuierliches Lernen für Staatsraummodelle

国家空间模型免税免费持续学习 2505.18604v1

Authors: Isaac Ning Lee, Leila Mahmoodi, Trung Le, Mehrtash Harandi

State-Space Models (SSMs) excel at capturing long-range dependencies with structured recurrence, making them well-suited for sequence modeling. However, their evolving internal states pose challenges in adapting them under Continual Learning (CL). This is particularly difficult in exemplar-free settings, where the absence of prior data leaves updates to the dynamic SSM states unconstrained, resulting in catastrophic forgetting. To address this, we propose Inf-SSM, a novel and simple geometry-aware regularization method that utilizes the geometry of the infinite-dimensional Grassmannian to constrain state evolution during CL. Unlike classical continual learning methods that constrain weight updates, Inf-SSM regularizes the infinite-horizon evolution of SSMs encoded in their extended observability subspace. We show that enforcing this regularization requires solving a matrix equation known as the Sylvester equation, which typically incurs $\mathcal{O}(n^3)$ complexity. We develop a $\mathcal{O}(n^2)$ solution by exploiting the structure and properties of SSMs. This leads to an efficient regularization mechanism that can be seamlessly integrated into existing CL methods. Comprehensive experiments on challenging benchmarks, including ImageNet-R and Caltech-256, demonstrate a significant reduction in forgetting while improving accuracy across sequential tasks.

nan

Article 1877

Title@2025-05-24 (6): LLM-Meta-SR: Learning to Evolve Selection Operators for Symbolic Regression

Title: LLM-Meta-SR: Learning to Evolve Selection Operators for Symbolic Regression

LLM-Meta-SR: Lernen, Auswahloperatoren für symbolische Regression zu entwickeln

LLM-Meta-SR:学习如何向演进中的反射反射选择操作员学习 2505.18602v1

Authors: Hengzhe Zhang, Qi Chen, Bing Xue, Mengjie Zhang

Large language models (LLMs) have revolutionized algorithm development, yet their application in symbolic regression, where algorithms automatically discover symbolic expressions from data, remains constrained and is typically designed manually by human experts. In this paper, we propose a learning-to-evolve framework that enables LLMs to automatically design selection operators for evolutionary symbolic regression algorithms. We first identify two key limitations in existing LLM-based algorithm evolution techniques: code bloat and a lack of semantic guidance. Bloat results in unnecessarily complex components, and the absence of semantic awareness can lead to ineffective exchange of useful code components, both of which can reduce the interpretability of the designed algorithm or hinder evolutionary learning progress. To address these issues, we enhance the LLM-based evolution framework for meta symbolic regression with two key innovations: bloat control and a complementary, semantics-aware selection operator. Additionally, we embed domain knowledge into the prompt, enabling the LLM to generate more effective and contextually relevant selection operators. Our experimental results on symbolic regression benchmarks show that LLMs can devise selection operators that outperform nine expert-designed baselines, achieving state-of-the-art performance. This demonstrates that LLMs can exceed expert-level algorithm design for symbolic regression.

nan

Article 1878

Title@2025-05-24 (6): Learning to Program Quantum Measurements for Machine Learning

Title: Learning to Program Quantum Measurements for Machine Learning

Lernen, Quantenmessungen für maschinelles Lernen zu programmieren

学习机器学习量度方案 2505.13525v2

Authors: Samuel Yen-Chi Chen, Huan-Hsin Tseng, Hsin-Yi Lin, Shinjae Yoo

The rapid advancements in quantum computing (QC) and machine learning (ML) have sparked significant interest, driving extensive exploration of quantum machine learning (QML) algorithms to address a wide range of complex challenges. The development of high-performance QML models requires expert-level expertise, presenting a key challenge to the widespread adoption of QML. Critical obstacles include the design of effective data encoding strategies and parameterized quantum circuits, both of which are vital for the performance of QML models. Furthermore, the measurement process is often neglected-most existing QML models employ predefined measurement schemes that may not align with the specific requirements of the targeted problem. We propose an innovative framework that renders the observable of a quantum system-specifically, the Hermitian matrix-trainable. This approach employs an end-to-end differentiable learning framework, enabling simultaneous optimization of the neural network used to program the parameterized observables and the standard quantum circuit parameters. Notably, the quantum observable parameters are dynamically programmed by the neural network, allowing the observables to adapt in real time based on the input data stream. Through numerical simulations, we demonstrate that the proposed method effectively programs observables dynamically within variational quantum circuits, achieving superior results compared to existing approaches. Notably, it delivers enhanced performance metrics, such as higher classification accuracy, thereby significantly improving the overall effectiveness of QML models.

nan

Article 1879

Title@2025-05-24 (6): Sum of Squares Circuits

Title: Sum of Squares Circuits

Summe der Quadrate Schaltungen

平方电路总和 2408.11778v3

Authors: Lorenzo Loconte, Stefan Mengel, Antonio Vergari

Designing expressive generative models that support exact and efficient inference is a core question in probabilistic ML. Probabilistic circuits (PCs) offer a framework where this tractability-vs-expressiveness trade-off can be analyzed theoretically. Recently, squared PCs encoding subtractive mixtures via negative parameters have emerged as tractable models that can be exponentially more expressive than monotonic PCs, i.e., PCs with positive parameters only. In this paper, we provide a more precise theoretical characterization of the expressiveness relationships among these models. First, we prove that squared PCs can be less expressive than monotonic ones. Second, we formalize a novel class of PCs – sum of squares PCs – that can be exponentially more expressive than both squared and monotonic PCs. Around sum of squares PCs, we build an expressiveness hierarchy that allows us to precisely unify and separate different tractable model classes such as Born Machines and PSD models, and other recently introduced tractable probabilistic models by using complex parameters. Finally, we empirically show the effectiveness of sum of squares circuits in performing distribution estimation.

nan

Article 1880

Title@2025-05-24 (6): LLMs for Supply Chain Management

Title: LLMs for Supply Chain Management

LLMs für Supply Chain Management

供应链管理LLMs 2505.18597v1

Authors: Haojie Wang, Jiuyun Jiang, L. Jeff Hong, Guangxin Jiang

The development of large language models (LLMs) has provided new tools for research in supply chain management (SCM). In this paper, we introduce a retrieval-augmented generation (RAG) framework that dynamically integrates external knowledge into the inference process, and develop a domain-specialized SCM LLM, which demonstrates expert-level competence by passing standardized SCM examinations and beer game tests. We further employ the use of LLMs to conduct horizontal and vertical supply chain games, in order to analyze competition and cooperation within supply chains. Our experiments show that RAG significantly improves performance on SCM tasks. Moreover, game-theoretic analysis reveals that the LLM can reproduce insights from the classical SCM literature, while also uncovering novel behaviors and offering fresh perspectives on phenomena such as the bullwhip effect. This paper opens the door for exploring cooperation and competition for complex supply chain network through the lens of LLMs.

nan

Article 1881

Title@2025-05-24 (6): MisoDICE: Multi-Agent Imitation from Unlabeled Mixed-Quality Demonstrations

Title: MisoDICE: Multi-Agent Imitation from Unlabeled Mixed-Quality Demonstrations

MisoDICE: Multi-Agent-Imitation aus nicht gekennzeichneten Mixed-Quality-Demonstrationen

MisoDICE:从未贴标签的混合质量示范中多机构吸收 2505.18595v1

Authors: The Viet Bui, Tien Mai, Hong Thanh Nguyen

We study offline imitation learning (IL) in cooperative multi-agent settings, where demonstrations have unlabeled mixed quality - containing both expert and suboptimal trajectories. Our proposed solution is structured in two stages: trajectory labeling and multi-agent imitation learning, designed jointly to enable effective learning from heterogeneous, unlabeled data. In the first stage, we combine advances in large language models and preference-based reinforcement learning to construct a progressive labeling pipeline that distinguishes expert-quality trajectories. In the second stage, we introduce MisoDICE, a novel multi-agent IL algorithm that leverages these labels to learn robust policies while addressing the computational complexity of large joint state-action spaces. By extending the popular single-agent DICE framework to multi-agent settings with a new value decomposition and mixing architecture, our method yields a convex policy optimization objective and ensures consistency between global and local policies. We evaluate MisoDICE on multiple standard multi-agent RL benchmarks and demonstrate superior performance, especially when expert data is scarce.

nan

Article 1882

Title@2025-05-24 (6): Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks

Title: Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks

Bayesian Meta-Reinforcement Learning mit Laplace Variational Recurrent Networks

采用拉位变换经常网络加强Bayesian Met-加强学习 2505.18591v1

Authors: Joery A. de Vries, Jinke He, Mathijs M. de Weerdt, Matthijs T. J. Spaan

Meta-reinforcement learning trains a single reinforcement learning agent on a distribution of tasks to quickly generalize to new tasks outside of the training set at test time. From a Bayesian perspective, one can interpret this as performing amortized variational inference on the posterior distribution over training tasks. Among the various meta-reinforcement learning approaches, a common method is to represent this distribution with a point-estimate using a recurrent neural network. We show how one can augment this point estimate to give full distributions through the Laplace approximation, either at the start of, during, or after learning, without modifying the base model architecture. With our approximation, we are able to estimate distribution statistics (e.g., the entropy) of non-Bayesian agents and observe that point-estimate based methods produce overconfident estimators while not satisfying consistency. Furthermore, when comparing our approach to full-distribution based learning of the task posterior, our method performs on par with variational baselines while having much fewer parameters.

nan

Article 1883

Title@2025-05-24 (6): CiRL: Open-Source Environments for Reinforcement Learning in Circular Economy and Net Zero

Title: CiRL: Open-Source Environments for Reinforcement Learning in Circular Economy and Net Zero

CiRL: Open-Source-Umgebungen für verstärktes Lernen in der Kreislaufwirtschaft und Net Zero

CIRL: 在循环经济和净零中加强学习的开放源环境 2505.21536v1

Authors: Federico Zocco, Andrea Corti, Monica Malvezzi

The demand of finite raw materials will keep increasing as they fuel modern society. Simultaneously, solutions for stopping carbon emissions in the short term are not available, thus making the net zero target extremely challenging to achieve at scale. The circular economy (CE) paradigm is gaining attention as a solution to address climate change and the uncertainties of supplies of critical materials. Hence, in this paper, we introduce CiRL, a deep reinforcement learning (DRL) library of environments focused on the circularity of both solid and fluid materials. The integration of DRL into the design of material circularity is possible thanks to the formalism of thermodynamical material networks, which is underpinned by compartmental dynamical thermodynamics. Along with the focus on circularity, this library has three more features: the new CE-oriented environments are in the state-space form, which is typically used in dynamical systems analysis and control designs; it is based on a state-of-the-art Python library of DRL algorithms, namely, Stable-Baselines3; and it is developed in Google Colaboratory to be accessible to researchers from different disciplines and backgrounds as is often the case for circular economy researchers and engineers. CiRL is publicly available.

nan

Article 1884

Title@2025-05-24 (6): Model Extrapolation Expedites Alignment

Title: Model Extrapolation Expedites Alignment

Modell Extrapolation Expeditionen Ausrichtung

模型外推快速调整 2404.16792v4

Authors: Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng

Given the high computational cost of preference alignment training of large language models (LLMs), exploring efficient methods to reduce the training overhead remains an important and compelling research problem. Motivated by the observation that alignment training typically involves only small parameter changes without injecting new knowledge into models, we propose a straightforward method called ExPO (model extrapolation) to expedite LLMs’ alignment with human preferences. Given a partially-trained model and its initial SFT checkpoint, ExPO improves the implicit optimization objective of alignment training by simply amplifying the parameter change based on a first-order approximation, without any additional training overhead. Through controlled experiments, we demonstrate that ExPO boosts a DPO model trained with only 20% steps to outperform the fully-trained one. Moreover, we show that ExPO notably improves existing open-source LLMs (ranging from 1.8B to 70B parameters) on the leading AlpacaEval 2.0 and MT-Bench benchmarks, which highlights ExPO’s broader utility in efficiently enhancing LLM alignment.

nan

Article 1885

Title@2025-05-24 (6): Continuous Multi-Task Pre-training for Malicious URL Detection and Webpage Classification

Title: Continuous Multi-Task Pre-training for Malicious URL Detection and Webpage Classification

Kontinuierliches Multi-Task-Vortraining für bösartige URL-Erkennung und Webpage-Klassifikation

恶意URL探测和网页分类连续多任务连续培训 2402.11495v2

Authors: Yujie Li, Yiwei Liu, Peiyue Li, Yifan Jia, Yanbin Wang

Malicious URL detection and webpage classification are critical tasks in cybersecurity and information management. In recent years, extensive research has explored using BERT or similar language models to replace traditional machine learning methods for detecting malicious URLs and classifying webpages. While previous studies show promising results, they often apply existing language models to these tasks without accounting for the inherent differences in domain data (e.g., URLs being loosely structured and semantically sparse compared to text), leaving room for performance improvement. Furthermore, current approaches focus on single tasks and have not been tested in multi-task scenarios. To address these challenges, we propose urlBERT, a pre-trained URL encoder leveraging Transformer to encode foundational knowledge from billions of unlabeled URLs. To achieve it, we propose to use 5 unsupervised pretraining tasks to capture multi-level information of URL lexical, syntax, and semantics, and generate contrastive and adversarial representations. Furthermore, to avoid inter-pre-training competition and interference, we proposed a grouped sequential learning method to ensure effective training across multi-tasks. Finally, we leverage a two-stage fine-tuning approach to improve the training stability and efficiency of the task model. To assess the multitasking potential of urlBERT, we fine-tune the task model in both single-task and multi-task modes. The former creates a classification model for a single task, while the latter builds a classification model capable of handling multiple tasks. We evaluate urlBERT on three downstream tasks: phishing URL detection, advertising URL detection, and webpage classification. The results demonstrate that urlBERT outperforms standard pre-trained models, and its multi-task mode is capable of addressing the real-world demands of multitasking.

nan

Article 1886

Title@2025-05-24 (6): REAL: Representation Enhanced Analytic Learning for Exemplar-free Class-incremental Learning

Title: REAL: Representation Enhanced Analytic Learning for Exemplar-free Class-incremental Learning

REAL: Darstellungsverstärktes analytisches Lernen für exemplarisch-freies Klassen-inkrementelles Lernen

实际:为免世禁初级入门学习加强代表性分析学习 2403.13522v2

Authors: Run He, Di Fang, Yizhu Chen, Kai Tong, Cen Chen, Yi Wang, Lap-pui Chau, Huiping Zhuang

Exemplar-free class-incremental learning (EFCIL) aims to mitigate catastrophic forgetting in class-incremental learning (CIL) without available historical training samples as exemplars. Compared with its exemplar-based CIL counterpart that stores exemplars, EFCIL suffers more from forgetting issues. Recently, a new EFCIL branch named Analytic Continual Learning (ACL) introduces a gradient-free paradigm via Recursive Least-Square, achieving a forgetting-resistant classifier training with a frozen backbone during CIL. However, existing ACL suffers from ineffective representations and insufficient utilization of backbone knowledge. In this paper, we propose a representation-enhanced analytic learning (REAL) to address these problems. To enhance the representation, REAL constructs a dual-stream base pretraining followed by representation enhancing distillation process. The dual-stream base pretraining combines self-supervised contrastive learning for general features and supervised learning for class-specific knowledge, followed by the representation enhancing distillation to merge both streams, enhancing representations for subsequent CIL paradigm. To utilize more knowledge from the backbone, REAL presents a feature fusion buffer to multi-layer backbone features, providing informative features for the subsequent classifier training. Our method can be incorporated into existing ACL techniques and provides more competitive performance. Empirical results demonstrate that, REAL achieves state-of-the-art performance on CIFAR-100, ImageNet-100 and ImageNet-1k benchmarks, outperforming exemplar-free methods and rivaling exemplar-based approaches.

nan

Article 1887

Title@2025-05-24 (6): AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Title: AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

AFL: Ein eingleisiger analytischer Ansatz für das Federated Learning mit vortrainierten Modellen

ACL: 采用培训前模式的联邦学习单一分析方法 2405.16240v2

Authors: Run He, Kai Tong, Di Fang, Han Sun, Haoran Li, Tianyi Chen, Ziqian Zeng, Huiping Zhuang

In this paper, we introduce analytic federated learning (AFL), a new training paradigm that brings analytical (i.e., closed-form) solutions to the federated learning (FL) with pre-trained models. Our AFL draws inspiration from analytic learning – a gradient-free technique that trains neural networks with analytical solutions in one epoch. In the local client training stage, the AFL facilitates a one-epoch training, eliminating the necessity for multi-epoch updates. In the aggregation stage, we derive an absolute aggregation (AA) law. This AA law allows a single-round aggregation, reducing heavy communication overhead and achieving fast convergence by removing the need for multiple aggregation rounds. More importantly, the AFL exhibits a property that $\textit{invariance to data partitioning}$, meaning that regardless of how the full dataset is distributed among clients, the aggregated result remains identical. This could spawn various potentials, such as data heterogeneity invariance and client-number invariance. We conduct experiments across various FL settings including extremely non-IID ones, and scenarios with a large number of clients (e.g., $\ge 1000$). In all these settings, our AFL constantly performs competitively while existing FL techniques encounter various obstacles. Our codes are available at https://github.com/ZHUANGHP/Analytic-federated-learning.

nan

Article 1888

Title@2025-05-24 (6): Mechanical in-sensor computing: a programmable meta-sensor for structural damage classification without external electronic power

Title: Mechanical in-sensor computing: a programmable meta-sensor for structural damage classification without external electronic power

Mechanische In-Sensor-Computing: ein programmierbarer Meta-Sensor für die Klassifizierung von Strukturschäden ohne externe elektronische Leistung

传感器中的机械内传感器计算:可编程的元传感器,用于结构损害分类,无外部电子电源 2505.18579v1

Authors: Tingpeng Zhang, Xuzhang Peng, Mingyuan Zhou, Guobiao Hu, Zhilu Lai

Structural health monitoring (SHM) involves sensor deployment, data acquisition, and data interpretation, commonly implemented via a tedious wired system. The information processing in current practice majorly depends on electronic computers, albeit with universal applications, delivering challenges such as high energy consumption and low throughput due to the nature of digital units. In recent years, there has been a renaissance interest in shifting computations from electronic computing units to the use of real physical systems, a concept known as physical computation. This approach provides the possibility of thinking out of the box for SHM, seamlessly integrating sensing and computing into a pure-physical entity, without relying on external electronic power supplies, thereby properly coping with resource-restricted scenarios. The latest advances of metamaterials (MM) hold great promise for this proactive idea. In this paper, we introduce a programmable metamaterial-based sensor (termed as MM-sensor) for physically processing structural vibration information to perform specific SHM tasks, such as structural damage warning (binary classification) in this initiation, without the need for further information processing or resource-consuming, that is, the data collection and analysis are completed in-situ at the sensor level. We adopt the configuration of a locally resonant metamaterial plate (LRMP) to achieve the first fabrication of the MM-sensor. We take advantage of the bandgap properties of LRMP to physically differentiate the dynamic behavior of structures before and after damage. By inversely designing the geometric parameters, our current approach allows for adjustments to the bandgap features. This is effective for engineering systems with a first natural frequency ranging from 9.54 Hz to 81.86 Hz.

nan

Article 1889

Title@2025-05-24 (6): Trust-Region Twisted Policy Improvement

Title: Trust-Region Twisted Policy Improvement

Vertrauensregion verdrehte politische Verbesserung

改变政策改进 2504.06048v3

Authors: Joery A. de Vries, Jinke He, Yaniv Oren, Matthijs T. J. Spaan

Monte-Carlo tree search (MCTS) has driven many recent breakthroughs in deep reinforcement learning (RL). However, scaling MCTS to parallel compute has proven challenging in practice which has motivated alternative planners like sequential Monte-Carlo (SMC). Many of these SMC methods adopt particle filters for smoothing through a reformulation of RL as a policy inference problem. Yet, persisting design choices of these particle filters often conflict with the aim of online planning in RL, which is to obtain a policy improvement at the start of planning. Drawing inspiration from MCTS, we tailor SMC planners specifically for RL by improving data generation within the planner through constrained action sampling and explicit terminal state handling, as well as improving policy and value target estimation. This leads to our Trust-Region Twisted SMC (TRT-SMC), which shows improved runtime and sample-efficiency over baseline MCTS and SMC methods in both discrete and continuous domains.

nan

Article 1890

Title@2025-05-24 (6): TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

Title: TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

TabICL: Ein tabellarisches Grundlagenmodell für das In-Context-Lernen mit großen Datenmengen

TabICL: 大型数据内部知识学习表示基础模型 2502.05564v2

Authors: Jingang Qu, David Holzmüller, Gaël Varoquaux, Marine Le Morvan

The long-standing dominance of gradient-boosted decision trees on tabular data is currently challenged by tabular foundation models using In-Context Learning (ICL): setting the training data as context for the test data and predicting in a single forward pass without parameter updates. While TabPFNv2 foundation model excels on tables with up to 10K samples, its alternating column- and row-wise attentions make handling large training sets computationally prohibitive. So, can ICL be effectively scaled and deliver a benefit for larger tables? We introduce TabICL, a tabular foundation model for classification, pretrained on synthetic datasets with up to 60K samples and capable of handling 500K samples on affordable resources. This is enabled by a novel two-stage architecture: a column-then-row attention mechanism to build fixed-dimensional embeddings of rows, followed by a transformer for efficient ICL. Across 200 classification datasets from the TALENT benchmark, TabICL is on par with TabPFNv2 while being systematically faster (up to 10 times), and significantly outperforms all other approaches. On 53 datasets with over 10K samples, TabICL surpasses both TabPFNv2 and CatBoost, demonstrating the potential of ICL for large data. Pretraining code, inference code, and pre-trained models are available at https://github.com/soda-inria/tabicl.

nan

Article 1891

Title@2025-05-24 (6): DAL: A Practical Prior-Free Black-Box Framework for Non-Stationary Bandit Environments

Title: DAL: A Practical Prior-Free Black-Box Framework for Non-Stationary Bandit Environments

DAL: Ein praktisches Prior-Free Black-Box Framework für nicht-stationäre Bandit-Umgebungen

DAL:非高度强盗环境实际的、事先免费的黑盒框架 2501.19401v2

Authors: Argyrios Gerogiannis, Yu-Han Huang, Subhonmesh Bose, Venugopal V. Veeravalli

We introduce a practical, black-box framework termed Detection Augmenting Learning (DAL) for the problem of non-stationary bandits without prior knowledge of the underlying non-stationarity. DAL is modular, accepting any stationary bandit algorithm as input and augmenting it with a change detector. Our approach is applicable to all common parametric and non-parametric bandit variants. Extensive experimentation demonstrates that DAL consistently surpasses current state-of-the-art methods across diverse non-stationary scenarios, including synthetic benchmarks and real-world datasets, underscoring its versatility and scalability. We provide theoretical insights into DAL’s strong empirical performance on piecewise stationary and drift settings, complemented by thorough experimental validation.

nan

Article 1892

Title@2025-05-24 (6): Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks

Title: Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks

Konvergenzanalyse des natürlichen Gradientenabstiegs für überparameterisierte physikinformierte neurale Netzwerke

超参数物理内成形神经神经网络的自然梯分源相趋同分析 2408.00573v3

Authors: Xianliang Xu, Ting Du, Wang Kong, Bin Shan, Ye Li, Zhongyi Huang

In the context of over-parameterization, there is a line of work demonstrating that randomly initialized (stochastic) gradient descent (GD) converges to a globally optimal solution at a linear convergence rate for the quadratic loss function. However, the learning rate of GD for training two-layer neural networks exhibits poor dependence on the sample size and the Gram matrix, leading to a slow training process. In this paper, we show that for training two-layer $\text{ReLU}^3$ Physics-Informed Neural Networks (PINNs), the learning rate can be improved from $\mathcal{O}(\lambda_0)$ to $\mathcal{O}(1/|\bm{H}^{\infty}|_2)$, implying that GD actually enjoys a faster convergence rate. Despite such improvements, the convergence rate is still tied to the least eigenvalue of the Gram matrix, leading to slow convergence. We then develop the positive definiteness of Gram matrices with general smooth activation functions and provide the convergence analysis of natural gradient descent (NGD) in training two-layer PINNs, demonstrating that the learning rate can be $\mathcal{O}(1)$ and at this rate, the convergence rate is independent of the Gram matrix. In particular, for smooth activation functions, the convergence rate of NGD is quadratic. Numerical experiments are conducted to verify our theoretical results.

nan

Article 1893

Title@2025-05-24 (6): Autocomp: LLM-Driven Code Optimization for Tensor Accelerators

Title: Autocomp: LLM-Driven Code Optimization for Tensor Accelerators

Autocomp: LLM-gesteuerte Code-Optimierung für Tensor-Beschleuniger

自动comp: LLM- Driven 代码对 Tensor 加速器的优化 2505.18574v1

Authors: Charles Hong, Sahil Bhatia, Alvin Cheung, Yakun Sophia Shao

Hardware accelerators, especially those designed for tensor processing, have become ubiquitous in today’s computing landscape. However, even with significant efforts in building compilers, programming these tensor accelerators remains challenging, leaving much of their potential underutilized. Recently, large language models (LLMs), trained on large amounts of code, have shown significant promise in code generation and optimization tasks, but generating low-resource languages like specialized tensor accelerator code still poses a significant challenge. We tackle this challenge with Autocomp, an approach that empowers accelerator programmers to leverage domain knowledge and hardware feedback to optimize code via an automated LLM-driven search. We accomplish this by: 1) formulating each optimization pass as a structured two-phase prompt, divided into planning and code generation phases, 2) inserting domain knowledge during planning via a concise and adaptable optimization menu, and 3) integrating correctness and performance metrics from hardware as feedback at each search iteration. Across three categories of representative workloads and two different accelerators, we demonstrate that Autocomp-optimized code runs 5.6x (GEMM) and 2.7x (convolution) faster than the vendor-provided library, and outperforms expert-level hand-tuned code by 1.4x (GEMM), 1.1x (convolution), and 1.3x (fine-grained linear algebra). Additionally, we demonstrate that optimization schedules generated from Autocomp can be reused across similar tensor operations, improving speedups by up to 24% under a fixed sample budget.

nan

Article 1894

Title@2025-05-24 (6): Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs

Title: Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs

Steigerung der Effizienz und Exploration bei der Stärkung des Lernens für LLMs

提高LLMM 强化学习的效率和探索 2505.18573v1

Authors: Mengqi Liao, Xiangyu Xi, Ruinian Chen, Jia Leng, Yangen Hu, Ke Zeng, Shuai Liu, Huaiyu Wan

Reasoning large language models (LLMs) excel in complex tasks, which has drawn significant attention to reinforcement learning (RL) for LLMs. However, existing approaches allocate an equal number of rollouts to all questions during the RL process, which is inefficient. This inefficiency stems from the fact that training on simple questions yields limited gains, whereas more rollouts are needed for challenging questions to sample correct answers. Furthermore, while RL improves response precision, it limits the model’s exploration ability, potentially resulting in a performance cap below that of the base model prior to RL. To address these issues, we propose a mechanism for dynamically allocating rollout budgets based on the difficulty of the problems, enabling more efficient RL training. Additionally, we introduce an adaptive dynamic temperature adjustment strategy to maintain the entropy at a stable level, thereby encouraging sufficient exploration. This enables LLMs to improve response precision while preserving their exploratory ability to uncover potential correct pathways. The code and data is available on: https://github.com/LiaoMengqi/E3-RL4LLMs

nan

Article 1895

Title@2025-05-24 (6): VISTA: Vision-Language Inference for Training-Free Stock Time-Series Analysis

Title: VISTA: Vision-Language Inference for Training-Free Stock Time-Series Analysis

VISTA: Vision-Language-Schlussfolgerung für eine trainingsfreie Analyse der Stock-Zeitreihen

VISTA:无培训-库存无培训-时间-系列分析的远景-语言推断 2505.18570v1

Authors: Tina Khezresmaeilzadeh, Parsa Razmara, Seyedarmin Azizi, Mohammad Erfan Sadeghi, Erfan Baghaei Portaghloo

Stock price prediction remains a complex and high-stakes task in financial analysis, traditionally addressed using statistical models or, more recently, language models. In this work, we introduce VISTA (Vision-Language Inference for Stock Time-series Analysis), a novel, training-free framework that leverages Vision-Language Models (VLMs) for multi-modal stock forecasting. VISTA prompts a VLM with both textual representations of historical stock prices and their corresponding line charts to predict future price values. By combining numerical and visual modalities in a zero-shot setting and using carefully designed chain-of-thought prompts, VISTA captures complementary patterns that unimodal approaches often miss. We benchmark VISTA against standard baselines, including ARIMA and text-only LLM-based prompting methods. Experimental results show that VISTA outperforms these baselines by up to 89.83%, demonstrating the effectiveness of multi-modal inference for stock time-series analysis and highlighting the potential of VLMs in financial forecasting tasks without requiring task-specific training.

nan

Article 1896

Title@2025-05-24 (6): Learning without Isolation: Pathway Protection for Continual Learning

Title: Learning without Isolation: Pathway Protection for Continual Learning

Lernen ohne Isolation: Pfadschutz für kontinuierliches Lernen

无孤立的学习:持续学习的路径保护 2505.18568v1

Authors: Zhikang Chen, Abudukelimu Wuerkaixi, Sen Cui, Haoxuan Li, Ding Li, Jingfeng Zhang, Bo Han, Gang Niu, Houfang Liu, Yi Yang, Sifan Yang, Changshui Zhang, Tianling Ren

Deep networks are prone to catastrophic forgetting during sequential task learning, i.e., losing the knowledge about old tasks upon learning new tasks. To this end, continual learning(CL) has emerged, whose existing methods focus mostly on regulating or protecting the parameters associated with the previous tasks. However, parameter protection is often impractical, since the size of parameters for storing the old-task knowledge increases linearly with the number of tasks, otherwise it is hard to preserve the parameters related to the old-task knowledge. In this work, we bring a dual opinion from neuroscience and physics to CL: in the whole networks, the pathways matter more than the parameters when concerning the knowledge acquired from the old tasks. Following this opinion, we propose a novel CL framework, learning without isolation(LwI), where model fusion is formulated as graph matching and the pathways occupied by the old tasks are protected without being isolated. Thanks to the sparsity of activation channels in a deep network, LwI can adaptively allocate available pathways for a new task, realizing pathway protection and addressing catastrophic forgetting in a parameter-efficient manner. Experiments on popular benchmark datasets demonstrate the superiority of the proposed LwI.

nan

Article 1897

Title@2025-05-24 (6): ReflectDiffu:Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework

Title: ReflectDiffu:Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework

ReflectDiffu: Reflect zwischen emotional-intent Ansteckung und Mimicry für Empathetic Response Generation über ein RL-Diffusion Framework

反省:通过RL-扩散框架,对情感-情感内聚变和Mmimimicry之间的反射,以便产生同情性反应 2409.10289v3

Authors: Jiahao Yuan, Zixiang Di, Zhiqing Cui, Guisong Yang, Usman Naseem

Empathetic response generation necessitates the integration of emotional and intentional dynamics to foster meaningful interactions. Existing research either neglects the intricate interplay between emotion and intent, leading to suboptimal controllability of empathy, or resorts to large language models (LLMs), which incur significant computational overhead. In this paper, we introduce ReflectDiffu, a lightweight and comprehensive framework for empathetic response generation. This framework incorporates emotion contagion to augment emotional expressiveness and employs an emotion-reasoning mask to pinpoint critical emotional elements. Additionally, it integrates intent mimicry within reinforcement learning for refinement during diffusion. By harnessing an intent twice reflect mechanism of Exploring-Sampling-Correcting, ReflectDiffu adeptly translates emotional decision-making into precise intent actions, thereby addressing empathetic response misalignments stemming from emotional misrecognition. Through reflection, the framework maps emotional states to intents, markedly enhancing both response empathy and flexibility. Comprehensive experiments reveal that ReflectDiffu outperforms existing models regarding relevance, controllability, and informativeness, achieving state-of-the-art results in both automatic and human evaluations.

nan

Article 1898

Title@2025-05-24 (6): Learning Fluid-Structure Interaction Dynamics with Physics-Informed Neural Networks and Immersed Boundary Methods

Title: Learning Fluid-Structure Interaction Dynamics with Physics-Informed Neural Networks and Immersed Boundary Methods

Learning Fluid-Struktur-Interaktion Dynamik mit physikinformierten Neuronalen Netzwerken und eingetauchten Grenzmethoden

与物理内成形神经网络和混合边界方法的互动动态 2505.18565v1

Authors: Afrah Farea, Saiful Khan, Reza Daryani, Emre Cenk Ersan, Mustafa Serdar Celebi

We introduce neural network architectures that combine physics-informed neural networks (PINNs) with the immersed boundary method (IBM) to solve fluid-structure interaction (FSI) problems. Our approach features two distinct architectures: a Single-FSI network with a unified parameter space, and an innovative Eulerian-Lagrangian network that maintains separate parameter spaces for fluid and structure domains. We study each architecture using standard Tanh and adaptive B-spline activation functions. Empirical studies on a 2D cavity flow problem involving a moving solid structure show that the Eulerian-Lagrangian architecture performs significantly better. The adaptive B-spline activation further enhances accuracy by providing locality-aware representation near boundaries. While our methodology shows promising results in predicting the velocity field, pressure recovery remains challenging due to the absence of explicit force-coupling constraints in the current formulation. Our findings underscore the importance of domain-specific architectural design and adaptive activation functions for modeling FSI problems within the PINN framework.

nan

Article 1899

Title@2025-05-24 (6): Joint-stochastic-approximation Random Fields with Application to Semi-supervised Learning

Title: Joint-stochastic-approximation Random Fields with Application to Semi-supervised Learning

Gelenk-Stochastische-Annäherung Random Fields mit Anwendung auf semi-überwachtes Lernen

应用到半监督学习的混合随机场 2505.20330v1

Authors: Yunfu Song, Zhijian Ou

Our examination of deep generative models (DGMs) developed for semi-supervised learning (SSL), mainly GANs and VAEs, reveals two problems. First, mode missing and mode covering phenomenons are observed in genertion with GANs and VAEs. Second, there exists an awkward conflict between good classification and good generation in SSL by employing directed generative models. To address these problems, we formally present joint-stochastic-approximation random fields (JRFs) – a new family of algorithms for building deep undirected generative models, with application to SSL. It is found through synthetic experiments that JRFs work well in balancing mode covering and mode missing, and match the empirical data distribution well. Empirically, JRFs achieve good classification results comparable to the state-of-art methods on widely adopted datasets – MNIST, SVHN, and CIFAR-10 in SSL, and simultaneously perform good generation.

nan

Article 1900

Title@2025-05-24 (6): Joint-stochastic-approximation Autoencoders with Application to Semi-supervised Learning

Title: Joint-stochastic-approximation Autoencoders with Application to Semi-supervised Learning

Gelenkstochastische Approximation Autoencoder mit Anwendung auf semi-überwachtes Lernen

应用到半监督学习的联合研究- 接近自动校方 2505.18558v1

Authors: Wenbo He, Zhijian Ou

Our examination of existing deep generative models (DGMs), including VAEs and GANs, reveals two problems. First, their capability in handling discrete observations and latent codes is unsatisfactory, though there are interesting efforts. Second, both VAEs and GANs optimize some criteria that are indirectly related to the data likelihood. To address these problems, we formally present Joint-stochastic-approximation (JSA) autoencoders - a new family of algorithms for building deep directed generative models, with application to semi-supervised learning. The JSA learning algorithm directly maximizes the data log-likelihood and simultaneously minimizes the inclusive KL divergence the between the posteriori and the inference model. We provide theoretical results and conduct a series of experiments to show its superiority such as being robust to structure mismatch between encoder and decoder, consistent handling of both discrete and continuous variables. Particularly we empirically show that JSA autoencoders with discrete latent space achieve comparable performance to other state-of-the-art DGMs with continuous latent space in semi-supervised tasks over the widely adopted datasets - MNIST and SVHN. To the best of our knowledge, this is the first demonstration that discrete latent variable models are successfully applied in the challenging semi-supervised tasks.

nan

Article 1901

Title@2025-05-24 (6): LAMDA: A Longitudinal Android Malware Benchmark for Concept Drift Analysis

Title: LAMDA: A Longitudinal Android Malware Benchmark for Concept Drift Analysis

LAMDA: Ein Longitudinal Android Malware Benchmark für Konzept Drift Analyse

LAMDA: 关于概念漂流分析的纵向和机器人毛毛虫基准 2505.18551v1

Authors: Md Ahsanul Haque, Ismail Hossain, Md Mahmuduzzaman Kamol, Md Jahangir Alam, Suresh Kumar Amalapuram, Sajedul Talukder, Mohammad Saidur Rahman

Machine learning (ML)-based malware detection systems often fail to account for the dynamic nature of real-world training and test data distributions. In practice, these distributions evolve due to frequent changes in the Android ecosystem, adversarial development of new malware families, and the continuous emergence of both benign and malicious applications. Prior studies have shown that such concept drift – distributional shifts in benign and malicious samples, leads to significant degradation in detection performance over time. Despite the practical importance of this issue, existing datasets are often outdated and limited in temporal scope, diversity of malware families, and sample scale, making them insufficient for the systematic evaluation of concept drift in malware detection. To address this gap, we present LAMDA, the largest and most temporally diverse Android malware benchmark to date, designed specifically for concept drift analysis. LAMDA spans 12 years (2013-2025, excluding 2015), includes over 1 million samples (approximately 37% labeled as malware), and covers 1,380 malware families and 150,000 singleton samples, reflecting the natural distribution and evolution of real-world Android applications. We empirically demonstrate LAMDA’s utility by quantifying the performance degradation of standard ML models over time and analyzing feature stability across years. As the most comprehensive Android malware dataset to date, LAMDA enables in-depth research into temporal drift, generalization, explainability, and evolving detection challenges. The dataset and code are available at: https://iqsec-lab.github.io/LAMDA/.

nan

Article 1902

Title@2025-05-24 (6): ReflectGAN: Modeling Vegetation Effects for Soil Carbon Estimation from Satellite Imagery

Title: ReflectGAN: Modeling Vegetation Effects for Soil Carbon Estimation from Satellite Imagery

ReflectGAN: Modellierung von Vegetationseffekten für Bodenkohlenstoffschätzungen aus Satellitenbildern

反射GAN:从卫星图像中模拟土壤碳估计的植被效应 2505.18546v1

Authors: Dristi Datta, Manoranjan Paul, Manzur Murshed, Shyh Wei Teng, Leigh M. Schmidtke

Soil organic carbon (SOC) is a critical indicator of soil health, but its accurate estimation from satellite imagery is hindered in vegetated regions due to spectral contamination from plant cover, which obscures soil reflectance and reduces model reliability. This study proposes the Reflectance Transformation Generative Adversarial Network (ReflectGAN), a novel paired GAN-based framework designed to reconstruct accurate bare soil reflectance from vegetated soil satellite observations. By learning the spectral transformation between vegetated and bare soil reflectance, ReflectGAN facilitates more precise SOC estimation under mixed land cover conditions. Using the LUCAS 2018 dataset and corresponding Landsat 8 imagery, we trained multiple learning-based models on both original and ReflectGAN-reconstructed reflectance inputs. Models trained on ReflectGAN outputs consistently outperformed those using existing vegetation correction methods. For example, the best-performing model (RF) achieved an $R^2$ of 0.54, RMSE of 3.95, and RPD of 2.07 when applied to the ReflectGAN-generated signals, representing a 35\% increase in $R^2$, a 43\% reduction in RMSE, and a 43\% improvement in RPD compared to the best existing method (PMM-SU). The performance of the models with ReflectGAN is also better compared to their counterparts when applied to another dataset, i.e., Sentinel-2 imagery. These findings demonstrate the potential of ReflectGAN to improve SOC estimation accuracy in vegetated landscapes, supporting more reliable soil monitoring.

nan

Article 1903

Title@2025-05-24 (6): B-score: Detecting biases in large language models using response history

Title: B-score: Detecting biases in large language models using response history

B-Score: Voreingenommenheit in großen Sprachmodellen anhand der Antworthistorie erkennen

B-序号:利用回应历史在大型语言模型中发现偏见 2505.18545v1

Authors: An Vo, Mohammad Reza Taesiri, Daeyoung Kim, Anh Totti Nguyen

Large language models (LLMs) often exhibit strong biases, e.g, against women or in favor of the number 7. We investigate whether LLMs would be able to output less biased answers when allowed to observe their prior answers to the same question in a multi-turn conversation. To understand which types of questions invite more biased answers, we test LLMs on our proposed set of questions that span 9 topics and belong to three types: (1) Subjective; (2) Random; and (3) Objective. Interestingly, LLMs are able to “de-bias” themselves in a multi-turn conversation in response to questions that seek an Random, unbiased answer. Furthermore, we propose B-score, a novel metric that is effective in detecting biases to Subjective, Random, Easy, and Hard questions. On MMLU, HLE, and CSQA, leveraging B-score substantially improves the verification accuracy of LLM answers (i.e, accepting LLM correct answers and rejecting incorrect ones) compared to using verbalized confidence scores or the frequency of single-turn answers alone. Code and data are available at: https://b-score.github.io.

nan

Article 1904

Title@2025-05-24 (6): Benchmarking Poisoning Attacks against Retrieval-Augmented Generation

Title: Benchmarking Poisoning Attacks against Retrieval-Augmented Generation

Benchmarking von Giftangriffen gegen retrieval-angereicherte Generation

制定基准,确定对回收一代人进行中毒袭击的基准 2505.18543v1

Authors: Baolei Zhang, Haoran Xin, Jiatong Li, Dongzhe Zhang, Minghong Fang, Zhuqing Liu, Lihai Nie, Zheli Liu

Retrieval-Augmented Generation (RAG) has proven effective in mitigating hallucinations in large language models by incorporating external knowledge during inference. However, this integration introduces new security vulnerabilities, particularly to poisoning attacks. Although prior work has explored various poisoning strategies, a thorough assessment of their practical threat to RAG systems remains missing. To address this gap, we propose the first comprehensive benchmark framework for evaluating poisoning attacks on RAG. Our benchmark covers 5 standard question answering (QA) datasets and 10 expanded variants, along with 13 poisoning attack methods and 7 defense mechanisms, representing a broad spectrum of existing techniques. Using this benchmark, we conduct a comprehensive evaluation of all included attacks and defenses across the full dataset spectrum. Our findings show that while existing attacks perform well on standard QA datasets, their effectiveness drops significantly on the expanded versions. Moreover, our results demonstrate that various advanced RAG architectures, such as sequential, branching, conditional, and loop RAG, as well as multi-turn conversational RAG, multimodal RAG systems, and RAG-based LLM agent systems, remain susceptible to poisoning attacks. Notably, current defense techniques fail to provide robust protection, underscoring the pressing need for more resilient and generalizable defense strategies.

nan

Article 1905

Title@2025-05-24 (6): Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking

Title: Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking

Denken Sie an Ihre Vision: Multimodale Abschätzung refraaktiver Störungen mittels Elektrookulographie und Eye Tracking

思考你的愿景:利用电光学和眼视跟踪对折发性失常进行多模式估计 2505.18538v1

Authors: Xin Wei, Huakun Liu, Yutaro Hirao, Monica Perusquia-Hernandez, Katsutoshi Masai, Hideaki Uchiyama, Kiyoshi Kiyokawa

Refractive errors are among the most common visual impairments globally, yet their diagnosis often relies on active user participation and clinical oversight. This study explores a passive method for estimating refractive power using two eye movement recording techniques: electrooculography (EOG) and video-based eye tracking. Using a publicly available dataset recorded under varying diopter conditions, we trained Long Short-Term Memory (LSTM) models to classify refractive power from unimodal (EOG or eye tracking) and multimodal configuration. We assess performance in both subject-dependent and subject-independent settings to evaluate model personalization and generalizability across individuals. Results show that the multimodal model consistently outperforms unimodal models, achieving the highest average accuracy in both settings: 96.207\% in the subject-dependent scenario and 8.882\% in the subject-independent scenario. However, generalization remains limited, with classification accuracy only marginally above chance in the subject-independent evaluations. Statistical comparisons in the subject-dependent setting confirmed that the multimodal model significantly outperformed the EOG and eye-tracking models. However, no statistically significant differences were found in the subject-independent setting. Our findings demonstrate both the potential and current limitations of eye movement data-based refractive error estimation, contributing to the development of continuous, non-invasive screening methods using EOG signals and eye-tracking data.

nan

Article 1906

Title@2025-05-24 (6): Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD

Title: Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD

Konvergenz, Haft und Flucht: Stochastische Dynamik in der Nähe kritischer Punkte in SGD

聚合、粘合和逃离:SGD中近临界点的斯托卡动态 2505.18535v1

Authors: Dmitry Dudukalov, Artem Logachov, Vladimir Lotov, Timofei Prasolov, Evgeny Prokopenko, Anton Tarasenko

We study the convergence properties and escape dynamics of Stochastic Gradient Descent (SGD) in one-dimensional landscapes, separately considering infinite- and finite-variance noise. Our main focus is to identify the time scales on which SGD reliably moves from an initial point to the local minimum in the same ‘‘basin’’. Under suitable conditions on the noise distribution, we prove that SGD converges to the basin’s minimum unless the initial point lies too close to a local maximum. In that near-maximum scenario, we show that SGD can linger for a long time in its neighborhood. For initial points near a ‘‘sharp’’ maximum, we show that SGD does not remain stuck there, and we provide results to estimate the probability that it will reach each of the two neighboring minima. Overall, our findings present a nuanced view of SGD’s transitions between local maxima and minima, influenced by both noise characteristics and the underlying function geometry.

nan

Article 1907

Title@2025-05-24 (6): CMoE: Converting Mixture-of-Experts from Dense to Accelerate LLM Inference

Title: CMoE: Converting Mixture-of-Experts from Dense to Accelerate LLM Inference

CMoE: Konvertieren von Mischungen von Experten aus Dense zu beschleunigter LLM-Inferenz

CMoE: 将混合专家从高能转换为加速LLM推理 2502.04416v2

Authors: Zehua Pei, Lancheng Zou, Hui-Ling Zhen, Xianzhi Yu, Wulong Liu, Sinno Jialin Pan, Mingxuan Yuan, Bei Yu

Scaling large language models (LLMs) improves performance but dramatically increases inference costs. The feed-forward network (FFN), consuming approximately 70\% of inference compute, represents a critical bottleneck, particularly in large batch size scenarios. While mixture-of-experts (MoE) architectures leverage activation sparsity for efficiency, converting existing dense models to MoEs traditionally requires resource-intensive continual pre-training. We present CMoE, a framework that rapidly transforms dense LLMs into MoEs without training. The key innovation lies in analyzing FFN neuron activations to partition them into shared (always active) and routed experts. Routed neurons are clustered using a balanced assignment algorithm, and a differentiable router is constructed analytically from activation statistics, enabling immediate deployment or optional lightweight fine-tuning. Experiments demonstrate that, with activation ratio of 75\%, it achieves remarkable results, delivering lossless precision in terms of perplexity while still maintaining a 5\% acceleration. Further experiments reveal that a CMoE configuration activating just 25\% of parameters reduces end-to-end latency by 1.5x while preserving usable perplexity without additional training. Moreover, a brief LoRA fine-tuning process (requiring only 1 hour and 2,000 samples) successfully recovers over 76\% of the dense model’s downstream accuracy. By effectively balancing performance and efficiency, CMoE offers a viable path forward for deploying LLMs in real-world scenarios where computational resources are limited. We make our code publicly available at https://github.com/JarvisPei/CMoE.

nan

Article 1908

Title@2025-05-24 (6): Preserving AUC Fairness in Learning with Noisy Protected Groups

Title: Preserving AUC Fairness in Learning with Noisy Protected Groups

AUC Fairness beim Lernen mit geräuschgeschützten Gruppen bewahren

维护AUC在与噪音保护群体学习中的公平公平 2505.18532v1

Authors: Mingyang Wu, Li Lin, Wenbin Zhang, Xin Wang, Zhenhuan Yang, Shu Hu

The Area Under the ROC Curve (AUC) is a key metric for classification, especially under class imbalance, with growing research focus on optimizing AUC over accuracy in applications like medical image analysis and deepfake detection. This leads to fairness in AUC optimization becoming crucial as biases can impact protected groups. While various fairness mitigation techniques exist, fairness considerations in AUC optimization remain in their early stages, with most research focusing on improving AUC fairness under the assumption of clean protected groups. However, these studies often overlook the impact of noisy protected groups, leading to fairness violations in practice. To address this, we propose the first robust AUC fairness approach under noisy protected groups with fairness theoretical guarantees using distributionally robust optimization. Extensive experiments on tabular and image datasets show that our method outperforms state-of-the-art approaches in preserving AUC fairness. The code is in https://github.com/Purdue-M2/AUC_Fairness_with_Noisy_Groups.

nan

Article 1909

Title@2025-05-24 (6): SMART: Self-Aware Agent for Tool Overuse Mitigation

Title: SMART: Self-Aware Agent for Tool Overuse Mitigation

SMART: Self-Aware Agent für Tool Overuse Mitigation

SMART: 减少工具过度使用自智能剂 2502.11435v2

Authors: Cheng Qian, Emre Can Acikgoz, Hongru Wang, Xiusi Chen, Avirup Sil, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji

Current Large Language Model (LLM) agents demonstrate strong reasoning and tool use capabilities, but often lack self-awareness, failing to balance these approaches effectively. This imbalance leads to Tool Overuse, where models unnecessarily rely on external tools for tasks solvable with parametric knowledge, increasing computational overhead. Inspired by human metacognition, we introduce SMART (Strategic Model-Aware Reasoning with Tools), a paradigm that enhances an agent’s self-awareness to optimize task handling and reduce tool overuse. To support this paradigm, we introduce SMART-ER, a dataset spanning three domains, where reasoning alternates between parametric knowledge and tool-dependent steps, with each step enriched by rationales explaining when tools are necessary. Through supervised training, we develop SMARTAgent, a family of models that dynamically balance parametric knowledge and tool use. Evaluations show that SMARTAgent reduces tool use by 24% while improving performance by over 37%, enabling 7B-scale models to match its 70B counterpart and GPT-4o. Additionally, SMARTAgent generalizes to out-of-distribution test data like GSM8K and MINTQA, maintaining accuracy with just one-fifth the tool calls. These highlight the potential of strategic tool use to enhance reasoning, mitigate overuse, and bridge the gap between model size and performance, advancing intelligent and resource-efficient agent designs.

nan

Article 1910

Title@2025-05-24 (6): Compositional Generalization via Forced Rendering of Disentangled Latents

Title: Compositional Generalization via Forced Rendering of Disentangled Latents

Zusammensetzungelle Verallgemeinerung durch Zwangsverleumdung entwirrter Latente

通过强迫拆散的内流流流体 2501.18797v2

Authors: Qiyao Liang, Daoyuan Qian, Liu Ziyin, Ila Fiete

Composition-the ability to generate myriad variations from finite means-is believed to underlie powerful generalization. However, compositional generalization remains a key challenge for deep learning. A widely held assumption is that learning disentangled (factorized) representations naturally supports this kind of extrapolation. Yet, empirical results are mixed, with many generative models failing to recognize and compose factors to generate out-of-distribution (OOD) samples. In this work, we investigate a controlled 2D Gaussian “bump” generation task with fully disentangled (x,y) inputs, demonstrating that standard generative architectures still fail in OOD regions when training with partial data, by re-entangling latent representations in subsequent layers. By examining the model’s learned kernels and manifold geometry, we show that this failure reflects a “memorization” strategy for generation via data superposition rather than via composition of the true factorized features. We show that when models are forced-through architectural modifications with regularization or curated training data-to render the disentangled latents into the full-dimensional representational (pixel) space, they can be highly data-efficient and effective at composing in OOD regions. These findings underscore that disentangled latents in an abstract representation are insufficient and show that if models can represent disentangled factors directly in the output representational space, it can achieve robust compositional generalization.

nan

Article 1911

Title@2025-05-24 (6): CLaDMoP: Learning Transferrable Models from Successful Clinical Trials via LLMs

Title: CLaDMoP: Learning Transferrable Models from Successful Clinical Trials via LLMs

CLaDMoP: Übertragbare Modelle aus erfolgreichen klinischen Studien über LLMs lernen

CLADMOP:通过LLMs成功临床试验学习可转让模型 2505.18527v1

Authors: Yiqing Zhang, Xiaozhong Liu, Fabricio Murai

Many existing models for clinical trial outcome prediction are optimized using task-specific loss functions on trial phase-specific data. While this scheme may boost prediction for common diseases and drugs, it can hinder learning of generalizable representations, leading to more false positives/negatives. To address this limitation, we introduce CLaDMoP, a new pre-training approach for clinical trial outcome prediction, alongside the Successful Clinical Trials dataset(SCT), specifically designed for this task. CLaDMoP leverages a Large Language Model-to encode trials’ eligibility criteria-linked to a lightweight Drug-Molecule branch through a novel multi-level fusion technique. To efficiently fuse long embeddings across levels, we incorporate a grouping block, drastically reducing computational overhead. CLaDMoP avoids reliance on task-specific objectives by pre-training on a “pair matching” proxy task. Compared to established zero-shot and few-shot baselines, our method significantly improves both PR-AUC and ROC-AUC, especially for phase I and phase II trials. We further evaluate and perform ablation on CLaDMoP after Parameter-Efficient Fine-Tuning, comparing it to state-of-the-art supervised baselines, including MEXA-CTP, on the Trial Outcome Prediction(TOP) benchmark. CLaDMoP achieves up to 10.5% improvement in PR-AUC and 3.6% in ROC-AUC, while attaining comparable F1 score to MEXA-CTP, highlighting its potential for clinical trial outcome prediction. Code and SCT dataset can be downloaded from https://github.com/murai-lab/CLaDMoP.

nan

Article 1912

Title@2025-05-24 (6): Scalable Gaussian Processes with Low-Rank Deep Kernel Decomposition

Title: Scalable Gaussian Processes with Low-Rank Deep Kernel Decomposition

Skalierbare Gauß-Prozesse mit niederrassiger Tiefenkernzersetzung

可缩放高斯进程,且低射深内核内核分解 2505.18526v1

Authors: Yunqin Zhu, Henry Shaowu Yuchi, Yao Xie

Kernels are key to encoding prior beliefs and data structures in Gaussian process (GP) models. The design of expressive and scalable kernels has garnered significant research attention. Deep kernel learning enhances kernel flexibility by feeding inputs through a neural network before applying a standard parametric form. However, this approach remains limited by the choice of base kernels, inherits high inference costs, and often demands sparse approximations. Drawing on Mercer’s theorem, we introduce a fully data-driven, scalable deep kernel representation where a neural network directly represents a low-rank kernel through a small set of basis functions. This construction enables highly efficient exact GP inference in linear time and memory without invoking inducing points. It also supports scalable mini-batch training based on a principled variational inference framework. We further propose a simple variance correction procedure to guard against overconfidence in uncertainty estimates. Experiments on synthetic and real-world data demonstrate the advantages of our deep kernel GP in terms of predictive accuracy, uncertainty quantification, and computational efficiency.

nan

Article 1913

Title@2025-05-24 (6): LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs

Title: LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs

LiSTEN: Soft Token-Embeddings für neurale Audio-LLMs lernen

LISTEN: 神经音频LMS学习软软制嵌入器 2505.18517v1

Authors: Pooneh Mousavi, Shubham Gupta, Cem Subakan, Mirco Ravanelli

Foundation models based on large language models (LLMs) have shown great success in handling various tasks and modalities. However, adapting these models for general-purpose audio-language tasks is challenging due to differences in acoustic environments and task variations. In this work, we introduce LiSTEN Learning Soft Token Embeddings for Neural Audio LLMs), a framework for adapting LLMs to speech and audio tasks. LiSTEN uses a dynamic prompt selection strategy with learnable key-value pairs, allowing the model to balance general and task-specific knowledge while avoiding overfitting in a multitask setting. Our approach reduces dependence on large-scale ASR or captioning datasets, achieves competitive performance with fewer trainable parameters, and simplifies training by using a single-stage process. Additionally, LiSTEN enhances interpretability by analyzing the diversity and overlap of selected prompts across different tasks.

nan

Article 1914

Title@2025-05-24 (6): Test-Time Adaptation with Binary Feedback

Title: Test-Time Adaptation with Binary Feedback

Test-Zeit-Anpassung mit Binär-Feedback

带有二进制反馈的测试时间适应 2505.18514v1

Authors: Taeckyung Lee, Sorn Chottananurak, Junsu Kim, Jinwoo Shin, Taesik Gong, Sung-Ju Lee

Deep learning models perform poorly when domain shifts exist between training and test data. Test-time adaptation (TTA) is a paradigm to mitigate this issue by adapting pre-trained models using only unlabeled test samples. However, existing TTA methods can fail under severe domain shifts, while recent active TTA approaches requiring full-class labels are impractical due to high labeling costs. To address this issue, we introduce a new setting of TTA with binary feedback. This setting uses a few binary feedback inputs from annotators to indicate whether model predictions are correct, thereby significantly reducing the labeling burden of annotators. Under the setting, we propose BiTTA, a novel dual-path optimization framework that leverages reinforcement learning to balance binary feedback-guided adaptation on uncertain samples with agreement-based self-adaptation on confident predictions. Experiments show BiTTA achieves 13.3%p accuracy improvements over state-of-the-art baselines, demonstrating its effectiveness in handling severe distribution shifts with minimal labeling effort. The source code is available at https://github.com/taeckyung/BiTTA.

nan

Article 1915

Title@2025-05-24 (6): Enhancing Training Data Attribution with Representational Optimization

Title: Enhancing Training Data Attribution with Representational Optimization

Verbesserung der Schulungsdatenzuweisung mit repräsentativer Optimierung

提高培训数据分配,优化代表性 2505.18513v1

Authors: Weiwei Sun, Haokun Liu, Nikhil Kandpal, Colin Raffel, Yiming Yang

Training data attribution (TDA) methods aim to measure how training data impacts a model’s predictions. While gradient-based attribution methods, such as influence functions, offer theoretical grounding, their computational costs make them impractical for large-scale applications. Representation-based approaches are far more scalable, but typically rely on heuristic embeddings that are not optimized for attribution, limiting their fidelity. To address these challenges, we propose AirRep, a scalable, representation-based approach that closes this gap by learning task-specific and model-aligned representations optimized explicitly for TDA. AirRep introduces two key innovations: a trainable encoder tuned for attribution quality, and an attention-based pooling mechanism that enables accurate estimation of group-wise influence. We train AirRep using a ranking objective over automatically constructed training subsets labeled by their empirical effect on target predictions. Experiments on instruction-tuned LLMs demonstrate that AirRep achieves performance on par with state-of-the-art gradient-based approaches while being nearly two orders of magnitude more efficient at inference time. Further analysis highlights its robustness and generalization across tasks and models. Our code is available at https://github.com/sunnweiwei/AirRep.

nan

Article 1916

Title@2025-05-24 (6): AcuRank: Uncertainty-Aware Adaptive Computation for Listwise Reranking

Title: AcuRank: Uncertainty-Aware Adaptive Computation for Listwise Reranking

AcuRank: Ungewissheits-Bewusst-Adaptive-Computation für Listwise-Reranking

AcuRank: 列表排序的不确定性- 软件适应性计算 2505.18512v1

Authors: Soyoung Yoon, Gyuwan Kim, Gyu-Hwung Cho, Seung-won Hwang

Listwise reranking with large language models (LLMs) enhances top-ranked results in retrieval-based applications. Due to the limit in context size and high inference cost of long context, reranking is typically performed over a fixed size of small subsets, with the final ranking aggregated from these partial results. This fixed computation disregards query difficulty and document distribution, leading to inefficiencies. We propose AcuRank, an adaptive reranking framework that dynamically adjusts both the amount and target of computation based on uncertainty estimates over document relevance. Using a Bayesian TrueSkill model, we iteratively refine relevance estimates until reaching sufficient confidence levels, and our explicit modeling of ranking uncertainty enables principled control over reranking behavior and avoids unnecessary updates to confident predictions. Results on the TREC-DL and BEIR benchmarks show that our method consistently achieves a superior accuracy-efficiency trade-off and scales better with compute than fixed-computation baselines. These results highlight the effectiveness and generalizability of our method across diverse retrieval tasks and LLM-based reranking models.

nan

Article 1917

Title@2025-05-24 (6): SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs

Title: SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs

SPDEBench: Ein umfassender Benchmark für das Lernen regelmäßiger und singulärer stochastischer PDEs

SPDEBENCH: 定期学习和单声速学项目的广泛基准 2505.18511v1

Authors: Zheyan Li, Yuantu Zhu, Hao Ni, Siran Li, Bingguang Chen, Qi Meng

Stochastic Partial Differential Equations (SPDEs) driven by random noise play a central role in modelling physical processes whose spatio-temporal dynamics can be rough, such as turbulence flows, superconductors, and quantum dynamics. To efficiently model these processes and make predictions, machine learning (ML)-based surrogate models are proposed, with their network architectures incorporating the spatio-temporal roughness in their design. However, it lacks an extensive and unified datasets for SPDE learning; especially, existing datasets do not account for the computational error introduced by noise sampling and the necessary renormalization required for handling singular SPDEs. We thus introduce SPDEBench, which is designed to solve typical SPDEs of physical significance (e.g., the $\Phi^4_d$, wave, incompressible Navier–Stokes, and KdV equations) on 1D or 2D tori driven by white noise via ML methods. New datasets for singular SPDEs based on the renormalization process have been constructed, and novel ML models achieving the best results to date have been proposed. In particular, we investigate the impact of computational error introduced by noise sampling and renormalization on the performance comparison of ML models and highlight the importance of selecting high-quality test data for accurate evaluation. Results are benchmarked with traditional numerical solvers and ML-based models, including FNO, NSPDE and DLR-Net, etc. It is shown that, for singular SPDEs, naively applying ML models on data without specifying the numerical schemes can lead to significant errors and misleading conclusions. Our SPDEBench provides an open-source codebase that ensures full reproducibility of benchmarking across a variety of SPDE datasets while offering the flexibility to incorporate new datasets and machine learning baselines, making it a valuable resource for the community.

nan

Article 1918

Title@2025-05-24 (6): How Particle System Theory Enhances Hypergraph Message Passing

Title: How Particle System Theory Enhances Hypergraph Message Passing

Wie Partikelsystemtheorie die Hypergraph-Nachricht verbessert

粒子系统理论如何增强超光速消息传递 2505.18505v1

Authors: Yixuan Ma, Kai Yi, Pietro Lio, Shi Jin, Yu Guang Wang

Hypergraphs effectively model higher-order relationships in natural phenomena, capturing complex interactions beyond pairwise connections. We introduce a novel hypergraph message passing framework inspired by interacting particle systems, where hyperedges act as fields inducing shared node dynamics. By incorporating attraction, repulsion, and Allen-Cahn forcing terms, particles of varying classes and features achieve class-dependent equilibrium, enabling separability through the particle-driven message passing. We investigate both first-order and second-order particle system equations for modeling these dynamics, which mitigate over-smoothing and heterophily thus can capture complete interactions. The more stable second-order system permits deeper message passing. Furthermore, we enhance deterministic message passing with stochastic element to account for interaction uncertainties. We prove theoretically that our approach mitigates over-smoothing by maintaining a positive lower bound on the hypergraph Dirichlet energy during propagation and thus to enable hypergraph message passing to go deep. Empirically, our models demonstrate competitive performance on diverse real-world hypergraph node classification tasks, excelling on both homophilic and heterophilic datasets.

nan

Article 1919

Title: Representation Learning with Mutual Influence of Modalities for Node Classification in Multi-Modal Heterogeneous Networks

Repräsentationslernen mit gegenseitigem Einfluss von Modalitäten für die Knotenklassifikation in multimodalen Heterogenen Netzwerken

多模式不同形式网络节点分类方式相互影响,代表学习 2505.07895v2

Authors: Jiafan Li, Jiaqi Zhu, Liang Chang, Yilin Li, Miaomiao Li, Yang Wang, Hongan Wang

Nowadays, numerous online platforms can be described as multi-modal heterogeneous networks (MMHNs), such as Douban’s movie networks and Amazon’s product review networks. Accurately categorizing nodes within these networks is crucial for analyzing the corresponding entities, which requires effective representation learning on nodes. However, existing multi-modal fusion methods often adopt either early fusion strategies which may lose the unique characteristics of individual modalities, or late fusion approaches overlooking the cross-modal guidance in GNN-based information propagation. In this paper, we propose a novel model for node classification in MMHNs, named Heterogeneous Graph Neural Network with Inter-Modal Attention (HGNN-IMA). It learns node representations by capturing the mutual influence of multiple modalities during the information propagation process, within the framework of heterogeneous graph transformer. Specifically, a nested inter-modal attention mechanism is integrated into the inter-node attention to achieve adaptive multi-modal fusion, and modality alignment is also taken into account to encourage the propagation among nodes with consistent similarities across all modalities. Moreover, an attention loss is augmented to mitigate the impact of missing modalities. Extensive experiments validate the superiority of the model in the node classification task, providing an innovative view to handle multi-modal data, especially when accompanied with network structures.

nan

Article 1920

Title@2025-05-24 (6): LiDAR-EDIT: LiDAR Data Generation by Editing the Object Layouts in Real-World Scenes

Title: LiDAR-EDIT: LiDAR Data Generation by Editing the Object Layouts in Real-World Scenes

LiDAR-EDIT: LiDAR-Datenerstellung durch Bearbeiten der Objektlayouts in realen Szenen

LiDAR-EDIT:通过在真实世界景点中编辑对象布局生成LIDAR数据 2412.00592v3

Authors: Shing-Hei Ho, Bao Thach, Minghan Zhu

We present LiDAR-EDIT, a novel paradigm for generating synthetic LiDAR data for autonomous driving. Our framework edits real-world LiDAR scans by introducing new object layouts while preserving the realism of the background environment. Compared to end-to-end frameworks that generate LiDAR point clouds from scratch, LiDAR-EDIT offers users full control over the object layout, including the number, type, and pose of objects, while keeping most of the original real-world background. Our method also provides object labels for the generated data. Compared to novel view synthesis techniques, our framework allows for the creation of counterfactual scenarios with object layouts significantly different from the original real-world scene. LiDAR-EDIT uses spherical voxelization to enforce correct LiDAR projective geometry in the generated point clouds by construction. During object removal and insertion, generative models are employed to fill the unseen background and object parts that were occluded in the original real LiDAR scans. Experimental results demonstrate that our framework produces realistic LiDAR scans with practical value for downstream tasks.

nan

Article 1921

Title@2025-05-24 (6): EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

Title: EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

EscapeBench: Auf dem Weg zu mehr kreativer Intelligenz von Sprachmodell-Agenten

逃避:努力推进语言示范代理的创意智能 2412.13549v2

Authors: Cheng Qian, Peixuan Han, Qinyu Luo, Bingxiang He, Xiusi Chen, Yuji Zhang, Hongyi Du, Jiarui Yao, Xiaocheng Yang, Denghui Zhang, Yunzhu Li, Heng Ji

Language model agents excel in long-session planning and reasoning, but existing benchmarks primarily focus on goal-oriented tasks with explicit objectives, neglecting creative adaptation in unfamiliar environments. To address this, we introduce EscapeBench, a benchmark suite of room escape game environments designed to challenge agents with creative reasoning, unconventional tool use, and iterative problem-solving to uncover implicit goals. Our results show that current LM models, despite employing working memory and Chain-of-Thought reasoning, achieve only 15% average progress without hints, highlighting their limitations in creativity. To bridge this gap, we propose EscapeAgent, a framework designed to enhance creative reasoning through Foresight (innovative tool use) and Reflection (identifying unsolved tasks). Experiments show that EscapeAgent can execute action chains over 1,000 steps while maintaining logical coherence. It navigates and completes games with up to 40% fewer steps and hints, performs robustly across difficulty levels, and achieves higher action success rates with more efficient and innovative puzzle-solving strategies.

nan

Article 1922

Title@2025-05-24 (6): Perception-Informed Neural Networks: Beyond Physics-Informed Neural Networks

Title: Perception-Informed Neural Networks: Beyond Physics-Informed Neural Networks

Wahrnehmungs-informierte neurale Netzwerke: Jenseits physikinformierter neuraler Netzwerke

感知内化神经网络:超越物理内化神经网络 2505.03806v2

Authors: Mehran Mazandarani, Marzieh Najariyan

This article introduces Perception-Informed Neural Networks (PrINNs), a framework designed to incorporate perception-based information into neural networks, addressing both systems with known and unknown physics laws or differential equations. Moreover, PrINNs extend the concept of Physics-Informed Neural Networks (PINNs) and their variants, offering a platform for the integration of diverse forms of perception precisiation, including singular, probability distribution, possibility distribution, interval, and fuzzy graph. In fact, PrINNs allow neural networks to model dynamical systems by integrating expert knowledge and perception-based information through loss functions, enabling the creation of modern data-driven models. Some of the key contributions include Mixture of Experts Informed Neural Networks (MOEINNs), which combine heterogeneous expert knowledge into the network, and Transformed-Knowledge Informed Neural Networks (TKINNs), which facilitate the incorporation of meta-information for enhanced model performance. Additionally, Fuzzy-Informed Neural Networks (FINNs) as a modern class of fuzzy deep neural networks leverage fuzzy logic constraints within a deep learning architecture, allowing online training without pre-training and eliminating the need for defuzzification. PrINNs represent a significant step forward in bridging the gap between traditional physics-based modeling and modern data-driven approaches, enabling neural networks to learn from both structured physics laws and flexible perception-based rules. This approach empowers neural networks to operate in uncertain environments, model complex systems, and discover new forms of differential equations, making PrINNs a powerful tool for advancing computational science and engineering.

nan

Article 1923

Title@2025-05-24 (6): Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection

Title: Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection

Gruppenadaptive Schwellenoptimierung für robuste KI-generierte Texterkennung

强力AI-发光的文本探测的集团-适应性阈值优化 2502.04528v4

Authors: Minseok Jung, Cynthia Fuertes Panizo, Liam Dugan, Yi R., Fung, Pin-Yu Chen, Paul Pu Liang

The advancement of large language models (LLMs) has made it difficult to differentiate human-written text from AI-generated text. Several AI-text detectors have been developed in response, which typically utilize a fixed global threshold (e.g., $\theta = 0.5$) to classify machine-generated text. However, one universal threshold could fail to account for distributional variations by subgroups. For example, when using a fixed threshold, detectors make more false positive errors on shorter human-written text, and more positive classifications of neurotic writing styles among long texts. These discrepancies can lead to misclassifications that disproportionately affect certain groups. We address this critical limitation by introducing FairOPT, an algorithm for group-specific threshold optimization for probabilistic AI-text detectors. We partitioned data into subgroups based on attributes (e.g., text length and writing style) and implemented FairOPT to learn decision thresholds for each group to reduce discrepancy. In experiments with nine AI text classifiers on three datasets, FairOPT decreases overall balanced error rate (BER) discrepancy by 12\% while minimally sacrificing accuracy by 0.003\%. Our framework paves the way for more robust classification in AI-generated content detection via post-processing.

nan

Article 1924

Title@2025-05-24 (6): Knowledge Grafting of Large Language Models

Title: Knowledge Grafting of Large Language Models

Wissen Graften von großen Sprachmodellen

大语言模式知识转让 2505.18502v1

Authors: Guodong Du, Xuanning Zhou, Junlin Li, Zhuo Li, Zesheng Shi, Wanyu Lin, Ho-Kin Tang, Xiucheng Li, Fangming Liu, Wenya Wang, Min Zhang, Jing Li

Cross-capability transfer is a key challenge in large language model (LLM) research, with applications in multi-task integration, model compression, and continual learning. Recent works like FuseLLM and FuseChat have demonstrated the potential of transferring multiple model capabilities to lightweight models, enhancing adaptability and efficiency, which motivates our investigation into more efficient cross-capability transfer methods. However, existing approaches primarily focus on small, homogeneous models, limiting their applicability. For large, heterogeneous models, knowledge distillation with full-parameter fine-tuning often overlooks the student model’s intrinsic capacity and risks catastrophic forgetting, while PEFT methods struggle to effectively absorb knowledge from source LLMs. To address these issues, we introduce GraftLLM, a novel method that stores source model capabilities in a target model with SkillPack format. This approach preserves general capabilities, reduces parameter conflicts, and supports forget-free continual learning and model fusion. We employ a module-aware adaptive compression strategy to compress parameter updates, ensuring efficient storage while maintaining task-specific knowledge. The resulting SkillPack serves as a compact and transferable knowledge carrier, ideal for heterogeneous model fusion and continual learning. Experiments across various scenarios demonstrate that GraftLLM outperforms existing techniques in knowledge transfer, knowledge fusion, and forget-free learning, providing a scalable and efficient solution for cross-capability transfer. The code is publicly available at: https://github.com/duguodong7/GraftLLM.

nan

Article 1925

Title@2025-05-24 (6): MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning

Title: MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning

MENTOR: Mixture-of-Experts-Netzwerk mit Task-Oriented Perturbation für visuelles Verstärkungslernen

INTOOR: 视力强化学习中以任务为导向的干扰干扰模拟专家网络 2410.14972v2

Authors: Suning Huang, Zheyu Zhang, Tianhai Liang, Yihan Xu, Zhehao Kou, Chenhao Lu, Guowei Xu, Zhengrong Xue, Huazhe Xu

Visual deep reinforcement learning (RL) enables robots to acquire skills from visual input for unstructured tasks. However, current algorithms suffer from low sample efficiency, limiting their practical applicability. In this work, we present MENTOR, a method that improves both the architecture and optimization of RL agents. Specifically, MENTOR replaces the standard multi-layer perceptron (MLP) with a mixture-of-experts (MoE) backbone and introduces a task-oriented perturbation mechanism. MENTOR outperforms state-of-the-art methods across three simulation benchmarks and achieves an average of 83% success rate on three challenging real-world robotic manipulation tasks, significantly surpassing the 32% success rate of the strongest existing model-free visual RL algorithm. These results underscore the importance of sample efficiency in advancing visual RL for real-world robotics. Experimental videos are available at https://suninghuang19.github.io/mentor_page/.

nan

Article 1926

Title@2025-05-24 (6): G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning

Title: G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning

G1: LLMs zur Vernunft bringen bei Diagrammen mit Verstärkungslernen

G1:在加强学习的图表方面向理性者传授法学硕士 2505.18499v1

Authors: Xiaojun Guo, Ang Li, Yifei Wang, Stefanie Jegelka, Yisen Wang

Although Large Language Models (LLMs) have demonstrated remarkable progress, their proficiency in graph-related tasks remains notably limited, hindering the development of truly general-purpose models. Previous attempts, including pretraining graph foundation models or employing supervised fine-tuning, often face challenges such as the scarcity of large-scale, universally represented graph data. We introduce G1, a simple yet effective approach demonstrating that Reinforcement Learning (RL) on synthetic graph-theoretic tasks can significantly scale LLMs’ graph reasoning abilities. To enable RL training, we curate Erd~os, the largest graph reasoning dataset to date comprising 50 diverse graph-theoretic tasks of varying difficulty levels, 100k training data and 5k test data, all drived from real-world graphs. With RL on Erd~os, G1 obtains substantial improvements in graph reasoning, where our finetuned 3B model even outperforms Qwen2.5-72B-Instruct (24x size). RL-trained models also show strong zero-shot generalization to unseen tasks, domains, and graph encoding schemes, including other graph-theoretic benchmarks as well as real-world node classification and link prediction tasks, without compromising general reasoning abilities. Our findings offer an efficient, scalable path for building strong graph reasoners by finetuning LLMs with RL on graph-theoretic tasks, which combines the strengths of pretrained LLM capabilities with abundant, automatically generated synthetic data, suggesting that LLMs possess graph understanding abilities that RL can elicit successfully.

nan

Article 1927

Title@2025-05-24 (6): Quantum Feature Space of a Qubit Coupled to an Arbitrary Bath

Title: Quantum Feature Space of a Qubit Coupled to an Arbitrary Bath

Quanten-Feature-Raum eines Qubits in Verbindung mit einem willkürlichen Bad

与任意浴室结合的Qubit夫妇的量量地貌空间 2505.03397v3

Authors: Chris Wise, Akram Youssry, Alberto Peruzzo, Jo Plested, Matt Woolley

Qubit control protocols have traditionally leveraged a characterisation of the qubit-bath coupling via its power spectral density. Previous work proposed the inference of noise operators that characterise the influence of a classical bath using a grey-box approach that combines deep neural networks with physics-encoded layers. This overall structure is complex and poses challenges in scaling and real-time operations. Here, we show that no expensive neural networks are needed and that this noise operator description admits an efficient parameterisation. We refer to the resulting parameter space as the \textit{quantum feature space} of the qubit dynamics resulting from the coupled bath. We show that the Euclidean distance defined over the quantum feature space provides an effective method for classifying noise processes in the presence of a given set of controls. Using the quantum feature space as the input space for a simple machine learning algorithm (random forest, in this case), we demonstrate that it can effectively classify the stationarity and the broad class of noise processes perturbing a qubit. Finally, we explore how control pulse parameters map to the quantum feature space.

nan

Article 1928

Title@2025-05-24 (6): FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers

Title: FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers

FuseGPT: Lernbare Ebenen Fusion generativer vortrainierter Transformer

FuseGPT: 训练前改造器的产生型先导变异器的可学习层融合 2411.14507v2

Authors: Zehua Pei, Hui-Ling Zhen, Xianzhi Yu, Sinno Jialin Pan, Mingxuan Yuan, Bei Yu

Generative Pre-trained Transformers (GPTs) have demonstrated remarkable performance across diverse domains, largely due to the extensive scaling of model parameters. Recent works have observed redundancy within transformer blocks and developed compression methods by structured pruning of less important blocks. However, such direct removal often leads to irreversible performance degradation. In this paper, we propose FuseGPT, a novel methodology designed to recycle pruned transformer blocks, thereby recovering the model’s performance. Firstly, we introduce a new importance detection metric, Macro Influence (MI), which evaluates the long-term impact of each transformer block by quantifying the information loss incurred upon its removal. Next, we propose group-level layer fusion, which leverages the parameters from layers of less important blocks and integrates them into the corresponding layers of neighboring blocks. This fusion process is not a one-time operation but is refined through iterative parameter updates by lightweight group-level fine-tuning. Specifically, the injected parameters are frozen but are weighted with learnable rank decomposition matrices to reduce the computational overhead during fine-tuning. Our approach not only works well for large language models but also for large multimodal models. Experimental results indicate that, even with modest amounts of data, FuseGPT surpasses previous methods in both perplexity and zero-shot task performance.

nan

Article 1929

Title@2025-05-24 (6): Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking

Title: Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking

Beyond Masked and Unmasked: Diskrete Diffusion Models via Partial Masking

超越遮盖和无遮盖:通过部分遮盖分解扩散模型 2505.18495v1

Authors: Chen-Hao Chao, Wei-Fang Sun, Hanwen Liang, Chun-Yi Lee, Rahul G. Krishnan

Masked diffusion models (MDM) are powerful generative models for discrete data that generate samples by progressively unmasking tokens in a sequence. Each token can take one of two states: masked or unmasked. We observe that token sequences often remain unchanged between consecutive sampling steps; consequently, the model repeatedly processes identical inputs, leading to redundant computation. To address this inefficiency, we propose the Partial masking scheme (Prime), which augments MDM by allowing tokens to take intermediate states interpolated between the masked and unmasked states. This design enables the model to make predictions based on partially observed token information, and facilitates a fine-grained denoising process. We derive a variational training objective and introduce a simple architectural design to accommodate intermediate-state inputs. Our method demonstrates superior performance across a diverse set of generative modeling tasks. On text data, it achieves a perplexity of 15.36 on OpenWebText, outperforming previous MDM (21.52), autoregressive models (17.54), and their hybrid variants (17.58), without relying on an autoregressive formulation. On image data, it attains competitive FID scores of 3.26 on CIFAR-10 and 6.98 on ImageNet-32, comparable to leading continuous generative models.

nan

Article 1930

Title@2025-05-24 (6): FedHL: Federated Learning for Heterogeneous Low-Rank Adaptation via Unbiased Aggregation

Title: FedHL: Federated Learning for Heterogeneous Low-Rank Adaptation via Unbiased Aggregation

FedHL: Föderiertes Lernen für heterogene Low-Rank-Anpassung durch unvoreingenommene Aggregation

FFHL:通过无偏见的聚合体进行异种性、低兰克低差异适应的联邦学习 2505.18494v1

Authors: Zihao Peng, Jiandian Zeng, Boyuan Li, Guo Li, Shengbo Chen, Tian Wang

Federated Learning (FL) facilitates the fine-tuning of Foundation Models (FMs) using distributed data sources, with Low-Rank Adaptation (LoRA) gaining popularity due to its low communication costs and strong performance. While recent work acknowledges the benefits of heterogeneous LoRA in FL and introduces flexible algorithms to support its implementation, our theoretical analysis reveals a critical gap: existing methods lack formal convergence guarantees due to parameter truncation and biased gradient updates. Specifically, adapting client-specific LoRA ranks necessitates truncating global parameters, which introduces inherent truncation errors and leads to subsequent inaccurate gradient updates that accumulate over training rounds, ultimately degrading performance. To address the above issues, we propose \textbf{FedHL}, a simple yet effective \textbf{Fed}erated Learning framework tailored for \textbf{H}eterogeneous \textbf{L}oRA. By leveraging the full-rank global model as a calibrated aggregation basis, FedHL eliminates the direct truncation bias from initial alignment with client-specific ranks. Furthermore, we derive the theoretically optimal aggregation weights by minimizing the gradient drift term in the convergence upper bound. Our analysis shows that FedHL guarantees $\mathcal{O}(1/\sqrt{T})$ convergence rate, and experiments on multiple real-world datasets demonstrate a 1-3\% improvement over several state-of-the-art methods.

nan

Article 1931

Title@2025-05-24 (6): TextArena

Title: TextArena

TextArena

TextArenna 文本 2504.11442v2

Authors: Leon Guertler, Bobby Cheng, Simon Yu, Bo Liu, Leshem Choshen, Cheston Tan

TextArena is an open-source collection of competitive text-based games for training and evaluation of agentic behavior in Large Language Models (LLMs). It spans 57+ unique environments (including single-player, two-player, and multi-player setups) and allows for easy evaluation of model capabilities via an online-play system (against humans and other submitted models) with real-time TrueSkill scores. Traditional benchmarks rarely assess dynamic social skills such as negotiation, theory of mind, and deception, creating a gap that TextArena addresses. Designed with research, community and extensibility in mind, TextArena emphasizes ease of adding new games, adapting the framework, testing models, playing against the models, and training models. Detailed documentation of environments, games, leaderboard, and examples are available on https://github.com/LeonGuertler/TextArena and https://www.textarena.ai/.

nan

Article 1932

Title@2025-05-24 (6): Statistical Inference under Performativity

Title: Statistical Inference under Performativity

Statistische Schlussfolgerung unter Performativität

性能下统计推断值 2505.18493v1

Authors: Xiang Li, Yunai Li, Huiying Zhong, Lihua Lei, Zhun Deng

Performativity of predictions refers to the phenomena that prediction-informed decisions may influence the target they aim to predict, which is widely observed in policy-making in social sciences and economics. In this paper, we initiate the study of statistical inference under performativity. Our contribution is two-fold. First, we build a central limit theorem for estimation and inference under performativity, which enables inferential purposes in policy-making such as constructing confidence intervals or testing hypotheses. Second, we further leverage the derived central limit theorem to investigate prediction-powered inference (PPI) under performativity, which is based on a small labeled dataset and a much larger dataset of machine-learning predictions. This enables us to obtain more precise estimation and improved confidence regions for the model parameter (i.e., policy) of interest in performative prediction. We demonstrate the power of our framework by numerical experiments. To the best of our knowledge, this paper is the first one to establish statistical inference under performativity, which brings up new challenges and inference settings that we believe will add significant values to policy-making, statistics, and machine learning.

nan

Article 1933

Title@2025-05-24 (6): Synthesizing and Adapting Error Correction Data for Mobile Large Language Model Applications

Title: Synthesizing and Adapting Error Correction Data for Mobile Large Language Model Applications

Synchronisieren und Anpassen von Fehlerkorrekturdaten für mobile Großsprachen-Modellanwendungen

合成和调整移动大语言模型应用错误校正数据 2505.18488v1

Authors: Yanxiang Zhang, Zheng Xu, Shanshan Wu, Yuanbo Zhang, Daniel Ramage

Error correction is an important capability when applying large language models (LLMs) to facilitate user typing on mobile devices. In this paper, we use LLMs to synthesize a high-quality dataset of error correction pairs to evaluate and improve LLMs for mobile applications. We first prompt LLMs with error correction domain knowledge to build a scalable and reliable addition to the existing data synthesis pipeline. We then adapt the synthetic data distribution to match the mobile application domain by reweighting the samples. The reweighting model is learnt by predicting (a handful of) live A/B test metrics when deploying LLMs in production, given the LLM performance on offline evaluation data and scores from a small privacy-preserving on-device language model. Finally, we present best practices for mixing our synthetic data with other data sources to improve model performance on error correction in both offline evaluation and production live A/B testing.

nan

Article 1934

Title@2025-05-24 (6): Grounding Bodily Awareness in Visual Representations for Efficient Policy Learning

Title: Grounding Bodily Awareness in Visual Representations for Efficient Policy Learning

Bodily Bewusstsein in visuellen Darstellungen für effizientes politisches Lernen geerdet

提高政策学习效率的视觉表现方面的共同认识 2505.18487v1

Authors: Junlin Wang, Zhiyun Lin

Learning effective visual representations for robotic manipulation remains a fundamental challenge due to the complex body dynamics involved in action execution. In this paper, we study how visual representations that carry body-relevant cues can enable efficient policy learning for downstream robotic manipulation tasks. We present $\textbf{I}$nter-token $\textbf{Con}$trast ($\textbf{ICon}$), a contrastive learning method applied to the token-level representations of Vision Transformers (ViTs). ICon enforces a separation in the feature space between agent-specific and environment-specific tokens, resulting in agent-centric visual representations that embed body-specific inductive biases. This framework can be seamlessly integrated into end-to-end policy learning by incorporating the contrastive loss as an auxiliary objective. Our experiments show that ICon not only improves policy performance across various manipulation tasks but also facilitates policy transfer across different robots. The project website: https://github.com/HenryWJL/icon

nan

Article 1935

Title@2025-05-24 (6): The Prompt is Mightier than the Example

Title: The Prompt is Mightier than the Example

Die Aufforderung ist mächtiger als das Beispiel

火急比例子更强 2505.18485v1

Authors: Shengzhe Xu, Nikhil Muralidhar, Naren Ramakrishnan

Numerous recent prompt optimization approaches like chain-of-thought, have been demonstrated to significantly improve the quality of content generated by large language models (LLMs). In-context learning (ICL), a recent paradigm where a few representative examples guide content generation has also led to strong improvements in generation quality of LLM generated content. This idea has been applied to great effect in synthetic tabular data generation, where LLMs, through effective use of ICL and prompt optimization, can generate data that approximate samples from complex, heterogeneous distributions based on representative examples. However, ensuring high-fidelity synthetic data often requires a very large number of ICL examples which may be unavailable or costly to obtain. At the same time, as LLMs get larger and larger, their in-built prior knowledge becomes vast and can potentially substitute for specific data examples. In this paper, we introduce Knowledge-Guided Prompting (KGP) as a new knob in prompt optimization and explore the ability of KGP-based prompt optimization to offset the cost of ICL. Specifically, we explore the question `how many examples can a prompt substitute for?’ and explore knowledge-guided prompting (KGP) where domain knowledge, either inferred or available, is explicitly injected into the prompt, reducing dependence on ICL examples. Our experiments systematically explore the trade-off between ICL and KGP, revealing an empirical scaling law that quantifies how quality of generated synthetic data varies with increasing domain knowledge and decreasing example count. Our results demonstrate that knowledge-guided prompting can be a scalable alternative, or addition, to in-context examples, unlocking new approaches to synthetic data generation.

nan

Article 1936

Title@2025-05-24 (6): DiffPuter: Empowering Diffusion Models for Missing Data Imputation

Title: DiffPuter: Empowering Diffusion Models for Missing Data Imputation

DiffPuter: Empowering Diffusion Modelle für fehlende Daten-Imputation

DiffPuter:赋予缺失数据计算传播模型权力 2405.20690v2

Authors: Hengrui Zhang, Liancheng Fang, Qitian Wu, Philip S. Yu

Generative models play an important role in missing data imputation in that they aim to learn the joint distribution of full data. However, applying advanced deep generative models (such as Diffusion models) to missing data imputation is challenging due to 1) the inherent incompleteness of the training data and 2) the difficulty in performing conditional inference from unconditional generative models. To deal with these challenges, this paper introduces DiffPuter, a tailored diffusion model combined with the Expectation-Maximization (EM) algorithm for missing data imputation. DiffPuter iteratively trains a diffusion model to learn the joint distribution of missing and observed data and performs an accurate conditional sampling to update the missing values using a tailored reversed sampling strategy. Our theoretical analysis shows that DiffPuter’s training step corresponds to the maximum likelihood estimation of data density (M-step), and its sampling step represents the Expected A Posteriori estimation of missing values (E-step). Extensive experiments across ten diverse datasets and comparisons with 17 different imputation methods demonstrate DiffPuter’s superior performance. Notably, DiffPuter achieves an average improvement of 6.94% in MAE and 4.78% in RMSE compared to the most competitive existing method.

nan

Article 1937

Title@2025-05-24 (6): Change Point Detection in the Frequency Domain with Statistical Reliability

Title: Change Point Detection in the Frequency Domain with Statistical Reliability

Punkterkennung im Frequenzbereich mit statistischer Zuverlässigkeit ändern

具有统计可靠性的频率域的更改点探测 2502.03062v2

Authors: Akifumi Yamada, Tomohiro Shiraishi, Shuichi Nishino, Teruyuki Katsuoka, Kouichi Taji, Ichiro Takeuchi

Effective condition monitoring in complex systems requires identifying change points (CPs) in the frequency domain, as the structural changes often arise across multiple frequencies. This paper extends recent advancements in statistically significant CP detection, based on Selective Inference (SI), to the frequency domain. The proposed SI method quantifies the statistical significance of detected CPs in the frequency domain using $p$-values, ensuring that the detected changes reflect genuine structural shifts in the target system. We address two major technical challenges to achieve this. First, we extend the existing SI framework to the frequency domain by appropriately utilizing the properties of discrete Fourier transform (DFT). Second, we develop an SI method that provides valid $p$-values for CPs where changes occur across multiple frequencies. Experimental results demonstrate that the proposed method reliably identifies genuine CPs with strong statistical guarantees, enabling more accurate root-cause analysis in the frequency domain of complex systems.

nan

Article 1938

Title@2025-05-24 (6): Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective

Title: Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective

Sigmoid-Selbstaufmerksamkeit hat eine geringere Probenkomplexität als Softmax-Selbstaufmerksamkeit: Eine Mischung aus Experten-Perspektive

与 Softmax自觉:混合专家视角相比,Sigmoid自觉的样本复杂性较低。 2502.00281v2

Authors: Fanqi Yan, Huy Nguyen, Pedram Akbarian, Nhat Ho, Alessandro Rinaldo

At the core of the popular Transformer architecture is the self-attention mechanism, which dynamically assigns softmax weights to each input token so that the model can focus on the most salient information. However, the softmax structure slows down the attention computation due to its row-wise nature, and it inherently introduces competition among tokens: as the weight assigned to one token increases, the weights of others decrease. This competitive dynamic may narrow the focus of self-attention to a limited set of features, potentially overlooking other informative characteristics. Recent experimental studies have shown that using the element-wise sigmoid function helps eliminate token competition and reduce the computational overhead. Despite these promising empirical results, a rigorous comparison between sigmoid and softmax self-attention mechanisms remains absent in the literature. This paper closes this gap by theoretically demonstrating that sigmoid self-attention is more sample-efficient than its softmax counterpart. Toward that goal, we represent the self-attention matrix as a mixture of experts and show that ``experts’’ in sigmoid self-attention require significantly less data to achieve the same approximation error as those in softmax self-attention.

nan

Article 1939

Title@2025-05-24 (6): Provably Robust Training of Quantum Circuit Classifiers Against Parameter Noise

Title: Provably Robust Training of Quantum Circuit Classifiers Against Parameter Noise

Wahrscheinlich robustes Training von Quantum Circuit Klassifikatoren gegen Parametergeräusche

针对参数噪音的量子电路分级器的可证实的强力培训 2505.18478v1

Authors: Lucas Tecot, Di Luo, Cho-Jui Hsieh

Advancements in quantum computing have spurred significant interest in harnessing its potential for speedups over classical systems. However, noise remains a major obstacle to achieving reliable quantum algorithms. In this work, we present a provably noise-resilient training theory and algorithm to enhance the robustness of parameterized quantum circuit classifiers. Our method, with a natural connection to Evolutionary Strategies, guarantees resilience to parameter noise with minimal adjustments to commonly used optimization algorithms. Our approach is function-agnostic and adaptable to various quantum circuits, successfully demonstrated in quantum phase classification tasks. By developing provably guaranteed optimization theory with quantum circuits, our work opens new avenues for practical, robust applications of near-term quantum computers.

nan

Article 1940

Title@2025-05-24 (6): CAPE: Covariate-Adjusted Pre-Training for Generalized Epidemic Time Series Forecasting

Title: CAPE: Covariate-Adjusted Pre-Training for Generalized Epidemic Time Series Forecasting

CAPE: Kovariat-adjustierte Vorschulung für generalisierte epidemische Zeitreihen

CAPE: 通用流行病时间序列预测共同调整前培训 2502.03393v3

Authors: Zewen Liu, Juntong Ni, Max S. Y. Lau, Wei Jin

Accurate forecasting of epidemic infection trajectories is crucial for safeguarding public health. However, limited data availability during emerging outbreaks and the complex interaction between environmental factors and disease dynamics present significant challenges for effective forecasting. In response, we introduce CAPE, a novel epidemic pre-training framework designed to harness extensive disease datasets from diverse regions and integrate environmental factors directly into the modeling process for more informed decision-making on downstream diseases. Based on a covariate adjustment framework, CAPE utilizes pre-training combined with hierarchical environment contrasting to identify universal patterns across diseases while estimating latent environmental influences. We have compiled a diverse collection of epidemic time series datasets and validated the effectiveness of CAPE under various evaluation scenarios, including full-shot, few-shot, zero-shot, cross-location, and cross-disease settings, where it outperforms the leading baseline by an average of 9.9% in full-shot and 14.3% in zero-shot settings.

nan

Article 1941

Title@2025-05-24 (6): Using Large Language Models to Tackle Fundamental Challenges in Graph Learning: A Comprehensive Survey

Title: Using Large Language Models to Tackle Fundamental Challenges in Graph Learning: A Comprehensive Survey

Große Sprachmodelle nutzen, um grundlegende Herausforderungen im Graphenlernen zu bewältigen: Eine umfassende Umfrage

使用大语言模式应对图表学习中的基本挑战:全面调查 2505.18475v1

Authors: Mengran Li, Pengyu Zhang, Wenbin Xing, Yijia Zheng, Klim Zaporojets, Junzhou Chen, Ronghui Zhang, Yong Zhang, Siyuan Gong, Jia Hu, Xiaolei Ma, Zhiyuan Liu, Paul Groth, Marcel Worring

Graphs are a widely used paradigm for representing non-Euclidean data, with applications ranging from social network analysis to biomolecular prediction. Conventional graph learning approaches typically rely on fixed structural assumptions or fully observed data, limiting their effectiveness in more complex, noisy, or evolving settings. Consequently, real-world graph data often violates the assumptions of traditional graph learning methods, in particular, it leads to four fundamental challenges: (1) Incompleteness, real-world graphs have missing nodes, edges, or attributes; (2) Imbalance, the distribution of the labels of nodes or edges and their structures for real-world graphs are highly skewed; (3) Cross-domain Heterogeneity, graphs from different domains exhibit incompatible feature spaces or structural patterns; and (4) Dynamic Instability, graphs evolve over time in unpredictable ways. Recent advances in Large Language Models (LLMs) offer the potential to tackle these challenges by leveraging rich semantic reasoning and external knowledge. This survey provides a comprehensive review of how LLMs can be integrated with graph learning to address the aforementioned challenges. For each challenge, we review both traditional solutions and modern LLM-driven approaches, highlighting how LLMs contribute unique advantages. Finally, we discuss open research questions and promising future directions in this emerging interdisciplinary field. To support further exploration, we have curated a repository of recent advances on graph learning challenges: https://github.com/limengran98/Awesome-Literature-Graph-Learning-Challenges.

nan

Article 1942

Title@2025-05-24 (6): Performance and Generalizability Impacts of Incorporating Geolocation into Deep Learning for Dynamic PM2.5 Estimation

Title: Performance and Generalizability Impacts of Incorporating Geolocation into Deep Learning for Dynamic PM2.5 Estimation

Leistung und Verallgemeinerbarkeit Auswirkungen der Einbeziehung von Geolocation in Deep Learning für dynamische PM2.5 Abschätzung

将地理定位纳入深入学习以进行动态PP2.5估算的绩效和通用性影响 2505.18461v1

Authors: Morteza Karimzadeh, Zhongying Wang, James L. Crooks

Deep learning models have demonstrated success in geospatial applications, yet quantifying the role of geolocation information in enhancing model performance and geographic generalizability remains underexplored. A new generation of location encoders have emerged with the goal of capturing attributes present at any given location for downstream use in predictive modeling. Being a nascent area of research, their evaluation has remained largely limited to static tasks such as species distributions or average temperature mapping. In this paper, we discuss and quantify the impact of incorporating geolocation into deep learning for a real-world application domain that is characteristically dynamic (with fast temporal change) and spatially heterogeneous at high resolutions: estimating surface-level daily PM2.5 levels using remotely sensed and ground-level data. We build on a recently published deep learning-based PM2.5 estimation model that achieves state-of-the-art performance on data observed in the contiguous United States. We examine three approaches for incorporating geolocation: excluding geolocation as a baseline, using raw geographic coordinates, and leveraging pretrained location encoders. We evaluate each approach under within-region (WR) and out-of-region (OoR) evaluation scenarios. Aggregate performance metrics indicate that while na"ive incorporation of raw geographic coordinates improves within-region performance by retaining the interpolative value of geographic location, it can hinder generalizability across regions. In contrast, pretrained location encoders like GeoCLIP enhance predictive performance and geographic generalizability for both WR and OoR scenarios. However, qualitative analysis reveals artifact patterns caused by high-degree basis functions and sparse upstream samples in certain areas, and ablation results indicate varying performance among location encoders…

nan

Article 1943

Title@2025-05-24 (6): EdgeAgentX: A Novel Framework for Agentic AI at the Edge in Military Communication Networks

Title: EdgeAgentX: A Novel Framework for Agentic AI at the Edge in Military Communication Networks

EdgeAgentX: Ein neuartiges Framework für Agentische KI am Rand in militärischen Kommunikationsnetzwerken

EdgeAgengengenderX:军事通信网络边缘地带AAA剂性AI新框架 2505.18457v1

Authors: Abir Ray

This paper introduces EdgeAgentX, a novel framework integrating federated learning (FL), multi-agent reinforcement learning (MARL), and adversarial defense mechanisms, tailored for military communication networks. EdgeAgentX significantly improves autonomous decision-making, reduces latency, enhances throughput, and robustly withstands adversarial disruptions, as evidenced by comprehensive simulations.

nan

Article 1944

Title@2025-05-24 (6): On the Limitations and Possibilities of Nash Regret Minimization in Zero-Sum Matrix Games under Noisy Feedback

Title: On the Limitations and Possibilities of Nash Regret Minimization in Zero-Sum Matrix Games under Noisy Feedback

Über die Einschränkungen und Möglichkeiten der Nash Regret Minimierung in Zero-Sum Matrix Games unter Noisy Feedback

根据噪音反馈在零-苏姆母体运动会中尽量减少纳什迟缓的限制和可能性 2306.13233v3

Authors: Arnab Maiti, Kevin Jamieson, Lillian J. Ratliff

This paper studies a variant of two-player zero-sum matrix games, where, at each timestep, the row player selects row $i$, the column player selects column $j$, and the row player receives a noisy reward with expected value $A_{i,j}$, along with noisy feedback on the input matrix $A$. The row player’s goal is to maximize their total reward against an adversarial column player. Nash regret, defined as the difference between the player’s total reward and the game’s Nash equilibrium value scaled by the time horizon $T$, is often used to evaluate algorithmic performance in zero-sum games. We begin by studying the limitations of existing algorithms for minimizing Nash regret. We show that standard algorithm–including Hedge, FTRL, and OMD–as well as the strategy of playing the Nash equilibrium of the empirical matrix–all incur $\Omega(\sqrt{T})$ Nash regret, even when the row player receives noisy feedback on the entire matrix $A$. Furthermore, we show that UCB for matrix games, a natural adaptation of the well-known bandit algorithm, also suffers $\Omega(\sqrt{T})$ Nash regret under bandit feedback. Notably, these lower bounds hold even in the simplest case of $2 \times 2$ matrix games, where the instance-dependent matrix parameters are constant. We next ask whether instance-dependent $\text{polylog}(T)$ Nash regret is achievable against adversarial opponents. We answer this affirmatively. In the full-information setting, we present the first algorithm for general $n \times m$ matrix games that achieves instance-dependent $\text{polylog}(T)$ Nash regret. In the bandit feedback setting, we design an algorithm with similar guarantees for the special case of $2 \times 2$ game–the same regime in which existing algorithms provably suffer $\Omega(\sqrt{T})$ regret despite the simplicity of the instance. Finally, we validate our theoretical results with empirical evidence.

nan

Article 1945

Title@2025-05-24 (6): Reinforcement Learning for Stock Transactions

Title: Reinforcement Learning for Stock Transactions

Verstärkungslernen für Aktientransaktionen

证券交易强化学习 2505.16099v2

Authors: Ziyi Zhou, Nicholas Stern, Julien Laasri

Much research has been done to analyze the stock market. After all, if one can determine a pattern in the chaotic frenzy of transactions, then they could make a hefty profit from capitalizing on these insights. As such, the goal of our project was to apply reinforcement learning (RL) to determine the best time to buy a stock within a given time frame. With only a few adjustments, our model can be extended to identify the best time to sell a stock as well. In order to use the format of free, real-world data to train the model, we define our own Markov Decision Process (MDP) problem. These two papers [5] [6] helped us in formulating the state space and the reward system of our MDP problem. We train a series of agents using Q-Learning, Q-Learning with linear function approximation, and deep Q-Learning. In addition, we try to predict the stock prices using machine learning regression and classification models. We then compare our agents to see if they converge on a policy, and if so, which one learned the best policy to maximize profit on the stock market.

nan

Article 1946

Title@2025-05-24 (6): Anchored Diffusion Language Model

Title: Anchored Diffusion Language Model

Verankertes Diffusions-Sprachenmodell

原成品的传播语言模式 2505.18456v1

Authors: Litu Rout, Constantine Caramanis, Sanjay Shakkottai

Diffusion Language Models (DLMs) promise parallel generation and bidirectional context, yet they underperform autoregressive (AR) models in both likelihood modeling and generated text quality. We identify that this performance gap arises when important tokens (e.g., key words or low-frequency words that anchor a sentence) are masked early in the forward process, limiting contextual information for accurate reconstruction. To address this, we introduce the Anchored Diffusion Language Model (ADLM), a novel two-stage framework that first predicts distributions over important tokens via an anchor network, and then predicts the likelihoods of missing tokens conditioned on the anchored predictions. ADLM significantly improves test perplexity on LM1B and OpenWebText, achieving up to 25.4% gains over prior DLMs, and narrows the gap with strong AR baselines. It also achieves state-of-the-art performance in zero-shot generalization across seven benchmarks and surpasses AR models in MAUVE score, which marks the first time a DLM generates better human-like text than an AR model. Theoretically, we derive an Anchored Negative Evidence Lower Bound (ANELBO) objective and show that anchoring improves sample complexity and likelihood modeling. Beyond diffusion, anchoring boosts performance in AR models and enhances reasoning in math and logic tasks, outperforming existing chain-of-thought approaches

nan

Article 1947

Title@2025-05-24 (6): On Minimax Estimation of Parameters in Softmax-Contaminated Mixture of Experts

Title: On Minimax Estimation of Parameters in Softmax-Contaminated Mixture of Experts

Zur Minimax-Abschätzung von Parametern in Softmax-kontaminierter Mischung von Experten

关于Softmax 被污染的专家混合体参数最小估计 2505.18455v1

Authors: Fanqi Yan, Huy Nguyen, Dung Le, Pedram Akbarian, Nhat Ho, Alessandro Rinaldo

The softmax-contaminated mixture of experts (MoE) model is deployed when a large-scale pre-trained model, which plays the role of a fixed expert, is fine-tuned for learning downstream tasks by including a new contamination part, or prompt, functioning as a new, trainable expert. Despite its popularity and relevance, the theoretical properties of the softmax-contaminated MoE have remained unexplored in the literature. In the paper, we study the convergence rates of the maximum likelihood estimator of gating and prompt parameters in order to gain insights into the statistical properties and potential challenges of fine-tuning with a new prompt. We find that the estimability of these parameters is compromised when the prompt acquires overlapping knowledge with the pre-trained model, in the sense that we make precise by formulating a novel analytic notion of distinguishability. Under distinguishability of the pre-trained and prompt models, we derive minimax optimal estimation rates for all the gating and prompt parameters. By contrast, when the distinguishability condition is violated, these estimation rates become significantly slower due to their dependence on the prompt convergence rate to the pre-trained model. Finally, we empirically corroborate our theoretical findings through several numerical experiments.

nan

Article 1948

Title@2025-05-24 (6): $μ$-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts

Title: $μ$-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts

$μ$-MoE: Test-Time Pruning als Mikro-Grained Mixture-of-Experts

美元-MoE:作为微粒混合剂专家进行试验时休整 2505.18451v1

Authors: Toshiaki Koike-Akino, Jing Liu, Ye Wang

To tackle the huge computational demand of large foundation models, activation-aware compression techniques without retraining have been introduced. However, since these rely on calibration data, domain shift may arise for unknown downstream tasks. With a computationally efficient calibration, activation-aware pruning can be executed for every prompt adaptively, yet achieving reduced complexity at inference. We formulate it as a mixture of micro-experts, called $\mu$-MoE. Several experiments demonstrate that $\mu$-MoE can dynamically adapt to task/prompt-dependent structured sparsity on the fly.

nan

Article 1949

Title@2025-05-24 (6): Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting

Title: Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting

Breaking Silos: Adaptive Modellfusion löst bessere Zeitreihen voraus

破碎硅:适应性模型融合解锁更好的时间序列预测 2505.18442v1

Authors: Zhining Liu, Ze Yang, Xiao Lin, Ruizhong Qiu, Tianxin Wei, Yada Zhu, Hendrik Hamann, Jingrui He, Hanghang Tong

Time-series forecasting plays a critical role in many real-world applications. Although increasingly powerful models have been developed and achieved superior results on benchmark datasets, through a fine-grained sample-level inspection, we find that (i) no single model consistently outperforms others across different test samples, but instead (ii) each model excels in specific cases. These findings prompt us to explore how to adaptively leverage the distinct strengths of various forecasting models for different samples. We introduce TimeFuse, a framework for collective time-series forecasting with sample-level adaptive fusion of heterogeneous models. TimeFuse utilizes meta-features to characterize input time series and trains a learnable fusor to predict optimal model fusion weights for any given input. The fusor can leverage samples from diverse datasets for joint training, allowing it to adapt to a wide variety of temporal patterns and thus generalize to new inputs, even from unseen datasets. Extensive experiments demonstrate the effectiveness of TimeFuse in various long-/short-term forecasting tasks, achieving near-universal improvement over the state-of-the-art individual models. Code is available at https://github.com/ZhiningLiu1998/TimeFuse.

nan

Article 1950

Title@2025-05-24 (6): DB-KSVD: Scalable Alternating Optimization for Disentangling High-Dimensional Embedding Spaces

Title: DB-KSVD: Scalable Alternating Optimization for Disentangling High-Dimensional Embedding Spaces

DB-KSVD: Skalierbare alternierende Optimierung für das Entwirren hochdimensionaler Einbettungsräume

DB-KSVD: 拆分高多元嵌入空间的可缩放变换最佳优化 2505.18441v1

Authors: Romeo Valentin, Sydney M. Katz, Vincent Vanhoucke, Mykel J. Kochenderfer

Dictionary learning has recently emerged as a promising approach for mechanistic interpretability of large transformer models. Disentangling high-dimensional transformer embeddings, however, requires algorithms that scale to high-dimensional data with large sample sizes. Recent work has explored sparse autoencoders (SAEs) for this problem. However, SAEs use a simple linear encoder to solve the sparse encoding subproblem, which is known to be NP-hard. It is therefore interesting to understand whether this structure is sufficient to find good solutions to the dictionary learning problem or if a more sophisticated algorithm could find better solutions. In this work, we propose Double-Batch KSVD (DB-KSVD), a scalable dictionary learning algorithm that adapts the classic KSVD algorithm. DB-KSVD is informed by the rich theoretical foundations of KSVD but scales to datasets with millions of samples and thousands of dimensions. We demonstrate the efficacy of DB-KSVD by disentangling embeddings of the Gemma-2-2B model and evaluating on six metrics from the SAEBench benchmark, where we achieve competitive results when compared to established approaches based on SAEs. By matching SAE performance with an entirely different optimization approach, our results suggest that (i) SAEs do find strong solutions to the dictionary learning problem and (ii) that traditional optimization approaches can be scaled to the required problem sizes, offering a promising avenue for further research. We provide an implementation of DB-KSVD at https://github.com/RomeoV/KSVD.jl.

nan

Article 1951

Title@2025-05-24 (6): Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning

Title: Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning

Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methoden für dezentralisiertes Mehr-Agenten-Verstärkungs-Lernen

分散式多机构强化学习的深神经立体-集中式多机构强化学习方法中全球最佳程度趋同 2505.18433v1

Authors: Zhiyao Zhang, Myeung Suk Oh, FNU Hairi, Ziyue Luo, Alvaro Velasquez, Jia Liu

Actor-critic methods for decentralized multi-agent reinforcement learning (MARL) facilitate collaborative optimal decision making without centralized coordination, thus enabling a wide range of applications in practice. To date, however, most theoretical convergence studies for existing actor-critic decentralized MARL methods are limited to the guarantee of a stationary solution under the linear function approximation. This leaves a significant gap between the highly successful use of deep neural actor-critic for decentralized MARL in practice and the current theoretical understanding. To bridge this gap, in this paper, we make the first attempt to develop a deep neural actor-critic method for decentralized MARL, where both the actor and critic components are inherently non-linear. We show that our proposed method enjoys a global optimality guarantee with a finite-time convergence rate of O(1/T), where T is the total iteration times. This marks the first global convergence result for deep neural actor-critic methods in the MARL literature. We also conduct extensive numerical experiments, which verify our theoretical results.

nan

Article 0

Title@2025-05-29 (4): From Chat Logs to Collective Insights: Aggregative Question Answering

Article 1

Title@2025-05-29 (4): Differential Information: An Information-Theoretic Perspective on Preference Optimization

Article 2

Title@2025-05-29 (4): Model Immunization from a Condition Number Perspective

Article 3

Title@2025-05-29 (4): Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint

Article 4

Title@2025-05-29 (4): REOrdering Patches Improves Vision Models

Article 5

Title@2025-05-29 (4): Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?

Article 6

Title@2025-05-29 (4): Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Article 7

Title@2025-05-29 (4): To Trust Or Not To Trust Your Vision-Language Model’s Prediction

Article 8

Title@2025-05-29 (4): On the Convergence Analysis of Muon

Article 9

Title@2025-05-29 (4): EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast

Article 10

Title@2025-05-29 (4): Keep Everyone Happy: Online Fair Division of Numerous Items with Few Copies

Article 11

Title@2025-05-29 (4): MuLoCo: Muon is a practical inner optimizer for DiLoCo

Article 12

Title@2025-05-29 (4): SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA

Article 13

Title@2025-05-29 (4): ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering

Article 14

Title@2025-05-29 (4): Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields

Article 15

Title@2025-05-29 (4): DiffER: Categorical Diffusion for Chemical Retrosynthesis

Article 16

Title@2025-05-29 (4): COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents

Article 17

Title@2025-05-29 (4): FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

Article 18

Title@2025-05-29 (4): TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning

Article 19

Title@2025-05-29 (4): Foundation Model Hidden Representations for Heart Rate Estimation from Auscultation

Article 20

Title@2025-05-29 (4): Skin Lesion Phenotyping via Nested Multi-modal Contrastive Learning

Article 21

Title@2025-05-29 (4): Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better

Article 22

Title@2025-05-29 (4): (U)NFV: Supervised and Unsupervised Neural Finite Volume Methods for Solving Hyperbolic PDEs

Article 23

Title@2025-05-29 (4): DiCoFlex: Model-agnostic diverse counterfactuals with flexible control

Article 24

Title@2025-05-29 (4): Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms

Article 25

Title@2025-05-29 (4): On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures

Article 26

Title@2025-05-29 (4): From Individual Experience to Collective Evidence: A Reporting-Based Framework for Identifying Systemic Harms

Article 27

Title@2025-05-29 (4): Mobi-$π$: Mobilizing Your Robot Learning Policy

Article 28

Title@2025-05-29 (4): Unifying Perspectives: Plausible Counterfactual Explanations on Global, Group-wise, and Local Levels

Article 29

Title@2025-05-29 (4): Learning Compositional Functions with Transformers from Easy-to-Hard Data

Article 30

Title@2025-05-29 (4): Understanding Mode Connectivity via Parameter Space Symmetry

Article 31

Title@2025-05-29 (4): SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

Article 32

Title@2025-05-29 (4): Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds

Article 33

Title@2025-05-29 (4): GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

Article 34

Title@2025-05-29 (4): Maximizing Confidence Alone Improves Reasoning

Article 35

Title@2025-05-29 (4): SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression

Article 36

Title@2025-05-29 (4): LoLA: Low-Rank Linear Attention With Sparse Caching

Article 37

Title@2025-05-29 (4): AMBER: Adaptive Mesh Generation by Iterative Mesh Resolution Prediction

Article 38

Title@2025-05-29 (4): Bayesian Perspective on Memorization and Reconstruction

Article 39

Title@2025-05-29 (4): Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation