• 00 06-12 (4) Choreographic Quick Changes: First-Class Location (Set) Polymorphism Choreographische schnelle Änderungen: Standort der ersten Klasse (Set) Polymorphismus 舞蹈快速变化:第一类位置(Set)多形态 2506.10913v1
  • 01 06-12 Solving Package Management via Hypergraph Dependency Resolution Lösung des Paketmanagements über Hypergraph Dependency Resolution 通过电报依赖决议解决软件包管理 2506.10803v1
  • 02 06-12 Hazel Deriver: A Live Editor for Constructing Rule-Based Derivations Hazel Deriver: Ein Live-Editor für die Konstruktion regelbasierter Ableitungen Hazel Deriver: 建筑基于规则的衍生物现场编辑 2506.10781v1
  • 03 06-12 Weaver: A Retargetable Compiler Framework for FPQA Quantum Architectures Weaver: Ein Retargetable Compiler Framework für FPQA Quantenarchitekturen Weaver:FPQA量度结构的可重新瞄准的汇编者框架 2409.07870v2
  • 04 06-12 CompilerDream: Learning a Compiler World Model for General Code Optimization CompilerDream: Lernen eines Compiler-Weltmodells für die allgemeine Code-Optimierung 汇编者:学习编纂者世界通用守则优化模式 2404.16077v3
  • 05 06-11 (3) Reward Models Enable Scalable Code Verification by Trading Accuracy for Throughput Reward-Modelle ermöglichen eine skalierbare Code-Überprüfung durch den Handel mit Genauigkeit für Durchsatz 通过交易准确性对交易流量的可缩放代码校验 2506.10056v1
  • 06 06-10 (2) ClassInvGen: Class Invariant Synthesis using Large Language Models ClassInvGen: Class Invariant Synthesis mit großen Sprachmodellen 类 InvGen: 使用大语言模型的分类变量合成 2502.18917v2
  • 07 06-10 Gradual Metaprogramming Stufenweise Metaprogrammierung 渐进元元方案 2506.09043v1
  • 08 06-10 Program Synthesis from Partial Traces Programmsynthese aus partiellen Spuren 部分跟踪程序合成 2504.14480v3
  • 09 06-10 Outcome Logic: A Unified Approach to the Metatheory of Program Logics with Branching Effects Ergebnis-Logik: Ein einheitlicher Ansatz zur Metatheorie von Programm-Logik mit Verzweigungseffekten 结果逻辑:对具有分流效应的方案逻辑比喻的统一方法 2401.04594v3
  • 10 06-10 Linguine: A Natural-Language Programming Language with Formal Semantics and a Clean Compiler Pipeline Linguine: Eine natursprachliche Programmiersprache mit formaler Semantik und einer sauberen Compiler-Pipeline 语言:一种自然语言-语言编程语言,有正式的语义和清洁编译管道 2506.08396v1
  • 11 06-10 A Language-Agnostic Logical Relation for Message-Passing Protocols Eine sprach-agnostische Logische Beziehung für Message-Passing-Protokolle 发送信件协议的语言不可接受逻辑关系 2506.10026v1
  • 12 06-09 (1) Verification of the Release-Acquire Semantics Überprüfung des Release-Acquire Semantics 释放-获取语义学的核查 2506.08238v1
  • 13 06-09 Execution-Aware Program Reduction for WebAssembly via Record and Replay Execution-Aware Programmreduktion für WebAssembly über Aufzeichnung und Wiedergabe 通过录制和重放减少网络摄像头的 执行软件程序 2506.07834v1
  • 14 06-09 Pel, A Programming Language for Orchestrating AI Agents Pel, eine Programmiersprache für die Orchestrierung von KI-Agenten Pel, 用于指挥AI代理的编程语言 2505.13453v2
  • 15 06-08 (7) From Tool Calling to Symbolic Thinking: LLMs in a Persistent Lisp Metaprogramming Loop Vom Tool Calling zum Symbolischen Denken: LLMs in einem persistenten Lisp Metaprogramming Loop 从工具呼叫到符号思维:在持久性 Lisp 元方案化循环中的LLMs 2506.10021v1
  • 16 06-08 From Informal to Formal – Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs Vom informellen zum formalen – Einbinden und Bewerten von LLMs über natürliche Sprachanforderungen bis hin zu überprüfbaren Formalproofs 从非正式到正式 – – 纳入和评价关于自然语言要求与可核实的正式证明之间的LLMs 2501.16207v4
  • 17 06-08 Two-sorted algebraic decompositions of Brookes’s shared-state denotational semantics Zwei-sortierte algebraische Zersetzungen von Brookes’ shared-state denotational semantics 布鲁克斯的 共同状态分解语义学的 双组代数分解 2501.15104v3
  • 18 06-07 (6) VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification VeriThoughts: Automatisierte Generierung von Verilog-Codes mittels Begründung und formaler Überprüfung Verithoughts: 利用理由说明和正式核查,使自动生成Verilog码 2505.20302v2
  • 19 06-07 Validating Quantum State Preparation Programs Validierung von Quantenzustandsvorbereitungsprogrammen 验证量子州编制方案 2501.05616v3
  • 20 06-07 Object-Spatial Programming Objekträumliche Programmierung 物体空间方案拟订 2503.15812v6
  • 21 06-07 Hadamard-$Π$: Equational Quantum Programming Hadamard-$I$: Äquatorielle Quantenprogrammierung Hadamard-$ $: 等量量方案编制 2506.06835v1
  • 22 06-07 Denotational Semantics for Probabilistic and Concurrent Programs Denotationelle Semantik für probabilistische und gleichzeitige Programme 概率和同时方案的说明性代记性语义学 2503.02768v2
  • 23 06-06 (5) Simplifying explicit subtyping coercions in a polymorphic calculus with effects Vereinfachung von expliziten Subtyping-Zwangen in einem polymorphen Kalkül mit Effekten 在具有效果的多形态微积分中简化显性亚型强制 2404.04218v2
  • 24 06-06 Reasoning about External Calls Begründung externer Anrufe 外部呼叫理由 2506.06544v1
  • 25 06-06 CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports CompilerGPT: Nutzung großer Sprachmodelle zur Analyse und Umsetzung von Compiler-Optimierungsberichten 汇编者最佳化报告:利用大语言模型进行分析并采取行动 2506.06227v1
  • 26 06-06 HEC: Equivalence Verification Checking for Code Transformation via Equality Saturation HEC: Überprüfung der Gleichwertigkeit auf Code-Transformation durch Gleichstellungssättigung EEC: 通过平等饱和对代码转换进行等同核查 2506.02290v2
  • 27 06-06 A Sound and Complete Characterization of Fair Asynchronous Session Subtyping Eine Klang- und vollständige Charakterisierung von Fair Asynchron Session Subtyping 公平非同步届会的健全和完整特点 2506.06078v1
  • 28 06-06 An Execution Model for RICE Ein Ausführungsmodell für RICE RICIC 执行模式执行模式 2506.05839v1
  • 29 06-06 Mirage: A Multi-Level Superoptimizer for Tensor Programs Mirage: Ein Multi-Level-Superoptimizer für Tensor-Programme 幻影:向导方案多层次超强激励器 2405.05751v3
  • 30 06-06 CoopetitiveV: Leveraging LLM-powered Coopetitive Multi-Agent Prompting for High-quality Verilog Generation CoopetitiveV: LLM-powered Coopetitive Multi-Agent für hochwertige Verilog-Generation 协作V:利用LLM-动力协同协作的多方协作促进高品质活性一代 2412.11014v2
  • 31 06-06 Autocomp: LLM-Driven Code Optimization for Tensor Accelerators Autocomp: LLM-gesteuerte Code-Optimierung für Tensor-Beschleuniger 自动comp: LLM- Driven 代码对 Tensor 加速器的优化 2505.18574v2
  • 32 06-05 (4) Classical notions of computation and the Hasegawa-Thielecke theorem Klassische Begriffe der Berechnung und das Hasegawa-Thielecke-Theorem 经典的计算概念和长谷川-长谷川-希列克定理 2502.13033v2
  • 33 06-05 Efficient Formal Verification of Quantum Error Correcting Programs Effiziente formale Überprüfung von Quantenfehler-Korrekturprogrammen 量化错误纠正程序的有效正式核实 2504.07732v2
  • 34 06-05 hdl2v: A Code Translation Dataset for Enhanced LLM Verilog Generation hdl2v: Ein Code-Übersetzungsdatensatz für verbesserte LLM Verilog-Generierung hdl2v: 用于强化LLM Verilog 生成的代码翻译数据集 2506.04544v1

Article 0

Title@2025-06-12 (4): Choreographic Quick Changes: First-Class Location (Set) Polymorphism

Title: Choreographic Quick Changes: First-Class Location (Set) Polymorphism Choreographische schnelle Änderungen: Standort der ersten Klasse (Set) Polymorphismus 舞蹈快速变化:第一类位置(Set)多形态 2506.10913v1

Authors (3): Ashley Samuelson, Andrew K. Hirsch, Ethan Cecchetti

Choreographic programming is a promising new paradigm for programming concurrent systems where a developer writes a single centralized program that compiles to individual programs for each node. Existing choreographic languages, however, lack critical features integral to modern systems, like the ability of one node to dynamically compute who should perform a computation and send that decision to others. This work addresses this gap with $\lambda_{QC}$, the first typed choreographic language with \emph{first class process names} and polymorphism over both types and (sets of) locations. $\lambda_{QC}$ also improves expressive power over previous work by supporting algebraic and recursive data types as well as multiply-located values. We formalize and mechanically verify our results in Rocq, including the standard choreographic guarantee of deadlock freedom.

舞蹈编程是编程并行系统的一个有希望的新范例,在编程中,一位开发者为每个节点的单个程序编译一个单一的集中程序。然而,现有的舞蹈语言缺乏现代系统所不可或缺的关键特征,例如一个节点能够动态计算谁应进行计算并将决定发送给其他人。这项工作用美元($)解决了这一差距,这是首种带有 emph{ 一级进程名称的舞蹈语言,以及两种类型和(类)地点的多元形态主义。$(lambda}C})也通过支持代数和循环数据类型以及倍数定位值,提高了对以往工作的表达力。我们正式和机械地核查了我们在罗克(Rocq)中的结果,包括僵局自由的标准舞蹈保障。


Article 1

Title@2025-06-12 (4): Solving Package Management via Hypergraph Dependency Resolution

Title: Solving Package Management via Hypergraph Dependency Resolution Lösung des Paketmanagements über Hypergraph Dependency Resolution 通过电报依赖决议解决软件包管理 2506.10803v1

Authors (10): Ryan Gibb, Patrick Ferris, David Allsopp, Michael Winston Dales, Mark Elvers, Thomas Gazagnaire, Sadiq Jaffer, Thomas Leonard, Jon Ludlam, Anil Madhavapeddy

Package managers are everywhere, with seemingly every language and operating system implementing their own solution. The lack of interoperability between these systems means that multi-lingual projects are unable to express precise dependencies across language ecosystems, and external system and hardware dependencies are typically implicit and unversioned. We define HyperRes, a formal system for describing versioned dependency resolution using a hypergraph that is expressive enough to model many ecosystems and solve dependency constraints across them. We define translations from dozens of existing package managers to HyperRes and comprehensively demonstrate that dependency resolution can work across ecosystems that are currently distinct. This does not require users to shift their choice of package managers; instead, HyperRes allows for the translation of packaging metadata between ecosystems, and for solving to be precisely specialised to a particular deployment environment.

软件包管理员无处不在,看似每一种语言和业务系统都有自己的解决方案。这些系统之间缺乏互操作性,这意味着多种语文项目无法表达不同语言生态系统的精确依赖性,外部系统和硬件依赖性通常是隐含和没有反向的。我们定义了HyperRes,这是一个正式的系统,用于用高压图描述已版本的依赖性分辨率,该高压图足以模拟许多生态系统并解决它们之间的依赖性制约。我们定义了从几十个现有软件包管理员到HyperRes的翻译,并全面表明依赖性解决方案可在目前不同生态系统之间发挥作用。这不要求用户改变对软件包管理员的选择;相反,HyperRes允许在生态系统之间翻译包装元数据,并精确地专门解决特定部署环境的问题。


Article 2

Title@2025-06-12 (4): Hazel Deriver: A Live Editor for Constructing Rule-Based Derivations

Title: Hazel Deriver: A Live Editor for Constructing Rule-Based Derivations Hazel Deriver: Ein Live-Editor für die Konstruktion regelbasierter Ableitungen Hazel Deriver: 建筑基于规则的衍生物现场编辑 2506.10781v1

Authors (2): Zhiyao Zhong, Cyrus Omar

Students in programming languages and formal logic courses often struggle with constructing rule-based derivation trees due to the complexity of applying inference rules, the lack of immediate feedback, and the manual effort required for handwritten proofs. We present Hazel Deriver, a live, web-based editor designed to scaffold derivation construction through multiple layers of support. Built on the Hazel live programming environment, it provides a structured, interactive experience that encourages iterative exploration and real-time feedback. A preliminary user study with former students suggests that Hazel Deriver reduces the perceived difficulty of derivation tasks while improving conceptual understanding and engagement. We discuss the design of its layered scaffolding features and raise questions about balancing system guidance with learner autonomy.

编程语言和正规逻辑课程的学生往往因为应用推论规则的复杂性、缺乏即时反馈和手写证据所需的人工努力而难以建造有章可循的衍生树。我们介绍Hazel Deriver,一个现场的网络编辑,旨在通过多层支持来搭建衍生工具。在Hazel现场编程环境中,它提供了结构化的互动经验,鼓励反复探索和实时反馈。与前学生进行的一项初步用户研究表明,Hazel Deriver在改进概念理解和参与的同时,减少了已知的衍生任务的困难。我们讨论了其分层脚架特征的设计,并提出了如何平衡系统指导与学习者自主性的问题。


Article 3

Title@2025-06-12 (4): Weaver: A Retargetable Compiler Framework for FPQA Quantum Architectures

Title: Weaver: A Retargetable Compiler Framework for FPQA Quantum Architectures Weaver: Ein Retargetable Compiler Framework für FPQA Quantenarchitekturen Weaver:FPQA量度结构的可重新瞄准的汇编者框架 2409.07870v2

Authors (4): Oğuzcan Kırmemiş, Francisco Romão, Emmanouil Giortamis, Pramod Bhatotia

While the prominent quantum computing architectures are based on superconducting technology, new quantum hardware technologies are emerging, such as Trapped Ions, Neutral Atoms (or FPQAs), Silicon Spin Qubits, etc. This diverse set of technologies presents fundamental trade-offs in terms of scalability, performance, manufacturing, and operating expenses. To manage these diverse quantum technologies, there is a growing need for a retargetable compiler that can efficiently adapt existing code to these emerging hardware platforms. Such a retargetable compiler must be extensible to support new and rapidly evolving technologies, performant with fast compilation times and high-fidelity execution, and verifiable through rigorous equivalence checking to ensure the functional equivalence of the retargeted code. To this end, we present $Weaver$, the first extensible, performant, and verifiable retargetable quantum compiler framework with a focus on FPQAs due to their unique, promising features. $Weaver$ introduces WQASM, the first formal extension of the standard OpenQASM quantum assembly with FPQA-specific instructions to support their distinct capabilities. Next, $Weaver$ implements the WOptimizer, an extensible set of FPQA-specific optimization passes to improve execution quality. Last, the WChecker automatically checks for equivalence between the original and the retargeted code. Our evaluation shows that $Weaver$ improves compilation times by $10^3\times$, execution times by $4.4\times$, and execution fidelity by $10\%$, on average, compared to superconducting and state-of-the-art (non-retargetable) FPQA compilers.

虽然著名的量子计算架构以超导技术为基础,但新的量子硬件技术正在出现,例如陷阱化的Ions、中立原子(或FPQAs)、硅旋转二次曲线等。这组不同的技术在可缩放性、性能、制造和运行费用方面提出了根本性的权衡。为了管理这些不同的量子技术,越来越需要一个可重新瞄准的编译器,能够有效地将现有代码调整到这些新兴硬件平台。这样的可重新瞄准的编译器必须能够扩展,以支持新的和迅速发展的技术,能够快速的编译时间和高纤维化执行,并通过严格的等值检查进行核查,以确保目标代码的功能等同性。为此,我们提供了美元,这是第一个可扩展的、可执行的和可核实的量子汇编框架,其重点是FPQA,因其独特、有希望的特性, $WAFSM, 将标准国质标的首次正式扩展,以PQA专用指令来运行,支持其明确的执行能力。


Article 4

Title@2025-06-12 (4): CompilerDream: Learning a Compiler World Model for General Code Optimization

Title: CompilerDream: Learning a Compiler World Model for General Code Optimization CompilerDream: Lernen eines Compiler-Weltmodells für die allgemeine Code-Optimierung 汇编者:学习编纂者世界通用守则优化模式 2404.16077v3

Authors (5): Chaoyi Deng, Jialong Wu, Ningya Feng, Jianmin Wang, Mingsheng Long

Effective code optimization in compilers is crucial for computer and software engineering. The success of these optimizations primarily depends on the selection and ordering of the optimization passes applied to the code. While most compilers rely on a fixed sequence of optimization passes, current methods to find the optimal sequence either employ impractically slow search algorithms or learning methods that struggle to generalize to code unseen during training. We introduce CompilerDream, a model-based reinforcement learning approach to general code optimization. CompilerDream comprises a compiler world model that accurately simulates the intrinsic properties of optimization passes and an agent trained on this model to produce effective optimization strategies. By training on a large-scale program dataset, CompilerDream is equipped to serve as a general code optimizer across various application scenarios and source-code languages. Our extensive experiments first highlight CompilerDream’s strong optimization capabilities for autotuning, where it leads the CompilerGym leaderboard. More importantly, the zero-shot generalization ability of large-scale trained compiler world model and agent, excels across diverse datasets, surpassing LLVM’s built-in optimizations and other state-of-the-art methods in both settings of value prediction and end-to-end code optimization.

编译器的有效代码优化对于计算机和软件工程至关重要。 这些优化的成功主要取决于对代码应用的优化通行证的选择和顺序。 虽然大多数编译器依赖固定的优化通行证序列, 但当前寻找最佳序列的方法要么采用不切实际的慢速搜索算法, 要么采用在培训期间难以普及到无法理解的代码的学习方法。 我们引入了基于模型的强化学习方法CapilrDream, 这是一种基于模型的强化学习方法, 用于一般代码优化。 编译器Dream 包括一个编译器世界模型, 准确模拟优化通行证的内在特性, 以及一个经过培训的代理人, 以制作有效的优化战略。 虽然大多数编译器依赖一个大型程序数据集的培训, 但通过对编译器的数据集进行培训, 将编译器Dream 的强大优化能力 首次凸显了 , 并引导了编译器Gym 领导板。 更重要的是, 大规模经过培训的编译器世界模型和代理器的零光化能力, 超越了各种数据集, 超越了LLVM的终极值和最优化预测设置中的其他状态。


Article 5

Title@2025-06-11 (3): Reward Models Enable Scalable Code Verification by Trading Accuracy for Throughput

Title: Reward Models Enable Scalable Code Verification by Trading Accuracy for Throughput Reward-Modelle ermöglichen eine skalierbare Code-Überprüfung durch den Handel mit Genauigkeit für Durchsatz 通过交易准确性对交易流量的可缩放代码校验 2506.10056v1

Authors (4): Gabriel Orlanski, Nicholas Roberts, Aws Albarghouthi, Frederic Sala

The standard paradigm for solving coding tasks via large language models (LLMs) is to generate-then-rank programs, where the latter step uses a verifier in the ranking process. The growing consensus is that a comprehensive verifier (e.g., a full test suite) should be prioritized over an outcome reward model (ORM) whenever possible, with little consideration given to the trade-offs involved. We aim to challenge this assumption by systematically exploring the tradeoff between speed and accuracy. We find that ORMs play a crucial role in scaling verification through trading accuracy for speed, even when a comprehensive verifier is available. Their value becomes especially apparent when used in a generate-prune-then-rank approach, where a faster but less accurate verifier removes incorrect solutions prior to ranking – leading to a system that is 11.65x faster while only being 8.33% less accurate than the full test suite. We analyze the generate-prune-then-rank approach and show that it works by filtering out incorrect but highly ranked solutions. These findings enable the design of scalable and accurate program ranking systems.

通过大语言模型(LLMS)解决编码任务的标准范式是生成当值程序,让后一步在排名过程中使用核查员。日益形成的共识是,尽可能将综合核查员(如完整的测试套件)置于结果奖励模式(ORM)之上,而很少考虑到所涉及的权衡。我们的目标是通过系统地探索速度和准确性之间的权衡来挑战这一假设。我们发现ORM公司在通过快速交易精确度来扩大核查规模方面发挥着关键作用,即使有一个全面的核查员。在采用生产-生产-生产-当值-排位方法时,其价值尤其明显,在这种方法中,快速但不准确的核查员在排位前消除了不正确的解决方案 – – 导致一个系统比整个测试套件更快11.65x,而仅差8.33%的准确率。我们分析了生成-生产-当值-排位方法,并表明它通过筛选不正确但排位高的解决方案来发挥作用。这些发现,能够设计可扩展和准确的程序排位系统。


Article 6

Title@2025-06-10 (2): ClassInvGen: Class Invariant Synthesis using Large Language Models

Title: ClassInvGen: Class Invariant Synthesis using Large Language Models ClassInvGen: Class Invariant Synthesis mit großen Sprachmodellen 类 InvGen: 使用大语言模型的分类变量合成 2502.18917v2

Authors (8): Chuyue Sun, Viraj Agashe, Saikat Chakraborty, Jubi Taneja, Clark Barrett, David Dill, Xiaokang Qiu, Shuvendu K. Lahiri

Formal program specifications in the form of preconditions, postconditions, and class invariants have several benefits for the construction and maintenance of programs. They not only aid in program understanding due to their unambiguous semantics but can also be enforced dynamically (or even statically when the language supports a formal verifier). However, synthesizing high-quality specifications in an underlying programming language is limited by the expressivity of the specifications or the need to express them in a declarative manner. Prior work has demonstrated the potential of large language models (LLMs) for synthesizing high-quality method pre/postconditions for Python and Java, but does not consider class invariants. In this work, we describe ClassInvGen, a method for co-generating executable class invariants and test inputs to produce high-quality class invariants for a mainstream language such as C++, leveraging LLMs’ ability to synthesize pure functions. We show that ClassInvGen outperforms a pure LLM-based technique to generate specifications (from code) as well as prior data-driven invariant inference techniques such as Daikon. We contribute a benchmark of standard C++ data structures along with a harness that can help measure both the correctness and completeness of generated specifications using tests and mutants. We also demonstrate its applicability to real-world code by performing a case study on several classes within a widely used and high-integrity C++ codebase.

以先决条件、后期条件和阶级变异等形式综合高质量的程序规格,对程序的设计和维护有若干好处。它们不仅有助于方案理解,因为它们的语义明确,而且可以动态地执行(或者当语言支持正式验证器时,甚至静态地执行)。然而,将高质量规格合成成一种基本编程语言,由于规格的清晰度或需要以宣示的方式表达这些规格,因而受到限制。先前的工作表明,大型语言模型(LLLMs)有可能合成Python和爪哇的高质量方法前期/后期,但并不广泛考虑类内变异性。在这项工作中,我们描述了Cel InvGen,这是共同生成可执行性分类的方法,测试投入,以产生一种优质的变异性语言,例如C++,利用LLMs的能力合成纯净功能。我们表明,Sleg InvGen的纯性LM技术超越了生成规格(从代码)的纯精度技术,以及先前的数据驱动的易变性前期,没有考虑等级。我们描述了CevGlevGen Indealtal roup roup roup roup roup roup roup roup lapeaude laus laus laus laus lave laus lax lax lax lax lax laus a lax lax lave lave lave laveal lax lax laus lax lax lax laus laus labild lave lave lax labild lax lax lax lax lax lax lax lax lax lax lax lax ex laveal lax lax lax lax lax ex lave lave lab labal lax lax lax lax laved lave lave lax lax lave ex lax ex ex ex lax ex ex ex ex ex ex laus laus ex ex ex


Article 7

Title@2025-06-10 (2): Gradual Metaprogramming

Title: Gradual Metaprogramming Stufenweise Metaprogrammierung 渐进元元方案 2506.09043v1

Authors (7): Tianyu Chen, Darshal Shetty, Jeremy G. Siek, Chao-Hong Chen, Weixi Ma, Arnaud Venet, Rocky Liu

Data engineers increasingly use domain-specific languages (DSLs) to generate the code for data pipelines. Such DSLs are often embedded in Python. Unfortunately, there are challenges in debugging the generation of data pipelines: an error in a Python DSL script is often detected too late, after the execution of the script, and the source code location that triggers the error is hard to pinpoint. In this paper, we focus on the F3 DSL of Meta (Facebook), which is a DSL embedded in Python (so it is dynamically-typed) to generate data pipeline description code that is statically-typed. We propose gradual metaprogramming to (1) provide a migration path toward statically typed DSLs, (2) immediately provide earlier detection of code generation type errors, and (3) report the source code location responsible for the type error. Gradual metaprogramming accomplishes this by type checking code fragments and incrementally performing runtime checks as they are spliced together. We define MetaGTLC, a metaprogramming calculus in which a gradually-typed metalanguage manipulates a statically-typed object language, and give semantics to it by translation to the cast calculus MetaCC. We prove that successful metaevaluation always generates a well-typed object program and mechanize the proof in Agda.

数据工程师越来越多地使用特定域语言( DSLs) 生成数据管道代码。 这种 DSL 通常嵌入 Python 中。 不幸的是, 在调试数据管道生成中存在挑战: 执行脚本后, Python DSL 脚本中的错误往往被检测得太晚, 触发错误的源代码位置很难定位。 在本文中, 我们关注Meta ( F3 DSL (Facebook) 的 F3 DSL (Facebook) , 这是嵌入 Python (因此是动态型式的) 的 DSL, 以生成静态型数据管道描述代码。 我们建议渐进式元式元式元式编程图解解密:(1) 向静态类型打入 DSLSL, (2) 立即提供对代码生成型错误的早期检测, (3) 报告类型错误的源代码位置。 渐渐变的元程序完成此任务, 我们定义了MetGTLC, 一种元式的标定式的标本, 最终将一个固定式的代谢式的代谢程序转化为, 将它的立成一个固定式的代式的代谢式的代谢式的代谢。


Article 8

Title@2025-06-10 (2): Program Synthesis from Partial Traces

Title: Program Synthesis from Partial Traces Programmsynthese aus partiellen Spuren 部分跟踪程序合成 2504.14480v3

Authors (4): Margarida Ferreira, Victor Nicolet, Joey Dodds, Daniel Kroening

We present the first technique to synthesize programs that compose side-effecting functions, pure functions, and control flow, from partial traces containing records of only the side-effecting functions. This technique can be applied to synthesize API composing scripts from logs of calls made to those APIs, or a script from traces of system calls made by a workload, for example. All of the provided traces are positive examples, meaning that they describe desired behavior. Our approach does not require negative examples. Instead, it generalizes over the examples and uses cost metrics to prevent over-generalization. Because the problem is too complex for traditional monolithic program synthesis techniques, we propose a new combination of optimizing rewrites and syntax-guided program synthesis. The resulting program is correct by construction, so its output will always be able to reproduce the input traces. We evaluate the quality of the programs synthesized when considering various optimization metrics and the synthesizer’s efficiency on real-world benchmarks. The results show that our approach can generate useful real-world programs.

我们展示了第一种综合程序的技术,这种程序构成副作用功能、纯功能和控制流程,其内容来自仅包含副作用功能记录的部分痕迹。这种技术可用于合成API,其中含有向API通话的记录中的脚本,或由工作量产生的系统电话记录中的脚本。我们提供的所有痕迹都是正面的例子,这意味着它们描述想要的行为。我们的方法并不需要负面的例子。相反,它概括了实例,并使用成本衡量尺度来防止过度普及。由于这个问题对于传统的单项程序合成技术来说太复杂,我们建议采用一种新的组合,优化重写器和合成合成组合程序。由此产生的程序是正确的,因此其输出将始终能够复制输入痕迹。我们在考虑各种优化度量和合成器在现实世界基准上的效率时,我们评估了合成程序的质量。结果显示,我们的方法可以产生有用的真实世界程序。


Article 9

Title@2025-06-10 (2): Outcome Logic: A Unified Approach to the Metatheory of Program Logics with Branching Effects

Title: Outcome Logic: A Unified Approach to the Metatheory of Program Logics with Branching Effects Ergebnis-Logik: Ein einheitlicher Ansatz zur Metatheorie von Programm-Logik mit Verzweigungseffekten 结果逻辑:对具有分流效应的方案逻辑比喻的统一方法 2401.04594v3

Authors (1): Noam Zilberstein

Starting with Hoare Logic over 50 years ago, numerous program logics have been devised to reason about the diverse programs encountered in the real world. This includes reasoning about computational effects, particularly those effects that cause the program execution to branch into multiple paths due to, e.g., nondeterministic or probabilistic choice. The recently introduced Outcome Logic reimagines Hoare Logic with branching at its core, using an algebraic representation of choice to capture programs that branch into many outcomes. In this article, we expand on prior Outcome Logic papers in order to give a more authoritative and comprehensive account of the metatheory. This includes a relatively complete proof system for Outcome Logic with the ability to reason about general purpose looping. We also show that this proof system applies to programs with various types of branching and that it facilitates the reuse of proof fragments across different kinds of specifications.

从50多年前的Hoare逻辑学开始,许多程序逻辑设计就是为了解释现实世界中遇到的不同程序。其中包括关于计算效果的推理,尤其是那些导致程序执行分解成多种路径的效果,例如,由于非确定性或概率性的选择。最近推出的成果逻辑再造Hoare逻辑学,其核心是分支,使用代数代表法选择将程序分解成许多结果。在本条中,我们扩展了先前的成果逻辑文件,以便更权威和全面地说明元论。这包括结果逻辑学相对完整的验证系统,能够解释一般目的环绕。我们还表明,这一验证系统适用于各种分支的方案,并且它有助于在不同规格下再利用证据碎片。


Article 10

Title@2025-06-10 (2): Linguine: A Natural-Language Programming Language with Formal Semantics and a Clean Compiler Pipeline

Title: Linguine: A Natural-Language Programming Language with Formal Semantics and a Clean Compiler Pipeline Linguine: Eine natursprachliche Programmiersprache mit formaler Semantik und einer sauberen Compiler-Pipeline 语言:一种自然语言-语言编程语言,有正式的语义和清洁编译管道 2506.08396v1

Authors (1): Lifan Hu

Linguine is a natural-language-inspired programming language that enables users to write programs in a fluent, controlled subset of English while preserving formal semantics. The language introduces anaphoric constructs, such as pronoun variables (e.g., “it”, “them”), that are statically resolved through referent-tracking analysis combined with a Hindley-Milner-style type system. Each pronoun is guaranteed to be unambiguous and well-typed at compile time. The Linguine compiler pipeline includes lexing, parsing, clause graph construction, desugaring into a typed intermediate representation, type inference, and abstract interpretation. This enables the early detection of semantic errors, such as undefined or type-inconsistent references. A lightweight backend currently generates Python code. This paper formalizes the core language, defines its typing and operational semantics, and proves the soundness of its pronoun resolution mechanism. An initial evaluation shows that Linguine allows the expression of concise and readable programs while supporting static verification. Linguine represents a step toward programming systems that prioritize human linguistic intuition while remaining grounded in formal methods and type-theoretic rigor.

语言是一种自然语言激励的编程语言,它使用户能够在保存正式语义的同时,用流畅、受控制的英文子集写程序,同时保留正式语义。语言引入了无光变量(例如“it”、“the”),这些变量通过参考跟踪分析与Hindley-Milner-system 系统一起静态地解决。每个代名词在编译时都得到保证清晰且类型精良的保证。语言编译器管道包括编译、解析、条款图构建、淡化成打字的中间代表、输入推断和抽象解释。这样可以早期发现语义错误,例如未定义或类型不一致的参考。一个轻量级后端目前生成 Python 代码。 本文将核心语言、 定义其打字和操作的语义表达方式, 并证明其Pronoun 解机制的正确性。 初步评估显示, 语言编程允许在支持静态校验的同时表达简明和可读的程序。 Linguine代表了在正式类型中以语言直观为优先的系统。


Article 11

Title@2025-06-10 (2): A Language-Agnostic Logical Relation for Message-Passing Protocols

Title: A Language-Agnostic Logical Relation for Message-Passing Protocols Eine sprach-agnostische Logische Beziehung für Message-Passing-Protokolle 发送信件协议的语言不可接受逻辑关系 2506.10026v1

Authors (5): Tesla Zhang, Sonya Simkin, Rui Li, Yue Yao, Stephanie Balzer

Today’s computing landscape has been gradually shifting to applications targeting distributed and heterogeneous systems, such as cloud computing and Internet of Things (IoT) applications. These applications are predominantly concurrent, employ message-passing, and interface with foreign objects, ranging from externally implemented code to actual physical devices such as sensors. Verifying that the resulting systems adhere to the intended protocol of interaction is challenging – the usual assumption of a common implementation language, let alone a type system, no longer applies, ruling out any verification method based on them. This paper develops a framework for certifying protocol compliance of heterogeneous message-passing systems. It contributes the first mechanization of a language-agnostic logical relation, asserting that its inhabitants comply with the protocol specified. This definition relies entirely on a labelled transition-based semantics, accommodating arbitrary inhabitants, typed and untyped alike, including foreign objects. As a case study, the paper considers two scenarios: (1) per-instance verification of a specific application or hardware device, and (2) once-and-for-all verification of well-typed applications for a given type system. The logical relation and both scenarios are mechanized in the Coq theorem prover.

今天的计算景观逐渐转向以分布式和异质系统为对象的应用,如云计算和事物互联网(IoT)应用。这些应用主要是 convor *,使用 *message-passing * 和与 fore 物体的接口,从外部执行的代码到传感器等实际物理装置。验证由此产生的系统遵守预期的互动协议具有挑战性 – – 通常假定一种共同的执行语言,更不用说一种类型系统,不再适用,排除任何基于这些语言的核查方法。本文开发了一个框架,用以核证不同信息传输系统遵守protocol 合规 * 。这些应用程序主要是 * 语言- Agnotic 逻辑关系 的首次机械化,声称其居民遵守了规定的协议。这一定义完全依赖于一个贴有标签的基于过渡的语义,容纳任意居民、排版和非型,包括外国物体。作为案例研究,本文考虑了两种假设:(1) 具体应用程序或硬件设备/硬件装置的永久核查* 和逻辑型号的验证系统- 和型号/型号的核查关系。


Article 12

Title@2025-06-09 (1): Verification of the Release-Acquire Semantics

Title: Verification of the Release-Acquire Semantics Überprüfung des Release-Acquire Semantics 释放-获取语义学的核查 2506.08238v1

Authors (4): Parosh Abdulla, Elli Anastasiadi, Mohamed Faouzi Atig, Samuel Grahn

The Release-Acquire (RA) semantics and its variants are some of the most fundamental models of concurrent semantics for architectures, programming languages, and distributed systems. Several steps have been taken in the direction of testing such semantics, where one is interested in whether a single program execution is consistent with a memory model. The more general verification problem, i.e., checking whether all allowed program runs are consistent with a memory model, has still not been studied as much. The purpose of this work is to bridge this gap. We tackle the verification problem, where, given an implementation described as a register machine, we check if any of its runs violates the RA semantics or its Strong (SRA) and Weak (WRA) variants. We show that verifying WRA in this setup is in O([)n5 ], while verifying the RA and SRA is in both NP- and coNP-hard, and provide a PSPACE upper bound. This both answers some fundamental questions about the complexity of these problems, but also provides insights on the expressive power of register machines as a model.

解密(RA)语义及其变体是建筑、编程语言和分布式系统同时使用语义的一些最基本模型。在测试这种语义方面已经采取了一些步骤,人们感兴趣的是单一程序执行是否与记忆模型一致。一般的核查问题,即检查所有允许的程序运行是否都与记忆模型一致,仍未得到同样的研究。这项工作的目的是缩小这一差距。我们处理核查问题,鉴于其实施被描述为注册机器,我们检查其运行是否违反了RA语义或其强项和WAK(WA)变体。我们表明,在对RA和SRA进行核查的同时,核查是在O([ 第5号],同时在NP-和CONP-硬体进行核查,并提供PSPACE上限。这都回答了关于这些问题复杂性的一些基本问题,但也提供了对登记机器作为模型的显性能量的洞察力。


Article 13

Title@2025-06-09 (1): Execution-Aware Program Reduction for WebAssembly via Record and Replay

Title: Execution-Aware Program Reduction for WebAssembly via Record and Replay Execution-Aware Programmreduktion für WebAssembly über Aufzeichnung und Wiedergabe 通过录制和重放减少网络摄像头的 执行软件程序 2506.07834v1

Authors (5): Doehyun Baek, Daniel Lehmann, Ben L. Titzer, Sukyoung Ryu, Michael Pradel

WebAssembly (Wasm) programs may trigger bugs in their engine implementations. To aid debugging, program reduction techniques try to produce a smaller variant of the input program that still triggers the bug. However, existing execution-unaware program reduction techniques struggle with large and complex Wasm programs, because they rely on static information and apply syntactic transformations, while ignoring the valuable information offered by the input program’s execution behavior. We present RR-Reduce and Hybrid-Reduce, novel execution-aware program reduction techniques that leverage execution behaviors via record and replay. RR-Reduce identifies a bug-triggering function as the target function, isolates that function from the rest of the program, and generates a reduced program that replays only the interactions between the target function and the rest of the program. Hybrid-Reduce combines a complementary execution-unaware reduction technique with RR-Reduce to further reduce program size. We evaluate RR-Reduce and Hybrid-Reduce on 28 Wasm programs that trigger a diverse set of bugs in three engines. On average, RR-Reduce reduces the programs to 1.20 percent of their original size in 14.5 minutes, which outperforms the state of the art by 33.15 times in terms of reduction time. Hybrid-Reduce reduces the programs to 0.13 percent of their original size in 3.5 hours, which outperforms the state of the art by 3.42 times in terms of reduced program size and 2.26 times in terms of reduction time. We envision RR-Reduce as the go-to tool for rapid, on-demand debugging in minutes, and Hybrid-Reduce for scenarios where developers require the smallest possible programs.

WebAssembly (Wasm) 程序可能会引发引擎执行中的错误。 为了帮助调试, 程序减少技术会试图生成一个较小的输入程序变体, 但仍触发错误。 但是, 现有的执行- unawar 程序减少技术会与大型和复杂的Wasm 程序争斗, 因为它们依赖静态信息并应用合成变换, 而忽略了输入程序执行行为提供的宝贵信息 。 我们展示了 RR- Red 和 混合- Red 、 利用记录和重放执行行为的新型执行- 识读程序减少技术 。 RR- Red 显示, 将一个错误触发程序变小的功能作为目标功能, 分离出该程序与程序其余部分的功能, 因为它们依赖静态信息, 并使用 RDR- Renformal 程序减少3 。 平均情况下, 将原始的 RR- 20 的变式程序压缩到正常时间 。


Article 14

Title@2025-06-09 (1): Pel, A Programming Language for Orchestrating AI Agents

Title: Pel, A Programming Language for Orchestrating AI Agents Pel, eine Programmiersprache für die Orchestrierung von KI-Agenten Pel, 用于指挥AI代理的编程语言 2505.13453v2

Authors (1): Behnam Mohammadi

The proliferation of Large Language Models (LLMs) has opened new frontiers in computing, yet controlling and orchestrating their capabilities beyond simple text generation remains a challenge. Current methods, such as function/tool calling and direct code generation, suffer from limitations in expressiveness, scalability, cost, security, and the ability to enforce fine-grained control. This paper introduces Pel, a novel programming language specifically designed to bridge this gap. Inspired by the strengths of Lisp, Elixir, Gleam, and Haskell, Pel provides a syntactically simple, homoiconic, and semantically rich platform for LLMs to express complex actions, control flow, and inter-agent communication safely and efficiently. Pel’s design emphasizes a minimal, easily modifiable grammar suitable for constrained LLM generation, eliminating the need for complex sandboxing by enabling capability control at the syntax level. Key features include a powerful piping mechanism for linear composition, first-class closures enabling easy partial application and functional patterns, built-in support for natural language conditions evaluated by LLMs, and an advanced Read-Eval-Print-Loop (REPeL) with Common Lisp-style restarts and LLM-powered helper agents for automated error correction. Furthermore, Pel incorporates automatic parallelization of independent operations via static dependency analysis, crucial for performant agentic systems. We argue that Pel offers a more robust, secure, and expressive paradigm for LLM orchestration, paving the way for more sophisticated and reliable AI agentic frameworks.

大语言模型(LLMS)的激增在计算方面开辟了新的疆界,但在简单的文本生成之外控制和调整其能力,仍然是一个挑战。目前的方法,例如功能/工具呼叫和直接代码生成,在表达、可缩放、成本、安全和实施细微控制能力方面受到限制。本文介绍了Pel,这是专门为弥合这一差距而设计的新型编程语言。受到Lisp、Elixir、Gleam和Haskell的强项的启发。Pel为LLMS提供了一个简单、同质和语义丰富的综合平台,以便LMS表达复杂的行动、控制流动和直接生成代码。Pel的设计强调一种最简便、易于修改的语法拼写能力能力,从而消除了对复杂沙箱的需求。关键特征包括线性构成的强大管道机制、一流式关闭,便于部分应用和功能模式,为LLMSMS评估的自然语言条件提供可靠、统一的控制流动和机构间通信通信平台。PLREP强调一个最先进的读-可调、更稳定的智能智能智能智能智能智能智能智能智能智能智能智能智能智能智能智能智能智能智能路路路路路段,供。


Article 15

Title@2025-06-08 (7): From Tool Calling to Symbolic Thinking: LLMs in a Persistent Lisp Metaprogramming Loop

Title: From Tool Calling to Symbolic Thinking: LLMs in a Persistent Lisp Metaprogramming Loop Vom Tool Calling zum Symbolischen Denken: LLMs in einem persistenten Lisp Metaprogramming Loop 从工具呼叫到符号思维:在持久性 Lisp 元方案化循环中的LLMs 2506.10021v1

Authors (1): Jordi de la Torre

We propose a novel architecture for integrating large language models (LLMs) with a persistent, interactive Lisp environment. This setup enables LLMs to define, invoke, and evolve their own tools through programmatic interaction with a live REPL. By embedding Lisp expressions within generation and intercepting them via a middleware layer, the system allows for stateful external memory, reflective programming, and dynamic tool creation. We present a design framework and architectural principles to guide future implementations of interactive AI systems that integrate symbolic programming with neural language generation.

我们提出了一个将大型语言模型(LLMs)与持续、互动的Lisp环境相结合的新架构。 这一架构使LLMs能够通过与实时REPL进行程序互动来定义、引用和开发自己的工具。 通过将Lisp表达式嵌入下一代并通过中间软件层截取这些表达式,这个体系允许有声有色的外部记忆、反射编程和动态工具创建。 我们提出了一个设计框架和建筑原则来指导交互式AI系统的未来实施,这些系统将象征性的编程与神经语言生成相结合。


Article 16

Title@2025-06-08 (7): From Informal to Formal – Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs

Title: From Informal to Formal – Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs Vom informellen zum formalen – Einbinden und Bewerten von LLMs über natürliche Sprachanforderungen bis hin zu überprüfbaren Formalproofs 从非正式到正式 – – 纳入和评价关于自然语言要求与可核实的正式证明之间的LLMs 2501.16207v4

Authors (12): Jialun Cao, Yaojie Lu, Meiziniu Li, Haoyang Ma, Haokun Li, Mengda He, Cheng Wen, Le Sun, Hongyu Zhang, Shengchao Qin, Shing-Chi Cheung, Cong Tian

The research in AI-based formal mathematical reasoning has shown an unstoppable growth trend. These studies have excelled in mathematical competitions like IMO and have made significant progress. This paper focuses on formal verification, an immediate application scenario of formal reasoning, and breaks it down into sub-tasks. We constructed 18k high-quality instruction-response pairs across five formal specification languages (Coq, Lean4, Dafny, ACSL, and TLA+) by distilling gpt-4o and evaluated against ten open-sourced LLMs, including recent popular DeepSeek-R1. We also fine-tuned several 7~8B small models to achieve comparable performance with Deepseek-R1-671B. Interestingly, we observed that fine-tuning with formal data also enhances mathematics, reasoning, and coding capabilities. Fine-tuned models are released at https: //huggingface.co/fm-universe.

以AI为基础的正规数学推理的研究显示了不可阻挡的增长趋势。这些研究在海事组织等数学竞赛中取得了显著的成绩,并取得了显著的进展。本文侧重于正式核查,即即即时应用正式推理的情景,并将其细分为子任务。我们通过蒸馏Gpt-4o和对10个公开来源的LLMS(包括最近的流行的DeepSeek-R1)进行评估,在五种正式规格语言(Coq、Lean4、Dafny、ACSL和TLA+)上建起了18k个高质量的教学-回应对子任务。我们还对7~8B小模型进行了微调,以取得与Deepseek-R1-671B的可比性能。有趣的是,我们发现,对正式数据的微调也提高了数学、推理和编译能力。在https:/huggingface.co/fm-univunverse发布精细模型。


Article 17

Title@2025-06-08 (7): Two-sorted algebraic decompositions of Brookes’s shared-state denotational semantics

Title: Two-sorted algebraic decompositions of Brookes’s shared-state denotational semantics Zwei-sortierte algebraische Zersetzungen von Brookes’ shared-state denotational semantics 布鲁克斯的 共同状态分解语义学的 双组代数分解 2501.15104v3

Authors (4): Yotam Dvir, Ohad Kammar, Ori Lahav, Gordon Plotkin

We use a two sorted equational theory of algebraic effects to model concurrent shared state with preemptive interleaving, recovering Brookes’s seminal 1996 trace-based model precisely. The decomposition allows us to analyse Brookes’s model algebraically in terms of separate but interacting components. The multiple sorts partition terms into layers. We use two sorts: a “hold” sort for layers that disallow interleaving of environment memory accesses, analogous to holding a global lock on the memory; and a “cede” sort for the opposite. The algebraic signature comprises of independent interlocking components: two new operators that switch between these sorts, delimiting the atomic layers, thought of as acquiring and releasing the global lock; non-deterministic choice; and state-accessing operators. The axioms similarly divide cleanly: the delimiters behave as a closure pair; all operators are strict, and distribute over non-empty non-deterministic choice; and non-deterministic global state obeys Plotkin and Power’s presentation of global state. Our representation theorem expresses the free algebras over a two-sorted family of variables as sets of traces with suitable closure conditions. When the held sort has no variables, we recover Brookes’s trace semantics.

我们使用两种分解的代数效应等式理论来模拟同时共享状态,先发制人间断裂, 精确地恢复布鲁克斯1996年的原始线性模型。 分解让我们能够分析布鲁克斯的模型代数, 以独立但互动的构件。 多种分化条件进入层。 我们使用两种类型的“ 控点” 类型, 不允许环境内存存存存存取的中间分解层, 类似于持有全球记忆锁; 反之的“ 封存” 类。 代数性全球状态由独立的互闭部分组成: 两种在这种类型之间转换的新操作者, 划定原子层, 被视为获取和释放全球锁; 非非定序选择; 以及 州访问操作者。 等分法相似的分解方式是: 定界器作为封闭配方; 所有操作者都是严格, 并分布在非确定性的非定式的不偏向性选择上; 和不确定性的全球状态由独立互闭式的状态构成Plotkin和权力演示。 我们的表示的定式变数是自由的固定的变数, 我们的定式的变数的变数以适当的变数形式代表着的变数, 。


Article 18

Title@2025-06-07 (6): VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification

Title: VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification VeriThoughts: Automatisierte Generierung von Verilog-Codes mittels Begründung und formaler Überprüfung Verithoughts: 利用理由说明和正式核查,使自动生成Verilog码 2505.20302v2

Authors (7): Patrick Yubeaton, Andre Nakkab, Weihua Xiao, Luca Collini, Ramesh Karri, Chinmay Hegde, Siddharth Garg

This paper introduces VeriThoughts, a novel dataset designed for reasoning-based Verilog code generation. We establish a new benchmark framework grounded in formal verification methods to evaluate the quality and correctness of generated hardware descriptions. Additionally, we present a suite of specialized small-scale models optimized specifically for Verilog generation. Our work addresses the growing need for automated hardware design tools that can produce verifiably correct implementations from high-level specifications, potentially accelerating the hardware development process while maintaining rigorous correctness guarantees. Our code and data are available at \href{https://github.com/wilyub/VeriThoughts}{this URL}.

本文介绍Verithoughts(Verithoughts),这是为基于推理的Verilog代码生成设计的新数据集。我们建立了一个以正式核查方法为基础的新的基准框架,以评估生成的硬件描述的质量和正确性。此外,我们介绍了一套专门为Verilog 生成而优化的小型专门模型。我们的工作解决了对自动化硬件设计工具日益增长的需求,这些工具能够根据高规格产生可核查的正确执行,有可能加快硬件开发过程,同时保持严格的正确性保障。我们的代码和数据可以在\href{https://github.com/wilyub/Veri Thoughts{Thoughts}查阅。


Article 19

Title@2025-06-07 (6): Validating Quantum State Preparation Programs

Title: Validating Quantum State Preparation Programs Validierung von Quantenzustandsvorbereitungsprogrammen 验证量子州编制方案 2501.05616v3

Authors (5): Liyi Li, Anshu Sharma, Zoukarneini Difaizi Tagba, Sean Frett, Alex Potanin

One of the key steps in quantum algorithms is to prepare an initial quantum superposition state with different kinds of features. These so-called state preparation algorithms are essential to the behavior of quantum algorithms, and complicated state preparation algorithms are difficult to develop correctly and effectively. This paper presents Pqasm: a high-assurance framework implemented with the Coq proof assistant, allowing us to certify our Pqasm tool to correctly reflect quantum program behaviors. The key in the framework is to reduce the program correctness assurance of a program containing a quantum superposition state to the program correctness assurance for the program state without superposition. The reduction allows the development of an effective testing framework for testing quantum state preparation algorithm implementations on a classical computer - considered to be a hard problem with no clear solution until this point. We utilize the QuickChick property-based testing framework to test state preparation programs. We evaluated the effectiveness of our approach over 5 case studies implemented using Pqasm; such cases are not even simulatable in the current quantum simulators.

量子算法的关键步骤之一是准备一个具有不同特性的初始量子叠加状态。 这些所谓的国家准备算法对于量子算法的行为至关重要, 而复杂的国家准备算法很难正确有效地发展。 本文展示了 Pqasm : 一个与 Coq 验证助理一起实施的高度保障框架, 允许我们验证我们的Pqasm 工具, 以正确反映量子程序行为。 框架的关键在于降低一个含有量子叠加状态的程序对程序状态的正确性保证的程式的准确性保证程序。 这种递减允许开发一个有效的测试框架, 用于测试古典计算机的量子编制算法实施情况。 据认为, 在这一点之前,这是一个难以解决的难题。 我们使用Quick Chick 地产测试框架来测试国家准备程序。 我们评估了我们使用 Pqasm 实施的超过5个案例研究的实效; 这些案例在目前的量子模拟器中甚至无法模拟。


Article 20

Title@2025-06-07 (6): Object-Spatial Programming

Title: Object-Spatial Programming Objekträumliche Programmierung 物体空间方案拟订 2503.15812v6

Authors (1): Jason Mars

The evolution of programming languages from low-level assembly to high-level abstractions demonstrates a fundamental principle: by constraining how programmers express computation and enriching semantic information at the language level, we can make previously undecidable program properties tractable for optimization. Building on the insight of this undecidability-lessening effect, we introduce Object-Spatial Programming (OSP), a novel programming model that extends Object-Oriented Programming by introducing topologically-aware class constructs called archetypes. OSP fundamentally inverts the traditional relationship between data and computation, enabling computation to move to data through four specialized archetypes: object classes, node classes (discrete data locations), edge classes (first-class relationships), and walker classes (mobile computational entities). By making topological relationships and traversal patterns explicit at the language level, OSP transforms previously opaque program behaviors into observable, optimizable patterns. This semantic enhancement enables runtime systems to make informed decisions about data locality, parallel execution, and distribution strategies based on explicit topology, while providing programmers with intuitive abstractions for modeling complex systems where connection topology is central to the computational model. The paradigm addresses fundamental limitations in traditional programming models when representing agent-based systems, social networks, neural networks, distributed systems, finite state machines, and other spatially-oriented computational problems, demonstrating how thoughtful abstraction design can simultaneously enhance programmer expressiveness and enable sophisticated system-level optimizations across the computing stack.

编程语言从低层组装到高层次抽象学的演进显示了一项根本原则:通过限制程序员如何在语言层面表达计算和丰富语义信息,我们可以通过限制程序员如何在语言层次上进行计算并丰富语义信息,使先前不可分的编程属性可以优化。基于这种不可分化效应的洞察力,我们引入了物体-空间编程(OSP),这是将目标偏向性编程扩展为横向编程的新编程模式,它通过引入被称为考古型号的表层-意识类结构,将目标偏向性编程扩展为横向编程。OSP从根本上改变了数据与计算的传统关系,使计算能够通过四种专门的直观型型类(对象类别、节点类(不同数据位置)、边缘级(一级关系)和行走者类(移动计算器类(移动计算实体))向数据转移:在语言层次层次上将表面关系和轮廓式编程模式将先前不透明的程序行为转换为可观察的、可优化型式模式。这种语义化系统使得运行系统能够对数据地点、平行执行和分布战略做出知情决定,同时以明确的基于明确的表学,同时为向直观的编程,同时向直观的编程,同时提供模式,同时提供模式,向直观的编程、直观的编程、直观的编程的编程、直观的编程式的编程的编程的编程、直观的编程、直线路路路路路路路路路路路路路系网络,同时向的编程网络路路路系,同时为代表着式式的编程网络代表着式系统,在中心模式、直路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路,从而,在中心路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路的网络,以至路,并路,以路,以路路,以路,并路,在中央设计网路路路路路路路,以路,在建路路路路路路路路路路路路路路路路路路,


Article 21

Title@2025-06-07 (6): Hadamard-$Π$: Equational Quantum Programming

Title: Hadamard-$Π$: Equational Quantum Programming Hadamard-$I$: Äquatorielle Quantenprogrammierung Hadamard-$ $: 等量量方案编制 2506.06835v1

Authors (3): Wang Fang, Chris Heunen, Robin Kaarsgaard

Quantum computing offers advantages over classical computation, yet the precise features that set the two apart remain unclear. In the standard quantum circuit model, adding a 1-qubit basis-changing gate – commonly chosen to be the Hadamard gate – to a universal set of classical reversible gates yields computationally universal quantum computation. However, the computational behaviours enabled by this addition are not fully characterised. We give such a characterisation by introducing a small quantum programming language extending the universal classical reversible programming language $\Pi$ with a single primitive corresponding to the Hadamard gate. The language comes equipped with a sound and complete categorical semantics that is specified by a purely equational theory, enabling reasoning about the equivalence of quantum programs in a way that can be automated. Completeness is shown by means of a novel finite presentation, and corresponding synthesis algorithm, for the groups of orthogonal matrices with entries in the ring $\mathbb{Z}[\tfrac{1}{\sqrt{2}}]$.

量子计算比古典计算有优势, 但将两者分开的确切特征仍然不清楚。 在标准量子电路模型中, 在一套通用的古典可逆门上添加一个 1 Qbit 基交换门 – – 通常被选为 Hadamard 门 – – 以可逆门交换的门来计算通用的量子计算。 但是, 由此添加而促成的计算行为没有完全定性 。 我们引入了一种小量子编程语言, 扩展通用的古典可逆编程语言$\ Pi$, 与 Hadamard 门相对应的单一原始语言, 从而给语言配备了一种纯等式理论所指定的、 完整绝对语义, 使关于量程序等值的推理得以自动化。 完整性表现通过一种新式的限定式演示和相应的合成算法, 表现在 $\\\\\\ [\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\


Article 22

Title@2025-06-07 (6): Denotational Semantics for Probabilistic and Concurrent Programs

Title: Denotational Semantics for Probabilistic and Concurrent Programs Denotationelle Semantik für probabilistische und gleichzeitige Programme 概率和同时方案的说明性代记性语义学 2503.02768v2

Authors (3): Noam Zilberstein, Daniele Gorla, Alexandra Silva

We develop a denotational model for probabilistic and concurrent imperative programs, a class of programs with standard control flow via conditionals and while-loops, as well as probabilistic actions and parallel composition. Whereas semantics for concurrent or randomized programs in isolation is well studied, their combination has not been thoroughly explored and presents unique challenges. The crux of the problem is that interactions between control flow, probabilistic actions, and concurrent execution cannot be captured by straightforward generalizations of prior work on pomsets and convex languages, prominent models for those effects, individually. Our model has good domain theoretic properties, important for semantics of unbounded loops. We also prove two adequacy theorems, showing that the model subsumes typical powerdomain semantics for concurrency and convex powerdomain semantics for probabilistic nondeterminism.

我们开发了一个概率和同时需要的程序的分解模型,这是一组通过有条件和时空操作以及概率动作和平行构成进行标准控制流程的程序。虽然对同时或随机的孤立程序的语义进行了很好的研究,但它们的组合还没有进行彻底的探讨,并提出了独特的挑战。 问题的症结在于控制流动、概率行动和同时执行之间的相互作用无法通过直接概括先前关于孔塞和混凝土语言的工作,这些效应的突出模型,即个别的显著模型来捕捉。我们的模型具有良好的域域域论特性,对于无界环的语义十分重要。 我们还证明了两种适当的理论,表明模型的子数是典型的等同货币和共通电力的精度。


Article 23

Title@2025-06-06 (5): Simplifying explicit subtyping coercions in a polymorphic calculus with effects

Title: Simplifying explicit subtyping coercions in a polymorphic calculus with effects Vereinfachung von expliziten Subtyping-Zwangen in einem polymorphen Kalkül mit Effekten 在具有效果的多形态微积分中简化显性亚型强制 2404.04218v2

Authors (2): Filip Koprivec, Matija Pretnar

Algebraic effect handlers are becoming an increasingly popular way of structuring effectful computations, and their performance is often a concern. One of the proposed approaches towards efficient compilation is tracking effect information through explicit subtyping coercions. However, in the presence of polymorphism, these coercions are compiled into additional arguments of compiled functions, incurring significant overhead. In this paper, we present a polymorphic effectful calculus, identify simplification phases needed to reduce the number of unnecessary constraints, and prove that they preserve semantics. In addition, we implement the simplification algorithm in the Eff language and evaluate its performance on a number of benchmarks. Though we do not prove the optimality of the presented simplifications, the results show that the algorithm eliminates all coercions, resulting in code as efficient as manually monomorphised one.

变形效应处理器正日益成为一种日益流行的、结构化有效计算方法,其性能经常引起人们的关注。高效汇编的拟议方法之一是通过明确的分型强制手段跟踪效果信息。然而,在多种形态的情况下,这些强制手段被汇编成编集功能的更多论据,引起大量间接费用。在本文中,我们提出了一个多变效应计算法,确定减少不必要限制数量所需的简化阶段,并证明它们保存语义。此外,我们用埃夫语言实施简化算法,并根据一些基准评估其绩效。尽管我们没有证明所提出的简化方法的最佳性,但结果显示算法消除了所有胁迫,从而产生了与手动单式法一样有效的代码。


Article 24

Title@2025-06-06 (5): Reasoning about External Calls

Title: Reasoning about External Calls Begründung externer Anrufe 外部呼叫理由 2506.06544v1

Authors (4): Sophia Drossopoulou, Julian Mackay, Susan Eisenbach, James Noble

In today’s complex software, internal trusted code is tightly intertwined with external untrusted code. To reason about internal code, programmers must reason about the potential effects of calls to external code, even though that code is not trusted and may not even be available. The effects of external calls can be limited, if internal code is programmed defensively, limiting potential effects by limiting access to the capabilities necessary to cause those effects. This paper addresses the specification and verification of internal code that relies on encapsulation and object capabilities to limit the effects of external calls. We propose new assertions for access to capabilities, new specifications for limiting effects, and a Hoare logic to verify that a module satisfies its specification, even while making external calls. We illustrate the approach though a running example with mechanised proofs, and prove soundness of the Hoare logic.

在当今复杂的软件中,内部信任的代码与外部不受信任的代码紧密相连。为了解释内部代码,程序员必须说明呼叫外部代码的潜在影响,即使该代码不可信,甚至可能无法提供。如果内部代码的编程是防御性的,外部代码的影响可能有限,限制潜在影响,限制产生这些影响所需的能力。本文涉及内部代码的规格和核查,该代码依赖封装和物体能力来限制外部通话的影响。我们提出了获取能力的新主张、限制效应的新规格和Hoare逻辑,以核实模块是否满足其规格,即使发出外部电话。我们通过机械化的证明来说明这一方法,并证明Hoare逻辑的正确性。


Article 25

Title@2025-06-06 (5): CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports

Title: CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports CompilerGPT: Nutzung großer Sprachmodelle zur Analyse und Umsetzung von Compiler-Optimierungsberichten 汇编者最佳化报告:利用大语言模型进行分析并采取行动 2506.06227v1

Authors (1): Peter Pirkelbauer

Current compiler optimization reports often present complex, technical information that is difficult for programmers to interpret and act upon effectively. This paper assesses the capability of large language models (LLM) to understand compiler optimization reports and automatically rewrite the code accordingly. To this end, the paper introduces CompilerGPT, a novel framework that automates the interaction between compilers, LLMs, and user defined test and evaluation harness. CompilerGPT’s workflow runs several iterations and reports on the obtained results. Experiments with two leading LLM models (GPT-4o and Claude Sonnet), optimization reports from two compilers (Clang and GCC), and five benchmark codes demonstrate the potential of this approach. Speedups of up to 6.5x were obtained, though not consistently in every test. This method holds promise for improving compiler usability and streamlining the software optimization process.

目前的汇编优化报告往往提供复杂的技术信息,程序员难以解释和有效采取行动。本文评估了大型语言模型(LLM)理解汇编优化报告并自动相应重写代码的能力。为此,本文件介绍了《汇编GPT》,这是一个使汇编者、LLMS和用户定义的测试和评价工具之间互动自动化的新框架。《汇编GPT》的工作流程运行了若干次迭代,并报告了取得的成果。与两个主要LLM模型(GPT-4o和Claude Sonnet)的实验,两个汇编者(Claude Sonnet)的优化报告以及五个基准代码显示了这一方法的潜力。获得了多达6.5x的加速,但并不是每次测试都一致。这种方法有望改进汇编的可用性和简化软件优化进程。


Article 26

Title@2025-06-06 (5): HEC: Equivalence Verification Checking for Code Transformation via Equality Saturation

Title: HEC: Equivalence Verification Checking for Code Transformation via Equality Saturation HEC: Überprüfung der Gleichwertigkeit auf Code-Transformation durch Gleichstellungssättigung EEC: 通过平等饱和对代码转换进行等同核查 2506.02290v2

Authors (5): Jiaqi Yin, Zhan Song, Nicolas Bohm Agostini, Antonino Tumeo, Cunxi Yu

In modern computing systems, compilation employs numerous optimization techniques to enhance code performance. Source-to-source code transformations, which include control flow and datapath transformations, have been widely used in High-Level Synthesis (HLS) and compiler optimization. While researchers actively investigate methods to improve performance with source-to-source code transformations, they often overlook the significance of verifying their correctness. Current tools cannot provide a holistic verification of these transformations. This paper introduces HEC, a framework for equivalence checking that leverages the e-graph data structure to comprehensively verify functional equivalence between programs. HEC utilizes the MLIR as its frontend and integrates MLIR into the e-graph framework. Through the combination of dynamic and static e-graph rewriting, HEC facilitates the validation of comprehensive code transformations. We demonstrate effectiveness of HEC on PolyBenchC benchmarks, successfully verifying loop unrolling, tiling, and fusion transformations. HEC processes over 100,000 lines of MLIR code in 40 minutes with predictable runtime scaling. Importantly, HEC identified two critical compilation errors in mlir-opt: loop boundary check errors causing unintended executions during unrolling, and memory read-after-write violations in loop fusion that alter program semantics. These findings demonstrate HEC practical value in detecting real-world compiler bugs and highlight the importance of formal verification in optimization pipelines.

在现代计算系统中,编译采用多种优化技术来提高代码性能。源到源代码转换,包括控制流程和数据路径转换,在高级合成(HLS)和编译器优化中被广泛使用。研究人员积极调查改进源到源代码转换的性能的方法,但往往忽视了核实其正确性的重要性。当前工具无法对这些转换进行整体核查。本文件介绍了HEC,这是一个对等检查框架,它利用电子制图数据结构全面核查程序之间的功能等同性。 HEC利用MLIR作为其前端,并将MLIR纳入电子版图框架。通过动态和静态电子制图的结合,HEC为全面代码转换的验证提供了便利。我们展示了HC在多边代码转换基准上的有效性,成功地核查了循环松动、节动和聚变。HEC在40分钟内处理超过100 000行的MLIR代码,并有可预测的运行时间缩放量。YIC在 ml-opt中发现了两个关键的编译错误:通过动态和静态电子文字重写来验证导致循环处决的准确性程序,这些修正的校正校验的校正的校验结果。


Article 27

Title@2025-06-06 (5): A Sound and Complete Characterization of Fair Asynchronous Session Subtyping

Title: A Sound and Complete Characterization of Fair Asynchronous Session Subtyping Eine Klang- und vollständige Charakterisierung von Fair Asynchron Session Subtyping 公平非同步届会的健全和完整特点 2506.06078v1

Authors (3): Mario Bravetti, Luca Padovani, Gianluigi Zavattaro

Session types are abstractions of communication protocols enabling the static analysis of message-passing processes. Refinement notions for session types are key to support safe forms of process substitution while preserving their compatibility with the rest of the system. Recently, a fair refinement relation for asynchronous session types has been defined allowing the anticipation of message outputs with respect to an unbounded number of message inputs. This refinement is useful to capture common patterns in communication protocols that take advantage of asynchrony. However, while the semantic (`a la testing) definition of such refinement is straightforward, its characterization has proved to be quite challenging. In fact, only a sound but not complete characterization is known so far. In this paper we close this open problem by presenting a sound and complete characterization of asynchronous fair refinement for session types. We relate this characterization to those given in the literature for synchronous session types by leveraging a novel labelled transition system of session types that embeds their asynchronous semantics.

会话类型是通信协议的抽象性,能够静态分析电文传递过程。 会话类型的完善概念是支持安全的程序替代形式,同时保持其与系统其他部分兼容性的关键。 最近,对不同步的会话类型规定了公平的完善关系,允许对无限制的电文输入量预期电文输出。这种完善有助于捕捉通信协议中利用无同步式的通用模式。然而,虽然这种改进的语义(ña la la test)定义是直截了当的,但其定性证明是相当具有挑战性的。事实上,迄今为止,只有一种合理但不完整的描述才为人所知。在本文中,我们通过对会话类型提出对无同步式的公平改进的完整和完整描述来结束这一开放问题。我们将这种定性与文献中为同步式会话类型而给出的特征联系起来,办法是利用一个新颖的、贴有标签的会话类型过渡系统,嵌入其非同步式的会话。


Article 28

Title@2025-06-06 (5): An Execution Model for RICE

Title: An Execution Model for RICE Ein Ausführungsmodell für RICE RICIC 执行模式执行模式 2506.05839v1

Authors (1): Steven Libby

In this paper, we build on the previous work of the RICE compiler by giving its execution model. We show the restrictions to the FlatCurry language that were made to produce executable code, and present the execution model using operational semantics similar to Launchbury. Finally, we show that the execution model conforms with the standard operational semantics for Curry.

在本文中,我们以RICE编译者先前的工作为基础,给出了执行模型。我们展示了对FlatCurry语言的限制,这些语言是用来制作可执行代码的,我们用类似于Launbury的操作语义来展示执行模型。 最后,我们展示了执行模型符合Curry的标准操作语义。


Article 29

Title@2025-06-06 (5): Mirage: A Multi-Level Superoptimizer for Tensor Programs

Title: Mirage: A Multi-Level Superoptimizer for Tensor Programs Mirage: Ein Multi-Level-Superoptimizer für Tensor-Programme 幻影:向导方案多层次超强激励器 2405.05751v3

Authors (10): Mengdi Wu, Xinhao Cheng, Shengyu Liu, Chunan Shi, Jianan Ji, Kit Ao, Praveen Velliengiri, Xupeng Miao, Oded Padon, Zhihao Jia

We introduce Mirage, the first multi-level superoptimizer for tensor programs. A key idea in Mirage is $\mu$Graphs, a uniform representation of tensor programs at the kernel, thread block, and thread levels of the GPU compute hierarchy. $\mu$Graphs enable Mirage to discover novel optimizations that combine algebraic transformations, schedule transformations, and generation of new custom kernels. To navigate the large search space, Mirage introduces a pruning technique based on abstraction that significantly reduces the search space and provides a certain optimality guarantee. To ensure that the optimized $\mu$Graph is equivalent to the input program, Mirage introduces a probabilistic equivalence verification procedure with strong theoretical guarantees. Our evaluation shows that Mirage outperforms existing approaches by up to 3.3$\times$ even for DNNs that are widely used and heavily optimized. Mirage is publicly available at https://github.com/mirage-project/mirage.

我们引入了幻影,这是对高压程序的第一个多层次超级优化。幻影中的一个关键想法是 $mu$Graphs, 在 GPU 计算层次的内核、 线条块和线条层次上统一代表高压程序。 $mu$Graphs 使幻影能够发现将代数变换、 时间表变换和产生新的定制内核相结合的新型优化。 为了浏览大型搜索空间,幻影引入了一种基于抽象的修剪技术,大大缩小了搜索空间,并提供了某种最佳性保证。为了确保优化后的$mu$Graph 等同于输入程序,幻影引入了一种具有强烈理论保证的概率等同核查程序。我们的评估表明,幻影超越了现有方法,甚至为广泛使用和高度优化的DNNPs提供了高达3.3美元的时间。 Mirage在http://github.com/mirage-projurage/mirage上公开提供。


Article 30

Title@2025-06-06 (5): CoopetitiveV: Leveraging LLM-powered Coopetitive Multi-Agent Prompting for High-quality Verilog Generation

Title: CoopetitiveV: Leveraging LLM-powered Coopetitive Multi-Agent Prompting for High-quality Verilog Generation CoopetitiveV: LLM-powered Coopetitive Multi-Agent für hochwertige Verilog-Generation 协作V:利用LLM-动力协同协作的多方协作促进高品质活性一代 2412.11014v2

Authors (8): Zhendong Mi, Renming Zheng, Haowen Zhong, Yue Sun, Seth Kneeland, Sayan Moitra, Ken Kutzer, Zhaozhuo Xu Shaoyi Huang

Recent advances in agentic LLMs have demonstrated great capabilities in Verilog code generation. However, existing approaches either use LLM-assisted single-agent prompting or cooperation-only multi-agent learning, which will lead to: (i) Degeneration issue for single-agent learning: characterized by diminished error detection and correction capabilities; (ii) Error propagation in cooperation-only multi-agent learning: erroneous information from the former agent will be propagated to the latter through prompts, which can make the latter agents generate buggy code. In this paper, we propose an LLM-based coopetitive multi-agent prompting framework, in which the agents cannot collaborate with each other to form the generation pipeline, but also create a healthy competitive mechanism to improve the generating quality. Our experimental results show that the coopetitive multi-agent framework can effectively mitigate the degeneration risk and reduce the error propagation while improving code error correction capabilities, resulting in higher quality Verilog code generation. The effectiveness of our approach is validated through extensive experiments. On VerilogEval Machine and Human dataset, CoopetitiveV+GPT-4 achieves 99.2% and 99.1% pass@10 scores, respectively. While on RTLLM, CoopetitiveV+GPT-4 obtains 100% syntax and 99.9% functionality pass@5 scores.

然而,现有的方法要么使用LLM协助的单一试剂推动或合作性多试剂学习,要么使用LLM协助的单一试剂促进或合作性的多试剂学习,这将导致:(一) 单一试剂学习的退化问题:发现和纠正能力降低;(二) 合作性多试学习中的错误传播:前试剂的错误信息将通过提示向后者传播,这可以使后者产生错误代码。在本文中,我们提议以LLM为基础的基于LM的多试剂促进合作框架,使这些试剂无法相互协作形成生产管道,而且还将建立一个健康的竞争性机制,以提高生成质量。我们的实验结果表明,协作性多试剂框架可以有效减轻退化风险并减少错误传播,同时提高代码错误纠正能力,从而产生更高质量的Verilog代码。我们的方法的有效性通过广泛的实验得到验证。关于VerlogEval机器和人类数据集、CopetiveVGPT-4、CopetiveV+GPT-4不能相互协作建立99.2%和99.1%adV.V.M.


Article 31

Title@2025-06-06 (5): Autocomp: LLM-Driven Code Optimization for Tensor Accelerators

Title: Autocomp: LLM-Driven Code Optimization for Tensor Accelerators Autocomp: LLM-gesteuerte Code-Optimierung für Tensor-Beschleuniger 自动comp: LLM- Driven 代码对 Tensor 加速器的优化 2505.18574v2

Authors (4): Charles Hong, Sahil Bhatia, Alvin Cheung, Yakun Sophia Shao

Hardware accelerators, especially those designed for tensor processing, have become ubiquitous in today’s computing landscape. However, even with significant efforts in building compilers, programming these tensor accelerators remains challenging, leaving much of their potential underutilized. Recently, large language models (LLMs), trained on large amounts of code, have shown significant promise in code generation and optimization tasks, but generating low-resource languages like specialized tensor accelerator code still poses a significant challenge. We tackle this challenge with Autocomp, an approach that empowers accelerator programmers to leverage domain knowledge and hardware feedback to optimize code via an automated LLM-driven search. We accomplish this by: 1) formulating each optimization pass as a structured two-phase prompt, divided into planning and code generation phases, 2) inserting domain knowledge during planning via a concise and adaptable optimization menu, and 3) integrating correctness and performance metrics from hardware as feedback at each search iteration. Across three categories of representative workloads and two different accelerators, we demonstrate that Autocomp-optimized code runs 5.6x (GEMM) and 2.7x (convolution) faster than the vendor-provided library, and outperforms expert-level hand-tuned code by 1.4x (GEMM), 1.1x (convolution), and 1.3x (fine-grained linear algebra). Additionally, we demonstrate that optimization schedules generated from Autocomp can be reused across similar tensor operations, improving speedups by up to 24% under a fixed sample budget.

在当今的计算环境中,硬件加速器,特别是那些设计用于高压处理的硬件加速器已经变得无处不在。然而,即便在建设数据编纂器方面做出了巨大努力,程序制作这些高压加速器仍然具有挑战性,使得其潜在潜力得不到充分利用。最近,在大量代码方面受过培训的大型语言模型(LLMS)在代码生成和优化任务中表现出了巨大的希望,但在生成像专门的高压加速器代码这样的低资源语言方面仍是一个重大挑战。我们用Autocomp 来应对这一挑战。Autcomp(Autocomp)赋予加速器程序程序程序员通过自动LLOM驱动搜索将域知识和硬件反馈用于优化代码。我们通过下列方法完成这项工作:1)将每个优化的通过结构化的两阶段快速、分为规划和代码生成阶段;2)在规划过程中通过简洁和调整的优化菜单插入域知识,3)将硬件的准确性和性度指标整合成为每次搜索的反馈。在三类代表性工作量和两个不同的内部加速器中,我们展示了自动整合的代码在5.6x(GEMMLA-D-A-D-D-D-D-CAR-D-D-C-C-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-D-C-C-D-D-C-C-C-C-C-C-C-C-C-C-C-C-D-D-D-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-S-S-S-S-S-S-S-C-C-S-C-C-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S


Article 32

Title@2025-06-05 (4): Classical notions of computation and the Hasegawa-Thielecke theorem

Title: Classical notions of computation and the Hasegawa-Thielecke theorem Klassische Begriffe der Berechnung und das Hasegawa-Thielecke-Theorem 经典的计算概念和长谷川-长谷川-希列克定理 2502.13033v2

Authors (3): Éléonore Mangel, Paul-André Melliès, Guillaume Munch-Maccagnoni

In the spirit of the Curry-Howard correspondence between proofs and programs, we define and study a syntax and semantics for classical logic equipped with a computationally involutive negation, using a polarised effect calculus. A main challenge in designing a denotational semantics is to accommodate both call-by-value and call-by-name evaluation strategies, which leads to a failure of associativity of composition. Building on the work of the third author, we devise the notion of dialogue duploid, which provides a non-associative and effectful counterpart to the notion of dialogue category introduced by the second author in his 2-categorical account, based on adjunctions, of logical polarities and continuations. We show that the syntax of the polarised calculus can be interpreted in any dialogue duploid, and that it defines in fact a syntactic dialogue duploid. As an application, we establish, by semantic as well as syntactic means, the Hasegawa-Thielecke theorem, which states that the notions of central map and of thunkable map coincide in any dialogue duploid (in particular, for any double negation monad on a symmetric monoidal category).

本着证据与程序之间的咖哩-霍华德对应关系的精神,我们定义并研究一种古典逻辑的语法和语义学,该语法配有计算上不易演化的否定,使用两极分化效果的微积分。设计分解语义学的主要挑战是如何兼顾按个价值和按个名称的评价战略,这导致组合的关联性失败。我们根据第三作者的工作,设计了对话符号的概念,它为第二作者在二类账户中基于逻辑极化和延续的辅助性、逻辑极化和延续性引入的对话类别概念提供了一种非关联性和效果的对应。我们表明,任何对话符号都可以解释极化微积分的共性,事实上它定义了一个合成对话符号。作为应用,我们通过语义和同理手段,建立了与第二作者在二类对话账户中引入的对话类别概念的非关联性和有效对应性对应性。 我们展示了在任何对话的图类中,任何可同时标定的中间地图和任何定型地图的正标定的正式概念。


Article 33

Title@2025-06-05 (4): Efficient Formal Verification of Quantum Error Correcting Programs

Title: Efficient Formal Verification of Quantum Error Correcting Programs Effiziente formale Überprüfung von Quantenfehler-Korrekturprogrammen 量化错误纠正程序的有效正式核实 2504.07732v2

Authors (5): Qifan Huang, Li Zhou, Wang Fang, Mengyu Zhao, Mingsheng Ying

Quantum error correction (QEC) is fundamental for suppressing noise in quantum hardware and enabling fault-tolerant quantum computation. In this paper, we propose an efficient verification framework for QEC programs. We define an assertion logic and a program logic specifically crafted for QEC programs and establish a sound proof system. We then develop an efficient method for handling verification conditions (VCs) of QEC programs: for Pauli errors, the VCs are reduced to classical assertions that can be solved by SMT solvers, and for non-Pauli errors, we provide a heuristic algorithm. We formalize the proposed program logic in Coq proof assistant, making it a verified QEC verifier. Additionally, we implement an automated QEC verifier, Veri-QEC, for verifying various fault-tolerant scenarios. We demonstrate the efficiency and broad functionality of the framework by performing different verification tasks across various scenarios. Finally, we present a benchmark of 14 verified stabilizer codes.

量子错误校正( QEC) 是抑制量子硬件噪音和促成容错量量计算的基础。 在本文中, 我们建议了QEC程序的有效核查框架 。 我们定义了专门为 QEC 程序设计的断言逻辑和程序逻辑, 并建立了一个健全的验证系统 。 然后我们开发了一种处理 QEC 程序核查条件( VCs) 的有效方法 : 对于 保利 错误, VC 将简化为由 SMT 解答器解答的经典说法, 对于非 Pauli 错误, 我们提供了一种超常算法 。 我们在 Coq 验证助理中正式确定了拟议的程序逻辑, 使之成为一个经核实的 QEC 验证器 。 此外, 我们用一个自动的 QEC 校准器, Veri- QEC 来验证各种容错漏情况 。 我们通过在不同情况下执行不同的核查任务来展示框架的效率和广泛功能 。 最后, 我们提出了一个由 14 个经核实的稳定器代码基准 。


Article 34

Title@2025-06-05 (4): hdl2v: A Code Translation Dataset for Enhanced LLM Verilog Generation

Title: hdl2v: A Code Translation Dataset for Enhanced LLM Verilog Generation hdl2v: Ein Code-Übersetzungsdatensatz für verbesserte LLM Verilog-Generierung hdl2v: 用于强化LLM Verilog 生成的代码翻译数据集 2506.04544v1

Authors (6): Charles Hong, Brendan Roberts, Huijae An, Alex Um, Advay Ratan, Yakun Sophia Shao

Large language models (LLMs) are playing an increasingly large role in domains such as code generation, including hardware code generation, where Verilog is the key language. However, the amount of publicly available Verilog code pales in comparison to the amount of code available for software languages like Python. In this work, we present hdl2v (“HDL-to-Verilog”), a dataset which seeks to increase the amount of available human-written Verilog data by translating or compiling three other hardware description languages - VHDL, Chisel, and PyMTL3 - to Verilog. Furthermore, we demonstrate the value of hdl2v in enhancing LLM Verilog generation by improving performance of a 32 billion-parameter open-weight model by up to 23% (pass@10) in VerilogEvalV2, without utilizing any data augmentation or knowledge distillation from larger models. We also show hdl2v’s ability to boost the performance of a data augmentation-based fine-tuning approach by 63%. Finally, we characterize and analyze our dataset to better understand which characteristics of HDL-to-Verilog datasets can be expanded upon in future work for even better performance.

大型语言模型(LLMS)在诸如代码生成(包括硬件代码生成)等领域发挥着越来越重要的作用, 包括硬件代码生成( Verilog 是 Verilog 的关键语言 ) 。 然而, 公开提供的 Verilog 代码数量与 Python 等软件语言可用的代码数量相比, 与可用代码数量相比, Vython 等软件语言的代码数量是苍白的。 在这项工作中, 我们提供了 hdl2v (“ HDL- 到 Verilog ” ) , 这个数据集试图通过翻译或汇编其他三种硬件描述语言( VHDL、 Chisel 和 PyMTL3 - 至 Verilog ) 来增加现有的人文版 Verilog 数据数量。 此外, 我们用 HDL- VerivalV 2 改进了320亿 参数开放度模型的性能模型的性能( passel@10) 。 我们还展示了 hdl2 能力, 63% 来提升基于数据增强基于 微调方法的性能的性能。 最后, 我们分析和分析了我们的数据数据集, 以便更好地了解未来数据系统如何改进了HDL- 。