After Party · May 27 · RSVP

Agent Skills '26

The First Workshop on Agent Skills — Design, Evaluation, and Optimization of Procedural Knowledge for LLM Agents
ACM CAIS May 26, 2026 · San Jose, CA · Full day

Overview

LLM-based agents are rapidly moving from research demos to production systems—but their effectiveness hinges on the procedural knowledge they can access at inference time. Agent Skills are an emerging answer: structured packages of instructions, scripts, and references that augment agents without model modification.

A Skill is a SKILL.md file adopted across Claude Code, Gemini CLI, OpenClaw, Codex, and 30+ agent platforms. Eight focused Skills papers appeared in Feb–Mar 2026. An audit of 3,984 Skills found 37% contained security flaws. SkillsBench revealed that curated Skills improve pass rates by 16.2 points on average, yet self-generated Skills can hurt performance in nearly a third of tasks.

The field is moving fast and needs a dedicated venue. Agent Skills '26 brings together researchers and practitioners working on Skills design, benchmarking, optimization, security, and ecosystem infrastructure.

Program update: Agent Skills '26 received 103 submissions and accepted 45 posters and 6 oral presentations.

Accepted Program

The workshop program spans skill applications, safety, evaluation, and learning, with accepted papers presented across poster and oral sessions.

103Submissions received
45Accepted posters
6Oral presentations

Accepted papers are available now.

Topics of Interest

We invite contributions across the full lifecycle of Agent Skills, including but not limited to the topics below. This list is not exhaustive—we welcome submissions in any related area.

  1. 1.Agent Skills & Applications— Skills for agent harnesses (OpenClaw, NanoClaw, MetaClaw, Pi); design & authoring principles; structure–efficacy relationships; domain-specific patterns; composition & scaling
  2. 2.Evaluation & Benchmarking— Frameworks treating Skills as first-class artifacts; metrics beyond pass rate; benchmark design; Skill impact measurement
  3. 3.Safety & Supply Chain— Malicious Skill detection; formal verification; supply-chain attacks; prompt injection via Skills; ecosystems & infrastructure security
  4. 4.Improvement & Learning— LLM-based Skill generation; RL refinement; evolutionary approaches; feedback-based improvement; trajectory mining for Skill discovery; continual learning

Author Information

Accepted papers were reviewed across two tracks:

All submissions must be in PDF format, conform to the ACM SIGPLAN proceedings template, and use the following LaTeX document class:

\documentclass[sigplan,review,anonymous]{acmart}

Submissions are double-blind (author identities are hidden from reviewers). Please ensure your manuscript does not reveal author identities. We encourage releasing reusable artifacts (code, Skills, datasets) after acceptance.

This is a non-archival venue — accepted papers will not be included in the proceedings. Authors retain full rights to submit extended versions to archival conferences and journals.

Submission deadlineMay 4, 2026 (AoE)
Author notificationMay 13, 2026
Camera-readyMay 24, 2026 (AoE)
Workshop dayMay 26, 2026

Camera-ready deadline: Accepted authors should submit final versions by May 24, 2026 (AoE).

OpenReview →

Schedule

08:30 – 08:40Opening Remarks
08:40 – 09:25Invited Talk 1 (Graham Neubig)
09:25 – 10:10Invited Talk 2 (Yushun Dong)
10:10 – 10:30Coffee Break
10:30 – 11:15Invited Talk 3 (Yu Su)
11:15 – 12:00Invited Talk 4 (Kanav Garg)
12:00 – 12:30Lightning Oral Session
12:30 – 13:30Conference Lunch
13:30 – 14:15Invited Talk 5 (Dawn Song)
14:15 – 15:00Invited Talk 6 (Manling Li)
15:00 – 15:30Conference Coffee Break / Poster Session Setup
15:30 – 16:50Poster Session
16:50 – 17:20Panel Discussion
17:20 – 17:30Closing (Award)

After Party

Continue the conversation after the workshop at the SkillsBench 1.0 Launch Party, presented by Google DeepMind.

Hosted with Google DeepMind

SkillsBench 1.0 Launch Party

BenchFlow, Google DeepMind, Kaggle, and Kernel Labs are bringing together researchers and practitioners working on skill design, benchmarking, optimization, security, and ecosystem infrastructure.

May 27, 2026
7:00–9:00 PM PT
San Francisco, CA
Approval-required registration
  • Live demos and talks on creating new benchmarks and RL environments with the BenchFlow SDK
  • Takeaways from building Kaggle's new agent benchmarks for open model evaluations
  • Additional speakers and community sessions
Register on Luma →

Invited Speakers

Dawn Song
Dawn Song
UC Berkeley
Manling Li
Manling Li
Northwestern University
Yushun Dong
Yushun Dong
Florida State
Kanav Garg
Kanav Garg
Core Automation
Yu Su
Yu Su
Ohio State University
Rob Ennals
Rob Ennals
New Public
Jesse Vincent
Jesse Vincent
Prime Radiant

Organizers

Xiangyi Li
Xiangyi Li
BenchFlow
Wenbo Chen
Wenbo Chen
Amazon
Xuandong Zhao
Xuandong Zhao
UC Berkeley
Kyoung Whan Choe
Kyoung Whan Choe
RLWRLD
Yimin Liu
Yimin Liu
Ohio State
Xiaokun Chen
Xiaokun Chen
Stanford
Shenghan Zheng
Shenghan Zheng
Dartmouth
Yifeng He
Yifeng He
UC Davis
Hao Chen
Hao Chen
UC Davis
Yushun Dong
Yushun Dong
Florida State
Yan Liu
Yan Liu
USC
Han-chung Lee
Han-chung Lee
Moody's
Yue Zhao
Yue Zhao
USC
Emilio Ferrara
Emilio Ferrara
USC
Dawn Song
Dawn Song
UC Berkeley

Sponsors

Interested in supporting Agent Skills '26? We welcome sponsorship at all levels. Please reach out at public@agentskills-workshop.org for details.

Contact

Email public@agentskills-workshop.org for submissions, logistics, or sponsorship.