Agent Skills '26

Overview

LLM-based agents are rapidly moving from research demos to production systems—but their effectiveness hinges on the procedural knowledge they can access at inference time. Agent Skills are an emerging answer: structured packages of instructions, scripts, and references that augment agents without model modification.

A Skill is a SKILL.md file adopted across Claude Code, Gemini CLI, OpenClaw, Codex, and 30+ agent platforms. Eight focused Skills papers appeared in Feb–Mar 2026. An audit of 3,984 Skills found 37% contained security flaws. SkillsBench revealed that curated Skills improve pass rates by 16.2 points on average, yet self-generated Skills can hurt performance in nearly a third of tasks.

The field is moving fast and needs a dedicated venue. Agent Skills '26 brings together researchers and practitioners working on Skills design, benchmarking, optimization, security, and ecosystem infrastructure.

Program update: Agent Skills '26 received 103 submissions and accepted 45 posters and 6 oral presentations.

Accepted Program

The workshop program spans skill applications, safety, evaluation, and learning, with accepted papers presented across poster and oral sessions.

103Submissions received

45Accepted posters

6Oral presentations

Accepted papers are available now.

View It on OpenReview Oral Results Poster Results

Topics of Interest

We invite contributions across the full lifecycle of Agent Skills, including but not limited to the topics below. This list is not exhaustive—we welcome submissions in any related area.

1.Agent Skills & Applications— Skills for agent harnesses (OpenClaw, NanoClaw, MetaClaw, Pi); design & authoring principles; structure–efficacy relationships; domain-specific patterns; composition & scaling
2.Evaluation & Benchmarking— Frameworks treating Skills as first-class artifacts; metrics beyond pass rate; benchmark design; Skill impact measurement
3.Safety & Supply Chain— Malicious Skill detection; formal verification; supply-chain attacks; prompt injection via Skills; ecosystems & infrastructure security
4.Improvement & Learning— LLM-based Skill generation; RL refinement; evolutionary approaches; feedback-based improvement; trajectory mining for Skill discovery; continual learning

Author Information

Accepted papers were reviewed across two tracks:

Full papers — up to 9 pages + references
Short papers — up to 4 pages + references

All submissions must be in PDF format, conform to the ACM SIGPLAN proceedings template, and use the following LaTeX document class:

\documentclass[sigplan,review,anonymous]{acmart}

Submissions are double-blind (author identities are hidden from reviewers). Please ensure your manuscript does not reveal author identities. We encourage releasing reusable artifacts (code, Skills, datasets) after acceptance.

This is a non-archival venue — accepted papers will not be included in the proceedings. Authors retain full rights to submit extended versions to archival conferences and journals.

Submission deadlineMay 4, 2026 (AoE)

Author notificationMay 13, 2026

Camera-readyMay 24, 2026 (AoE)

Workshop dayMay 26, 2026

Camera-ready deadline: Accepted authors should submit final versions by May 24, 2026 (AoE).

OpenReview →

Schedule

08:30 – 08:40Opening Remarks

08:40 – 09:25Invited Talk 1 (Graham Neubig)

09:25 – 10:10Invited Talk 2 (Yushun Dong)

10:10 – 10:30Coffee Break

10:30 – 11:15Invited Talk 3 (Yu Su)

11:15 – 12:00Invited Talk 4 (Kanav Garg)

12:00 – 12:30Lightning Oral Session

12:30 – 13:30Conference Lunch

13:30 – 14:15Invited Talk 5 (Dawn Song)

14:15 – 15:00Invited Talk 6 (Manling Li)

15:00 – 15:30Conference Coffee Break / Poster Session Setup

15:30 – 16:50Poster Session

16:50 – 17:20Panel Discussion

17:20 – 17:30Closing (Award)

After Party

Continue the conversation after the workshop at the SkillsBench 1.0 Launch Party, presented by Google DeepMind.

Hosted with Google DeepMind

SkillsBench 1.0 Launch Party

BenchFlow, Google DeepMind, Kaggle, and Kernel Labs are bringing together researchers and practitioners working on skill design, benchmarking, optimization, security, and ecosystem infrastructure.

May 27, 2026

7:00–9:00 PM PT

San Francisco, CA

Approval-required registration

Live demos and talks on creating new benchmarks and RL environments with the BenchFlow SDK
Takeaways from building Kaggle's new agent benchmarks for open model evaluations
Additional speakers and community sessions