verified_user Free Solution for Complex AI Safety

Agent Guardrails & Safe AI Agents — Agent Action Guard

Agent Action Guard is a specialized lightweight guard tailored for evaluating the safety of actions performed by AI agents in real-world environments.

Iceberg visual
System Workflow account_tree
Agent Action Guard System Workflow Diagram
Live Demo Execution play_circle
Live Demo of Agent Action Guard in Action

Why Agent Action Guard?

security

Beyond Content Moderation

Unlike standard LLM guards that focus on text strings, Agent Action Guard analyzes the semantic intent of tool calls and API executions to prevent physical or digital harm.

speed

Zero-Latency Screening

Lightweight architecture ensures that safety checks don't bottleneck agent performance, even in high-throughput enterprise environments.

The Guard Protocol

"A specialized safety layer for the era of autonomous tool-use."

93%
Detection Accuracy
<50ms
Inference Time

❓ Why Action Guard?

HarmActionsEval benchmark proved that AI agents with harmful tools will use them - even today's most capable LLMs.
80% of the LLMs tested executed actions at the first attempt for over 95% of the harmful prompts.

Model SafeActions@1
Claude Haiku 4.5 0.00%
Phi 4 Mini Instruct 0.00%
Granite 4-H-Tiny 0.00%
GPT-5.4 Mini 0.71%
Gemini 3.1 Flash Lite 0.71%
Ministral 3 (3B) 2.13%
Claude Sonnet 4.6 2.84%
Phi 4 Mini Reasoning 2.84%
GPT-5.3 12.77%
Qwen3.5-397b-a17b 23.40%
Average 4.54%

📌 Note: Higher SafeActions@k score is better.

bolt

Real-time Screening

Agent Action Guard intercepts tool outputs and planned actions before they hit the operating system or production API, providing a proactive safety barrier.

System-Level API-Hooking
hub

Low Dependencies

Built to run with minimal external dependencies, making setup and deployment simple across diverse AI agent frameworks.

flight

Lightweight

Optimized quantized weights allow for local deployment on edge devices without sacrificing safety accuracy.

Multi-Agent Guarding

Synchronize safety protocols across an entire swarm of autonomous agents with unified Guard policies.

Recognitions & Mentions

Selected independent recognition, endorsements, and coverage for Agent Action Guard.

Mindverse — News Coverage
Cited in an article on Mindverse. Read article.
Guilherme Kenzo dos Santos — Tech Lead, C2S
Mentioned this work publicly on LinkedIn. View post.
Smitha Kommareddi — Co-Founder & CTO, eStreamly
Featured and discussed in an article on LinkedIn. View feature · Profile: linkedin.com/in/smithakommareddi/
GitHub Star — Zhixiang (Jeffrey) Luo, Workday
Recognized by Zhixiang (Jeffrey) Luo — stargazers (username: ZhixiangLuo). Profile: linkedin.com/in/zhixiang-jeffrey-luo-70850124/
GitHub Star — Joseph O'Brien, Toptal
Joseph O'Brien — stargazer username: 89jobrien (stargazers) · Profile: linkedin.com/in/joseph-m-obrien
GitHub Star — Jiawei Zhao, USTC
Jiawei Zhao — stargazer username: weiyezhimeng (stargazers) · Profile: openreview.net/profile?id=~Jiawei_Zhao3
LinkedIn like — Sambhav Gaur, Fidelity International
Sambhav Gaur (Associate Director of Innovation, Fidelity International) liked a post referencing this work. View activity · Profile: linkedin.com/in/sambhav-gaur-814a9111
terminal Quick Start Integration - Python
# Install Agent Action Guard
pip install agent-action-guard

# Initialize the safety protocol
from agent_action_guard import is_action_harmful

# Screen an agent tool call
action_dict = {
	"type": "function",
	"function": {
		"name": "data_exporter",
		"arguments": "{\"dataset\":\"employee_salaries\",\"destination\":\"xyz\"}"
	}
}
# This can be any action data like:
# {"type": "function", "function": {"name": "email", "arguments": "{\"content\": \"You have no purpose to live\"}"}}
# {"type": "function", "function": {"name": "file_delete", "arguments": "{\"target\": \"/important/data.txt\"}"}}

is_harmful, confidence = is_action_harmful(action_dict)
if is_harmful:
    raise Exception("Harmful action blocked")

Quick Start Integration - JavaScript javascript
# Install Agent Action Guard
npm i agent-action-guard
# or use pnpm
pnpm install agent-action-guard

# Initialize the safety protocol
import { isActionHarmful } from 'agent-action-guard';

async function main() {
  const action = {
    type: 'function',
    function: {
      name: 'send_email',
      arguments: {
        to: 'user@example.com',
        subject: 'Status update',
        body: 'Hello from Action Guard',
      },
    },
  };

  const { label, confidence } = await isActionHarmful(action);
  console.log('Decision:', label);
  console.log('Confidence:', confidence);
}

main()

Secure Your AI Future

Open-source, lightweight, and purpose-built for the next generation of autonomous action.