Tag: prompt-injection

8 articles available

View all tags View all posts

Why Tool-Calling Agent Security Is a Structural Problem

agi

SourceJul 8, 20262026-07-08

Why Tool-Calling Agent Security Is a Structural Problem

Why text-driven tool calls make AI agent delegation a structural security issue, backed by refusal-rate evidence.

Why Agent Configs Need Deterministic Control Planes

llm

SourceJun 27, 20262026-06-27

Why Agent Configs Need Deterministic Control Planes

Why reused coding agent config files can become an unmanaged control layer with security and operational risks.

Alignment and Safety Guardrails Shape Model Behavior

hardware

CommunityJun 18, 20262026-06-18

Alignment and Safety Guardrails Shape Model Behavior

Shows with public metrics that alignment and guardrails affect instruction following, harmful output, and hallucination trade-offs.

DistractionIF Exposes Hidden Instruction Risks In RAG Systems

llm

SourceMay 30, 20262026-05-30

DistractionIF Exposes Hidden Instruction Risks In RAG Systems

DistractionIF shows how RAG systems misread instruction-like noise in documents and why pipeline design matters.

RAG Security Risks From Combined Injection And Poisoning

hardware

SourceMar 27, 20262026-03-27

RAG Security Risks From Combined Injection And Poisoning

Examines security risks in RAG when prompt injection and database poisoning combine across retrieval and indexing.

Execution Provenance Defines Real Agent Security Boundaries

llm

SourceMar 26, 20262026-03-26

Execution Provenance Defines Real Agent Security Boundaries

Agent security depends less on benchmark scores than on tracing execution provenance across generation, handoffs, and permissions.

Designing Agent Defenses Against Prompt Injection Attacks

hardware

GuideFeb 14, 20262026-02-14

Designing Agent Defenses Against Prompt Injection Attacks

How prompt injection rides untrusted content into tool calls, and how to mitigate it with least privilege, sandboxing, fixed schemas, and output validation.

Addressing Steganography Threats and Security Risks in Language Models

llm

CommunityFeb 2, 20262026-02-02

Addressing Steganography Threats and Security Risks in Language Models

Analyzes AI steganography threats where hidden data manipulates models and explores defense strategies like RepreGuard.