AgentHacker | AI Adversarial Research & Security Engineering

AI adversarial research and security engineering skunkworks

If you want permissionless research, radical creativity, and the freedom to break things the big labs would never let you touch, you'll feel at home here. If you'd rather discover new failure modes than sit in meetings discussing old ones, you'll fit right in.

OpenAI has guardrails. Anthropic has constraints. Google has committees.

We are the lab built for:

investigating adversarial vulnerabilities in LLMs
exploring agentic failure modes without restrictions
inventing new classes of jailbreaks, exploits, and failure patterns
designing the next security layer for autonomous AI
rewarding people who discover what others cannot

If you want a safe, structured, corporate research environment, this is not it. If you want to uncover failure modes no one has documented yet, you will fit in immediately.

Culture: why choose us over Anthropic, OpenAI, or Google

Most labs optimize for safety optics and controlled research. We optimize for discovery, creativity, and speed.

growth based on contribution, not tenure
lead meaningful projects from your first month
no hierarchy, no committees, no approval chains
full ownership of your research direction and methodology
freedom to explore unsafe, emerging, or unconventional failure modes
access to tools, models, and environments that would be restricted anywhere else
flexible work - output matters, not hours, work when you do your best thinking

How we work

weekly micro-hackathons for new attack ideas
rapid experimentation - if it works, it ships
zero bureaucracy, no waiting for permission
build your own exploit frameworks, fuzzers, and tooling
use any language, workflow, or stack you prefer

Our philosophy

curiosity-driven research is encouraged
unconventional and weird ideas welcome
boundary-pushing is the expectation, not the exception

This is a place for people who want to push the edge of what AI systems can do, and what they can break.

And most importantly: The hacker bonus program

Breakthroughs are rewarded here. If you uncover:

a new jailbreak class
an exploit that bypasses all known defenses
a failure mode affecting memory, reasoning, or agent behavior
an adversarial pattern that generalizes across models

You receive a direct cash bonus. We reward ingenuity, curiosity, and technical creativity. This model does not exist at other labs.

Why join Agent Hacker now

Because the people who secure autonomous agents will:

write the rules everyone else follows
lead the category before it even has a name
define the research others cite for decades
invent the defenses the world depends on
build the next wave of AI security companies

If you want the chance to build something the world will rely on, this is it. Early builders don't follow the field - they define and create it.

You will work on:

Semantic jailbreak detection
Agentic drift and behavior deviation
Multi-step adversarial reasoning
Misaligned or unsafe tool usage
Chain-of-thought extraction attempts
Embedding-space manipulation and hidden attack surfaces

What you'll do:

Publish and lead foundational research in adversarial AI and safety
Define the defensive architecture of a new AI security category
Pursue research that researchers leave MAANG and big labs to work on

Ready to apply?

Send your resume, publications, and what research questions drive you in adversarial AI.

jobs@agenthacker.ai

AI Research Scientist