Back to Homepage

AI Research Scientist

Adversarial ML, detection, and safety

$250,000 + breakthrough bonusesIdeal for: PhD researchers, postdocs, graduate specialists

AI adversarial research and security engineering skunkworks

If you want permissionless research, radical creativity, and the freedom to break things the big labs would never let you touch, you'll feel at home here. If you'd rather discover new failure modes than sit in meetings discussing old ones, you'll fit right in.

OpenAI has guardrails. Anthropic has constraints. Google has committees.

We are the lab built for:

  • investigating adversarial vulnerabilities in LLMs
  • exploring agentic failure modes without restrictions
  • inventing new classes of jailbreaks, exploits, and failure patterns
  • designing the next security layer for autonomous AI
  • rewarding people who discover what others cannot

If you want a safe, structured, corporate research environment, this is not it. If you want to uncover failure modes no one has documented yet, you will fit in immediately.

Culture: why choose us over Anthropic, OpenAI, or Google

Most labs optimize for safety optics and controlled research. We optimize for discovery, creativity, and speed.

  • growth based on contribution, not tenure
  • lead meaningful projects from your first month
  • no hierarchy, no committees, no approval chains
  • full ownership of your research direction and methodology
  • freedom to explore unsafe, emerging, or unconventional failure modes
  • access to tools, models, and environments that would be restricted anywhere else
  • flexible work - output matters, not hours, work when you do your best thinking

How we work

  • weekly micro-hackathons for new attack ideas
  • rapid experimentation - if it works, it ships
  • zero bureaucracy, no waiting for permission
  • build your own exploit frameworks, fuzzers, and tooling
  • use any language, workflow, or stack you prefer

Our philosophy

  • curiosity-driven research is encouraged
  • unconventional and weird ideas welcome
  • boundary-pushing is the expectation, not the exception

This is a place for people who want to push the edge of what AI systems can do, and what they can break.

And most importantly: The hacker bonus program

Breakthroughs are rewarded here. If you uncover:

  • a new jailbreak class
  • an exploit that bypasses all known defenses
  • a failure mode affecting memory, reasoning, or agent behavior
  • an adversarial pattern that generalizes across models

You receive a direct cash bonus. We reward ingenuity, curiosity, and technical creativity. This model does not exist at other labs.

Why join Agent Hacker now

Because the people who secure autonomous agents will:

  • write the rules everyone else follows
  • lead the category before it even has a name
  • define the research others cite for decades
  • invent the defenses the world depends on
  • build the next wave of AI security companies

If you want the chance to build something the world will rely on, this is it. Early builders don't follow the field - they define and create it.

You will work on:

  • Semantic jailbreak detection
  • Agentic drift and behavior deviation
  • Multi-step adversarial reasoning
  • Misaligned or unsafe tool usage
  • Chain-of-thought extraction attempts
  • Embedding-space manipulation and hidden attack surfaces

What you'll do:

  • Publish and lead foundational research in adversarial AI and safety
  • Define the defensive architecture of a new AI security category
  • Pursue research that researchers leave MAANG and big labs to work on

Ready to apply?

Send your resume, publications, and what research questions drive you in adversarial AI.

jobs@agenthacker.ai