ResearchRemoteFull-time

AI Safety Researcher

Build the ethics and safety infrastructure that makes Primordia trustworthy enough to deploy in high-stakes operational environments.

About the role

Astraea is our constitutional ethics gate — the subsystem that evaluates every output against a codified set of principles before it reaches the operator. This isn't a content filter or a fine-tuned refusal mechanism. It's a governance architecture designed to hold up under adversarial pressure in real operational environments.

We take AI safety seriously not as a PR exercise, but as a core engineering constraint. If Primordia is going to be deployed in logistics, defense, and critical infrastructure, it has to be provably safe to the standards those environments require. That's what you'll build.

What you'll do

Advance the design and implementation of Astraea's constitutional ethics architecture
Research formal methods for specifying and verifying AI behavioral constraints
Develop adversarial testing frameworks to probe system behavior under edge cases
Collaborate with the Aletheia team on the relationship between truth verification and ethical constraints
Maintain and evolve the constitutional principles framework as deployment contexts expand
Engage with the broader AI safety community and represent Primordia's approach publicly

What we need

Strong research background in AI safety, alignment, or AI governance
Familiarity with interpretability methods and mechanistic understanding of model behavior
Experience with formal specification of behavioral constraints
Ability to combine theoretical research with practical system implementation
Track record of published work or equivalent research contributions

Nice to have

Background in philosophy of ethics or moral philosophy
Experience with red-teaming or adversarial ML
Familiarity with regulatory and compliance frameworks for AI in high-stakes industries
Prior work at an AI safety organization (ARC, Redwood, Anthropic safety team, etc.)

Primordia will only be as trustworthy as the people building its safety layer. If this is the work you want to do, reach out directly.

Apply for this role