Harness Engineer (Ai Agent Systems)

Truewind

Date listed

1 week ago

Employment Type

Full time

Keywords: ai github agents llm

What this role is

We build AI agents that do real work.

Not assistants. Not demos.
Agents that execute workflows end-to-end and produce correct outcomes.

Your job is to:

  • build those agents
  • and build the systems that make them reliable

This is not prompt engineering.
This is making AI work in production.


What you’ll do

  • Build agents that execute multi-step workflows
  • Design systems for validation, retry, and failure handling
  • Define constraints (schemas, invariants, contracts)
  • Add feedback loops (detect → debug → improve)
  • Turn failures into reusable systems

What this role is NOT

  • Not prompt engineering
  • Not one-shot demos
  • Not feature-heavy product work

You are building agents that do the work, and the systems that ensure they do it correctly.

Note: This is different from “vibe coding.” You won’t just prompt and accept outputs. You’ll build systems so results are reliable and repeatable.


What we’re looking for

  • Strong systems thinking
  • Background in:
    • infrastructure, backend, or data systems
    • developer tools or internal platforms
  • Experience building reliable systems (not just features)
  • Comfortable debugging complex, ambiguous problems

Important:
LLM experience alone is not enough.
We care about how you make systems reliable.


Good fit if you:

  • Think in constraints, invariants, and feedback loops
  • Care about correctness, not just output quality
  • Have automated real workflows end-to-end
  • Prefer building systems over features

Not a fit if you:

  • Mostly prompt models and accept outputs
  • Have only built demos or prototypes
  • Avoid debugging or failure handling

Application (required)

1. Project (GitHub)
An agent system that:

  • performs a multi-step task
  • includes validation
  • handles failures (retry, fallback, etc.)

2. Short answer (5–10 sentences)
Describe a system where an AI agent failed.
What caused it, and how would you fix it?


How we measure success

  • Agents complete real workflows with minimal human input
  • Outputs are correct by construction
  • Failures decrease over time
  • New capabilities come from improving the system, not patching outputs

Why this matters

AI models are already powerful.

The bottleneck is making them:

  • reliable
  • structured
  • production-ready

The teams that win will not have better prompts.
They will have systems where agents actually work.


Before you apply

Most engineers won’t enjoy this role.

It requires:

  • thinking in systems instead of code
  • caring about correctness instead of speed
  • debugging behavior instead of writing features

But if this clicks for you,
you’ll be working on the actual frontier of software engineering.

Findwork Copyright © 2023

Newsletter


Let's simplify your job search. Receive your tailored set of opportunities today.

Subscribe to our Jobs