Most test automation frameworks die within a year.
Not because they don't work. Because nobody wants to use them.
They're too brittle. Too slow. Too cryptic. Or they require the person who built them to be in the room any time something breaks.
Here's what I've learned about building automation that actually survives contact with a real team.
Start With the Failure Mode
Before you write a single line of test code, ask: what's the most common reason automation projects fail at this company?
Brittle selectors? Flaky network calls? Tests that take 40 minutes to run so nobody runs them? A single framework author who left?
Your architecture should be designed around your specific failure mode, not around what looks good in a conference talk.
Make It Easy to Write Tests, Not Easy to Write Your Tests
The most common mistake I see is framework authors who write a beautiful abstraction layer that only they can work with intuitively.
Good frameworks optimize for the person who's never used it. The test should read like documentation, not implementation. A developer writing their first automated test in your framework shouldn't need to pair with you.
// Opaque — what is step() doing internally?
await step("login flow");
// Clear — anyone can read this
await page.goto("/login");
await page.fill("[data-testid=email]", user.email);
await page.fill("[data-testid=password]", user.password);
await page.click("[data-testid=submit]");
await expect(page).toHaveURL("/dashboard");
The second version is more verbose. It's also more maintainable, more debuggable, and understandable to a developer looking at a CI failure at 10pm.
Fast Feedback Is Non-Negotiable
If your test suite takes more than 15 minutes to run, developers will stop running it before pushing. And if they stop running it locally, your CI will be the only thing catching bugs — and only after the commit.
Parallelize early. Use tags to run subsets. Invest in fast selectors. Delete tests that test implementation details instead of behavior.
A 5-minute suite that runs 200 meaningful tests is worth more than a 45-minute suite that runs 800 redundant ones.
Document the "Why" Next to the Code
When a test fails 6 months after you wrote it, you need to know why it exists, not just what it's testing.
// IMPORTANT: This test validates the race condition found in GH-1234.
// The 500ms delay is intentional — remove it and the test becomes meaningless.
await page.waitForTimeout(500);
That comment has saved hours of debugging time. It'll save more. Write it.
The Real Metric
The metric that matters isn't test count or coverage percentage. It's: did the team catch a production bug with automation before it shipped?
Track that number. Celebrate it. Make it visible.
When the team sees that the framework caught three regressions last sprint, they start treating it like infrastructure instead of a chore.
That's when you know it'll survive.