Most integration tests fall into one of two traps: they are either so flaky that teams start ignoring failures, or so shallow that they catch nothing a unit test would not already cover. When I set out to redesign our integration testing approach, the goal was simple: get to 100% coverage without making the test suite a maintenance burden.

The Problem

Our service had around 60% integration test coverage. The existing tests were tightly coupled to downstream services, meaning any deployment in a dependency could cascade into false failures. Engineers spent more time debugging test infrastructure than actual bugs.

The core issues were:

  • Direct dependencies on live downstream services in test environments
  • No deterministic way to reproduce edge cases
  • Test data that rotted over time as schemas evolved
  • Flaky network calls that introduced non-determinism

The Approach: Structured Mocking

The key insight was that we needed a mocking layer that was structured rather than ad-hoc. Instead of hand-writing mock responses for each test case, I built a framework that could:

  1. Record real responses from downstream services during a capture phase
  2. Sanitize and parameterize those responses for reuse
  3. Replay them deterministically during test execution
  4. Detect schema drift and flag stale mocks automatically
// Example: Configuring a mock for a downstream call
MockRegistry.register(
    ServiceCall.of("PaymentService", "processRefund"),
    MockResponse.builder()
        .withStatus(200)
        .withBody(captured("refund-success-response.json"))
        .withLatency(Duration.ofMillis(50))
        .build()
);

Schema Drift Detection

One of the more interesting challenges was detecting when mocked responses no longer matched reality. We solved this by running a nightly job that replayed a subset of tests against live services and compared response shapes. Any structural differences triggered an alert and auto-generated a PR to update the affected mocks.

Results

After rolling this out:

  • Coverage went from 60% to 100%
  • Flaky test rate dropped from ~15% to under 1%
  • Average test suite runtime decreased by 40% (no network calls)
  • Engineers could reproduce any production scenario locally

The best test infrastructure is invisible. Engineers should think about what they are testing, not how the test machinery works.

Takeaways

If you are building integration tests for a service with many downstream dependencies, consider these principles:

  • Record, do not fabricate. Real responses are always more accurate than hand-written mocks.
  • Automate staleness detection. Mocks rot silently. Build systems that catch drift early.
  • Make the happy path trivial. If writing a new test requires more than 5 minutes of setup, something is wrong.
  • Treat test infrastructure as a product. It has users (your team), and it needs the same care as production code.