Testing Strategy: Unit vs Widget vs Integration
Building a testing philosophy from scratch — cost, coverage, and confidence
Open interactive version (quiz + challenge)Real-world analogy
Think of tests like security checks at an airport. Unit tests are the X-ray scanner — fast, catches most issues. Widget tests are the body scanner — slower but catches more. Integration tests are the full manual pat-down — thorough but you can't do it for every passenger. Smart airports use all three at the right gates.
What is it?
A testing strategy defines which types of tests to write, how many of each, and in what order. The pyramid model (many unit, some widget, few integration) optimises for fast feedback and low maintenance cost while still catching the bugs that matter.
Real-world relevance
On a fintech claims app handling BankID JWT auth and refund processing, the team wrote unit tests for every use case, widget tests for the claim submission form, and one integration test for the full login-to-submit flow. This caught a JWT expiry edge case in CI before it reached the QA environment.
Key points
- Why test at all? — Tests are executable documentation that prevent regressions. On a real-time SaaS chat app, a refactoring of the message model broke 6 screens silently — tests would have caught it in seconds, not days.
- The testing pyramid — Unit tests form the wide base (fast, cheap, many), widget tests form the middle (moderate speed/cost), integration tests sit at the top (slow, expensive, few). Invert this pyramid and your CI takes 45 minutes instead of 4.
- Unit tests: pure logic — Test Dart classes, use cases, repositories, and state machines in isolation. No Flutter framework, no device, runs on the host machine in milliseconds. Best ROI for business logic.
- Widget tests: UI behaviour — Test individual widgets and small subtrees with WidgetTester. Runs in a simulated Flutter environment — no real device needed. Catches layout errors, missing providers, and navigation mistakes.
- Integration tests: full flows — Run on a real device or emulator. Test complete user journeys — login → dashboard → submit claim → logout. Slowest, most brittle, but the only way to catch platform-specific bugs.
- What to test FIRST — Start with the code that changes most and costs most when broken: use cases, repositories, BLoC/Cubit state transitions. These give 80% of the value at 20% of the effort.
- Test cost vs confidence — Unit: 1x cost, catches logic bugs. Widget: 3x cost, catches UI wiring bugs. Integration: 10x cost, catches flow and platform bugs. Always ask: what is the cheapest test that gives me enough confidence?
- Testing in production apps — On an offline-first survey app, unit tests on the sync engine caught a data-loss bug before release. The test took 2 hours to write and saved a day of production debugging and a potential client complaint.
- Regression safety net — Every bug you fix should get a test. This is the minimum viable testing practice. It costs 15 minutes and prevents the same bug from returning in every future sprint.
- Coverage is a vanity metric alone — 100% line coverage does not mean 100% correctness. A test that calls every line but asserts nothing is worthless. Coverage is useful as a lower bound — below 60% on business logic is a red flag in interviews.
Code example
// pubspec.yaml dev_dependencies
// test: ^1.24.0
// flutter_test: sdk: flutter
// bloc_test: ^9.1.0
// mocktail: ^1.0.0
// Unit test — pure Dart, no Flutter
import 'package:test/test.dart';
import 'package:my_app/domain/use_cases/calculate_refund.dart';
void main() {
group('CalculateRefundUseCase', () {
late CalculateRefundUseCase useCase;
setUp(() {
useCase = CalculateRefundUseCase();
});
test('returns full refund when claim is within 24h', () {
final result = useCase.execute(
claimAmount: 100.0,
hoursSincePurchase: 12,
);
expect(result, equals(100.0));
});
test('returns 50% refund when claim is 24–72h', () {
final result = useCase.execute(
claimAmount: 100.0,
hoursSincePurchase: 48,
);
expect(result, equals(50.0));
});
test('returns zero refund after 72h', () {
final result = useCase.execute(
claimAmount: 100.0,
hoursSincePurchase: 96,
);
expect(result, equals(0.0));
});
});
}Line-by-line walkthrough
- 1. group('CalculateRefundUseCase') — organises related tests under a named suite, shown in test output
- 2. setUp(() { useCase = CalculateRefundUseCase(); }) — creates a fresh instance before each test, preventing state leakage between tests
- 3. test('returns full refund when claim is within 24h') — names the exact scenario; a good test name IS the documentation
- 4. useCase.execute(claimAmount: 100.0, hoursSincePurchase: 12) — calls the real production code with controlled inputs
- 5. expect(result, equals(100.0)) — asserts the exact expected output; if this fails, CI blocks the merge
- 6. Three separate test cases cover three branches of the refund logic — this is boundary value testing, a key interview concept
- 7. No Flutter imports needed — this runs on the Dart VM in milliseconds, with no simulator overhead
Spot the bug
void main() {
group('RefundTests', () {
test('full refund test', () {
final uc = CalculateRefundUseCase();
final result = uc.execute(claimAmount: 200.0, hoursSincePurchase: 6);
expect(result, equals(100.0));
});
});
}Need a hint?
The test will fail — but not because the production code is wrong. Look at the assertion value.
Show answer
Bug: The test asserts equals(100.0) but the input claimAmount is 200.0 and hoursSincePurchase is 6 (within 24h), so the correct full refund should be 200.0. The assertion is wrong — this is a false-failing test, which is as dangerous as no test because it erodes trust in the test suite. Fix: expect(result, equals(200.0)).
Explain like I'm 5
Tests are like a checklist before a flight. Unit tests check each individual part — engine, wheels, wings — one at a time, very fast. Widget tests check that the dashboard lights up correctly. Integration tests do a full taxi run. You do lots of part checks, some dashboard checks, and only a few full taxi runs — otherwise you would never take off.
Fun fact
Google's internal research found that code with tests is 50% less likely to have production incidents. The Flutter framework itself has over 30,000 tests — roughly 1 test per 2 lines of production code.
Hands-on challenge
Audit a feature you have built: identify one piece of business logic, one UI component, and one user flow. Write down which test type applies to each and justify the cost vs confidence tradeoff.
More resources
- Flutter Testing Overview (Flutter Docs)
- test package — Dart (pub.dev)
- flutter_test library (Flutter API)
- The Practical Test Pyramid (Martin Fowler)