Why E2E Tests Have a Bad Reputation
End-to-end tests fail for reasons that have nothing to do with bugs in your code: network timeouts, timing issues, test data left over from a previous run, a third-party service being slow, a CI runner being under load. This flakiness erodes trust in the test suite. When tests fail randomly, engineers start ignoring failures. When engineers ignore failures, the tests stop providing value.
The solution is not to avoid E2E tests. It's to write them in ways that avoid the common failure modes.
Pick the Right Tool
Choose your E2E framework based on what you're testing.
Playwright is the current best-in-class for browser automation. It handles modern JavaScript apps well, has excellent auto-waiting built in (eliminating most timing issues), supports multiple browsers, and has a powerful selector engine.
Cypress is excellent for JavaScript-heavy single-page applications. Its architecture (runs in the browser alongside your app) makes it very fast and reliable for UI-heavy workflows.
Laravel Dusk is the right choice if your app is server-rendered (Blade templates) and you want tight integration with Laravel's testing utilities. Dusk wraps ChromeDriver with a Laravel-aware API.
For a Laravel application with some JavaScript interactivity, Dusk is the pragmatic choice. For a Laravel API backed by a React/Vue SPA, Playwright on the frontend is the better fit.
The Page Object Pattern
The most important pattern for maintainable E2E tests is the Page Object. A Page Object encapsulates the interactions with a single page or component, so when the UI changes, you update one class instead of hunting through dozens of tests.
// tests/Browser/Pages/LoginPage.php
class LoginPage extends Page
{
public function url(): string
{
return '/login';
}
public function assert(Browser $browser): void
{
$browser->assertPathIs($this->url());
}
public function elements(): array
{
return [
'@email' => '[data-testid="email-input"]',
'@password' => '[data-testid="password-input"]',
'@submit' => '[data-testid="login-button"]',
'@error' => '[data-testid="login-error"]',
];
}
public function loginAs(Browser $browser, string $email, string $password): void
{
$browser->type('@email', $email)
->type('@password', $password)
->click('@submit');
}
}
Using the page object in a test:
public function test_user_can_log_in(): void
{
$user = User::factory()->create();
$this->browse(function (Browser $browser) use ($user) {
$browser->visit(new LoginPage())
->loginAs($user->email, 'password')
->assertPathIs('/dashboard')
->assertSee('Welcome back');
});
}
If you later rename the login button or restructure the form, you update LoginPage::elements() and the test code stays the same.
Stable Selectors: The Foundation of Non-Brittle Tests
E2E tests break most often because selectors are fragile. CSS classes change for styling reasons. DOM structure gets reorganized. Text content is updated.
Use data-testid attributes exclusively for test selectors. Add them to elements your tests interact with:
<button
type="submit"
class="btn btn-primary"
data-testid="create-project-button"
>
Create Project
</button>
These attributes are stable because they exist only for tests. A designer can change the button's classes, color, and surrounding markup without breaking your tests.
Strip data-testid attributes in production builds if you're concerned about exposing test structure. A webpack/Vite plugin can do this automatically.
Handling Asynchrony Without Sleep
The most common cause of flaky E2E tests is sleep() calls used to wait for async operations. Don't do this:
// FRAGILE: Assumes the modal appears within 1 second
$browser->click('@open-modal');
sleep(1);
$browser->assertVisible('@modal-content');
Instead, wait for specific conditions:
// STABLE: Waits until the element is visible, up to 5 seconds
$browser->click('@open-modal')
->waitFor('@modal-content', 5)
->assertVisible('@modal-content');
Laravel Dusk's waitFor, waitUntilMissing, waitForText, and waitUntil methods poll until a condition is true, making your tests robust to varying load times.
For API-driven operations, wait for the UI to reflect the completed state:
$browser->click('@save-invoice')
->waitForText('Invoice saved successfully')
->assertSee('Invoice #1042');
Test Data Isolation
Each E2E test must create its own data and must not depend on data created by other tests. Shared test data is the second-most-common cause of test failures.
In Dusk, use RefreshDatabase or database transactions to reset state between tests:
class InvoiceTest extends DuskTestCase
{
use DatabaseMigrations;
public function test_admin_can_create_invoice(): void
{
// Create all data needed for THIS test
$admin = User::factory()->admin()->create();
$client = Client::factory()->create(['name' => 'Acme Corp']);
$this->browse(function (Browser $browser) use ($admin, $client) {
$browser->loginAs($admin)
->visit('/admin/invoices/create')
->select('@client-select', $client->id)
->type('@amount', '2500.00')
->click('@save-draft')
->waitForText('Invoice created')
->assertSee('$2,500.00');
});
}
}
Categorizing and Prioritizing Tests
Not all E2E tests are equally important. Categorize them:
Critical path tests (run on every PR):
- User registration and email verification
- Login and logout
- Core feature workflows (creating a project, generating an invoice)
- Payment flows
Regression tests (run nightly):
- Edge cases discovered from production bugs
- Secondary workflows (profile editing, notification preferences)
- Admin tools
Smoke tests (run after deployment):
- Is the app loading?
- Can a user log in?
- Is the API responding?
Running all E2E tests on every commit is usually not practical. Running only the critical path tests (5-10 tests, 2-3 minutes) on every PR is realistic and high-value.
Parallel Execution in CI
E2E test suites that would take 20 minutes sequentially can run in 4-5 minutes with parallelism. In GitHub Actions:
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- name: Run E2E tests (shard ${{ matrix.shard }}/4)
run: |
php artisan dusk --group=shard-${{ matrix.shard }}
Divide tests into groups by feature area or by estimated runtime. The matrix runs all groups simultaneously, reducing total wall-clock time.
Debugging Failures
E2E test failures are harder to debug than unit test failures. Set up your tests to capture evidence on failure:
// DuskTestCase.php
protected function tearDown(): void
{
if ($this->hasFailed()) {
$this->captureFailuresFor(collect($this->browsers));
}
parent::tearDown();
}
Dusk captures screenshots and browser console logs on failure automatically when configured correctly. Store these as CI artifacts so you can download and inspect them after a run.
For harder debugging, run Dusk in non-headless mode locally to watch what the browser actually does:
// .env.dusk.local
DUSK_HEADLESS_DISABLED=true
When to Delete E2E Tests
E2E tests that are permanently flaky, that test behavior already covered by reliable integration tests, or that cover deprecated features should be deleted. Dead weight in a test suite makes everything slower and erodes confidence.
Review your E2E suite quarterly:
- Delete tests with more than 5% flakiness rate
- Delete tests that duplicate integration test coverage
- Delete tests for features that no longer exist
Practical Takeaways
- Use
data-testidattributes for all E2E selectors; never use CSS classes or DOM position - Use Page Objects to encapsulate page interactions; update one class when UI changes
- Replace all
sleep()calls with explicitwaitForconditions - Each test creates its own data; never share state between tests
- Run only critical path tests on every PR; full suite nightly
- Capture screenshots and console logs on failure for easier debugging
Need help building reliable systems? We help teams architect software that scales. scopeforged.com