Platform Engineering: Building an Internal Developer Platform

Platform engineering has emerged as one of the most impactful investments an engineering organization can make. But most platforms fail the same way: they get built, announced, and then quietly ignored because they solve the wrong problems or create more friction than they remove.

An Internal Developer Platform (IDP) that succeeds does so because developers choose to use it — not because they're forced to.

What Is Platform Engineering?

Platform engineering is the discipline of building and running self-service infrastructure and tooling that application developers use to deliver software. The platform team's customers are other engineers inside your organization.

The key insight is that a platform is a product. It has users, workflows, pain points, and adoption metrics. Treat it like one.

The Core Promise

A well-built IDP gives developers:

A paved road for the 80% of tasks they do repeatedly
Self-service capabilities without waiting on ops tickets
Guardrails that prevent common mistakes without being restrictive
Consistent environments from dev through production

What It Is Not

A collection of scripts stitched together with duct tape
A mandate from management that forces workflow changes
A single team's side project that nobody maintains
A one-size-fits-all system that ignores team-specific needs

The Golden Path

The concept of a "golden path" is central to platform engineering. It's the recommended, well-supported way to build and deploy services at your organization.

The golden path should cover:

Code → Build → Test → Deploy → Operate

For each stage, your platform provides:

Templates: Starter projects with best practices baked in
Pipelines: Pre-built CI/CD workflows that just work
Tooling: CLI commands that abstract complexity
Documentation: Runbooks and architecture decision records

Example: Service Bootstrapping

Instead of developers spending two days setting up a new service from scratch, your platform's CLI does it in minutes:

# Before: hours of copy-paste and misconfiguration
mkdir my-service && cd my-service
# ... manually copy configs from an old service ...
# ... forget to update 12 references ...
# ... CI fails for 3 hours ...

# After: platform CLI scaffolds everything
platform service create my-service \
  --type api \
  --language go \
  --database postgres

# Platform creates:
# - Repository with standard structure
# - Dockerfile and docker-compose
# - CI/CD pipeline pre-configured
# - Kubernetes manifests with resource limits
# - Monitoring dashboards
# - Alert rules
# - README from template

This is the ROI that justifies a platform team.

Choosing Your Platform's Scope

Start by understanding where developers spend time that isn't writing product code. Common friction points:

Discovery: Talk to Your Users

Before building anything, run structured interviews with 10-15 developers:

Questions to ask:
1. Walk me through your last deployment. What took longest?
2. What do you have to ask the ops team for?
3. What did you Google last week that you feel you shouldn't have to?
4. What are you afraid to touch in production? Why?
5. If you could remove one tool from your workflow, what would it be?

You'll find patterns quickly. Common answers: "setting up local environments", "understanding what's deployed where", "figuring out why something is failing in prod", "waiting for environment provisioning".

Prioritize by Impact and Effort

Plot each pain point on a 2x2:

High Impact, Low Effort  → Do these first (quick wins)
High Impact, High Effort → Plan these carefully
Low Impact, Low Effort   → Do if you have capacity
Low Impact, High Effort  → Skip or defer indefinitely

Environment provisioning and deployment automation almost always fall into the top-left quadrant for most organizations.

Core Components to Build

1. Service Catalog

A service catalog gives every team visibility into what services exist, who owns them, and how they interconnect. Backstage (from Spotify, now a CNCF project) is the de-facto standard here.

# catalog-info.yaml — lives in every service repository
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payment-service
  description: Handles payment processing and refunds
  annotations:
    github.com/project-slug: myorg/payment-service
    pagerduty.com/service-id: P12345
    datadog/dashboard-url: https://app.datadoghq.com/dashboard/abc
tags:
  - billing
  - critical
spec:
  type: service
  lifecycle: production
  owner: payments-team
  dependsOn:
    - component:user-service
    - resource:payments-db

When every service has this file, developers can answer "what services depend on X?" without asking anyone.

2. Self-Service Environment Provisioning

Enable teams to spin up isolated environments on demand. This removes the ops bottleneck for testing and QA.

# platform/environments/preview.yaml
apiVersion: platform.myorg.com/v1
kind: Environment
metadata:
  name: pr-1234
spec:
  type: preview
  ttl: 7d
  services:
    - name: api
      image: myorg/api:pr-1234
    - name: frontend
      image: myorg/frontend:pr-1234
  databases:
    - type: postgres
      seed: fixtures/demo-data.sql

The platform controller reads this manifest, provisions the environment in Kubernetes, seeds the database, and posts the URL back to the pull request. When the PR closes, it tears down automatically.

3. Deployment Abstractions

Most developers shouldn't need to understand Kubernetes to deploy their service. Provide a simplified deployment interface:

# Developer-facing interface
platform deploy my-service --env staging --version 1.2.3
platform rollback my-service --env production
platform status my-service --env production

# Under the hood, the platform handles:
# - Image existence checks
# - Health check configuration
# - Traffic shifting (canary/blue-green)
# - Rollback conditions
# - Slack notifications
# - Audit logging

The platform encapsulates the complexity so developers get safety without understanding every knob.

4. Secrets Management

Provide a single, secure way to manage secrets:

# Developer creates a secret
platform secrets set my-service/DATABASE_URL "postgres://..."
platform secrets set my-service/API_KEY "sk-..."

# Secret is automatically:
# - Stored in Vault with appropriate policies
# - Synced to Kubernetes secrets in the right namespaces
# - Rotated according to policy
# - Audited on every access

When secrets management is self-service and easy, developers stop storing secrets in environment files that get committed to git.

Measuring Platform Success

Your platform team should track metrics just like a product team:

Developer Experience Metrics

Deployment Frequency    - How often can teams ship?
Lead Time for Changes   - Commit to production time
Time to Restore Service - How fast can teams recover?
Change Failure Rate     - % of deploys causing incidents

These are the DORA metrics, and a well-built platform should measurably improve all four.

Platform Adoption Metrics

Service onboarding rate   - % of services using the platform
Golden path adoption      - % of deployments via platform
Ticket deflection         - Ops tickets before vs after
Developer NPS             - Quarterly survey score
Time to first deployment  - For a new service from zero

If adoption is low, don't mandate usage — investigate why. Low adoption is a signal that the platform is solving the wrong problems or adding friction.

Common Anti-Patterns to Avoid

Building Before Validating

The most common mistake: spending six months building a platform based on assumptions, then discovering teams don't use it. Build small, release early, iterate.

Ignoring Escapability

Your platform should make the right thing easy, not the wrong thing impossible. If a team has a legitimate reason to deviate from the golden path, they should be able to — with appropriate documentation of why.

Neglecting Documentation

A platform without docs is a puzzle. Every self-service action should have:

A quick-start guide for the common case
Reference documentation for all options
Runbooks for troubleshooting common failures
Architecture docs explaining why decisions were made

Building Everything In-House

Use open source tools where they solve the problem well. Backstage for service catalog, Argo CD for GitOps deployments, Vault for secrets, Crossplane for cloud resource provisioning. You add value by integrating and configuring these tools, not reinventing them.

Team Structure and Funding

Platform teams need long-term investment to succeed. A common model:

1 platform engineer per 8-10 application developers is a reasonable starting ratio
Platform teams should have a product manager or designated product-thinking engineer
Treat platform work as product investment, not a cost center
Define SLAs for platform services — if the platform is down, teams can't ship

Getting Started: The Minimal Viable Platform

Don't try to build everything at once. An MVP platform might just be:

A standardized Dockerfile and CI pipeline template
A script that creates a new service repository with all the boilerplate
A deployment script that handles the common cases
A Slack channel where developers can ask platform questions

This is unglamorous but valuable. Measure adoption, gather feedback, and iterate. The best platforms grow organically from real developer pain rather than architectural ambition.

The goal isn't to build a beautiful platform — it's to help your developers ship better software faster, with less pain. Keep that north star in mind with every decision.

Building something that needs to scale? We help teams architect systems that grow with their business. scopeforged.com

Platform Engineering: Building an Internal Developer Platform That Teams Actually Use