Koalrdocs
Getting started

How Koalr Works

Architecture overview — how Koalr collects, processes, and surfaces engineering data.

How Koalr Works

Koalr connects to your existing tools (GitHub, Jira, PagerDuty, Codecov, etc.) via OAuth and webhooks, then normalizes the data into a unified engineering intelligence layer.

Data collection

Webhooks (real-time)

When you connect GitHub, Koalr registers a webhook on your organization. Every pull request event, deployment event, and check run is delivered in real time. Koalr processes these within seconds.

REST API sync (historical)

On first connection, Koalr fetches up to 1 year of historical data: merged PRs, deployment history, CODEOWNERS files, repository metadata. This populates your initial DORA metrics and deploy risk model.

Scheduled syncs

For data that doesn't emit webhooks (CODEOWNERS drift detection, coverage snapshots, on-call schedules), Koalr runs scheduled syncs every 1–24 hours depending on the integration.

Processing pipeline

GitHub webhook → BullMQ queue → Ingestion worker → Prisma (PostgreSQL)

                                           Risk scoring engine

                                       ClickHouse (aggregations)

                                            API → Dashboard
  1. Ingestion: raw events are enqueued in BullMQ and processed asynchronously
  2. Normalization: events are mapped to Koalr's unified schema (PR → Deployment → Incident chain)
  3. Scoring: deploy risk scores are computed per deployment using 7 weighted factors
  4. Aggregation: DORA metrics, cycle time, and review health are pre-aggregated in ClickHouse for fast queries

Data freshness

Data typeLatency
PR events< 5 seconds (webhook)
Deployment risk score< 30 seconds (webhook trigger)
DORA metrics< 1 minute (aggregation job)
CODEOWNERS driftDaily (3 AM UTC)
Coverage snapshotsEvery 6–24 hours (provider-dependent)
On-call rosterEvery 15 minutes (live query)

Multi-tenancy

All data is scoped to your organization. No data is shared between organizations. Each query is automatically filtered by organizationId at the database level.