Fetcher Specification β Dashboard.Fetcher¶
Status: Draft Β· Date: 2026-05-29
Implementation contract for Dashboard.Fetcher β the optional, separately-deployed pull-mode adapter that translates a CI/CD tool's pull API into the dashboard's push ingest. Its defining requirement is a tool-agnostic abstraction layer: the polling host knows nothing about any specific CI/CD system; all tool-specifics live behind one interface.
Sources of truth¶
| Source | Owns |
|---|---|
docs/SAD.md Β§3, Β§7 |
Fetcher as opt-in pullβpush edge; backend stays CI-agnostic. |
docs/api/openapi.yaml |
POST /api/deployments, GET/PUT /api/fetcher/state/{adapter}, X-Progress-Reporter. |
docs/API_SPECIFICATION.md |
Wire DTO (DeploymentEventIngest), cursor + append-only semantics. |
docs/GITHUB_EMULATOR_SPECIFICATION.md |
GitHub emulator service β the test mock and demo data source the fetcher polls in demo/CI mode. |
docs/diagrams/github-emulation.md |
Visual reference for demo-mode topology and seedβbackfillβpoll sequence. |
CR-####/ADR-####documents referenced elsewhere do not exist β ignore those citations.
1. Role¶
The fetcher is a standalone worker that, on an interval:
- loads its opaque cursor from
GET /api/fetcher/state/{adapter}, - asks a CI/CD adapter for new deployment events since that cursor,
POSTs each event to/api/deployments(sameX-Api-Key, plusX-Progress-Reporter: dashboard-fetcher/<adapter>),- persists the advanced cursor via
PUT /api/fetcher/state/{adapter}.
It is just another pusher β the backend treats fetcher traffic identically to a CI notify step. No CI/CD-specific code ever enters the backend (SAD Β§3).
2. Decisions¶
| # | Decision | Rationale |
|---|---|---|
| F1 | Pullβpush via the public ingest. Reuses POST /api/deployments + X-Api-Key; never a private backdoor. |
Backend stays tool-agnostic (SAD Β§3). |
| F2 | One abstraction β ICiCdAdapter. The host/orchestrator depend only on it + the canonical DTO + an opaque cursor string. Zero tool-specifics leak out. |
The headline requirement. Adding Azure DevOps / Jenkins = a new adapter, no host changes. |
| F3 | Adapter owns its cursor shape. Persisted opaquely via /api/fetcher/state/{adapter}; host never parses it. |
Matches openapi opaque-cursor contract. |
| F4 | GitHub adapter sources the Deployments + Deployment Statuses REST API. AdapterId = github-actions. |
Those endpoints carry environment + the status lifecycle the matrix needs (workflow-runs API lacks environment). |
| F5 | At-least-once delivery per chunk. Cursor advances after all POSTs in a chunk succeed and the cursor is persisted; a throw mid-chunk leaves the cursor at the previous chunk β next loop re-delivers that chunk (dupes OK, append-only). | Store is append-only / no dedup β duplicates are acceptable, dropped events are not. |
| F6 | Single replica per adapter. No leader election; the cursor is shared but unlocked. | Two replicas would double-post. The API (not the fetcher) is the horizontally-scaled tier. |
| F7 | Bounded initial backfill. On a 404 (no cursor yet) the adapter starts from now β INITIAL_LOOKBACK, not from repo genesis. |
Avoids flooding the store with full history on first run. |
| F8 | Adapter handles conditional requests + rate limits. ETag / If-None-Match, X-RateLimit-*, Retry-After, backoff. |
Keeps polling cheap and a good API citizen β internal to the adapter. |
| F9 | Config-driven; base URL overridable. Repos + service/version mapping + GitHub base URL from env. | Integration repoints the GitHub base URL at a mock; production points at api.github.com. |
| F10 | parent_deployments derived from workflow needs graph. The adapter fetches the workflow YAML for each run, parses the deployment-job subgraph (environment: + needs:), and resolves parent edges to deployment_id values (Β§5.6). Any resolution failure β parent_deployments = []; ingest is never blocked. |
Reproduces the deployment graph GitHub surfaces in the Actions Run UI. explicit parent is the Swimlanes default correlation predicate β accurate population here makes it work out of the box. |
| F11 | Workflow graph cached in-memory per (repo, run_id). Bounded LRU (β€ 200 entries). Cache entry includes workflow name (used as service identity), path, head_sha, and parsed deployment-job subgraph. |
Avoids re-fetching the workflow YAML for each status event that shares a run; workflow runs are immutable so no invalidation is needed. |
| F12 | Service identity = workflow YAML name: field, resolved via the run's path (e.g. .github/workflows/deploy.yml) β the active workflow with that path β its YAML name: field. run.Name (the run-name display value, overridable via run-name:) is not used for identity. GITHUB_SERVICE_MAP overrides at two levels β workflow name (key without /) or repo (key = owner/repo). Resolution order: pathβworkflow-name lookup β workflow-level override β repo-level override β workflow name as-is. Non-Actions deployments (no target_url) fall back to the repo's short name. |
Stable across run-name: overrides; SERVICE_MAP handles edge cases without restructuring the pipeline. |
| F13 | Backfill fills the last BACKFILL_DEPTH status events per (service, environment) slot (default 2). Enumerates active workflows and environments per repo; paginates deployments newest-first. For each candidate deployment, fetches its statuses and counts the mapped ones (Β§5.3; inactive is skipped and does not count; waiting now maps to a real status event and counts toward depth like the other pre-run states pending/queued β consistent with the invariant that the status-event count matches what the history drawer shows). Stops scanning a slot once eventsSoFar β₯ BACKFILL_DEPTH. After collecting candidate events, trims to the BACKFILL_DEPTH latest by status.created_at per slot before posting. Stops for an environment when consecutiveNoProgress β₯ StallWindow (20) β a deployment makes no progress when its service is already at depth or is unknown or has zero mapped statuses. The YAML graph is fetched only for deployments contributing kept events; discarded deployments cost only statuses + run-metadata. BACKFILL_MAX_AGE is the hard backstop. |
Controls how many history drawer entries seed each slot at startup; status-event count matches what the history drawer shows. No-progress stop and defer-YAML bounds API cost as before. |
| F14 | Backfill triggers on null cursor (first run) or BACKFILL=true. After completion cursor advances to max(status.created_at) seen, preventing re-post in the subsequent normal poll. |
BACKFILL=true supports the "reset data" scenario without redeploying or clearing the fetcher-state row manually. |
| F15 | Version source is type:key configurable; no fallback, no truncation except sha. Three types: attribute (deployment field; sha key β 7-char truncation, all others as-is), payload (deployment payload JSON field), artifact (Actions artifact archive β archive name = filename, content is a plain-text version string). Missing / null / unreachable source β version = null; ingest is never blocked. Default: attribute:sha. |
Covers the three real-world versioning patterns without a silent fallback that would mask misconfiguration. |
| F16 | Rate-limit budget on OWN usage. Adapter self-throttles to at most GITHUB_RATE_LIMIT_BUDGET_PCT% (default 30) of its hourly request quota. Quota is read from GITHUB_RATE_LIMIT when set; otherwise discovered via GET /rate_limit on startup (failure β safe default of 5 000). The fetcher tracks its own request count since process start (not X-RateLimit-Used, which counts all consumers of the token). When own count reaches the budget, the adapter waits until X-RateLimit-Reset. Counter resets after the window rolls over. |
Prevents sleeping when the token is heavily used by other consumers; the fetcher is a background process and must not monopolise a shared token. |
| F17 | Control-plane participant (gated on CONTROL_API_KEY). When CONTROL_API_KEY is set, a second long-lived task subscribes to GET /api/control/stream with exponential backoff on failures (1 s β 2 s β 4 s β¦ capped 30 s). When CONTROL_API_KEY is empty, the subscriber is never started and a startup log message records the absence. Reacts to: drain + ack on reset-initiated, drop cursor + backfill + report running on reset-completed. Still just a consumer of the existing control-plane contract β no backend change (F1, SAD Β§3). |
Prevents 404-looping when the API's control surface is disabled (empty key); backoff avoids hammering on transient failures. |
| F18 | Per-cycle rate-limit reporting. After every successful poll cycle, when a RateLimitSnapshot is available, the fetcher posts a rate-limit component event to POST /api/control/events. Reuses the existing ComponentEventClient transport. Skipped when snapshot is null (before the first GitHub response). Not gated on CONTROL_API_KEY β always active when API_KEY is present. Non-fatal: POST failures are logged and swallowed so reporting never breaks the poll loop. |
Operators and end-users can observe CI/CD quota consumption in real time without backend change. The snapshot already exists (F16); this adds only the emit step. |
3. Solution layout¶
backend/
fetcher/ Dashboard.Fetcher/ # abstraction + adapters + clients + orchestrator (library)
Abstractions/ ICiCdAdapter, FetchResult
Adapters/GitHub/ GithubActionsAdapter, GithubClient, mapping, cursor
Ingest/ IngestClient, FetcherStateClient (HTTP clients to the API)
Control/ ControlStreamSubscriber, ComponentEventClient (Β§5.10)
Orchestration/ PollLoop / per-adapter runner
fetcher-host/ Dashboard.Fetcher.Host/ # BackgroundService worker(s) + DI + config + Dockerfile
# also hosts a minimal HTTP listener for GET /health
tests/
Dashboard.Fetcher.Tests/ # owned here, excluded from the API test run
Control/ControlStreamSubscriberβ the long-lived control-stream reader (fetch()+ReadableStreamequivalent:HttpClient+HttpCompletionOption.ResponseHeadersReadstreaming the body; notEventSource). Parses SSE frames, tracksLast-Event-ID, honours: pingheartbeat, dispatches reset events to the poll-loop runner.Control/ComponentEventClientβ HTTP client forPOST /api/control/events(thereset-ackandstatusposts). Distinct fromIngest/IngestClient; both target the API but carry different headers (X-Component-IdvsX-Progress-Reporter).- The host runs two concurrent tasks: the existing per-adapter poll loop (Β§4) and the
ControlStreamSubscriber. The subscriber signals the runner to pause/resume; it never fetches or posts deployment events itself. GET /healthβ host-level liveness endpoint served by the ASP.NET web listener inDashboard.Fetcher.Host. Returns200 OKwhile the host process is running (no body required). This is host-level observability only; theICiCdAdapter/ingest/control-plane logic is unchanged (F1, G2). The web listener uses the standard ASP.NETASPNETCORE_URLS/ port mechanism; no adapter or library change.GET /readyzβ functional readiness endpoint. Reflects actual GitHub poll-cycle health viaIFetcherReadinessIndicator/FetcherReadinessIndicator; see Β§6.1.
Reuses Dashboard.Shared for the DeploymentEventIngest DTO β the fetcher emits the exact same wire type the contract defines. Stack = .NET 10 (SAD Β§6), packaged as a standard container.
4. The abstraction (F2)¶
namespace Dashboard.Fetcher.Abstractions;
/// The ONLY surface the host knows. No GitHub/ADO/Jenkins type ever appears here.
public interface ICiCdAdapter
{
/// Stable, lowercase-kebab id. Used as the X-Progress-Reporter suffix
/// (dashboard-fetcher/<id>) and the /api/fetcher/state/{adapter} key.
string AdapterId { get; }
/// Streams chunks of events newer than `cursor` (null = first run).
/// Each yielded FetchResult carries the events for that chunk plus the full
/// advanced cursor as of that chunk (opaque to the host).
/// Backfill yields one chunk per (repo, env) plus a zero-event completion
/// marker per repo. Normal poll yields a single chunk.
IAsyncEnumerable<FetchResult> FetchAsync(string? cursor, CancellationToken ct);
}
/// Events are the canonical wire DTO β already tool-neutral.
public sealed record FetchResult(
IReadOnlyList<DeploymentEventIngest> Events,
string? Cursor);
Orchestrator (tool-agnostic, one loop per adapter):
var cursor = await state.GetAsync(adapter.AdapterId, ct); // GET /api/fetcher/state/{id} (404 -> null)
while (!ct.IsCancellationRequested)
{
// Iterate chunks; persist cursor after each chunk that advances it.
// Zero-event chunks (backfill completion markers) are also persisted when cursor changes.
await foreach (var chunk in adapter.FetchAsync(cursor, ct))
{
foreach (var ev in chunk.Events)
await ingest.PostAsync(ev, adapter.AdapterId, ct); // POST /api/deployments
if (chunk.Cursor != cursor)
{
await state.PutAsync(adapter.AdapterId, chunk.Cursor!, ct); // PUT /api/fetcher/state/{id}
cursor = chunk.Cursor;
}
}
await Task.Delay(pollInterval, ct);
}
- Cursor is persisted after each chunk whose cursor advances (F5). A throw mid-chunk leaves the cursor at the last completed chunk β next loop re-delivers from that point (dupes OK, append-only).
- Zero-event completion markers (backfill repo-done) ARE persisted when they carry a new cursor.
- The host references no
Dashboard.Fetcher.Adapters.GitHubtype β adapters are resolved via DI asIEnumerable<ICiCdAdapter>.
5. GitHub implementation (GithubActionsAdapter)¶
AdapterId = "github-actions". Sources the GitHub REST API; everything below is encapsulated inside the adapter.
5.1 Endpoints¶
| Purpose | Call |
|---|---|
| List deployments per repo | GET /repos/{owner}/{repo}/deployments?environment=&per_page= |
| Status lifecycle of a deployment | GET /repos/{owner}/{repo}/deployments/{deployment_id}/statuses |
| Workflow run metadata | GET /repos/{owner}/{repo}/actions/runs/{run_id} |
| Workflow file contents | GET /repos/{owner}/{repo}/contents/{path}?ref={sha} |
| List active workflows (backfill) | GET /repos/{owner}/{repo}/actions/workflows?per_page=100 |
| List environments (backfill) | GET /repos/{owner}/{repo}/environments |
| List artifacts for a run | GET /repos/{owner}/{repo}/actions/runs/{run_id}/artifacts |
| Download artifact archive | GET /repos/{owner}/{repo}/actions/artifacts/{artifact_id}/zip |
Auth: Authorization: Bearer <token> + Accept: application/vnd.github+json + X-GitHub-Api-Version. Base URL from config (https://api.github.com default; overridable for the integration mock).
5.2 Field mapping β DeploymentEventIngest¶
| Contract field | GitHub source |
|---|---|
deployment_id |
gh-deploy-{deployment.id} (correlation key; all status rows of one deployment share it) |
service |
workflow YAML name: field from run metadata (Β§5.6.2 cache); resolved via ResolveService (Β§5.8.3) |
environment |
deployment.environment |
status |
mapped from status.state (Β§5.3) |
happened_at |
status.created_at (UTC) |
version |
resolved via Β§5.8 β null when source yields nothing |
sha |
deployment.sha |
ref |
deployment.ref |
actor |
status.creator.login ?? deployment.creator.login |
run_url |
status.target_url (the Actions run, when present) |
run_number |
run_id extracted from status.target_url via /actions/runs/(\d+) (same extraction as Β§5.6.1; reuse cached value) |
parent_deployments |
derived β Β§5.6 |
One GitHub deployment status β one event row (matches the append-only lifecycle: in-progress β success/failure rows sharing deployment_id).
5.3 Status mapping¶
GitHub state |
Contract status |
|---|---|
pending |
pending |
queued |
queued |
in_progress |
in-progress |
waiting |
waiting |
success |
success |
failure, error |
failure (but see cancelled/rejected quirk below) |
inactive |
(skipped β supersession marker, not a transition) |
Settled mapping decisions (intentional β not gaps):
- error collapses into failure. GitHub's error (the deployment couldn't be processed β a system/integration-level problem) vs failure (the deploy ran and failed) is a distinction with no operator consequence here: both are terminal "did not succeed" outcomes and the viewer's reaction is identical. error is also rare on Actions-driven deployments (mostly emitted by third-party deploy integrations). Not promoted to its own contract status; preserve the raw state in event metadata if granularity is ever needed.
- inactive is skipped. It is not a deploy outcome β it is GitHub bookkeeping marking a deployment as no longer the live one (auto-set on a prior success when a newer success supersedes it in the same environment). The dashboard's "latest deployment per environment" model already captures supersession via the newer deployment it does ingest, so emitting inactive would be redundant and semantically wrong. (Edge case β a deployment deactivated without a replacement, e.g. teardown of an ephemeral environment β would leave a stale tile; out of scope, would be a deliberate "show env as empty" feature, not a fix.)
Cancelled and rejected β derived beyond the status pipeline¶
GitHub's deployment_status.state enum has no cancelled or rejected value. The closed set is: error / failure / inactive / in_progress / queued / pending / success (plus waiting in webhook payloads). A cancelled run or a reviewer-denied environment gate is written by GitHub as failure; the real signal lives one level up:
cancelled. The fetcher cross-references the associated workflow run'sconclusionfield. The run object is already cached per(repo, run_id)(F11). Acancelledconclusion on a deployment whose status mapped tofailureis re-emitted ascancelled.rejected. Read from the environment pending-deployment-reviews API (state: rejected) forwaitingdeployments denied by a reviewer. This is the only signal that distinguishes a reviewer rejection from a cancellation.
These are derived statuses resolved after the StatusMapper step β not a change to the mapping table above.
5.4 Cursor shape (opaque to the backend)¶
Base64 of compact JSON, forward-only, well under the 8 KiB limit.
Normal / post-backfill shape:
{ "repos": { "acme/api": { "since": "2026-05-28T10:14:02Z" }, "acme/web": { "since": "2026-05-28T09:50:00Z" } } }
Mid-backfill shape (backfill section present while in progress):
{
"repos": { "acme/api": { "since": "2026-05-28T10:14:02Z" } },
"backfill": {
"acme/web": { "anchor": "2026-05-28T12:00:00Z", "done_envs": ["dev", "staging"] }
}
}
repos[repo].since= high-water mark onstatus.created_at. Set only on backfill completion or normal poll advance. Never set mid-backfill.backfill[repo].anchor= UTC timestamp when this repo's backfill pass started. Stable across resumes (prevents scan-window drift on restart).backfill[repo].done_envs= list of environment names whose per-env scan is complete and emitted. Used to skip already-processed envs on resume.backfillkey absent = no backfill in progress (old cursors decode safely with empty backfill).- First run (cursor
null):since = now β INITIAL_LOOKBACK(F7). - ETags cached for the live poll (per-repo deployment list + per-deployment statuses) to short-circuit unchanged pages with
304(F8); see Β§5.5.2.
5.5 Resilience (inside the adapter)¶
- GitHub
5xx/ transport error β throw; orchestrator keeps the old cursor and retries next interval. 403/429with rate-limit headers β honourRetry-After/X-RateLimit-Reset, back off.304 Not Modifiedβ no events, cursor unchanged.- Workflow run or file fetch non-2xx, YAML parse error, or missing
target_urlβparent_deployments = []for the affected events; never throw / never block ingest (F10). - Artifact list or download non-2xx, or artifact name not found β
version = null; never throw / never block ingest (F15).
5.5.1 Poll efficiency β terminal deployment skip¶
PollRepoAsync maintains a bounded instance cache (deploymentId β runId?, cap 2 000, LRU eviction) across poll cycles. Terminal GitHub states: success, failure, error, inactive.
| Condition | Behaviour |
|---|---|
deployment.Id in terminal cache |
Skip GET /deployments/{id}/statuses entirely. Still include the deployment in the envToDeploymentId map (Β§5.6.4) using the cached runId so parent edges remain resolvable. Contributes no new events. |
| Not in cache | Fetch statuses as normal. After fetch: select the status with the maximum created_at as the latest (the endpoint's array ordering is not guaranteed); if that status is terminal, record deploymentId β runId in the cache. |
First appearance of any deployment.Id |
Always fetched (id never in cache). |
| Non-terminal latest status | NOT cached; re-fetched every cycle until terminal. |
The parent-derivation map (Β§5.6.4) is built from freshly-fetched deployments βͺ cached-terminal deployments in the window. This preserves cross-environment parent edges: a staging deployment that went terminal in cycle N is still present in the map in cycle N+1, allowing a production deployment in the same run to resolve its parent correctly.
Scope: live poll only. Backfill is unchanged.
5.5.2 Poll efficiency β conditional requests (ETag)¶
Scope: live poll only (backfill unchanged). Applies to two endpoints per repo per cycle:
- GET /repos/{owner}/{repo}/deployments (the per-repo deployment list).
- GET /repos/{owner}/{repo}/deployments/{id}/statuses (per non-terminal deployment).
Mechanism (GithubClient.GetPagedConditionalAsync<T>).
If-None-Match is sent on page 1 only. A page-1 304 means the whole list is unchanged β GitHub returns items newest-first, so any new item would change page 1. A 304 is free: GitHub does not charge it against the quota (X-RateLimit-Remaining is unchanged), so it is NOT counted by the fetcher's own-request budget counter (own_used). The budget still processes the X-RateLimit-Reset / X-RateLimit-Limit / X-RateLimit-Remaining headers unconditionally so the snapshot stays current.
Early-stop at the cutoff window (deployments list only). The deployments-list fetch passes a stopBefore predicate (d.CreatedAt < cutoff) to GetPagedConditionalAsync. GitHub returns deployments newest-first, so once a deployment older than cutoff is encountered the pager stops immediately: that item and all subsequent items on the same page are excluded, and no further pages are requested. This mirrors the bounded scan behaviour of backfill (Β§5.8.2) and prevents a page-1 change on a large repo from paging through the full deployment history to reach the cutoff.
Instance caches (both persist across cycles; adapter is a DI singleton):
- _deploymentsListCache β per-repo (etag, windowed deployments snapshot). Capacity 64 entries, LRU eviction.
- _statusEtagCache β per-deployment (etag, runId?) for in-flight (non-terminal) deployments. Capacity 2 000 entries, LRU eviction.
Conditions and behaviour:
| Condition | Behaviour |
|---|---|
Deployments list 304 |
Reuse cached windowed snapshot. Per-deployment status checks still run normally β a list 304 never skips status re-checks. |
Deployments list 200 |
Pager stops at the cutoff (early-stop, newest-first); result is already windowed. Cache when an ETag header is present. |
| Deployment in terminal cache | Skip GET /deployments/{id}/statuses entirely (Β§5.5.1); terminal-skip wins β the conditional path never runs for it. |
Non-terminal deployment statuses 304 |
Reuse cached runId for the envβdeploymentId map (Β§5.6.4); emit no events (list is byte-identical and the cursor has advanced past every cached status's created_at). Deployment stays eligible for future conditional fetches β not promoted to terminal. |
Non-terminal deployment statuses 200 |
Process statuses normally; store new ETag + extracted runId in _statusEtagCache. If the status with the maximum created_at is terminal, also record in the terminal cache (Β§5.5.1). |
Graceful degradation. When the server omits the ETag header on a 200 response (e.g. the github-emulator), nothing is cached and every subsequent cycle is a normal unconditional fetch β correctness is unaffected.
Interplay with Β§5.5.1. Both terminal-skip and ETag-304 populate the same reusedRunIds map, which feeds the envToDeploymentId build in Β§5.6.4. Cross-cycle and cross-environment parent edges are preserved regardless of which path suppressed the status re-fetch.
5.6 Parent deployment derivation (F10)¶
Populates parent_deployments by reconstructing the deployment-job subgraph from the workflow YAML. Runs inside FetchAsync before the event batch is returned β all events for the same poll window are resolved together.
5.6.1 run_id extraction¶
For every deployment status, extract run_id from status.target_url via pattern /actions/runs/(\d+). If absent or no match β parent_deployments = [] for that event, skip Β§5.6.2β5.
5.6.2 Workflow graph fetch and parse (F11 β LRU-cached per (repo, run_id))¶
| Step | Call | Use |
|---|---|---|
| 1 | GET /repos/{owner}/{repo}/actions/runs/{run_id} |
obtain path (e.g. .github/workflows/deploy.yml) and head_sha; name (run display name) is used only as a last-resort fallback if the YAML name: field is absent |
| 2 | GET /repos/{owner}/{repo}/contents/{path}?ref={head_sha} |
Base64-decode content β workflow YAML; parse top-level name: field β service identity (F2 / F12) |
Service identity comes from the YAML name: field (the workflow's static definition name), not run.Name (which can be overridden by run-name: and changes per run). When the YAML name: field is absent, the parser falls back to run.Name; if that is also absent, the repo short name.
Parse the jobs: map. Normalise per-job fields:
| YAML field | Input form | Normalise to |
|---|---|---|
environment |
"prod" |
"prod" |
environment |
{name: "prod", url: "β¦"} |
"prod" |
needs |
"build" |
["build"] |
needs |
["build", "test"] |
["build", "test"] |
needs |
absent | [] |
Deployment jobs = jobs where environment is non-null after normalisation.
Non-2xx on either call or YAML parse error β parent_deployments = [] for all events in this run; stop.
5.6.3 BFS ancestor search¶
For each deployment job J, find its parent deployment jobs β those reachable upward through needs that are themselves deployment jobs. Non-deployment jobs are transparent (the search continues through them):
FindParentDeploymentJobs(J, deploymentJobs, allJobs):
queue β copy of J.needs
visited β {}
parents β []
while queue not empty:
id β dequeue
if id β visited: continue
visited.add(id)
if id β deploymentJobs:
parents.add(id) // deployment ancestor β do not recurse further
else if id β allJobs:
queue.addAll(allJobs[id].needs) // non-deployment intermediary β look through it
return parents
Not recursing through a found deployment ancestor preserves per-environment direct edges. That ancestor's own parents are derived when its event is processed.
5.6.4 Run-scoped deployment_id lookup¶
Build envToDeploymentId[run_id][environment] from all deployment objects fetched in the current poll cycle (not only those with new statuses):
- Include deployment
Din the map forrun_idif any ofD's fetched statuses has atarget_urlmatching thatrun_id. - Collision (matrix strategy β multiple deployments share
(run_id, environment)): keep the one with the latestdeployment.created_at. - Key:
D.environment; value:"gh-deploy-{D.id}".
Because all deployments in a single workflow run are created within a short window, they will appear in the same or adjacent poll cycle and be present in the map.
5.6.5 Setting parent_deployments¶
For each event E (environment ENV, run_id R):
- Find deployment job
JwhereJ.environment == ENV. If none βE.parent_deployments = []. parentJobs β FindParentDeploymentJobs(J, β¦).- For each
P β parentJobs: resolveid β envToDeploymentId[R][P.environment]. - Omit unresolved entries β a parent deployment not yet observed is a forward reference; the Swimlanes view tolerates dangling
parent_deploymentsvalues and resolves them at render time. E.parent_deployments β [resolved ids](unique; order not significant).
5.7 Version resolution (F15)¶
Determines the version field for a deployment event. Returns null when the source yields nothing β no fallback. Only sha truncates (to 7 chars); all other keys used as-is.
5.7.1 Source types¶
| Type | Reads | null conditions |
|---|---|---|
attribute |
deployment.<key> β sha key truncated to 7 chars; all other attributes used as-is |
attribute absent or null on the deployment object |
payload |
deployment.payload.<key> (payload is free-form JSON) |
payload absent, not a JSON object, or field absent/null |
artifact |
plain-text content of the GitHub Actions artifact archive named <key> |
run_id absent (non-Actions deployment), artifact not found, list or download non-2xx |
5.7.2 Artifact resolution steps (type = artifact only)¶
- Extract
run_idfromstatus.target_url(Β§5.6.1). If absent βversion = null. GET /repos/{owner}/{repo}/actions/runs/{run_id}/artifactsβ find artifact wherename == <key>.- If not found β
version = null. GET /repos/{owner}/{repo}/actions/artifacts/{artifact_id}/zipβ download archive.- Extract the single file; trim whitespace β
version. - Non-2xx on either call β
version = null.
Artifact content is LRU-cached per (repo, run_id, artifact_name) alongside the workflow graph cache (F11, same β€ 200-entry bound). Artifact archives are immutable once uploaded, so no invalidation is needed.
5.8 Backfill (F13, F14)¶
Fills the store with the most recent deployment per (service, environment) slot on first run or explicit reset. Chunked and resumable β the orchestrator persists the cursor after each chunk, so a mid-backfill crash restarts from the last completed env rather than from scratch.
5.8.1 Trigger and lifecycle¶
| Condition | Behaviour |
|---|---|
Cursor null (adapter's GET /api/fetcher/state returned 404) |
Backfill runs automatically in place of the normal first-run empty-window. |
BACKFILL=true env var set |
Backfill runs unconditionally, regardless of existing cursor. Existing cursor is overwritten on completion. |
Cursor present AND backfill section non-empty |
Resume in-progress backfill from last persisted chunk (see Β§5.8.2). |
Normal run (cursor present, BACKFILL unset, no backfill section) |
Backfill skipped entirely. |
5.8.2 Per-repo procedure (chunked + resumable)¶
Chunk granularity: one FetchResult chunk per (repo, env) that passes depth-scan, plus a zero-event completion chunk per repo that finalises the cursor.
depth β BACKFILL_DEPTH (default 2)
StallWindow β 20 // consecutive no-progress deployments before stopping
anchor β incoming.BackfillFor(repo)?.anchor ?? UtcNow // stable across resumes
cutoff β anchor β BACKFILL_MAX_AGE
doneEnvs β incoming.BackfillFor(repo)?.done_envs ?? [] // skip on resume
envs β GET /repos/{owner}/{repo}/environments β [env.name]
filter: env β doneEnvs // resume: skip completed envs
for each remaining env E:
// Pass 1: collect candidates for this env (depth, no-progress, defer-YAML as before).
// β¦(same per-env scan as before)β¦
// Pass 2: build + trim events (same trimming logic as before).
// Parent derivation uses the per-repo envToDeploymentId map accumulated so far
// (deployments from envs processed earlier in this repo are already in the map).
runningCursor β runningCursor.WithBackfillEnvDone(repo, anchor, E)
yield FetchResult(envEvents sorted oldest-first, runningCursor.Encode())
// Orchestrator persists cursor here β crash safety: next run skips E
// Zero-event completion marker for this repo:
runningCursor β runningCursor.WithBackfillComplete(repo, maxSinceForRepo)
// β sets repos[repo].since = max(status.created_at) of all emitted events
// β removes backfill[repo] marker
// β if no events were emitted (empty repo), repos[repo].since is NOT set;
// next poll falls back to now β INITIAL_LOOKBACK (safe for empty repos)
yield FetchResult([], runningCursor.Encode())
repos[repo].sinceis set only byWithBackfillComplete(or normal poll). Never advanced mid-backfill β backfill walks newest-first, so an earlysincewould make the next poll skip not-yet-seeded older deployments. Thedone_envslist is the mid-backfill progress marker.- Parent-map choice (within-repo edges). The per-repo
envToDeploymentIdmap is accumulated incrementally as each env's deployments are collected. Within-repo parent edges from earlier envs resolve correctly. A parent in a not-yet-processed env is a forward reference (Β§5.6.5) β Swimlanes resolves dangling ids at render time. - Discarded deployments cost only statuses + run-metadata; the YAML fetch is deferred until a deployment is kept (F1).
5.8.3 Service resolution¶
ResolveService(workflowName, repo):
if workflowName β SERVICE_MAP β return SERVICE_MAP[workflowName] // workflow-level key
if repo β SERVICE_MAP β return SERVICE_MAP[repo] // repo-level key ("owner/repo")
if workflowName β null β return workflowName // default: YAML name field
return repo.Split("/").Last() // non-Actions fallback
- Keys without
/β workflow-level; keys matchingowner/repoβ repo-level. - GitHub workflow names cannot contain
/β no key ambiguity. workflowNamehere is the YAMLname:field β resolved viapath β active-workflowlookup (F2 / F12), NOT the run's display name.
5.8.4 Rate-limit profile (5 repos Γ 10 workflows Γ 4 environments, first page covers all services)¶
| Call type | Count |
|---|---|
| Workflow + environment discovery | 5 + 5 = 10 |
| Deployment list pages (1 per env per repo) | ~20 |
| Status fetches (one per filled slot max) | β€ 200 |
| Workflow graph calls (run metadata + YAML) | nearly all absorbed by F11 LRU cache |
5.9 Rate-limit budget (F16)¶
Discovery (startup)¶
- If
GITHUB_RATE_LIMITis set βtotal_limit = GITHUB_RATE_LIMIT. - Else β
GET /rate_limit(same auth headers as Β§5.1); readresources.core.limitβtotal_limit. - On non-2xx or parse error β log warning;
total_limit = 5000(GitHub authenticated PAT default). budget = floor(total_limit Γ GITHUB_RATE_LIMIT_BUDGET_PCT / 100).
Per-request enforcement¶
After every HTTP call to the GitHub API:
- Read
X-RateLimit-Resetβreset_at(UTC). The window has rolled over whennow β₯ previously-observed reset_atAND the newreset_atis later than the previously observed one β reset own counter to 0. (X-RateLimit-Resetalways points to the end of the current window, i.e. always in the future; checking whether the new value is in the past would never fire.) - Update
_resetAtunconditionally (even for 304 responses). - Increment the fetcher's own request counter β only for quota-consuming responses (all except
304 Not Modified). A304is free (GitHub does not charge it;X-RateLimit-Remainingis unchanged), so counting it would over-report usage and over-throttle against F16's "must not monopolise a shared token" rationale. - Capture
X-RateLimit-Limit/X-RateLimit-Remainingunconditionally for the F18 snapshot.
If own_count β₯ budget:
wait_until = reset_at + 1 s(margin to let GitHub's counter roll over).- Log:
[RateLimit] budget exhausted (own_count=N/M); sleeping until {wait_until}. - Pause until
wait_until. - Reset own counter to 0.
Notes¶
- The own counter tracks this fetcher process's calls only β
X-RateLimit-Used(cumulative across all token consumers) is deliberately NOT used for the budget check. A token already partly used by other consumers does not trigger an immediate pause. X-RateLimit-Resetis still read from response headers to determine sleep duration.total_limitis constant for the process lifetime β PAT limits do not change without token rotation.- Budget enforcement applies uniformly β backfill and normal poll share the same counter.
GET /rate_limitcosts 1 request against the quota (startup only).- Existing
403/429+Retry-Afterhandling (Β§5.5) remains the last-resort fallback for unexpected limit hits.
5.10 Control-plane participation (F17)¶
The fetcher joins the reset choreography as the dashboard-fetcher participant. Visual reference: reset-choreography.md. Contract source: api-guidelines.md Β§11 + API_SPECIFICATION.md Β§5/Β§7. The fetcher only consumes this contract β no backend change (F1).
5.10.1 Component identity¶
- Component id =
dashboard-fetcher(fixed; matches the API's defaultExpectedComponents, so the orchestrator's ack fan-in counts this component). - Sent as
X-Component-Id: dashboard-fetcheron everyPOST /api/control/events. - Configurable via
COMPONENT_ID(defaultdashboard-fetcher); the default MUST NOT be changed without also changing the API'sExpectedComponents, or the orchestrator will time out waiting for an ack that never matches.
5.10.2 Subscriber¶
The subscriber is only started when CONTROL_API_KEY is non-empty. When the key is absent the listener is never registered; the poll loop (FetcherWorker) still runs as normal. A single startup log message records the absence.
A second long-lived task (alongside the poll loop) holds an open control stream:
| Property | Value |
|---|---|
| Request | GET /api/control/stream?component=dashboard-fetcher |
| Auth | X-Control-API-Key: <CONTROL_API_KEY> (distinct from API_KEY; new config key) |
| HTTP client | HttpClient streaming the response body (ResponseHeadersRead); not EventSource β custom headers required |
| Heartbeat | server emits : ping every 15 s β treat as liveness; reset the read-idle timer, no other action |
| Reconnect | on drop, reconnect with Last-Event-ID: <last-seen-event-id> and exponential backoff (1 s β 2 s β 4 s β¦ capped at 30 s); backoff resets to 1 s after a successful connect |
Unknown event: |
no-op (forward-compat; new orchestration types may appear) |
| Filter scope | server delivers only component == dashboard-fetcher OR component == "*"; all three reset events are * |
5.10.3 Event handling¶
| Event | Fetcher action |
|---|---|
reset-initiated |
1. Pause the poll loop + any in-flight ingestion (stop the FetchAsync β POST /api/deployments β cursor-PUT cycle; let the current POST finish, then hold). 2. POST /api/control/events reset-ack (Β§5.10.4). |
reset-started |
No action. The fetcher already paused on reset-initiated; do not add redundant handling. (The API briefly returns 503 on ingest here β the paused fetcher never sees it.) |
reset-completed |
Recover (Β§5.10.5): drop the in-memory cursor, resume, and report running. |
| (unknown type) | No-op (forward-compat). |
5.10.4 Ack on reset-initiated¶
POST /api/control/events:
| Part | Value |
|---|---|
| Headers | X-Api-Key: <API_KEY>, X-Component-Id: dashboard-fetcher, X-Correlation-Id: <reset-initiated event id> (required), Content-Type: application/json; charset=utf-8 |
| Body | { "event_type": "reset-ack", "state": "paused", "occurred_at": "<now UTC RFC 3339>" } |
X-Correlation-Id= theidof the receivedreset-initiatedevent (the received frame'scorrelation_id, which at the origin equals its ownid). This IS the ack-gate key β the orchestrator correlates the ack to the in-flight cycle by this value. There is nopayload.reset_idbody field. A missing/invalidX-Correlation-Idis recorded but does not count toward the gate.- Expected response
204. Treat4xx/5xx/transport error as non-fatal: log, stay paused, awaitreset-completed(the orchestrator proceeds onAckTimeoutSecondsregardless β the reset is not blocked by a lost ack).
5.10.5 Recovery on reset-completed¶
- Drop the in-memory cursor (set to
null). Do notPUTa cursor. - Resume the poll loop.
- The next iteration calls
GET /api/fetcher/state/{adapter}. Because the API clearedfetcher_stateduring the reset window (API_SPECIFICATION Β§5/Β§7), this returns404β null cursor. - A null cursor is exactly the backfill trigger (F14, Β§5.8.1): the runner performs the bounded backfill (F13) as the initial ingestion, advances the cursor to
max(status.created_at), then normal polling continues. - After the poll loop has resumed,
POST /api/control/eventsastatusevent (reuse the existingstatustype β not a new type):
| Part | Value |
|---|---|
| Headers | X-Api-Key, X-Component-Id: dashboard-fetcher, X-Correlation-Id: <reset-completed correlation_id> (optional, recommended β correlates recovery to the same process), Content-Type |
| Body | { "event_type": "status", "state": "running", "occurred_at": "<now UTC>" } |
The reset linkage to backfill is implicit by design: the fetcher does not call a "backfill" API: it simply drops the cursor and lets the existing F14 null-cursor path do the work. This keeps the reset handler tiny and reuses the tested backfill flow.
5.10.6 Resilience and self-heal¶
| Scenario | Behaviour |
|---|---|
| Subscriber connection drops mid-cycle | Reconnect with Last-Event-ID; the server replays any missed events (including a missed reset-completed) within the 2 h window β recovery still fires. |
| Fetcher down for the entire reset cycle | On next startup the poll loop sees an empty store + 404 cursor and backfills anyway (F14) β no event needed; the reset self-heals via the same null-cursor path. |
| Ack POST fails | Stay paused; orchestrator proceeds on AckTimeoutSeconds. Recovery still triggers on the eventual reset-completed. |
reset-completed arrives while already running (duplicate/replay) |
Idempotent: dropping an already-advanced cursor and re-checking state at worst re-backfills the most-recent slot per (service, environment) β duplicates are acceptable (F5, append-only). |
5.11 Per-cycle rate-limit reporting (F18)¶
After each successful poll cycle, when a RateLimitSnapshot is available, the fetcher posts a rate-limit component event to the existing POST /api/control/events surface. See api/api-guidelines.md Β§11 "Rate-limit report payload" and diagrams/fetcher-rate-limit.md.
Multi-adapter note. With multiple adapters each adapter emits its own rate-limit event carrying a distinct payload.adapter value under the shared component_id. Consumers must key on payload.adapter, not on component_id, to distinguish per-adapter counters.
Trigger and gate¶
| Condition | Behaviour |
|---|---|
Snapshot present after PollOnceAsync |
Post rate-limit event immediately. |
| Snapshot null (before first GitHub response) | Skip β no all-null reports. |
CONTROL_API_KEY absent |
Not a gate β the report uses X-Api-Key (same as ingest); always active when API_KEY is present. |
Extended RateLimitSnapshot¶
RateLimitSnapshot carries two additional fields populated from GitHub response headers after each call (GithubClient β RateLimitBudget.RecordAndWaitIfNeededAsync):
| Field | Source | Null when |
|---|---|---|
CiLimit |
X-RateLimit-Limit |
Before first GitHub response. |
CiRemaining |
X-RateLimit-Remaining |
Before first GitHub response. |
Existing fields (Used, Budget, ResetAt) are unchanged.
Payload mapping¶
The payload object maps the snapshot to the api-guidelines rate-limit contract:
| Payload field | Source |
|---|---|
adapter |
ICiCdAdapter.AdapterId (e.g. github-actions) |
ci_limit |
snapshot.CiLimit (null when not yet received) |
ci_remaining |
snapshot.CiRemaining (null when not yet received) |
own_budget |
snapshot.Budget |
own_used |
snapshot.Used |
reset_at |
snapshot.ResetAt; serialised as RFC 3339 UTC; null when ResetAt == DateTimeOffset.MinValue |
state = "running" normally. The delegate closure in DI supplies the adapter id and state; PollLoop itself remains free of the Control namespace dependency.
Resilience¶
Non-fatal. Transport errors and non-2xx responses are logged at Warning level and swallowed. The poll loop continues regardless. This mirrors PostAckAsync / PostRunningAsync behaviour (Β§5.10.4, Β§5.10.5).
Wiring (no Orchestration β Control dependency)¶
PollLoop accepts an optional Func<RateLimitSnapshot, CancellationToken, Task>? reportCycleAsync parameter. Program.cs DI wires it to IComponentEventClient.PostRateLimitAsync, closing over the adapter id. This preserves the existing dependency direction: Control β Orchestration, never the reverse.
6. Configuration (env)¶
| Var | Example | Purpose |
|---|---|---|
DASHBOARD_API_BASE_URL |
http://gateway:8080 |
where to POST events + read/write state + open the control stream |
API_KEY |
(secret) | X-Api-Key for ingest + state + POST /api/control/events |
CONTROL_API_KEY |
(secret) | X-Control-API-Key for the control stream subscription (GET /api/control/stream); distinct from API_KEY (Β§5.10.2) |
COMPONENT_ID |
dashboard-fetcher |
X-Component-Id on component-event posts; MUST match the API's ExpectedComponents (Β§5.10.1) |
POLL_INTERVAL_SECONDS |
30 |
loop cadence (integration uses 1) |
INITIAL_LOOKBACK |
7.00:00:00 |
normal poll first-run window (F7); also the default for BACKFILL_MAX_AGE when unset |
BACKFILL |
false |
set true to force a backfill run regardless of cursor state (F14) |
BACKFILL_MAX_AGE |
30.00:00:00 |
how far back backfill scans per environment; defaults to INITIAL_LOOKBACK |
BACKFILL_DEPTH |
2 |
number of latest status events to seed per (service, environment) slot during backfill (F13); default 2 |
GITHUB_BASE_URL |
https://api.github.com |
overridable for the integration mock |
GITHUB_TOKEN |
(secret) | PAT / GitHub App token |
GITHUB_REPOS |
acme/api,acme/web |
repos to poll |
GITHUB_SERVICE_MAP |
Deploy Checkout API=checkout-api,acme/api=api |
optional overrides; key without / = workflow-level, key with / = repo-level (Β§5.8.3) |
GITHUB_VERSION_SOURCE |
attribute:sha |
attribute:<attr> | payload:<field> | artifact:<filename> β see Β§5.7 |
GITHUB_RATE_LIMIT |
(unset) | Total hourly request quota for the token. Unset = discovered via GET /rate_limit on startup; discovery failure β 5 000. |
GITHUB_RATE_LIMIT_BUDGET_PCT |
30 |
Percentage of the quota the fetcher may consume per hour (1β100). Default 30 (e.g. 1 500 of 5 000). |
Explicit-binding vars. All vars in this table are read explicitly by name β
FetcherOptionsEnv.ApplyEnvOverrides(for the fetcher vars) andGithubAdapterOptionsEnv.ApplyEnvOverrides(for theGITHUB_*vars). The appsettingsGitHubsection provides base values;GITHUB_*env vars override it. A missing or unparseable value leaves the property at its default without throwing.
Health endpoint port. The GET /health listener uses the standard ASP.NET ASPNETCORE_URLS environment variable (e.g. http://+:8080). Default container port is 8080; the demo driver's FETCHER_URL (DEMO_DRIVER_SPEC Β§9) must match.
Demo mode. Set GITHUB_BASE_URL=http://github-emulator:3100 (the github-emulator service β GITHUB_EMULATOR_SPECIFICATION.md) and GITHUB_TOKEN to any placeholder value (the emulator does not validate it). No other fetcher config change is needed.
6.1 Functional readiness β GET /readyz¶
Reflects actual GitHub poll-cycle health. Distinct from the liveness /health which is always 200.
Response shape:
{
"status": "ready" | "degraded",
"github": {
"reachable": true | false,
"last_outcome": "ok" | "auth_failed" | "rate_limited" | "error" | null,
"last_success_at": "<RFC 3339 UTC>" | null,
"last_error": "<string>" | null,
"paused_for_reset": false,
"rate_limit": {
"used": 150,
"budget": 1500,
"reset_at": "<RFC 3339 UTC>",
"ci_limit": 5000,
"ci_remaining": 4830
} | null
}
}
Status codes:
| Condition | HTTP | status |
|---|---|---|
Last outcome is ok |
200 | ready |
Last outcome is rate_limited or never polled |
200 | degraded |
| Paused for reset (any prior outcome) | 200 | ready or degraded per outcome |
Last outcome is auth_failed or error AND NOT paused |
503 | degraded |
Paused-for-reset is healthy. A loop paused during the reset choreography (Β§5.10.3) never produces a 503 β paused_for_reset: true signals the expected transient state regardless of the last recorded outcome.
Rate-limit snapshot. rate_limit is populated after the first GitHub HTTP response that carries X-RateLimit-* headers. null before the first response.
Indicator. IFetcherReadinessIndicator / FetcherReadinessIndicator live in Dashboard.Fetcher.Orchestration. PollLoop calls RecordSuccess / RecordAuthFailed / RecordRateLimited / RecordError after every cycle, and SetPausedForReset(true/false) on pause / resume events.
7. Testing¶
| Layer | Project | Scope |
|---|---|---|
| Unit | Dashboard.Fetcher.Tests |
Β§7.1 |
| Integration | cross-stack suite | Β§7.2 |
Dashboard.Fetcher.Tests is excluded from the API test run and exercised on the fetcher's own pipeline.
7.1 Unit test cases¶
Mapping: GitHub JSON fixture β DeploymentEventIngest; status table (Β§5.3); cursor advance / first-run lookback; orchestrator loop (mock ICiCdAdapter + mock ingest/state clients); at-least-once on mid-batch failure.
Parent derivation: linear chain (dev β staging β prod); parallel branches (two envs with shared root); non-deployment intermediary job (BFS look-through); matrix collision (two deployments same env same run β latest wins); environment as object vs string; needs as string vs array; no matching deployment job (β []); non-Actions target_url (β []); workflow fetch non-2xx (β []); YAML parse error (β []).
Service resolution: workflow-level SERVICE_MAP hit; repo-level hit; default (workflow name as-is); non-Actions fallback (repo short name).
Version resolution: attribute:sha β 7-char truncation; attribute:ref β value as-is; payload:version β field value; payload field absent β null; payload not a JSON object β null; artifact:version.txt β trimmed file content; artifact not found β null; artifact list non-2xx β null; artifact download non-2xx β null; artifact source + no run_id β null; artifact result LRU-cached for same (repo, run_id, artifact_name).
Backfill: all services covered on first page (early exit); rarely-deployed service found on page 2 (pagination); service not deployed to env within BACKFILL_MAX_AGE (skipped); BACKFILL=true overwrites existing cursor; events posted oldest-first.
Rate-limit budget: GET /rate_limit response β correct total_limit and budget; GITHUB_RATE_LIMIT set β discovery call skipped; GET /rate_limit non-2xx β total_limit = 5000; budget = floor(total_limit Γ pct / 100) (boundary cases: pct = 1, pct = 100); adapter pauses until reset_at + 1 s when used β₯ budget; internal counter resets to 0 after window rollover; backfill and normal poll share the same budget counter.
Conditional requests (ETag, Β§5.5.2):
- In-flight statuses 304 across cycles β no event emitted in cycle 2; If-None-Match was sent for the statuses URL.
- Statuses 200 after payload change β new event emitted; statuses endpoint did NOT return 304 (ETag rotated).
- Deployments-list 304 β cached snapshot reused; per-deployment status endpoint still called in cycle 2 (list 304 does not skip status checks).
- Parent edge preserved when staging statuses return 304 in cycle 2 β prod event resolves parent_deployments via the cached runId.
- No ETag from server β no If-None-Match sent on the next cycle; no 304s served (graceful degradation, behaviour identical to unconditional fetch).
- Rate-limit budget does NOT increment the own counter for 304 responses (304 consumes no quota); a 200 does. Rollover bookkeeping and header capture (X-RateLimit-Limit / X-RateLimit-Remaining) remain unconditional.
Control-plane participation (F17, Β§5.10):
- reset-initiated received β poll loop paused (no further FetchAsync / ingest POST) AND reset-ack posted with headers X-Api-Key + X-Component-Id: dashboard-fetcher + X-Correlation-Id = the reset-initiated event id + Content-Type, body {event_type:reset-ack, state:paused, occurred_at} (no payload.reset_id).
- reset-completed received β in-memory cursor dropped; next GET /api/fetcher/state mock returns 404 β backfill (F14) triggered; status/running event posted afterwards with X-Correlation-Id = the reset-completed correlation_id.
- reset-started received β no ack, no extra POST, poll loop stays paused (asserts no redundant handling).
- Unknown event: type β no-op (no POST, poll loop unaffected).
- Reconnect after a dropped stream sends Last-Event-ID = last seen event id.
- : ping frame β treated as heartbeat, no event dispatched.
- Ack POST returns non-2xx β subscriber stays paused, does not throw, still recovers on subsequent reset-completed.
- Component id overridden via COMPONENT_ID β header reflects the override.
Per-cycle rate-limit reporting (F18, Β§5.11):
- RateLimitBudget.CiLimit and CiRemaining are null before the first GitHub response; populated from X-RateLimit-Limit / X-RateLimit-Remaining on first response; updated on subsequent responses; remain null when headers are absent.
- PostRateLimitAsync emits body with event_type:"rate-limit", correct state, occurred_at; payload contains adapter, ci_limit, ci_remaining, own_budget, own_used, reset_at; reset_at is null when snapshot ResetAt == DateTimeOffset.MinValue; X-Api-Key and X-Component-Id headers present.
- PostRateLimitAsync non-2xx response β does not throw.
- PostRateLimitAsync transport error β does not throw.
- Per-cycle reportCycleAsync delegate fires once per successful cycle when snapshot is non-null.
- Per-cycle reportCycleAsync delegate NOT invoked when snapshot is null.
- reportCycleAsync throws β loop continues next cycle (non-fatal).
Functional readiness indicator (Β§6.1):
- Initial state β LastOutcome = null, LastSuccessAt = null, IsPausedForReset = false.
- RecordSuccess β LastOutcome = ok, LastSuccessAt set, LastErrorSummary = null.
- RecordSuccess with snapshot β RateLimit populated; without snapshot β existing snapshot retained.
- ok β auth_failed β ok transition: outcome and error summary follow latest record; success clears error.
- RecordAuthFailed β LastOutcome = auth_failed, summary populated.
- RecordRateLimited β LastOutcome = rate_limited, snapshot and summary populated.
- RecordError β LastOutcome = error, summary populated.
- SetPausedForReset(true) β IsPausedForReset = true; does NOT change LastOutcome (orthogonal flags).
- SetPausedForReset(false) β flag clears.
- Paused while auth_failed β both flags independent; handler applies its own 503 logic.
- PollLoop.Pause() β calls SetPausedForReset(true) on indicator.
- PollLoop.DropCursorAndResume() β calls SetPausedForReset(false) on indicator.
7.2 Integration test cases¶
The mock GitHub API referenced in this section is the github-emulator service (GITHUB_EMULATOR_SPECIFICATION.md). Integration tests seed it via POST /_github/seed {dataset:"demo"} and run the real fetcher-host against http://github-emulator:3100. See docs/diagrams/github-emulation.md for the topology.
Real fetcher-host against the github-emulator + real Dashboard.Api + Postgres. Asserts:
- Wire shape (FR-06) and opaque-cursor round-trip.
- Populated
parent_deploymentson a two-environment chain. - Backfill populates
(service, environment)slots correctly. - NFR-03 latency envelope.
- Full reset cycle (F17, Β§5.10) against the real API + Postgres: fetcher subscribes to
GET /api/control/stream; operator triggersPOST /api/control/reset; assert the fetcher (a) receivesreset-initiatedand posts areset-ack(paused, correctX-Correlation-Id) visible viaGET /api/control/events; (b) onreset-completeddrops its cursor, re-backfills against the mock GitHub API after the store +fetcher_statewere cleared, and posts astatus/runningevent. Confirms the orchestrator counts thedashboard-fetcherack and the store is re-populated post-reset.
8. Out of scope¶
- Horizontal scaling of the fetcher (single replica per adapter β F6).
- Adapters other than GitHub (the abstraction is the deliverable; ADO/Jenkins are future drop-ins).
- Any backend change β the fetcher only consumes the existing public contract.
- Triggering/managing deployments (read-only, SAD Β§3 Non-Goals).