The dashboard that watches a school district's cloud.
A full-stack TypeScript & SQL Server platform for Azure Local / Azure Stack HCI. It pulls operational truth from several very different planes — report snapshots, live WMI faults, Windows events, Azure alerts — and turns them into one operator surface across five environments, without ever pretending a number came from somewhere it didn't.
The data was scattered across planes that don't agree.
Monitoring Azure Local (renamed from Azure Stack HCI) isn't one feed — it's many, and they have different truth models. Some data comes from scheduled health reports. Some is a live WMI fault that fires and clears in seconds. Some is a firehose of forwarded Windows events. Some is an Azure API you poll. And it all lives across five domains where non-production environments can't always reach production services directly.
The lazy version flattens everything into one table and fakes the joins. This one doesn't. Report-backed data is tied to a real report; live data stays live and reportless. That single decision is what makes the dashboard trustworthy instead of just busy.
PowerShell gathers evidence. SQL Server does the thinking.
Data shaping lives at the database and repository boundary — never as ad-hoc reads from a component.
Seven sources, each with its own honesty.
Report-backed sources carry a real report ID. Live sources are environment-scoped and intentionally reportless.
HCI Fleet Health
REPORT-BACKEDCluster, node, storage, network, VM, update & service-health snapshots collected inside each environment's domain context.
AVHDX orphan scan
REPORT-BACKEDFinds orphaned differencing-disk files across HCI storage paths; SQL derives age, staleness, and per-server totals.
HCI Health Fault Listener
LIVEWatches the Health Service WMI class for fault create / modify / resolve events, including synthetic stale-close handling.
Forwarded Windows Events
LIVEWEF subscriptions feed a watcher with catch-up + live phases, SHA-256 dedupe, and a rolling retention window over 5M+ rows.
Azure Monitor Alerts
LIVEPolls the Alerts Management API via OAuth2 and upserts history with SQL MERGE — active, resolved, and recent windows.
App & pipeline telemetry
LIVEThe platform monitors itself: server/client errors, Prisma slow queries, and materialization step logs feed a pipeline view.
Evidence-first: keep the two truth models apart.
The whole design hinges on never letting a live signal pretend it came from a report.
Report-backed
- Has a real ReportDocument + ReportId
- You can ask: which report produced this row?
- Which environment, was it valid, what timestamp?
- What raw collector JSON backs the evidence?
- HCI Fleet Health & AVHDX live here
Live / reportless
- Environment-scoped, never report-scoped
- No synthetic ReportId is ever invented
- Latest-report enrichment stays nullable, never identity
- WMI faults, Windows events, Azure alerts
- Prevents "this fault came from the latest report" lies
-- A canonical key defines alert identity; an issue key defines an occurrence. MERGE dbo.FleetIssue AS tgt USING staged_alerts AS src ON tgt.IssueKey = src.IssueKey WHEN MATCHED AND src.MonitorCondition = 'Resolved' THEN UPDATE SET tgt.Status = 'Resolved', tgt.ResolvedAtUtc = src.SeenUtc WHEN NOT MATCHED THEN INSERT (CanonicalAlertKey, IssueKey, EnvironmentCode, Status) VALUES (src.CanonicalAlertKey, src.IssueKey, src.EnvironmentCode, 'Active'); -- FleetEvent then records every fired / changed / resolved transition.
The Forge and an AI assistant, built in.
- The Forge — a governed automation runner inside the dashboard that lets operators launch approved PowerShell workflows: SQL-backed job queues, command manifests, runner heartbeats, artifacts, cancellation, and a full audit trail. No arbitrary shell from a browser. See the full breakdown →
- Ask ALDHI — the platform's own AI chat, on TanStack AI + OpenRouter, with 38 typed tools that surface its logging and telemetry directly: SQL health, alert triage, Windows event search and XML, materialization logs, application errors, Azure Resource Health, and artifact generation. Hard boundaries — no raw SQL from the model, no exposed secrets, row caps, guarded writes.
- Built like a product — TypeScript end to end, Zod-validated contracts, live DB contract tests, component tests, Playwright, and lint/typecheck/coverage gates. SQL source is checked in as views, procs, functions, and triggers.
A few more places screenshots will land.
Designed frames so it reads intentionally now, and richer the moment real captures go in.