Screenshots
See Vigil in Action
Dashboard — service health cards with uptime percentages, latency sparklines, and alarm banner
Features
Know Before Your Users Do
Continuous endpoint monitoring with sub-second WebSocket alerts, latency tracking, incident management, and uptime reporting. When something breaks at 3 AM, Vigil is already on it.
Health Monitoring
Endpoints, Latency, and Uptime
Automated Endpoint Polling
Configurable check intervals per site. HTTP status validation, response time measurement, and content verification. APScheduler-based polling engine.
Latency Tracking
Millisecond-resolution response time history. Sparkline visualizations on dashboard cards. Detailed timeseries charts in site detail view.
Uptime Statistics
Rolling uptime percentages across 24-hour, 7-day, 30-day, and 90-day windows. Per-site and aggregate fleet uptime on the dashboard.
Health Status Classification
Three-state health model: healthy (green), degraded (yellow), and down (red). Degraded status triggered by slow response times before full failure.
Incident Management
Alerts, Incidents, and Resolution
Automatic Incident Creation
Incidents created automatically when health checks fail. Tracks start time, affected service, and current status. Auto-resolves when the service recovers.
Alarm Banner
Red banner across the top of the dashboard when any service is down. Shows count of affected sites. Mutable for acknowledged incidents.
WebSocket Live Updates
Real-time check results and incident broadcasts via WebSocket. Dashboard updates instantly without polling. No refresh needed.
Incident History
Full incident timeline per site with start, end, and duration. Filter by status (active, resolved). Incident table in the site detail modal.
Dashboard
Card View and List View
Card View
Grid of site cards showing name, status indicator, uptime percentage, latency sparkline, and last check time. Color-coded borders for health state.
List View
Tabular view with sortable columns for name, URL, status, uptime, latency, and last check. Toggle between card and list from the header.
Site Detail Modal
Click any site to open a detail panel with latency chart (Recharts), uptime stats across all time windows, incident table, sub-checks, and settings.
Status Summary
Aggregated counts at the top: total sites, healthy, degraded, and down. Fleet-wide uptime percentage for the last 24 hours.
Accessibility
Built for Everyone
WCAG 2.1 AA Compliance
4.5:1 contrast for body text, 3:1 for large text and UI components, in both light and dark themes.
Keyboard Navigation
Every interaction reachable via keyboard. Logical tab order, visible focus indicators, Escape-to-dismiss for modals.
Screen Reader Support
VoiceOver, NVDA, and JAWS tested. Semantic HTML, ARIA labels, live regions for dynamic updates.
Reduced Motion
Respects prefers-reduced-motion. Usable at 200% zoom. Touch targets meet 44x44 minimum.
How It Works
From Endpoint to Alert
Register
Add endpoints to monitor with URL, expected status, check interval, and timeout thresholds.
Poll
APScheduler fires checks on schedule. Records status code, response time, and content hash for each endpoint.
Detect
Failed checks trigger incidents automatically. Degraded checks fire when latency exceeds thresholds. WebSocket broadcasts the change instantly.
Resolve
When the endpoint recovers, the incident auto-resolves. Uptime stats recalculate. The alarm banner clears.
Technical Specifications
Under the Hood
Backend
- FastAPI (Python 3.12+)
- PostgreSQL + SQLAlchemy 2.0 async
- APScheduler for check scheduling
- Alembic migrations
- avian-diagnostics integration
Frontend
- React 19 + TypeScript
- Vite build system
- Recharts for latency charts
- Framer Motion animations
- CSS Modules with AVIAN design system
- Light and dark mode
Real-Time
- WebSocket for live updates
- Check result broadcasting
- Incident state changes
- No polling required
Monitoring
- HTTP status validation
- Response time measurement
- Content hash verification
- Configurable check intervals
- Automatic incident lifecycle
Development
100% Built by Claude
Vigil was built entirely by Claude (Anthropic) working alongside a single human supervisor. Every line of code, every test, every deployment: AI-authored with human direction.