Fulcrum: Measuring What AI Actually Saves

We use AI coding agents for nearly every engineering task at Renkara Media Group. Claude Code writes features, fixes bugs, generates content, provisions infrastructure. But without measurement, we had no idea which tasks benefited most from AI assistance, how much time we actually saved, or when the pace of decision-making was degrading the quality of our supervisory input. Fulcrum is the system we built to answer those questions.

What Is Leverage?

The leverage factor is a simple ratio. Estimate how long a senior engineer unfamiliar with the specific subfield would take to complete a task (including research, learning, experimentation, coding, and debugging). Divide that by how long Claude actually took. A leverage factor of 50x means a task estimated at 50 human hours was completed in 60 minutes of Claude runtime.

This number is intentionally conservative. It measures AI output per minute of Claude's runtime, not per minute of your time. Since Claude runs autonomously for the majority of each task, your actual time investment is often a fraction of Claude's wall-clock time. That is where the supervisory factor comes in.

The Supervisory Factor

The supervisory factor measures engineering output per minute of your personal investment: the time you spend writing prompts, reviewing output, making corrections, and providing guidance. For a task with a 50x leverage factor where you spent 15 minutes supervising while Claude worked for 60 minutes, the supervisory factor is 200x. The remaining 45 minutes were free. You could have been reviewing another PR, supervising another Claude session, or doing something else entirely.

Both metrics are PostgreSQL generated columns, computed server-side from their source values. No client-side math, no stale data. The database is the single source of truth.

Four Interfaces, One Service Layer

Fulcrum can be accessed four ways: a React web dashboard, a REST API, a Typer CLI, and an MCP server for Claude Code. All four share the same service layer and SQLAlchemy models. The CLI imports the service layer directly and opens its own database session, so it works without the server running. The MCP server does the same. Zero code duplication between the four interfaces.

The Web Dashboard

The React dashboard shows summary cards with leverage factor, supervisory factor, hours saved, and total records. A trend chart plots daily leverage factor over time. The records table supports sorting, filtering, and pagination. Clicking any row opens a detail modal with full metrics, info tooltips explaining what each number means, and Edit, Duplicate, and Delete actions. A per-project breakdown shows which codebases yield the highest AI productivity gains.

The CLI

The leverage CLI handles the most common operations from the terminal: adding records, listing recent records, viewing summary statistics, exporting data to CSV or JSON, and running database migrations. It is the fastest way to log a record after completing a task, and it works on any machine with Python installed, with or without the server running.

The MCP Server

Eight MCP tools are exposed for Claude Code sessions: record_leverage, get_leverage_summary, get_today_stats, list_records, get_project_breakdown, get_timeseries, get_fatigue_assessment, and predict_leverage. The MCP server talks directly to the database through the shared service layer. No HTTP round-trips. Every Claude Code session automatically logs leverage records via instructions in our CLAUDE.md configuration, making tracking completely seamless.

Decision Fatigue Assessment

This is the feature that surprised us with how useful it turned out to be. Agentic coding compresses decision density roughly 20x: decisions that would take a human 4 hours of spread-out thinking get compressed into 12 minutes of supervisory input. Research on cognitive load suggests that this accelerated pace causes glutamate to accumulate in the lateral prefrontal cortex, progressively degrading judgment quality.

Fulcrum tracks today's activity and provides a fatigue assessment with four levels:

Records Today	Level	Agent Behavior
0-3	Low	Normal: ask clarifying questions, present options
4-7	Moderate	Reduce questions, make reasonable defaults
8-12	High	Minimize decision points, choose safe approaches
13+	Very High	Maximum autonomy, present completed work only

The system computes a continuous score from 0 to 1, weighting record count (50%), supervisory time (25%), and session duration (25%), minus a recovery discount for idle time. It produces neuroscience-grounded recovery recommendations: break timing, session limits, mode switching. These recommendations are priority-ranked and calibrated to the current fatigue level, time of day, and work pattern.

The agent guidance output is prescriptive. It tells Claude Code exactly how to adjust its interaction style based on the fatigue level. By the 13th task of the day, the agent absorbs more of the decision burden, presenting completed work rather than options. This is not just a nice-to-have; it is a safety mechanism for maintaining decision quality during long agentic coding sessions.

Predictive Leverage Estimation

Before starting a new task, you can ask Fulcrum to predict its leverage factor using historical data from similar tasks. The prediction endpoint analyzes your past records and estimates how much AI assistance will help with the task at hand. This lets you prioritize high-leverage work first and allocate your supervisory time where it delivers the highest return.

What Makes It Better Than a Spreadsheet

Fulcrum started as a CSV file. For the first few months, every Claude Code session appended a row to ~/.claude/leverage_factor_log.csv. That worked until the file hit hundreds of records and we needed filtering, aggregation, trend analysis, and multi-user support. The CSV had no computed columns, no search, no API, and no way to answer "what is my average leverage factor on auth-service tasks over the last 90 days?"

Fulcrum answers that question in a single API call. It also provides full-text search across task descriptions using PostgreSQL GIN indexes, soft deletes with a 90-day retention window, bulk import from the old CSV format with duplicate detection, and data export in CSV or JSON. The CSV still exists as a fallback: if the API is unreachable, records are written locally and backfilled later.

Key Specs

Spec	Detail
Frontend	React 18, TypeScript, Vite, Recharts, CSS Modules
Backend	FastAPI, SQLAlchemy 2.0 async (asyncpg), PostgreSQL
CLI	Typer (imports service layer directly, no HTTP)
Auth	Argon2id-hashed API keys with prefix-based lookup
MCP tools	8 (record, summary, today, list, projects, timeseries, fatigue, predict)
Computed columns	leverage_factor and supervisory_factor (PostgreSQL generated)
Search	Full-text via PostgreSQL GIN/tsvector
Ports	3401 (frontend), 3411 (backend)
Theme	Light and dark mode

Integration Points

Fulcrum integrates with Claude Code at the deepest level: every coding session logs leverage records automatically through CLAUDE.md instructions, writing to both the REST API and the local CSV as a fallback. The MCP server gives Claude direct access to fatigue assessments and summary statistics. The REST API serves the web dashboard and supports external integrations. And the CLI provides a no-server-required interface for scripting, cron jobs, and quick lookups from the terminal.

Fulcrum: Measuring the Leverage of AI-Assisted Engineering