AI Span of Control: A Lever Long Enough to Move the Middle [Management]
A promotion metric that turns middle management into the engine of AI adoption.
Give me a lever long enough and I can move the world.
I have been thinking about what single lever I would use to move a large incumbent organization into the AI age. Not a startup. Not a highly technical engineering team. A real institution: government-agency sized, budget-constrained, process-heavy, committee-bound, and full of people whose incentives were designed for a different technological era.
The lever is middle management.
Not because middle managers are the disruptive people in the building. Usually they are not. Middle managers are the machinery that turns strategic intent into repeated organizational behavior that drives productivity, efficiency, and profit. They own the staff meetings. They own the workflow reviews. They own the budget requests. They decide whether a new tool becomes an operating habit or dies as a pilot.
If you want to change a large organization, you have to change what middle managers compete for.
Today, management careers are measured by span of control: how many people report into the organization, how large the budget is, how much revenue or mission throughput the unit owns, and whether the manager can credibly claim stewardship over a bigger machine than their peers.
In industry, this gets bragged about on the golf course. In government, it shows up in position descriptions, promotion criteria, office location, and who sits at the table vs against the wall. “I run a thousand-person organization.” “I manage a billion-dollar budget.” “I have a $100m P&L.” These are not just vanity metrics. They are career metrics.
They are also incentive systems.
Headcount is not a perfect measure of value. Budget is not a perfect measure of output. Revenue is not a perfect measure of managerial contribution. They persist because they are legible proxies for organizational scope. They are easy to compare, easy to brag about, and easy to fight over.
That is exactly why they shape behavior.
If middle managers are promoted for headcount, they will protect headcount. If they are promoted for budget, they will protect budget. If they are promoted for revenue, they will protect revenue. They will do this even when AI could improve the actual output of the organization.
This is not a character flaw. It is rational behavior inside a badly aging incentive structure designed for another era.
The AI transition will fail in large organizations if adopting AI makes a manager look smaller.
If a manager automates a workflow and reduces staffing needs, the old system may interpret that as reduced scope. If a manager collapses a process from six teams to two, the old system may interpret that as losing organizational mass. If a manager uses agents to increase throughput without increasing headcount, the old system may fail to notice the achievement at all.
That is the problem.
So the lever is obvious: we need to make AI adoption part of span of control.
We need a metric that lets a middle manager say, “I do not just manage people, dollars, and systems. I manage AI-augmented work.”
Call it AI Span of Control or AISC if you’re in the government and need an acronym.
The Wrong Metric Is Token Usage
My first instinct was to use token volume like the tech industry.
That instinct is understandable. Token usage is simple. It is measurable. It is already visible in enterprise consoles. It creates a leaderboard. It can even produce trophies.
OpenAI’s token-award program is a useful example. It turned API consumption into a status object. Companies that crossed large token thresholds got something they could display, talk about, and use as evidence that they were participating in the AI economy.
That is clever marketing. It is also a warning.
Token usage is not value.
A manager can increase token usage by encouraging real AI adoption. But they can also increase token usage by wasting money, buying unused capacity, spamming prompts, or pushing people to use AI when it adds no value. Token volume measures consumption, not contribution.
A good metric is not merely simple. A good metric is simple and designed to be gamed in a productive direction.
That is the real test.
If the easiest way to game a metric is to buy unused seats, spam prompts, or inflate token counts, the metric is garbage.
If the easiest way to game a metric is to train the team, standardize tools, redesign workflows, ship reusable AI assistants, and normalize weekly AI use, the metric is doing its job.
That is the metric we need. A metric designed to be gamed.
AI Span of Control
I propose measuring AI Span of Control, or AISC, over a 28-day rolling window.
The formula:
This is not meant to be a high-precision engineering metric. It is a management metric.
That means it will be wrong in some cases. That is acceptable. The existing metrics are wrong too. Headcount is wrong. Budget is wrong. Revenue ownership is often wrong. But they are legible, comparable, and productive when gamed. That is why they matter.
AISC needs the same properties.
The definitions are intentionally simple.
A Light User is someone who meaningfully touched AI in the last 28 days. Maybe they used a chatbot once to write a difficult email, summarize a policy document, draft a memo, or prepare for a meeting. That counts. That’s a stepping stone into the future..
A Regular User is someone with repeated usage. They probably use AI every week, and likely most workdays. AI is becoming part of their normal execution loop. They may not know it yet, but they’ve probably taken the AGI pill.
A Power User is someone with sustained, tool-rich usage. They use AI inside real work systems: desktop agents, research agents, coding agents, copilots, workflow builders, or internal assistants. They are not just asking a chatbot for prose. They are changing how work gets done.
An AI Workstream is a shared or production workflow that actually runs. Not a demo. Not a pilot deck. Not an innovation-lab screenshot. A real workflow with recurring use.
For example: an automation that monitors news feeds for mission-relevant stories, summarizes them, checks them for organizational relevance, and sends a phone notification when something matters. That is an AI Workstream.
A reusable assistant that drafts first-pass procurement language for a contracting team is an AI Workstream.
A coding agent that organizes and summarizes pull-requests so the sales team knows what new feature just got deployed is an AI Workstream.
A meeting agent that turns weekly staff meetings into task assignments, risk registers, and follow-up drafts is an AI Workstream.
The point is not that every workstream is equally valuable. They are not. The point is that every real workstream represents organizational learning. Someone had to identify repeated work. Someone had to redesign a process. Someone had to make AI useful enough that the team came back to it.
That is the behavior we want.
Yes, Managers Will Game It
Can this be gamed?
Of course it can.
Every management metric gets gamed. Middle managers already count interns against headcount. They count one-time hardware purchases against budget scope without regard to depreciation. They describe inherited revenue as if they personally created it. They make their domains sound as large as possible because the system rewards them for doing so.
That is normal.
The question is not whether AISC can be gamed. The question is whether gaming it creates the behaviors we want.
If a manager wants to increase their AISC, what do they have to do?
They have to get more people using AI. They have to identify recurring workflows. They have to build or sponsor assistants. They have to train their teams. They have to normalize AI use in staff rhythms. They have to create actual workstreams that survive beyond a demo.
That is exactly what leadership should want them doing.
A metric is dangerous when the cheapest way to improve it damages the organization.
A metric is powerful when the cheapest way to improve it actually improves the organization.
Give them a metric that we win when they game it.
Why This Works for Large Organizations
Large organizations do not usually fail because nobody has heard the CEO’s strategy. They fail because the strategy never becomes a weekly habit.
The CEO announces AI transformation. The CIO launches a governance board. The CTO’s office runs pilots. The legal team writes acceptable-use guidance. The training team publishes a learning path.
Then nothing much changes.
The average branch chief, program manager, division director, or office lead still has the same staff meeting, the same reporting cycle, the same clearance process, the same budget defense, the same hiring plan, and the same performance-review template.
The organization has an AI strategy, but the middle has no AI scoreboard.
That is fatal.
AISC gives the middle a scoreboard.
It says: your job is not merely to protect your existing span of control. Your job is to expand your AI span of control.
How many AI-augmented workstreams do you manage?
How many power users have you enabled?
How many regular users are you developing?
How many people have at least crossed the threshold from non-user to light user?
Those questions are crude. They are also useful.
And crucially, they are answerable.
The Anthropic enterprise console, OpenAI enterprise console, Google’s AI Workspace, Microsoft 365 Copilot, and GitHub Copilot already expose many of the adoption signals needed for this kind of measurement: active users, active days, prompts, feature usage, daily or weekly active users, acceptance rates, and agent adoption. The tooling will vary by stack, but the pattern is already here.
The enterprise does not need a perfect measurement system before it starts. It needs a directional measurement system that rewards the right behavior.
The First Workstream
Here is the practical version.
Open the AI agent application your organization permits. Maybe that is a coding agent. Maybe it is Anthropic’s Claude Cowork, or OpenAI’s Codex, or Microsoft Copilot, or Gemini. The brand matters less than the operating mode.
Do not open a chatbot and ask for a poem about productivity.
Open an agent and tell it to build you a recurring report.
The report should estimate your team’s AI Span of Control once a week. It should pull from whatever usage data your organization exposes. It should classify users into light, regular, and power categories. It should identify recurring AI-enabled workflows. It should send the result wherever you actually pay attention: email, Slack, Teams, your weekly staff agenda, or your operating review.
Tell the agent where your team roster lives. Tell it what tools your team uses. Tell it what dashboards are available. Tell it to ask questions when it lacks access or context. Then have it build and install the workflow.
Tell it what to do like you would a senior member of your team. Congratulations. You are now a power user, and you have created your first AI Workstream.
Your AISC is already 1.25.
That probably puts you ahead of most managers in your organization.
Now keep going.
Every workflow that must be performed more than once is a candidate for an AI Workstream. Every recurring status update. Every intake review. Every meeting summary. Every compliance check. Every first-draft memo. Every research scan. Every code review. Every customer-response pattern. Every procurement artifact. Every weekly metric pull.
Not all of them should be automated. But all of them should be examined.
You’re not going to automate yourself out of a job, you’re going to automate yourself into promotions.
What Leaders Should Do
If I were running a large organization, I would start publishing AISC by division.
Not quietly. Publicly.
I would show the distribution. I would show the leaders. I would show the laggards. I would make clear that this year it is an observation metric, and next year it becomes part of compensation, promotion, budget review, and job requirements.
That is how you move the middle.
The first version does not have to be punitive. In fact, it should not be. Early measurement should expose reality, not create panic. Some teams will have structural reasons for low adoption: regulatory constraints, classified systems, poor tooling, immature data access, or legitimate mission risk.
Fine. Measure anyway.
The point is not to pretend every division should have the same number. The point is to force the right conversation.
Why does this office have twenty AI Workstreams while that one has zero?
Why does this team have a dozen power users while that one has none?
Why are these managers converting light users into regular users while those managers are still waiting for a training memo?
Why did this division reduce cycle time while that division only increased token spend?
That is where the useful management work begins.
The New Status Game
Yesterday, a manager who ran a thousand-person organization or controlled a billion-dollar budget was a big fish.
Tomorrow, that will not be enough.
The next generation of organizational status will not only ask how many people you manage or how much money you control. It will ask how much intelligent work your organization can produce, how many workflows you have automated, how many teams you have augmented, and how many intelligent systems you can operate safely.
A manager who does not know their AI Span of Control will look like a fossil.
Not because AI is magic. It is not.
Because the world has been disrupted, the organization is changing, and the scoreboard has not caught up.
The leaders who fix the scoreboard first will move faster than everyone else. They will not need every manager to become a visionary. They will only need managers to do what managers already do: compete for scope, status, promotion, and resources.
Change the metric, and you change the competition.
Change the competition, and you change the organization.

