How incident.io is building the future of resilient engineering

In software engineering, incidents are an unavoidable reality of the job. From minor glitches to worldwide outages, it’s a matter of when, not if.
But for years, the industry has relied on outdated systems built for alerting rather than solving, says Stephen Whitworth, cofounder and CEO of intelligent incident management platform, incident.io. “You were essentially left with tools that would wake you up to tell you that something had gone wrong, but after that point, you’re left on your own.”
And when an incident occurs, engineers have to parse through disconnected workflows to coordinate a response. Navigating siloed tools for paging, logging actions, and documentation, all while communicating via Slack, slows everything down when speed matters most.
AI has intensified these challenges. Software is built and shipped increasingly fast, and AI-generated code introduces new vulnerabilities. Suddenly, engineering teams find themselves responsible for systems they didn’t write, relying on manual processes to diagnose modern problems.
Turning alerts into action
Incident.io was created to unify those tools and platforms into one cohesive workflow — not to prevent failures, but to give teams the clarity and coordination to tackle incidents when they inevitably arise.
The platform plugs directly into Slack, so incidents unfold where teams already work. Users simply trigger an incident in Slack, and the platform guides them through every step of the response, automating the setup, communication, and tracking of every incident so teams don’t waste time jumping between tools.
“We think about incidents existing on a spectrum, from tiny errors that affect your customers every single day to things like CrowdStrike, where the whole world shuts down for a day,” says Whitworth. “What people ultimately use us for is to help fix these issues faster, so their customers [aren’t] impacted, and then ultimately learn from them, so they can try and stop them happening again [and] make their business more resilient.”
Engineered by experience
The company began as a side project during 2020, when Whitworth and his cofounders were working together at Monzo. He and CTO Pete Hamilton were software engineers, and Chris Evans, incident.io’s chief product officer, was a director for platform and reliability, responsible for incident management.
“My cofounder, Chris, had built internal tooling that was so much better than what I’d ever seen on the market before,” says Whitworth. “[Incident response] has always sucked at every company I’ve been at before. Chris’ thing is much better.” The opportunity was clear, he says. “We know this is a universal problem, so let’s go build a company to solve it.”
The team tested demand for an early version of the platform among their personal network of engineering leaders. A single tweet generated over 500 demo requests. And that early enthusiasm quickly translated into revenue. By the time incident.io officially launched in early 2021, it had secured 50 to 100 hand-signed customers.
A customer-centric approach has been key to the company’s success, says Whitworth. “When you are three people around a kitchen table with barely a company to your name and a product that doesn’t quite work yet, a lot of your earliest customers will be betting on you as a person…The way that you can accelerate and improve things the most is by building personal relationships with your customers.”
Breaking new ground
Incident.io’s early traction set it on a trajectory for rapid growth. Several Monzo founders participated in its first $5.5M seed round in late 2021, and the following year, the company announced a $28.7M series A. In July 2022, since 50% of its customer base was already in the United States, incident.io opened its first North American office in New York.
The team has since focused heavily on expansion and hiring. In 2022, incident.io grew from three to more than 30 employees and was supporting more than 150 engineering teams worldwide.
By April 2025, incident.io had grown into a team of 80, helped resolve more than 250,000 incidents, and tripled its customer base from last year. More than 600 organizations — including Netflix, Etsy, OpenAI, Airbnb, Ramp, and Intercom — rely on the platform as their incident command system.
That year, the company raised a $62M Series B led by Insight Partners, bringing the company’s total funding to over $96M and valuing the business at around $400M. The investment marked a leap forward in expanding incident.io’s U.S. presence and, crucially, scaling AI R&D.
A new era of automation
Initially, AI wasn’t part of the picture for incident.io, says Whitworth. “We started the company with really no AI in the product at all.” The goal was simply to give teams one place to run their incident process and reduce friction during failures. “There wasn’t really much need for AI.”
But as the platform became the central command center for how hundreds of companies handled outages, the team realized they were sitting on something far more powerful. With full visibility into deployments, logs, Slack conversations, timelines, and thousands of past incidents, incident response was naturally evolving beyond coordination and into intelligence.
This shift aligns with what Whitworth thinks of as the three eras of incident management: the first era was about alerting — tools like PagerDuty waking teams up; the second era is about managing and coordinating the response. We’re now entering a third era, one of intelligent Agents that can autonomously investigate, diagnose, and eventually repair issues.
“What that looks like in practice,” Whitworth explains, “is we have a product called AI SRE.” AI Site Reliability Engineering (SRE) scans recent code changes, reading log lines, analyzing Slack discussions, and correlating similar historical incidents to identify root causes and recommend actions to solve the issue. “We’ve gone on this journey from not having any AI in the product at all, [to] investing very, very heavily in it. Our engineering capacity is going into building out that product.”
Internally, AI is also reshaping how the team works as it scales. Rather than reducing headcount, Whitworth says it allows the company to “take all of those efficiencies and reinvest them in other things.” On the go-to-market side, the team now uses AI to analyze call transcripts and automatically generate competitive insights.
As Whitworth explains, this creates an “always-on intelligent stream linked back to exactly what the customer said,” replacing hours of manual work. “It’s such a cool application of [AI to what] would literally just be too much work for any one human to do.”
“And we’re just at the beginning”
As AI accelerates the volume and complexity of failures, incident response is turning from a technical, back-office function into a competitive advantage; incident.io eases that process with workflow automation, deep operational insight, and advanced AI. And the impact can be felt on an individual customer level, says Whitworth.
“With our AI SRE product, you can wake up bleary-eyed, look at the Slack channel that we’ve created for you, and we’ve done a full investigation and basically told you…what you need to do next…You’re just saving a few hours sleep for someone, but…[it’s] a nice example of how impactful things can be on the personal lives of some of your customers.”
This small shift, at scale, “is going to save colossal amounts of time for engineering organizations,” he adds. “And we’re just at the beginning.”
*Note: Insight Partners has invested in incident.io. This article is part of our ScaleUp:AI 2025 Partner Series, highlighting insights from the companies and leaders shaping the future of AI.








