How to know what just broke when your AI-built app breaks

SpasSenior product engineer, Shipsteady

05 May 20269 min read

It's 11 PM. The dashboard isn't loading.

You can't tell if it's down for everyone or just for you. The red text in the browser console is unreadable. You ask the AI. The AI is confident. It says it knows what's wrong. It writes a fix. The dashboard loads. The signup form is now broken. You ask the AI about the signup form. It writes a fix for that too. The dashboard goes empty again.

You sit back. You don't know how many things you've changed in the last hour. You don't know what was working at 9 PM and isn't now, and you don't know what was already broken at 9 PM that you only noticed at 11. The AI is still typing. It's offering a third fix.

You don't accept it. You don't roll anything back either. You're not sure what back would mean.

That's the moment most founders meet at least once after their app meets real users. The bug isn't the thing. The thing is that the question you most want to ask, what just changed?, has no clean answer tonight. And every minute the AI keeps moving makes the answer harder to find.

Why "what just broke" is the wrong question

When something stops working in an app that was working an hour ago, the instinct is to ask what is broken. That's a fair question, and it has a long, useful tradition. It's not the question that helps you in the moment.

The question that helps is what changed. Almost everything that breaks in a working app breaks because something is now different from what it was when it last worked. Find the difference, and you find the cause. The bug is downstream of the change.

The trouble with apps built quickly with AI is that nobody is keeping a clean record of the changes. Including the AI. Including, often, you.

Four shapes of "what changed" that you don't know

There are four kinds of change happening in your project on any given week. The fault, when something breaks, is almost always in one of them. You probably don't have a clear record of any of the four.

The change you asked for

This one is easy. You wanted the signup button moved to the top of the page. You got that. Whatever broke after that is probably related to that.

Probably. Unless one of the other three got mixed in.

The changes the AI made that you didn't ask for

These are the ones that matter. You asked for one thing. The AI made the change you wanted, then rewrote three other things along the way to "improve" them or "make sure it works everywhere."

Some of those rewrites were necessary. Some weren't. The AI doesn't tell you the difference. It tells you the same thing it always tells you: the implementation is clean, best practices applied. When you scroll back through the conversation, you can't tell which of the four files it touched needed touching, and which ones it touched because it was being thorough in a way you didn't ask for.

This is where most "I just changed one thing" disasters live. You did just change one thing. The AI changed four. None of you were tracking which.

The change you accepted on Monday and forgot about by Friday

You asked the AI to fix something on Monday. It worked. You moved on. By Wednesday, you'd forgotten which file the fix was in. By Friday, you'd forgotten the fix existed.

Now it's the next Tuesday and something is broken. You can't tell if today's break is connected to last Monday's fix, because last Monday's fix is no longer in the part of your memory that thinks about the project. It's gone.

This isn't a personal failing. The brain isn't built to keep eight days of half-understood changes in working memory. We talk to founders who can't remember what the AI did yesterday, let alone last week. The thing that's supposed to keep that memory for you is a clear record of the changes. You probably don't have one of those that you can read.

The thing that was breaking quietly all along

Some things have been wrong since the second week of the project. They didn't fail loudly. They failed in a way nobody noticed. The bug at 11 PM tonight isn't a new break. It's an old break that finally got loud, because something that changed today pushed the quiet failure across a line.

This is the worst category, because the timeline lies. The thing that broke today wasn't caused by today. It was caused by a thing six weeks ago. Nobody is going to find that by looking at today's changes.

Why "let's just undo it" doesn't usually work

Rollback sounds like the lifeline. In a healthy project, it is. You point at a moment in the past where things worked, and you go back to it. Done.

In an app built quickly with AI, rollback is harder, for three reasons.

The first is that you might not have a back worth going to. If the AI has been editing the project directly, in chunks you didn't review carefully, the version of the project from yesterday morning might not be a version you trust either. You'd be going back to a state that was already half-broken in ways you didn't know about. That isn't safer. It's just earlier.

The second is that you've been using the app since the change. Customers have signed up. Things have been written to your database. Even if you could roll the code back cleanly, the data the new code wrote isn't going anywhere. The two halves don't match. You'd be running old code against new data, and that's a different kind of broken.

The third is that the change you'd most like to undo isn't always findable. It's not the change you remember asking for. It's one of the silent ones the AI made along the way, and you don't know which one. The back button doesn't know either.

So you sit there at 11 PM with the AI offering a fourth fix, and rollback is technically there, and technically not the same as a rescue.

What helps when you're already in it

If you're inside the moment right now, with the AI typing, here is the order that usually helps.

Stop accepting changes. Every change the AI suggests while you don't know what's broken makes the picture noisier. The first move is to freeze the picture, even if the picture is bad.

Write down, in plain English, what was working an hour ago and what isn't working now. Not for the AI. For you. The act of writing slows the panic and shows you which question you actually need to ask. Half the time, the act of writing exposes that the thing you thought broke isn't the thing that broke.

Stop asking the AI to fix it. Start asking the AI to explain what it changed. Ask for the list of files it touched in the last hour. Ask what each change did. The AI is much better at this question than you'd think, and much worse at "fix it" than it sounds.

If the explanation doesn't add up, that's the moment to stop. Maybe the AI insists it didn't touch a file you can see has changed. Maybe it can't account for a piece you remember accepting. Either way, that's the time to bring in someone who can read what the AI wrote. The cost of one bad fix layered onto another bad fix passes the cost of a phone call somewhere around the third one.

Why is this broken?

What changed since it was last working?

Can you fix it?

Can you list every file you've touched in the last hour, and what each change did?

Roll back to last week.

Show me what was different about this morning's version, before the change.

Is this safe to put live now?

What would I lose if I went back to this morning's version?

What helps before the next 11 PM

There's no preventing the moment entirely. Software breaks. Real users do things nobody planned for. We've written about what "ready for real users" actually means, and one of the quiet truths of that list is that strangers will eventually find a thing your app does wrong.

What you can do is build a picture of your project that you can read at 11 PM, in the dark, without help. A few questions, asked once when nothing is on fire, that pay back the first time something is.

Where does the project's history live, and can you actually open it? Most AI tools save a record of changes somewhere. Knowing where, and being able to look at the list of changes from the last week, is the difference between what just changed? being a guess and being a question with an answer.

Are the changes labeled in a way you can read? "Updated app.tsx" tells you nothing six weeks later. "Moved signup button to top of page" tells you everything. Whoever or whatever is saving the changes, you, the AI, your hosting tool, should be saving them with a sentence that means something.

What's the last version that you remember being good? Not the last version that loaded. The last version that you'd be okay handing to a new customer right now. Knowing the answer to that question is half the rescue. Most founders we talk to don't know.

If those questions don't have answers, that's the same ceiling we wrote about in when vibe coding stops paying off. The 11 PM moment is one of the ways the ceiling makes itself known.

What we do on a session

If you're reading this at 11 PM and things are still breaking, you can book a session for the morning. Most of these moments don't need a midnight rescue. They need a calmer second look in daylight, with someone who can read what the AI wrote and tell you what actually changed and what didn't.

You share your screen. We open the project together. We don't read every line. We look at what the AI has touched in the last week, name the changes that made it in by accident, and tell you which ones are likely behind tonight's break and which are noise. By the end, you have a picture of your own project that you can hold in your head. A picture the AI can't give you, because the AI is the thing the picture has to make sense of.

The fix you most want at 11 PM isn't another fix. It's the answer to what changed. The AI built the thing. It can't read the history of building it. That's the part somebody else has to do.

When the app breaks, the question isn't what's wrong. It's what's different.