Recently, I've been thinking a lot about failure, my daughter, risk and punishment, and the whole culture that has evolved around trying to avoid failure, trying to point fingers or putting blame elsewhere.

Simplest example: my daughter spills something over the table. What's the first reaction? Scolding or punishment of sorts. I'm guilty as charged. I read something pretty simple and wonderful recently, a very short read titled "Father Forgets".

That read got me thinking: why do we tend to punish failure immediately? It's not just something to do with our kids, it's human nature. We tend to put blame elsewhere, we tend to get defensive because people turn to us to fix a problem, when something is broken in production, for example.

Why can't we instead make failure a part of our culture? Not just at home, with our kids, but in our work place?

As soon as people feel like they need to get defensive, or they're blamed for a problem that occurred due to a recent change of theirs, negativity hits everyone on the team. It's hard to stay calm, it's hard to stay focused on what really matters: that something is broken in production, affecting your customers.

As soon as people feel threatened or pressured, they get defensive or they feel down because some of their own code broke something. Their vision is clouded. Finding the problem's cause and implementing a solution is suddenly just a blur, something that's hard to focus on. Even though that's what really that matters.

When people feel like failure is not an option, they'll stop taking risks. When people stop taking risks, your team and your company is doomed, innovation comes to a grinding halt. Most of us are in the lucky position that lives don't depend on our work. We can try new things, iterate quickly, disregard or improve them.

If my daughter doesn't take any risks because I keep punishing or scolding her, she might just stop trying altogether. The analogy is an odd one, but there's a striking similarity.

If a problem comes up, you fix it, you learn your lesson, you make sure it doesn't happen again, you move on. It can be that simple. When everyone on the team feels like failure is an accepted part of running an application, fixing the problems as they occur as a team becomes a lot easier.

In the end, it's not a question of if something breaks, it's rather about when it breaks. And the answer is: all the time. Great teams focus on the one thing that matters in these situations: how to best resolve the situation and on being ready when it does.

Embrace outages, the most common failure of our craft. Take a deep breath, phase out distractions (including managers) and try to find joy in digging through data and finding what's causing a problem. Turn it from a seemingly frustrating experience into a personal challenge. You find the problem, you fix it, you make customers happy again. Rinse, repeat.

Failure is cool.