There's a prevailing idea when it comes to startups and building and running your own business.

The idea that to be successful, you need to work hard, put in long hours, and push your team to the limit as well.

Keeping up with the competition, trying to make your customers happy, your investors too, and trying everything you can to turn your business venture into a success, however that is defined.

Some companies even go as far as advertising it as normal that you can just take your work everywhere you go, to the park, to your kids' soccer game, maybe even to the pub?

I've fallen into this trap, I've been putting in 10-12 hours per day, working from home, with my family around. The family is understanding, but that doesn't justify these kinds of working hours.

As someone working on a product that's used around the globe and at every hour of the day, I can relate to this idea. When production is broken, it's handy to have something around to respond. When a customer is having troubles, I want to help them. I'm used to taking my computer with me, even during the weekends.

Adding to that, with customers only coming online when it's the end of the business day in Europe, our support usually ramps up in the later hours, where customers come into our live chat and expect someone to help them with their problem.

Helping customers succeed is one of the most important purposes of a business, and we're trying as best as we can to help them out.

But that thought drove me into a habit that's hard to break free from. It's the fear that there could be a new customer support issue every day, that there could be a new customer in the live chat that needs help.

This very habit has driven me to being on the computer from the morning to the evening hours, always waiting for someone to approach us with an issue.

It's a habit that's been having a very destructive effect on my work and my life, and the two are not the same.

A few weeks back, as our team grew more and more, I've come to the realization that working longer hours gives a bad example, not just for myself, but it sets an implicit expectation that others on the team work just as long. It's poison for a team for even one person, in particular a manager, to work longer hours. It gives the impression that it's normal and expected to work longer than what your contract says.

It wears people out, it's worn me out, on multiple occasions.

We recently started doing support rotation, where everyone gets a dedicated day of doing customer support, escalating to others on the team where necessary. That gives the support person the freedom to only focus on customers for the entire day, and it gives the rest of the team the ability to focus on getting other work done.

It's been effective for us, and it's already improved our own tooling, our user interface and our documentation.

But I still can't shake the habit of always wanting to help. You can say that it's a noble habit to have, but its downsides are starting to show.

There's a quote from Small Giants that stuck with me:

For all the extraordinary service and enlightened hospitality that the small giants offer, what really sets them apart is their belief that the customer comes second.

A company's team is what sets it apart. That team needs to be well rested to make for happy customer.

The only way to get them to do that is to discourage long working hours.

In our company, that starts with me.

Success doesn't depend on how much you work, it depends on where you focus your time in the best way possible.

Don't work too hard.

Tags: work, smallbiz

For the last two years, I've been working on Travis CI, a hosted continuous integration and deployment platform. It first started out as a free service for open source projects on GitHub, but has since then evolved into a hosted product for private projects as well.

The fascinating bit for me was, right from the start, that the entire platform, inarguably an infrastructure product, is built and runs on top of other infrastructure products.

Travis CI, for the most part, runs on infrastructure managed and operated by other people.

Most of the code runs on Heroku, our RabbitMQ is hosted by CloudAMQP, our database is run by Heroku Postgres, our build servers are managed by Blue Box, our logs go into Papertrail, our metrics to Librato, our alerts come from OpsGenie, our status page is hosted on StatusPage.io, even our Chef server, the one bit that we use to customize some of our servers, is hosted.

In a recent Hangops, we talked about buying vs. building infrastructure. I thought it's well worth elaborating more on these things, why we went on to build Travis CI on top of other infrastructure products rather than build and automate these things ourselves.

Operation Expenses vs. Time Spent Building

The most obvious reason why you'd want to buy rather than build is to save time.

In a young company trying to build out a product, anything that saves you time but adds value is worth paying for.

You can build anything, if you have the time to do it.

This is an important trade-off. You're spending money, possibly on a monthly basis to use a service rather than spend the time building it yourself.

A status page is a classic example. Surely it should be doable to build it myself in just a few days, yes?

But then, to be really useful, your custom status page needs an API, for easy interaction with your Hubot. Maybe you also want a way to integrate metrics from external sources, and you want it to include things like scheduled maintenances.

On top of that, (hopefully) a pretty user interface.

That's the better of two weeks, if not more than that. On top of that, you need to run it in production too. It's one more distraction from running your core product.

Other people may be able to do this a lot better than you. They help you free up time to work on things that are relevant to your products and your customers.

In return, you pay a monthly fee.

Surely, you say, building it yourself is practically free compared to a monthly fee, isn't it?

Your time is very valuable. It's more valuable spent on your own product rather than build other things around it.

The problem with time spent building things is that you can't put a number on it. You're basically comparing a monthly fee to what looks like a big fat zero. Because heck, you built it yourself, it didn't cost anything.

This is the classic tradeoff of using paid services. You could very well build it yourself, but you could also spend your time on other things. You could also use and run an open source solution, but that too needs to be maintained, operated and upgraded.

If this sounds one-sided, that's unintentional. I have a history of building things myself, racking my own servers, provisiong them myself.

But there are things to keep in mind when it comes to spending time building things. There's a non-zero cost attached to this, it's just not visible as the monthly invoice you're getting from a service. That cost is hard to fathom as it's hard to put a numeric value on the time spent building it.

When you have the resources and can afford to, it makes sense to start pulling things in-house.

For us, not having to take care of a big chunk of our infrastructure ourselves is a big benefit, allowing us to focus on the more relevant bits.

But letting other folks run core parts of your infrastructure doesn't come without risks either.

Risks of Downtime and Maintenance

When you buy into the idea of someone else maintaining more or less vital parts of your infrastructure, there's a risk involved.

You're bound to any problems they might have with their infrastructure, with their code. In multi-tenant systems, any operational issues tend to ripple through the system and affect several customers rather than just one.

You're also bound to their maintenance schedules. Amazon's RDS service, for this particular scenario, allows you to specify a maintenance window through their API for your database instances.

The full risk of how this affects your own application is hard, if not impossible, to calculate.

A part of your infrastructure could go down at any time, and it's mostly out of your hands to respond to it. What you can and should do, is harden your code to work around them, if at all possible.

One question to ask is how vital this particular piece of infrastructure is to your application and therefore, your business.

If it's in the critical path, if it affects your application significantly when it goes down, there are options. Not all multi-tenant systems are automatically multi-tenant. Some offer the ability to have dedicated but managed setups. Some even offer high availability capabilities to reduce the impact of single nodes going down.

Both our PostgreSQL database and our RabbitMQ setup are critical parts of Travis CI. Without the database, we can't store or read any data. Without our message queue, we can't push build logs and build jobs through the system, effectively leaving the system unable to run any tests for our customers.

We started out on multi-tenant setups for both. On our PostgreSQL database, the load was eventually way too high for the small size of the database setup.

For our RabbitMQ, we were easily impacted by other clients in the system. RabbitMQ in particular can be gnarly to work with when lots of clients share the same cluster. On client producing an unusual amount of messages can grind everyone else in the system to a grinding halt.

Eventually, we ran both parts on dedicated infrastructure, but still fully managed. There's still a chance of things going down, of course. But the impact is less than if an entire multi-tenant cluster goes down.

Putting parts that were in the critical path on dedicated infrastructure has been working well for us. The costs certainly went up, but we just couldn't continue telling excuses on why Travis CI was down.

When it comes to buying into other people running your infrastructure, don't be afraid to ask how they manage it. Do they have a status page that is actively used? How do they handle downtimes?

Operational openness is important when other people manage parts of your infrastructure.

It's inevitable that something bad will happen in their infrastructure that affects you. How they deal with these scenarios is what's relevant.

Security and Privacy

With multi-tenant infrastructure, you're confronted with curious challenges, and they can affect you in ways that only studying your local laws and the provider's terms of service will fully reveal.

Security and privacy are two big issues you need to think about when entrusting your data to a third party. The recent MongoHQ security incident has brought up this issue in an unprecendented way, and we've had our own issues with security in the past.

Note that these issues could come up just the same when you're running your own infrastructure. But just like outages, security and privacy breaches can have much wider ranging ripple effects on multi-tenant infrastructure.

How can you best handle this? Encrypting your data is one way to approach the situation. Encrypt anything that's confidential, that you want to protect with at least one small extra layer of security to reduce the attack surface on it.

We encrypt SSH keys and OAuth tokens, the most private data that's entrusted to our systems. Of course, the keys aren't stored in the database.

When buying infrastructure rather than building it, keep a good eye on what your providers do and how they handle security and your data. This is just as important as handling outages, if not even more so.

Make sure that your privacy/security statements reflect which services you're using and how you handle your customers' data with them. It may not sound like much, but transparency goes a long way.

One unfortunate downside of infrastructure services, Heroku add-ons come to mind, is the lack of fine-grained access privileges. Only some of the addons we use allow us to create separate user accounts with separate permissions.

It's one of the downsides of the convenience of just having a URL added to your application's environment and start using an add-on.

Judging the impact of the trade-off is, again, up to you. Sometimes convenience trumps security, but other times (most times?), security is more important than convenience.

Your users' data is important to your users, so it should be just as important to you.

Scaling up and out

We started out small, with just a few Heroku dynos and a small database setup, a shared RabbitMQ setup to boot.

In fact, initially Travis CI ran on just one dyno, then two, then just a few more when a second application was split out.

This worked up to a few thousand tests per day. But as we scaled up, that wasn't sufficient.

I was sceptical at first whether we can scale up while remaining on managed infrastructure rather than build our own. Almost two years later, it's still working quite well.

Important bits have been moved to dedicated setups, the databases (we have four clusters, eight database servers in total) and our RabbitMQ service, which we needed to move to a cluster setup.

Most hosted services give you means to scale up. For Heroku apps, you add more dynos, or you increase the capacity of a single dyno.

For their databases (or Amazon RDS, for that matter), upgrade the underlying server, simple enough to do. For RabbitMQ, go for a bigger plan that gives more dedicated resources, higher throughput, and the like.

Figuring out the limits of hosted infrastructure services is hard. If you send a log service, even by mistake, thousands of messages per second, how do they respond? Can they handle it?

Only one way to find out, ask them!

With most of the bits that we need to scale out, we're confident that hosted services will give us the means to do so for quite some time. After that, we can still talk to them and figure out what we can do beyond their normal offerings.

Scaling up is a challenge, as Joe Ruscio put it on the aforementioned hangops episode: "Scaling is always violent."

It was violent on occasion for us as well.

We may need more dedicated bits in the future for specialized use, things like ZooKeeper for distributed consensus. But most of our tools are still running nicely on hosted infrastructure.

Operational insight

One thing that's been bugging me about a few of our core services originally was the lack of operational insight.

With infrastructure beyond your control, getting insight into what's happening can be challenging.

We had to ask Heroku support quite a few times for insight into our database host machine. For figuring out whether or not an upgrade to a larger plan or instance is required, this can be essential. It certainly was for us. This situation has been improving and it will be even more in the future from what I've heard.

But for an infrastructure provider, offering this kind of insight can also be challenging. Heroku's Postgres has improved quite a lot, and we get better insight into what's happening in our database now thanks to datascope and their means of dumping metrics into the logs, which you can then aggregate with a service like Librato.

Most providers have great people working for them. When in doubt, ask them about anything that's on your mind. The services we work with are usually very helpful and insightful. The Heroku Postgres team is a knowledge goldmine in itself.

There's a curious assumption out there, and it's time to put that assumption to rest for good.

It's the assumption that things should be free while they're in beta.

That assumption is made on both ends, your customers think it should be free, and you think your product should be free too.

It's bollocks. You're hurting your business by following down this path, and, while they have good intentions, so are your customers.

Free customers don't validate a business, they don't validate a product. Only when customers are willing to open their wallet for your product will you start to have validation that you might be on to something.

Offering a product for free while it's in beta implies the assumption that the product doesn't provide any value while it's in beta. If your product doesn't provide any value while it's in beta, stop what you're working on right now.

You think your product doesn't provide any value while it's in beta, and neither do your customers. At least that's how the assumption goes.

Are you sure that you've verified that there's a market, an audience, for your product that needs this problem solved? Is there more research you can do? Could you talk to one of the folks trying out your product?

If you can be sure that your product provides value, and this is why serving businesses is a much better idea than serving consumers, why are you not charging for it?

Does the label beta keep you from charging money? Remove it.

The sooner you start charging, the sooner two things will happen:

  • You know that your product has value to the customers that are willing to pay you for it.
  • Your business has a much higher likelihood of succeeding. Early validation gives you confidence that it's worth continuing with the product.

When we started out with our product version of Travis CI, we took a few months to develop what we considered to be the minimum features we could ship and throw at potential customers.

We launched our initial private beta in July 2012. We started charging in September 2012. But, and this is where we could've done better, we didn't enforce charging. If people didn't want to sign up, they could continue using the product for free.

Meanwhile, we manually emailed dozens of customers, asking them to sign up for a subscription and for feedback at the same time. This was a long process, and I wish we'd have automated sooner.

But it gave us a chance to see what people liked, what they didn't like. At the end of the manual process we had 100 paying customers. More than enough to validate that we were on to something.

A beta product doesn't make a product, let alone a business, yet.

But when you start charging early, and you find customers paying for it, you have a tiny validation that it's valuable to some. That can be a much more powerful realization and boost than trying to crank out new features to make free users happy.

The best way to validate a product is to get a customer to pay for it, then 10 customers, 100 customers.

Tags: smallbiz