Services like Heroku, Elastic Beanstalk, and the rest are great tools for “easy” and “fast” web application deployment. While I can’t deny the benefits of these PaaS providers I think that some of their users are doing a disservice to themselves by choosing to use them. When PaaS services like Heroku get used as a developer’s first introduction to application deployment it can have a lasting, sometimes negative impact on their growth trajectory.
Learning the hard way
I’ve always learned programming the hard way and I highly recommend it. I avoided shortcuts like frameworks and PaaS providers because I wanted to understand the entire web application stack from request all the way through the network and back to the response. The thing is, you don’t need to know everything inside and out to “get shit done”. Rails experts are valuable and Heroku gurus are valuable even if they’re not able to build a simple “Hello World” Rack app or configure Nginx to sit in front of a Node.js app as a reverse proxy. I’ve learned to be okay with not being perfect. I have to admit that deep inside I used to look down my nose on developers who code Rails but not Ruby and programmers who deploy to Heroku instead of setting up a server themselves. What’s changed my opinions are my experiences teaching these concepts as a professional programming instructor. There is so much to know that sometimes you need to use a shortcut to teach a concept so you don’t get sucked into all the details that a student just isn’t ready for yet. In fact, I think that these shortcuts, when used as training wheels, can help skyrocket a new junior developer’s skill set taking them from 0 to 60 in weeks rather than years.
Training Wheels or TBI
By abstracting away the basic concepts of deployment you turn hosting into a black box. Git commits go in and running web applications come out. This can be as helpful to new developers as training wheels are to a kid first learning to ride a bike or it can be like giving a developer a traumatic brain injury (TBI).
Training Wheels for Deployment
To a new developer, servers are a black box in the request response cycle. Most of them think of the request/response cycle like this:
“Client sends HTTP request to a host server, the server runs Ruby|Python||Node.js files and sends back an HTTP response with HTML in the body.”
This kind of thinking perpetuates the Black Box server mentality. It’s thought of as a computer that takes HTTP in and sends HTTP back out. I call this “PHP Syndrome” as it encourages you to think of requests being routed to files when really they’re being routed to programs that have their own specialized routing functionality. I’d expect such a simple explanation from a student but not from any developer who ever even dreams of removing that “Junior” role from their title. But that’s the entire point of Heroku and services like it. They abstract away the complexities of what happens inside a server and turns them into black boxes so we don’t have to worry about what’s going on in them… until we need to know what’s going on in there because “shit, it won’t deploy without throwing cryptic errors” again.
TBIs for Developers
When using a Heroku type service you can choose to keep using it because that’s just the way you were shown or because it’s easy. If you choose this path it’s a self-inflicted wound. You’ll never understand what’s going on inside of this black box deployment process beyond the details of what the service tells you. You don’t know how to deploy web apps, you know how to deploy to Heroku. There’s a difference. It’s the same difference between Ruby developers and Rails developers.
That said, if you’re naturally curious – as you should be if you want to write software for a living – then you’re wondering what’s happening in these black boxes. So hopefully one day you spin up a VPS and start setting up a secure server complete with Nginx or equivalent reverse proxy and you learn to configure it yourself. You understand that running web applications online means your requests get sent to a reverse proxy that sends your requests to an app running on a local port or unix socket. Then you have your glue software like Rack for Ruby that acts as the go-between for the language runtime (Node, Python, Ruby, etc.) and your web server (Apache, Nginx, LightHTTPD a.k.a. “Lighty”, etc.) and you need to write apps that are compatible with that spec. At this point you won’t think of servers as these black boxes. Remember the first time you understood MVC and how data flowed through your application? Well that “aha moment” is going to happen again when you deploy your first app to a server manually.
Choose your choice
So will you stunt your growth or use PaaS as training wheels? The choice is up to you. Learning to build and run a web application locally is hard for a newbie. For these new devs a Heroku-style PaaS solution is perfect. It lets them focus on the new skills they’ve learned without having to understand too much about deployment all at once. Once you’ve mastered the building of web applications then it’s time to get your hands dirty with servers. Get a Linode or DigitalOcean account, spin up a new VM and start setting it up for deployment. This process is harder the first time around but it opens up that black box and exposes what’s been hidden from you with that
git push heroku master command. It’s like popping the hood of your car and changing the oil for the first time. You’re looking around frantically for the right wrenches and tools. You take a stab at replacing the oil filter but damn is that thing jammed in there and hard to get a good grasp on. But then, after a while you suddenly get it and realize it’s not as hard as you thought it was. From now on you’re going to change your own oil. Why pay a mechanic a premium to do something that’s accessible and easy for someone who’s not even an expert in oil changes to do? Then maybe one day you get a raise at work and can afford to have someone work on your car for you. You know what needs fixing and how things work so you’re comfortable paying a premium and know how not to get ripped off with extra services you don’t necessarily need.
Why PaaS is a good thing
PaaS (platform as a service) providers allow you to do things like outsource your database management (Mongolab, Compose.io) or make deploying web apps written in languages like Ruby or Python as simple as running a quick
git push from your deployment branch.
This makes it so developers don’t need to worry about provisioning servers, security, or take any time to set up deployment scripts. It truly is easy. What’s hard is justifying the cost and deciding whether making compromises in your application code to fit your hosting provider’s constraints is not only worth the effort but acceptable.
PaaS is also a bad thing
The problem with these services are the very things that make them so great. While having a server abstracted away to the point where it’s just a simple machine that accepts HTTP requests and responds to HTTP requests makes life easy for me, the lead or senior developer on a team but it makes life a living hell for the poor junior dev who got suckered into “easy deployment” and found out there’s a package he forgot to install or some other seemingly minor error and learns the hard way that these simple mistakes take hours to correct. Add to that the fact that deploying through PaaS is not the same as deploying to your own server. Every service has their own unique quirks and specs that you need to adhere to in order to get an app running while maintaining your own server gives you the kind of freedom a lot of us crave. Not only that but running your own server is a standardized process. Setting up a reverse proxy to your app is something that doesn’t change from server to server. The private server model never changes but the process from PaaS provider to PaaS provider is always different.
By automatically provisioning secure servers, these PaaS providers provide a false sense of security. Sure, your Heroku dyno may be hardened so that attackers can’t get into the virtual machine itself they don’t stop you from writing insecure code. They aren’t providing SQL injection protection, preventing session hijacking, or adding CSRF tokens for you. If you don’t know what any of those three things are then please at least read the OWASP top ten articles.
Your application’s security is still left up to you in the end. Now, if you run your own server the security is left up to you. There are plenty of reliable guides out there that’ll help you harden a server. Both the Linode and DigitalOcean docs are great resources with reliable information to help secure your server. That said, security is a whole other beast. It’s something you can’t dabble in. You can’t be a full-stack developer that’s also a security expert. My point is that although you have a chance to harden your server, there are definitely going to be things that slip through the cracks. So there’s a tradeoff in application security when choosing PaaS or your own server. With PaaS you only need to worry about your application code’s security. With a private server your have to worry about both your app code’s security and the security of the server it’s running on.
Deployment made complex
For the remainder of this article I’m going to focus on just Heroku as it’s the gold standard for PaaS.
Heroku and their ilk advertise their deployment process as simple. “You simply
git push and your app is deployed” they say but that’s not the case. Part of deployment with a PaaS provider is simply shifting the deployment setup steps from what you’d normally do to something that’s equally complex but hidden behind the scenes. Let’s take Heroku’s deployment process for instance. To deploy a Ruby application on Heroku you need to add a couple of specialized gems to your project Gemfile and you’re basically forced to use Postgres as your database unless you want to pay more money and jump through some extra hoops to use MySQL or something else.
Last fall I taught a back-end web development course at General Assembly. The students had to deploy their apps to Heroku as part of their final project. At least half of the class had deployment errors. They followed the instructions to the T and still, we had to debug the Heroku deployment process anyway. So what do you do when something goes wrong in a web app? You check the logs. So how do you do that on Heroku? Well, it isn’t difficult but you have to learn a whole new way of interacting with your “server” which isn’t actually a server at all in the way we think of bare metal or VPSes. That’s not a problem by itself. The problem is that instead of learning how to deploy web applications you’re actually learning how to deploy your code to Heroku which is a totally different thing.
Learning new things and push-to-deploy solutions aren’t a bad thing by themselves. The issue is that without a background in servers, a student or user of one of these PaaS platforms is learning “the Heroku way”. It’s basically vendor lock-in.
Your code mostly runs…
On Heroku for instance, your database server and your web application are actually on separate machines. And you know all that security they provide for you? Well good luck doing basic stuff like sending AJAX requests from your server. Heroku and other services lock down their servers to the point where stuff that works like a charm locally or on a VPS will require a lot of work to get running on their service. You can tail the logs all day long too but don’t expect them to contain any actionable information. If you were running your own server you’d know that there are at least three difference log files you can check when things go wrong. With Heroku, you get the terminal output you’d get for requests when running your app on localhost. That’s not very helpful in a production environment.
You want custom? That’ll be an extra $10/month each, please
Customizations require you to do things the platform’s way rather than in a standard way. SSL certificates are a great example. You can buy SSL as an add-on but then good luck setting it up and running it. You’re locked into their way of doing it and it costs more than if you were to simply buy and install an SSL certificate yourself (Namecheap has the cheapest SSL certificates out there but you can even get them for free with LetsEncrypt or StartSSL, not to mention self-signed certificates which often have perfectly valid use cases.
Scaling is easy!
Scaling is never easy. Once again it’s the same issues we’ve been talking about before: money and vendor lock-in. Scaling with Heroku actually does seem to be easy. Apparently you just add additional “dynos” and your application will be load balanced across multiple servers automatically. But before you start scaling let’s ask ourselves some important questions before we jump in head first:
- Will we realistically need this “easy” auto-scale capability?
- Really? Is your todo list app really really going to need to scale quickly?
- Do you understand how scaling actually works from an ops perspective?
- Can you afford to triple your monthly spend on servers?
Most developers who get suckered in by the auto-scaling features are living in a fantasy land thinking they’re somehow going to go viral and need to scale quickly. I can’t tell you how many delusional developers I’ve met who think they’ve got an idea for the next billion dollar company but end up with 20 monthly active users in the end.
I’ve been running a web application that serves thousands of monthly active users, has over 10,000 total users, and has been running alongside multiple Wordpress, static websites, and a Sinatra API that powers my first iOS app since 2012 on a server with 1 CPU core and 1GB of RAM. Last year that server got upgraded to 2 cores and 2GB of RAM but only because the hosting provider cut their rates in half so I got a free server upgrade. Regardless, you’d be surprised just how much traffic a small server can actually handle.
Did you know that Buffer started with just a single Linode server and one day they had a massive traffic spike which took down their server. All it took was adding one more server with load balancing to fix the issue and they did it in two hours. They documented the experience in a great blog post. The point, though, is that you can start a very popular service and run it on a server much smaller than the one you think you need. And these were a couple of full stack developers who set up their own infrastructure in the cheapest way possible, got traction, then worried about scaling. I don’t know the details of their background but from what I can gather it seems as though they could be the poster children for using PaaS as training wheels or as their go-to service after understanding how deployment infrastructure actually works.
But let’s pretend for a moment you do need to scale. Do you know how that actually works behind the scenes? Do you think you don’t need to know because you’re using Heroku? That’s bullshit. You need to understand how the internet works if you’re going to run an application on it. Horizontal scaling (adding more servers to handle load) is something that costs an arm and a leg with PaaS providers. So even if you did have a need to scale would you have the revenue or free income to pay $300 a month to run a web application? (That $300 figure is a real number some people I know pay for apps that don’t even need scaling – they do it “just in case”).
Everyone knows that running Lean and Agile is the only way to do things these days. Develop an MVP, launch it ASAP, get feedback, then iterate. It’s an idea and a process I don’t think anyone can argue with. PaaS services like Heroku are great for running Lean. They allow a developer to focus on the most important aspect of their project: the code.
But wait a second – we just made a giant assumption. Is the most important part of any project the code? There are so many things that go into launching an app but we choose to focus only on the code. We do this because we’re a bunch of tools. We’re hammers. We hammer things. And when all you do is hammer stuff all day long then all you see are nails. That may be why we believe that the code is the most important part of any project.
Developers do much more than code though. Often times we need to bring a project from concept to deployment. There are a lot of steps in between.
If you’re full stack developer (like truly full-stack, not one of the thousands of wannabes) then you’ll be conceptualizing the app, designing it, ERD diagramming, migrating database tables, writing styles, scripts, markup, and building out the back-end. That’s a lot of work that we simply gloss over.
The whole “focus on what you do best – writing code” tagline that these services use is a lie. It leaves out way too much of the process. Sometimes code is what we want to focus on most but really there are more important things to think of. I once worked on an app where there were strict legal and compliance laws we had to keep top of mind the entire time we planned and executed the development of the project. I had to write code that could not possibly allow a user to do something against certain laws and regulations. That was a bigger responsibility than writing code itself. The planning stage was more important than the code written.
Servers are easy too
Developers use Heroku and other platforms for a variety of reasons but the most popular I’ve seen are “ease of deployment” and “scaling”. Both of these are great reasons to use Heroku or another service but I encourage you to learn to manage a VPS first.
Customization is free
Want to install SSL on your server? Want multiple sites running on the same VPS? Need to run MySQL instead of Postgres? Want to run multiple databases for different apps on the same server at the same time? No problem. Running
apt-get install on a VPS is free. With a VPS you’re paying for resources. There isn’t an upcharge by some PaaS provider and you can configure it however you like. Think of your VPS as a computer that you only have terminal access to. You’re free to do whatever you want to it because you’re only paying for CPU, disk storage, and RAM.
Security is up to you
Pass providers harden their servers well. They likely harden them better than you could yourself but sometimes you need to purposely relax some rules. If you understand the risks and know how things work then you’ll get a better result in the end.
Deployment is easy, deployment gives you options
On a VPS you can use git hooks to deploy or use a deployment solution for your favorite language like Capistrano, PM2, etc. On a VPS you have options. You can configure deployment in whichever way suits you and your team best. You won’t be fighting your provider because you’re in control of everything.
Some people like to think of Heroku and other PaaS services as training wheels for when you’re ready to deploy your own custom solution. The problem is that those services might lock you into their ecosystem by encouraging you to never have a good reason to learn how to build a server unless or until you run out of money to pay for more “dynos” (resources).
Other developers may be driven by their own curiosity and need for customization to ditch their training wheels and start running their own infrastructure.
When do we use PaaS?
I’ve personally deployed a few apps to Heroku. One major reason is teaching. When you have a class full of students who are getting a crash course in programming you just don’t have time to teach them all how to deploy a VPS, secure it, proxy web requests for your app through Nginx, and set up a deploy hook or another deployment solution like PM2 or Capistrano.
Deployment is the hardest part of web development and PaaS providers do make it easy. The problem is that when something goes wrong they’re twice as hard to debug and work in somewhat non-standard ways compared to private servers.
What should I do? What’s the point?
I admit that this post was originally written as a hit piece. I wanted to talk shit about Heroku and PaaS services all day long but the more I researched it and the more I dwelled on my experiences with it I came to realize that there’s a place in this world for both Heroku style services and VPS providers.
In the end what you choose is going to be based on your own skill level and needs. If you’re just learning to code and want to get a web app online, go for Heroku. If you’re a seasoned developer who knows exactly what infrastructure you need and how the PaaS service counterparts work then go with a Heroku or EngineYard.
If you’ve just mastered web application development and have deployed a few apps to Heroku then it’s time to learn servers. Take the plunge into server land and set up a custom deployment strategy for your platform of choice. I recommend Linode or DitigalOcean for application deployment. Linode is an older service and I prefer them for applications that I consider “important”. DigitalOcean is also a great provider which I use for client servers and my own personal side projects. Both are comparable on price, add ons, and features so go with your gut when exploring.
PaaS is a fork in the road. It’ll either stunt your growth or serve as training wheels for your adventures in server building. And remember, just because you’ve graduated from training wheels to server building doesn’t mean you can’t go back and launch a new production application on Heroku. As a developer you’ll hear this so often it gets annoying but here goes anyway: choose the right tool for the job. To choose the right tool, however, you need to have at least experienced each tool once.