Very short version: check that you’re not running out of RAM and/or swapping really works. Notice that swapping is completely off on EC2 Ubuntu AMIs even if it’s on everywhere else. Longer version below.
Notice the gap on the graph. That’s the time when my AWS micro instance went completely janky and stopped responding to everything and I had to force-reboot it from the EC2 control panel. What did I do? Very simple: running webpack build pipeline for my project’s static assets and it ate all the RAM. I suspect that it would have finished the build some time in the future (though not sure if this or the next decade) but after 10 minutes rebooting was the only option.
This blog post is just for those who might not notice the problem and try to google for answers as to why their perfectly working deployment tool hangs the whole production server from time to time.
This happened to me once last week and now for the second time. So basically 1 hang for every 40 deployments or something like that. I had no clue. But just when I was about to start googling it struck me: the server runs out of resources. This wasn’t completely obvious to me from the beginning because A the project is in early steps and I hadn’t configured any server monitoring before (if you look at the picture that tad over 16hours is when I actually started collecting the stats, no joking) and B I had made sure that the 1 GB of RAM on the server was completely enough to run the actual services I needed. It didn’t cross my mind that ~500 meg usage would rise so much when building.
The project in question is Nyssetutka which is a Tampere area public transportation web app showing busses and stops and whatnot. It’s still in early(-ish) development stage so I haven’t had the time to build a nice deployment pipeline and put all the usual checks into place and was funnily just setting up the collectD monitoring as I discovered the reason for the hangs. The app consists of a Node backend and static front end. The backend is run by PM2 because it works nicely and comes with customizable deployment “pipeline” (not really a pipeline). I’ve configured it so that the static assets for the front end are not stored in VC and every deployment always builds the front end files, installs all new dependencies for the server and starts the backend. This basically means a 30 second downtime for the app but that’s fine right now. This workflow gives me very fast iteration times and now that I have server monitoring in place (using Grafana) I don’t have to worry so much.
Hopefully this is of help for someone.