A few days ago, we experienced some problems with Redis. It occasionally blew up. When we examined the server at those times, we saw interesting things like thousands of open TCP connections on port 6379.
At the same time, in a seemingly unrelated universe, we experienced strange problems on the server that runs our monitoring application. We use Dashing to keep tabs on various metrics. Over time, Dashing ate up all the memory on the box and kept dying a horrible death. We implemented a “fix” by restarting the application via cron a few times each day, but that didn’t always keep the machine from dying.
Considering these two problems led me to think about the internals of Dashing. Dashing uses Thin, which is an evented application server built on top of EventMachine. Dashing also uses Rufus-scheduler in order to schedule its various monitoring jobs. The Rufus-scheduler gem hooks into that same EventMachine loop, and it all runs in a single Ruby process. We had a few Dashing jobs that looked like this…
1 2 3 4
I did some digging in the Redis Ruby client and discovered that there is no automatic connection pooling implemented. That’s interesting. Then, I did this…
1 2 3
Well, that sucks. Each subsequent run of our monitoring jobs created a new TCP connection to Redis that wasn’t closed. Sure, the connections eventually timed out, but multiple jobs running once a minute resulted in connections that were created faster than they could timeout. No wonder All The Things were breaking after an indeterminate amount of time. The fix was pretty simple…
1 2 3 4 5 6 7 8 9 10
I used the connection_pool gem out of convenience, but writing your own connection pool manager is pretty trivial. Here’s the result of the change…
The yellow blob is memory in use. Prior to deploying fixes, the only way to keep the machine from choking up all the time was a series of application restarts courtesy of cron. The resulting improvement is significant. I should note that I deployed other fixes at the same time which cut back on memory being leaked throughout the application, but the connection pool implementation was the most significant change.
The moral of the story is that we as Ruby/Rails programmers tend to take things like memory management and connection pooling for granted. Ruby is garbage collected, but it’s still very easy to leak memory through poor code. Additionally, it’s important to keep track of the network connections our applications are arbitrarily establishing. ActiveRecord manages a connection pool for us. Certain other gems (like the Mongo Ruby driver) do as well. But that doesn’t mean that every 3rd party client library will keep us safe. I’ll certainly consider this next time I’m opening a connection in my code.
If you enjoyed this post, please consider subscribing.