Selecting a method for traversing a graph involves two decisions. First, decide between depth-first versus breadth-first. A depth-first approach fully explores each branch before continuing on to the next one. A breadth-first method travels across each node on one level of the graph before moving on to the next. The choice between depth-first and breadth-first depends on the organization and expectations of the graph data. A depth-first search is more appropriate for graphs with very deep branches containing many nodes. In this application, the graph contains ten nodes with multiple potential paths between any two nodes. I selected a depth-first algorithm, but the graph is small enough that a breadth-first search would perform just as well.
The second question when selecting a graph traversal algorithm is iteration versus recursion. The trade off here is a question of complexity. Many people find recursion harder to reason about. The order that the nodes are traversed in can also vary slightly between the two methods. For this use case, I selected an iterative approach mostly because I can’t remember previously writing an iterative, depth-first search algorithm.
This implementation uses two data structures: A stack to keep track of the nodes to search, and an array containing a list of nodes that were already visited.
1 2 3 4 5 |
|
In plain English, that’s all it takes to traverse the graph for a desired destination. This is an iterative approach, but it’s worth noting that the only difference between this and a recursive implementation is the stack. In a recursive method, the stack is implied because it’s the call stack. Steps one and five then change slightly. Instead of looping and going back to the first step, call the same function again on the next adjacent node.
The algorithm returns the destination node, but that’s not particularly helpful for this use case. We already knew there was a path between the origin and destination because we designed the graph! The real value of this feature lies in showing a highlighted route to the player and storing it for future reference. As the player flies to subsequent systems, update the pointer to the head of the path to keep track of the current route. The full algorithm has one new step and one new data structure. The new data structure is a dictionary with keys and values that are both graph nodes.
1 2 3 4 5 6 7 8 |
|
Unwind the path dictionary by starting from the destination and retracing the path back to its origin. Record each node along the way in a new array, and the result will be a list of nodes in the traveled path.
1 2 3 4 5 |
|
With an array of nodes representing the navigational path in hand, a route can be painted for the player on the screen and stored somewhere for future reference.
The final implementation uses C# (this game is built in Unity). The definition of the GalaxyMapNode
class is not shown – nor is the full listing for the class containing this method – but the few parts that are relevant can be discerned from context.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
|
This implementation of a depth-first iterative search is not optimal or perfect by any means. However, the size and structure of the graph is known, and perfect optimization isn’t necessary in order to meet the requirements.
If you enjoyed this post, please consider subscribing
]]>ActiveRecord::Base
. As the application grows, it may be useful to connect to different databases for a variety of reasons. One database might be dedicated to reports. Another may be the result of an entirely different process, and now the Rails application wants to read from it. Using multiple databases helps a Rails application scale, and may be a more manageable first step toward an architecture based on microservices.
Rails needs two things in order to back specific ActiveRecord
models from different databases: A connection configuration and an establish_connection
directive. First, the configuration.
1 2 3 4 5 6 7 8 9 10 11 |
|
If the new database has different connection or authentication options, make those additions.
Next, instruct Rails to use a different database for a particular model.
1 2 3 |
|
When the ReportUser
class is loaded, Rails creates an additional connection pool for the new database. All reads and writes involving this model now use the new database.
Those are the basics, but there’s a few more things to think about when working with multiple databases in the same Rails app.
The ReportUser
model works great if a report_users
table already exists in the new database, but what about creating one from scratch? Generated migrations need a little tweaking because the default database is the assumed target.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
This works, but there should be an easy way to create the database before running migrations.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Now we’re getting somewhere, but what about using this database for several additional models?
Imagine creating two more reporting models, ReportOrder
and ReportProduct
. They look identical to ReportUser
, each with a call to establish_connection
. The problem here is that each class creates its own independent connection pool, and each pool has some number of individual TCP connections to the database server. Maybe this doesn’t matter for three models, but what about ten? I previously wrote about the dangers of failing to care about TCP connections. Let’s refactor before this has an opportunity to become a problem.
1 2 3 4 5 6 7 |
|
1 2 3 4 |
|
1 2 3 4 |
|
All subclasses of Reporting::Base
now share a single connection pool. This is the same way that ActiveRecord::Base
creates a connection pool used by its other subclasses. The abstract_class
assignment in the Reporting::Base
model means child classes look for database tables using expected Rails-isms (i.e. reporting_users, reporting_orders) instead of following single table inheritance rules.
We’ve nicely namespaced all of the reporting models. This convention can extend to include namespacing of related controllers and views. Good separation of concerns suggests that it makes sense to isolate the reporting concept. In a world where microservices are trendy, this might be the moment when someone suggests making a reporting service. That’s a heavy investment, but there is a reasonable compromise that still accomplishes many of the same design goals: A Rails engine.
An isolated Rails engine with its own database is basically a lightweight service. Generate an engine inside lib/reporting
and relocate everything in the existing Reporting
namespace into the engine. Make sure the engine is isolated.
1 2 3 4 5 |
|
It’s normal when using a Rails engine to copy the engine migrations into the enclosing application using rake reporting:install:migrations
. This step is unnecessary when the engine has its own database, and is actually detrimental to the separation of concerns. Instead, add a few helper tasks alongside the earlier one for creating the database.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Treat the reporting engine as a different project. Develop it separately. Consider moving the code into its own repository and pulling it in as a gem. Strictly adhere to the engine’s isolation by keeping constants from unnecessarily bleeding across module boundaries.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Adding the above dependencies couples the engine to the application and vice versa. This is not always bad, but each additional dependency should be an explicit and careful choice.
If and when you decide to take the plunge on a reporting service, the engine is ready to convert into a standalone Rails application. In the meantime, repeat this pattern to grow an existing Rails app using multiple databases in a modularized, scalable manner.
If you enjoyed this post, please consider subscribing.
]]>At a high level, the WebGL rendering process breaks down into three phases:
Setting up this drawing process comes with a lot of initial ceremony. It might seem overwhelming without prior OpenGL programming experience, but this is a one-time cost. An early investment in a few different concepts becomes the foundation for creating a custom rendering pipeline tailored to the individual needs of a system.
A basic program that exercises every piece of the above diagram can be assembled from the bottom up. But first, some initial setup.
1 2 3 4 5 6 7 8 9 10 |
|
1 2 3 4 |
|
The gl
variable contains a reference to a WebGL rendering context. This context is the main interface for the WebGL API.
Shaders are pre-compiled drawing programs that run inside the GPU. They are written in a C-like language called GLSL and provide rendering instructions to the GPU. Two types of shaders are used in this pipeline example: Vertex and fragment.
Vertex shaders describe how to draw the vertices making up one or more polygons. For the purposes of this example, that means a list of two-dimensional coordinates. However, the vertex shader does not know the actual positions of these coordinates. It knows only that they exist, and that they will be available by way of some attribute provided when the program runs.
1 2 3 4 5 6 7 |
|
At runtime, an attribute named a_position
of the type vec2
(a 2-dimensional vector) contains positional data about a vertex. Convert that vector into a vec4
(4-dimensional vector) and assign it to the special WebGL global variable gl_Position
. This program runs once for every pair of vertex coordinates.
Fragment shaders describe the space between vertices. While the vertex shader was called once for each vertex, the fragment shader program is called once for each pixel in the space between those vertices. In this example, the fragment shader program describes the color of each pixel.
1 2 3 4 5 |
|
At runtime, each time the fragment shader program executes (for each pixel), assign a new 4-dimensional vector describing a color (in RGBA form) to the special WebGL global variable gl_FragColor
. In this case, the color is always white.
Hooking up shaders makes up a large chunk of the WebGL setup ceremony. The source code for the shaders must be compiled and linked together in an instance of a WebGL program.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Once compiled, the process is not repeated unless the shader source code changes.
Attributes serve as containers for the data that travels from JavaScript into the shader programs.
1 2 |
|
Expose the a_position
attribute from the vertex shader and provide a reference to it in JavaScript. Think of it as a pointer to the place in memory where the attribute data resides.
If attributes are the containers for data, then buffers are the pipes that connect JavaScript to those containers.
1 2 3 4 5 6 7 8 9 |
|
Create a buffer and an array containing positional data. Then, activate the buffer by “binding” it. Finally, declare that the data for the activated buffer is the array of positional data in the form of 32-bit floats.
The setup is finally complete. It’s time to draw.
1 2 3 4 |
|
First, declare a clear color of black (in RGBA form). Next, inform WebGL to clear the color buffer using the declared clear color. Then, declare a pointer to the WebGL attribute a_position
containing the vertex data. Finally, draw the buffered vertices as a pair of triangles.
The results may be slightly underwhelming for the amount of effort, but this lays the groundwork for more advanced applications.
In a more advanced WebGL application, the drawing section might be called repeatedly in a loop as the buffered data changes. Drawing 60 times each second results in a target frame rate of 60fps. Everything else is initial setup that may expand in size (i.e. additional buffers, more complex shaders), but otherwise looks very similar to this example. The complete example is available on Github and as a JSBin.
For more in-depth learning about WebGL, I recommend these resources:
If you enjoyed this post, please consider subscribing.
]]>A middleware stack is not the traditional LIFO (last in, first out) data structure that comes to mind for many programmers when they hear the word “stack”. It’s a layered series of code modules, each of which modifies the state of an incoming data structure. After each layer has a turn, the resulting (new) structure is returned.
Here’s the situation which made me want to adapt this pattern:
There are only two types of pieces in this puzzle: The middleware and the builder.
A middleware is nothing more than a class that takes an “application” (more on that later) as its constructor argument, and which implements a single method, call
. This method takes one argument: A hash of the current “request” environment. In Rack parlance, the request is an incoming HTTP request. There is no HTTP here, so the “request” is really just whatever is making use of the middleware stack. The only requirement is that call
must return by passing the new environment (including whatever changes are made) to the next layer of the stack.
1 2 3 4 5 6 7 8 9 10 |
|
The builder is a class that manages a middleware stack and an associated application. There may be many builder instances depending on the number of desired middleware arrangements. The application is simply an object that responds to call
. In its simplest form, it is a lambda.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
Using the builder means defining a desired stack configuration and then calling it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
The results are hashes that have been manipulated in any number of ways by the various middleware layers.
A shortcoming I’ve identified is the difficulty of parallelizing stack layers that don’t absolutely have to run in a specified order. That’s a solvable problem, but worth noting when considering this pattern. One large benefit is the ease of communicating this architecture to my teammates, which I mentioned above as a primary goal. I can start a knowledge transfer conversation with “it works like Rack middleware”, and immediately establish a shared understanding. Would I use this pattern again? Maybe. It gets the job done and it’s fairly easy to understand. At the very least, I’ve emerged with a deeper understanding of Rack internals.
If you enjoyed this post, please consider subscribing.
]]>I learned I would be leading a team about a month early, and I attempted to make good use of that lead time by contacting my organization’s representative early and often via email and phone. After a few conversations, I wrote up a series of user stories for her to review, comment on, and approve. Then, I turned those user stories into Github issues. I did all of this in an attempt to understand the domain as deeply as possible. At Ruby for Good, the team meets on Friday morning and delivers a project on Sunday afternoon. The faster I could transfer my knowledge of the problem space to my teammates, the more productivity we could squeeze out of those 2.5 days.
I set a goal of having a code repository ready and waiting for teammates to clone on the morning of the event. It’s easy to lose half of that first day to environment and build issues. So, after initializing a bare Rails app, I wrote a README with a list of succinct instructions for getting up and running as quickly as possible.
I prepared as best I could prior to the event, and I think it paid off. Teams were selected at approximately 10AM on Friday morning. The first commit from one of my team members occurred at 11:41AM. The first pull request was merged at 3:30PM, and that was after we took a break for lunch.
Time is the most valuable asset at any hackathon. Ruby for Good amplified that feeling because I wasn’t simply building something for myself. I made a commitment to deliver a working application, and I wanted to follow through on that promise. There were two things in particular that I didn’t want to waste time on: Technology choices and stakeholder feedback.
I normally take the time to evaluate technology options carefully by comparing the requirements to the available solutions. I consider if it’s worthwhile to experiment with pre-1.0 products. I poll my teammates to see what they’re interested in and if we can find an opportunity to learn something new. None of that applied in this case. The choices I made were meant to be traditional and obvious to Rails programmers, regardless of experience level: The latest stable versions of Ruby, Rails, and Postgres. Vanilla JavaScript without any frameworks. I set up deployment using Capistrano to an Ubuntu 14.04 machine on Digital Ocean. The team later decided to use the CSS portions of Materialize, but that was a deferred group decision that was easy to drop in during development.
I did my best to eliminate anything that resulted in waiting on stakeholder feedback. Normal, non-programmer people who aren’t participating in hackathons typically don’t respond to email very promptly on weekends. It’s a bonus if a particular stakeholder is able to be more involved, but it’s not the expectation. A sufficient amount of preparation meant that I wasn’t waiting for responses to questions and blocking development as a result. In those rare situations where something questionable came up, I took the initiative, made a decision, and acted on it. If it turns out to be the wrong choice and the resulting feature isn’t delivered just right, that’s fine. Tweaking things later in response to feedback is just iterative development. The most valuable (and limited) resource available to me in this situation was the time of my team members, and I didn’t want to waste it on indecision.
The team had a healthy variety of experience that was evenly split between senior and junior developers. My goal as a lead was to find divisions of work appropriate for those different experience levels. I wanted everyone to feel that they were making important contributions. To that end, I tried to set aside tasks for the junior developers that were more straight-forward or easier to reference in documentation: Using generators to build out application scaffolding, or integrating third-party login using the OmniAuth gem. Meanwhile, I leaned on the more experienced developers for things like researching integration with various Google APIs and making higher-level architectural choices for the entire application.
One thing I didn’t expect was the sheer amount of new stuff I learned from reading pull requests from junior developers. I spend my days working on an older Rails application with a fairly stable set of gems and libraries. The people on my team at work are mostly senior developers. It was great having exposure to gems I’ve never heard of and language or framework features that I don’t normally have an opportunity to use. I pride myself on continual learning and professional growth, but this was a stark reminder of just how quickly technology moves. More importantly, it was a reminder that experience diversity is a good thing.
Our team had people comfortable working at most levels of the stack: From design, styling, and JavaScript on the front-end to Ruby and Rails on the server. One thing we were missing was someone to handle the infrastructure, operations, and deployment strategies. I initially filled that role myself, but soon started taking on a variety of increasingly diverse tasks in support of other team members. I was suddenly very thankful for the excuses I’ve had to work on all sorts of crazy projects at every level of the stack. I often worry that I’m not focusing on specific skills enough; that I’ll end up with knowledge that is wide but shallow. That’s still a concern, but this showed me the importance of being able to jump into any role that a team might need.
Ruby for Good is an opportunity to participate in a full (albeit accelerated) project lifecycle from concept to delivery. That kind of experience can be hard to come by. It’s valuable to have insight into how a project evolves beyond simply writing code, and that goes double for anyone considering freelancing or contracting. The icing on the cake is the knowledge that you’re giving a little something back to organizations who wouldn’t otherwise have the means to hire developers. We’re all immersed in technology and surrounded by brilliant people on a daily basis, and that environment breeds impostor syndrome. This conference is a great reminder that everyone’s skills are valuable and the demand for them is high, regardless of our perceived self-worth when compared to peers. Ruby for Good might not be able to save the world in a single weekend, but I think it does a pretty good job at making things a little better.
If you enjoyed this post, please consider subscribing.
]]>Every web developer who spends a significant amount of time with Ruby inevitably reaches a point when they want to learn more about Rack. Rack is at the heart of the most popular Ruby web frameworks, including Rails and Sinatra. There are tons of resources available for getting started with Rack applications from the ground up, but I found myself curious about the other side of the fence. How do I write a web server that knows how to talk to Rack applications, and can I get Sinatra to serve a minimal app using that server?
I started with the simplest Sinatra application possible.
1 2 3 4 5 6 |
|
Trying to run the above application will result in an error because Sinatra is asking Rack to use a server called my_server, and Rack doesn’t know about it. So, let’s tell Rack about the new server.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
Telling Rack about a server is as simple as defining a new handler that lets Rack know how to start the server. The handler has a single method, run
, which receives the Rack-compliant application to be served, along with an optional hash of server-specific settings. All that’s left to do is actually implement the server, which is the most significant portion of the entire exercise.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|
If you’ve ever experimented with writing a basic HTTP server, most of this is boilerplate. Loop continually, waiting for TCP connections. When one is received, pass the request through to the Rack application along with all of the necessary environment settings. When the application is done, send the request back to the client along with any headers, and then close the connection. Obviously, this server has some pretty severe limitations and isn’t intended for actual real-world use.
The only Rack-specific code is the hash created in the new_env
method. A Rack application is simply an object that responds to one method, call. That method takes a single argument which is a hash describing the current environment. I took some liberties here because I was only interested in getting the most basic application to work, but the Rack specification describes all of the expected environment values in detail. The takeaway is that Rack applications expect an environment hash, and it’s the job of the server to provide that hash its initial state.
That’s literally all there is to standing up a web server that can speak the Rack language. The small Sinatra app from the beginning of this post should now serve up its Hello World page without a problem. The functionality of this web server is obviously quite limited, but it’s enough to get started on the path toward something more robust. The interesting part to me was how easy this was to put together after a little digging through the Rack source. From the perspective of a server, Rack really is designed to get out of your way while providing a very simple interface to the world of Ruby web apps.
If you enjoyed this post, please consider subscribing.
]]>At the same time, in a seemingly unrelated universe, we experienced strange problems on the server that runs our monitoring application. We use Dashing to keep tabs on various metrics. Over time, Dashing ate up all the memory on the box and kept dying a horrible death. We implemented a “fix” by restarting the application via cron a few times each day, but that didn’t always keep the machine from dying.
Considering these two problems led me to think about the internals of Dashing. Dashing uses Thin, which is an evented application server built on top of EventMachine. Dashing also uses Rufus-scheduler in order to schedule its various monitoring jobs. The Rufus-scheduler gem hooks into that same EventMachine loop, and it all runs in a single Ruby process. We had a few Dashing jobs that looked like this…
1 2 3 4 |
|
I did some digging in the Redis Ruby client and discovered that there is no automatic connection pooling implemented. That’s interesting. Then, I did this…
1 2 |
|
1 2 3 |
|
1 2 |
|
Well, that sucks. Each subsequent run of our monitoring jobs created a new TCP connection to Redis that wasn’t closed. Sure, the connections eventually timed out, but multiple jobs running once a minute resulted in connections that were created faster than they could timeout. No wonder All The Things were breaking after an indeterminate amount of time. The fix was pretty simple…
1 2 3 4 5 6 7 8 9 10 |
|
I used the connection_pool gem out of convenience, but writing your own connection pool manager is pretty trivial. Here’s the result of the change…
The yellow blob is memory in use. Prior to deploying fixes, the only way to keep the machine from choking up all the time was a series of application restarts courtesy of cron. The resulting improvement is significant. I should note that I deployed other fixes at the same time which cut back on memory being leaked throughout the application, but the connection pool implementation was the most significant change.
The moral of the story is that we as Ruby/Rails programmers tend to take things like memory management and connection pooling for granted. Ruby is garbage collected, but it’s still very easy to leak memory through poor code. Additionally, it’s important to keep track of the network connections our applications are arbitrarily establishing. ActiveRecord manages a connection pool for us. Certain other gems (like the Mongo Ruby driver) do as well. But that doesn’t mean that every 3rd party client library will keep us safe. I’ll certainly consider this next time I’m opening a connection in my code.
If you enjoyed this post, please consider subscribing.
]]>An operation that can run in the background is identified. The Rails application needs to enqueue a request. The mechanism for doing this depends on the particular versions of Rails and Resque. On Rails 4.x using ActiveJob, the interface is MyJob.perform_later
. On earlier versions of Rails using Resque 1.x, this is done using Resque.enqueue
. Resque uses Redis behind the scenes, and adding a job to the queue actually means serializing some details about the job and inserting that information into Redis. The Rails application finishes dealing with Resque at this point, and returns to the web request-response cycle.
A piece of information now sits in a Redis data structure representing the job to execute. Something needs to consume that queue, and it comes in the form of a completely separate Ruby process. This process is typically started by running rake resque:work
from the application directory. The new process waits to consume information from the Redis queue, and then uses that information to identify and execute the background job.
When the worker process starts, a completely new copy of the entire Rails application is loaded into memory. One of the most common examples of a background job is sending an email using ActionMailer. That code lives inside the Rails application, and so the entire app must be loaded. However, an important thing to consider is that you don’t actually need Rails to run the job code. Rails is potentially a huge amount of overhead. If a job is enqueued using the MyJob class from Rails, then the only requirement for running that job is a MyJob class in the consuming process that listens on the same queue.
1 2 3 4 5 6 7 8 9 |
|
The code above knows nothing about Rails, but will happily consume a background job that was enqueued from a Rails application. Add a minimal Rakefile that pulls in resque/tasks, and you can run this worker with rake resque:work
from a completely different application. This example uses Resque directly without the ActiveJob interface. Using ActiveJob means adding it as a dependency. In general, the more domain knowledge needed to run the background job, the more dependencies the consumer process will have in common with the Rails application. Use this as inspiration to write several small applications – only one of which uses Rails – instead of one huge monolithic Rails app. This could be an intermediary step toward some sort of microservices based solution.
The Ruby process that sits and listens for jobs in Redis is not the process that ultimately runs the job code written in the perform method. It is the “master” process, and its only responsibility is to listen for jobs. When it receives a job, it forks yet another process to run the code. This other “child” process is managed entirely by its master. The user is not responsible for starting or interacting with it using rake tasks. When the child process finishes running the job code, it exits and returns control to its master. The master now continues listening to Redis for its next job.
The advantage of this master-child process organization – and the advantage of Resque processes over threads – is the isolation of job code. Resque assumes that your code is flawed, and that it contains memory leaks or other errors that will cause abnormal behavior. Any memory claimed by the child process will be released when it exits. This eliminates the possibility of unmanaged memory growth over time. It also provides the master process with the ability to recover from any error in the child, no matter how severe. For example, if the child process needs to be terminated using kill -9
, it will not affect the master’s ability to continue processing jobs from the Redis queue.
In earlier versions of Ruby, Resque’s main criticism was its potential to consume a lot of memory. Creating new processes means creating a separate memory space for each one. Some of this overhead was mitigated with the release of Ruby 2.0 thanks to copy-on-write. However, Resque will always require more memory than a solution that uses threads because the master process is not forked. It’s created manually using a rake task, and therefore must load whatever it needs into memory from the start. Of course, manually managing each worker process in a production application with a potentially large number of jobs quickly becomes untenable. Thankfully, we have pool managers for that.
One of the most ubiquitous pool managers for Resque (and the one we use at Optoro) is resque-pool. This plugin provides a rake task that manages all of the workers normally started using rake resque:work
. Earlier, I pointed out that each worker process requires its own copy of the application in memory. A pool can potentially alleviate these memory concerns. When the pool starts, it loads the entire application into memory. Then, it forks a process for each (master) worker. Once again, copy-on-write significantly reduces the amount of memory used by each forked process. The memory benefits combined with the convenience of process management make resque-pool (or some other pool solution) an easy win.
The other tool worthy of consideration for part of your Resque infrastructure is a scheduler. One of the most popular solutions is resque-scheduler. The scheduler is a very simple, cron-based application that inserts jobs into the Redis queue based on a configuration file. It has very few dependencies in general, and doesn’t need the Rails app or the job code in memory. As a matter of fact, it doesn’t need any constant definitions at all if the job class names are passed as string arguments.
It’s valuable to understand the tools you’re using, especially when it comes to rogue processes outside the normal scope of the Rails application. Understanding leads to better architectural decisions. The concepts that apply to Resque will certainly be applicable to other background job solutions. Implementation is details. The most important skill is learning how to think about the application in a different way. Go forth and enqueue.
If you enjoyed this post, please consider subscribing.
]]>It’s not just you. Delivering faulty software on a regular basis is a problem that plagues the industry. Sometimes, it feels like we’re trying to hide our own failures behind the Captain America shield of “agile” development. Bugs are part of the process, but don’t worry because we’re iterating quickly! I challenge myself not to think like that. Just because we’re able to deploy 20 times a day doesn’t excuse us from the responsibility of getting it right the first time.
Playing loose and fast like this doesn’t fly in other industries. Surgeons can’t make mistakes until they get it right. Architects can’t implement a flawed building blueprint and then correct it later. Before I started writing code for a living, I worked in an industry that has a very low tolerance for mistakes: Law enforcement. Getting it wrong the first time as a police officer can have some pretty serious repercussions on someone’s life or liberty.
That’s not to say that surgeons, architects, and cops always get it right. Watch the news on any given day and you’re sure to see the ramifications of screwing up something serious. However, those other industries still have an error rate (and a fault tolerance) that is far lower than your average software development shop. So, why shouldn’t we as programmers hold ourselves to an equally high standard? Our work may not mean the difference between life and death on a daily basis, but our mistakes could result in tens of thousands of dollars (or more) worth of economic damage to our employers. Depending on your business, that might indirectly affect more lives than you think. Let’s do it right the first time.
When I worked in law enforcement, I was a criminal investigator primarily tasked with pursuing fire and arson-based crimes. I’ve spent quite a bit of time recently thinking about techniques and practices that I used as an investigator to minimize my risk of making mistakes in all aspects of my work. Minimizing risk is a way of life for a police officer. I want to apply that mindset to my work as a developer, and I also want to encourage it among my team members.
So, without further ado, here are eight techniques for raising the bar on software deliverables, from a criminal investigation perspective.
In law enforcement, they say that your head is “on a swivel”. Be aware of your environment. Always know what’s around you. When entering a room, your first inclination is to note the locations of all the exits. When sitting down, put your back against a corner and face the entrance so that you have line of sight on everyone who comes in. Always watch the hands of people you approach on the street. Take note of identifying details in case you need to describe someone later.
In software development terms, maintain awareness of your surroundings by reading code. Read the code that other members of your team write. Read the third party library code in your application. Before charging ahead to implement a new feature, read the code that might be affected and understand the implications of the needed changes. In many cases, you may spend more time reading code than writing it. That’s a good thing.
Witnesses are always one of the most valuable sources of information when conducting a criminal investigation. The problem is that most people typically aren’t very observant. Getting good information from a witness is something of a painful extraction process that requires asking very specific questions in order to exercise their memories. Part of this process involves honing your ability to “read” people. Be observant of the subtle physiological reactions that your questions elicit, and practice associating those reactions with the emotions they represent.
Effectively communicating with your stakeholders is one of the most important parts of taking a software project from start to finish. All of those same communication skills are directly applicable. Asking specific, pointed questions reassures the stakeholder that you’re both talking about the exact same thing. Being cognizant of physiological responses will help you recognize when the other person doesn’t really understand, even though they might say otherwise. That’s your signal to re-frame the explanation, probably using less technical jargon. Not everyone speaks Tech, and developers are notorious for finding the most complicated way to explain simple concepts. It’s a natural reaction for people to “fake it until they make it” even if they don’t truly understand what you’re saying. That can spell disaster when the subject at hand is project requirements.
Talking to witnesses yields subjective observations, but physical evidence doesn’t forget and can’t lie. A thorough scene examination is the only way to get the objective information that you need as an investigator to draw conclusions based on fact. Conduct your examination by applying methodologies that are widely accepted, procedural, and repeatable because you will be called upon to justify them in court, under oath. Courts will not qualify expert witnesses who’s methods can’t stand up under rigorous vetting.
The “procedural” part refers to the development process, not the programming language. Some people swear by Test or Behavior Driven Development. I generally practice TDD, but that doesn’t mean it’s my exclusive mantra. Maybe you’re one of those folks who believe TDD is dead. The particular school of thought doesn’t matter as long as you have some sort of process that involves testing. Most like-minded developers will probably listen as long as you can justify your methods. The non-negotiable part is that there should be tests, regardless of when they were written. Those tests will be the record of truth for future development. They are proof that the proper specification was implemented, regardless of methodology.
Most investigators in agencies with enough personnel work in pairs. The reasoning is simple: Two pairs of eyes are better than one. A partner is an investigatory assistant, a sounding board for crazy theories, and a friend to watch your back all rolled into one. Working in pairs is safer and more productive than going it alone. The very presence of another person means any potential mistakes have to make it through an additional layer of protection.
Quality assurance can mean several different things depending on the work environment. Larger organizations may have a dedicated QA team. In small shops, it may be just another developer. If you’re a freelancer, QA might be you taking a fresh look after a coffee break. It’s preferable to have someone who was not involved in development QA your work, but that’s not always realistic. Even when you can hand your work to someone else, you should still be manually testing it beforehand. The existence of a QA department is not your excuse to pawn basic functionality testing off on someone else. That means not only testing the features you worked on directly, but any related systems that may have been affected as a result of your changes. There is never a reason for delivering code that hasn’t been run from a user’s perspective.
The act of sitting down and writing the report is both mandatory and exceedingly useful for the investigator. It forces the organization and presentation of thoughts. This often has the beneficial side-effect of raising new questions and revealing previously unconsidered connecting details. Furthermore, it’s an opportunity to tell the detailed story about how your conclusions were reached using sound investigatory methods. It will be the record on file that will represent you and your work in front of a judge and jury. It could be the deciding factor in someone’s guilt or innocence. Details matter. Professionalism matters. Judges don’t like spelling mistakes.
Documentation is more than just a collection of README files. It’s any written attempt to communicate the intent of your code to an audience. That could mean Github issue responses, JIRA comments, commit messages, or any number of things. Clarity of detail and the presentation of professionalism are just as important as they are in the investigator’s written report. Documentation for a new feature should explain how the deliverable meets the original requirements. If it’s a bug, describe the process used to diagnose and repair the problem in such a way that a reader could duplicate your actions. Consider your audience and write at a technical level that is appropriate for the reader. The goal is not to impress everyone with the depths of your knowledge, but rather to communicate well enough that the reader doesn’t need to ask any follow up questions.
Writing an arrest warrant is a detailed, often frustrating process. It’s a request to take away someone’s freedom. Not only must you carefully, laboriously lay out the facts of your case and the conclusions that you drew as a result, but you must do so using a very specific presentation style and format. Getting a warrant signed means bringing it in person to a judge at the courthouse. The judge reviews your application and, if he or she approves, has you swear an oath in their presence. This means that every mistake or forgotten detail in the warrant results in one more round trip to the office and back. That’s serious motivation to get it right the first time.
Code review is asking for a warrant that, once signed, will allow you to deploy. Developers don’t have to raise their right hands and swear an oath before receiving approval, but the vetting process should still be equally rigorous depending on the scope of change. The review may be a semi-formalized process depending on the organization, or it could be as simple as pinging a friend and asking them to review a pull request. The mechanics are not important as long as it means getting your code in front of someone else. The best reviewer is another developer who wasn’t involved in writing the code. A fresh perspective will often lead to architectural and functional improvements.
Making an arrest requires careful planning and coordination with an overall goal of controlling the environment where the arrest will be made. The best way to accomplish this is to maintain the element of surprise. Learn the target’s routine and choose a time and place where you will have a tactical (and preferably numerical) advantage. It is difficult and dangerous to make an arrest in a place that you are unfamiliar with, such as the suspect’s house. Regardless of location, maintain a heightened level of awareness and anticipation until the suspect is in jail and you are back at home or in the office. A good arrest is a well executed plan where conflict is kept to a minimum. When successful, it represents the culmination of days, weeks, or perhaps months of work.
A successful deployment is the reward at the end of the development cycle, but it too requires careful planning and coordination. Maintaining a tactical advantage means picking the right deployment strategy. End users should remain blissfully unaware of any update roll outs or restarting services. If there must be some kind of interruption, keep it as minimal as possible and choose a time that is convenient for the majority of users. Additionally, releasing code into the wild does not mean the deploy is done. There is a Danger Zone immediately following a release which may last anywhere from a few minutes to a few hours depending on the application and scope of change. Maintain a heightened sense of awareness during this time by using all available monitoring tools. Indications that something is wrong with new code may be buried in the middle of a long stack trace for something seemingly unrelated. If you work in an organization where someone else is deploying your code, the responsibility for knowing when it’s happening and subsequently monitoring the roll out still rests with you.
An investigator’s learning is never complete. There are minimum levels of government-mandated training that cover a wide array of concepts from firearms to law, but good investigators go above and beyond by seeking out sources of knowledge for staying at the forefront their field. This includes looking critically at past cases for areas of possible improvement. The most useful resource in a group of investigators is their collective past experience.
Developers don’t have mandated continuing education, so it’s each person’s individual responsibility to continue honing their craft. Reading blog posts and tech news, listening to podcasts, learning new languages and frameworks, and experimenting with side projects are all forms of continuing education. When it comes to specific projects, retrospectives are a great way to examine both the positives and negatives of the development cycle after delivering a particular feature or product. Some organizations have a formal retrospective process, but as an individual there’s nothing wrong with taking a moment after finishing a deliverable to reflect on the experience. Recognize what went well and what didn’t. Come up with ideas for how to perpetuate the former while correcting the latter.
This ended up being a bit long winded, but it turned out to be a useful exercise in organizing my thoughts around a topic that was floating in my head for a while. I’ve had varying degrees of success in applying these principles to my own work, but I will continue trying. We all want to deliver bug-free code, and the very act of recognizing that we can improve is a step in the right direction.
If you enjoyed this post, please consider subscribing.
]]>I’ve read tons of great blog posts and watched dozens of awesome talks on this subject, but one thing that usually seems to be missing from the SOA discussion is an approach to handling the database. Carving your monolithic application into a series of lightweight services is probably going to be an incremental affair. You need to balance ongoing feature development and business needs against your desire to take a refactoring axe to your codebase.
If you’re taking the smallest possible first step, you’re trying to find the seams in your code in order to identify and break off that first service. What about the data? The SOA dream means that most services will eventually have their own databases, and other services that need that data will have to talk to the service that owns it. Getting there is going to be hard. If you’re a traditional Rails shop, you’re likely dealing with a single, massive SQL instance (MySQL in my case). Up until now, “scaling” the database has meant throwing more hardware at the problem. So, do we shard? Great idea! Only now you’ve introduced a significant amount of complexity and operations overhead into what was supposed to be a small, incremental step in your scaling journey. Scope creep, ahoy! Maybe there’s another way.
One thing you can easily do for your database if you haven’t already is set up replication. This is great for backups and durability, but it would be nice if we could take advantage of replication by distributing database reads from Rails across all of the replication nodes. This is harder than it seems at first glance, and a myriad of problems must be addressed: What happens if you attempt to read data from a slave that hasn’t been replicated yet? What happens if one or more slave nodes go down? How do you deal with reading data immediately after it’s been written?
Fortunately, the folks over at TaskRabbit have been working on this. They’re developing a library called Makara which makes it easy to distribute SQL reads across multiple slave servers while simultaneously addressing all of the previously mentioned problems. Makara is designed for any Ruby application, but comes with a very handy set of ActiveRecord adapters for plugging in to Rails.
If you’re anything like those of us over at Optoro, we tend to be picky about introducing new dependencies into our already bloated Gemfile. Additionally, the idea of blindly embracing something as critical as an ActiveRecord adapter from a pre-1.0 library across the entire application is very scary. So, we wanted a chance to evaluate Makara by choosing the parts of our application where we specifically wanted to distribute reads to our replication slave servers. Unfortunately, Makara doesn’t make this easy by default. It’s designed to either be on or off with very little flexibility in between. Typically, if you force Makara to read or write to your master SQL instance any time in the context of a request, it’s going to “stick” to that master for any subsequent reads for the remainder of the request. This is a good thing, but it also means you can’t choose the parts of the request where you want to distribute your reads.
We wanted to be able to do something like this…
1 2 3 4 5 6 7 8 9 10 11 |
|
Furthermore, we didn’t want to introduce the overhead of establishing, disconnecting, and reestablishing ActiveRecord connections within our requests. Just use the one already defined connection (Makara maintains separate pools for the various nodes), and within a block distribute the reads if it’s appropriate to do so. I was able to accomplish this by subclassing and extending the MySQL ActiveRecord adapter that comes with Makara.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
By default, the Makara adapter returns false for #needs_master? if the pending SQL statement is interpreted as a read operation. I wanted it to return true all the time unless we’re operating in “distributed” mode. Unfortunately, there was still one problem. The recommended Makara configuration is to set your adapter to “sticky” mode, which means that once any SQL operation hits a particular node within a specific context, it will continue to use that node until the context changes. This is a good thing because replication across different nodes may happen at different times. You don’t want to read data from one node, only to find on a subsequent read (from a different node) that the data doesn’t exist. For our purposes, the downside to this is that every request starts out by reading (and writing) to the master node. So, by the time we enter a distributed_mode block, the request is always “stuck” to the master node. Therefore, I made a #with_new_context method that resets the Makara context only for the duration of the given block, and resets it afterwards. This will give the request a chance to hit a slave node, and subsequently become “stuck” to whatever node it ends up with. Then, when the block ends, the context is reset to what it was before the operation. The previous context is always the one that was originally stuck to the master node. It’s important to note that the context handling for Makara uses class methods and singletons which essentially means the entire library is not threadsafe. This isn’t a problem for us at Optoro because we use a forking server model (Unicorn).
Finally, I needed a method that takes a block and uses the adapter’s #with_new_context method…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
I wanted application-wide access to the method, so sticking it on ActiveRecord::Base seemed like the way to go. I’m not in love with the idea of monkey patching ActiveRecord::Base, but it gets the job done for the time being until I can come up with a better implementation. The upside of this implementation is that it falls back to normal behavior even if we’re not using our custom ActiveRecord adapter. This means I can use ActiveRecord::Base.execute_distribued wherever I want, and if it just so happens that we’re not using the Makara adapter (development mode, A/B testing production, etc.), nothing will break. If you use this approach, you’ll always need to require the adapter even if you’re not using it.
That’s all there is to it! 60-something lines of code and one new gem, and now we have the ability to distribute reads across any number of MySQL slave nodes. Depending on how widely you use distributed mode, this has the potential to greatly reduce the load on your MySQL master node. Additionally, it will buy you some breathing room for a strained database. Most importantly, it’s a small, incremental step toward scalability that doesn’t require you to make huge, sweeping ops changes.
Make sure you check out the Makara documentation for details on configuring the gem for your particular needs.
If you enjoyed this post, please consider subscribing.
]]>A couple weekends ago I had the opportunity to attend Ruby for Good in Fairfax, VA at George Mason University. Ruby for Good is a conference/hackathon where participants split into teams and spend the weekend hacking on projects that benefit someone or something. The “someone or something” is typically a charitable organization of some sort, and the projects span the gamut. They could be (and were) anything from improving documentation on existing open-source projects, to building a greenfield app from scratch for a nonprofit.
The team I was on did the latter. We built a fresh web application from the ground up for a nonprofit. I had a great time, met some awesome people, and learned way more than I thought I would going in.
Our team built an issue tracker application for Pathway Homes, a nonprofit dedicated to providing housing to adults with mental illnesses. We started the weekend in the advantageous position of knowing who our stakeholders were, and having a fairly good understanding of what they needed. Up until now, tracking maintenance issues across their properties meant one employee listening to voicemails and taking notes in Microsoft Word. This was great from a developer perspective: Anything we produced would be better than the current system.
Team members started off by identifying our individual strengths. Few things make me happier than a well-designed data model and a great backend API, but somehow I ended up leading the frontend team. This was probably because I suggested using Angular. It just so happened that not only was I the only person with any experience using Angular, but several other people were immediately excited about the opportunity to learn something new throughout the weekend. I had my reservations initially, but those were soon replaced with excitement at the prospect of being way outside of my comfort zone. These kinds of conferences are all about doing something new. On top of that, I was in a position to lead and teach: Two areas in which I definitely want to improve.
I was initially concerned about the architecture of our application. I thought we were making hasty decisions that would come back to bite us later on. That fear soon faded away, and was replaced by the driving need to Get Shit Done. It was great. I’m typically the sort of person that will spend hours or days agonizing over the perfect system design. That doesn’t fly in a hackathon situation. Inspired by team members who were pushing changes left and right, I soon fell in line. The Github stats by Monday morning speak for themselves…
That is an absoultely insane amount of work accomplished by 10 people in less than 3 days. It’s even more insane when you consider that we typically called it quits each night by 9pm in order to go be social and do conference stuff. On Monday afternoon, we had a working, (mostly) production-ready application to demo. As far as I’m concerned, any other shortcoming pales in comparison with that accomplishment.
And there are certainly shortcomings. There are design problems. There is bad code all over the place (much of it written by me). Our git practices were terrible. The separation between frontend and backend is not really identifiable despite the fact that we were using a client-side framework. But you know what? That’s okay. I don’t think I would have been okay with it before Ruby for Good, but it took this experience to bring me to that realization. Perfect code is worthless if it doesn’t ship. Our goal was to write an app that would make life easier for a nonprofit, and that’s what we did.
We learned something. We shipped something. We had fun. Who can ask for more than that in one weekend?
If you’d like to contribute or you’re just curious, the code is available on Github.
If you enjoyed this post, please consider subscribing.
]]>When I started at Optoro (which coincidentally was also my first professional programming experience), one of my first projects was to create an ETL solution that had been sitting on the back burner for a while. The goal was to migrate a series of Mongo collections to equivalent MySQL database tables so that the company’s analysts could easily access the data from their Windows-based GUI SQL clients. In the case of embedded documents, the structure essentially had to be “flattened” into a series of SQL columns. An additional requirement was that the schema should be determined dynamically during each run. If we started to add arbitrary new fields on future Mongo documents, the program should recognize that and adjust the destination SQL schema appropriately during the next run.
My first attempt at a solution was a very crude Ruby implementation. It “worked”, but it used a ludicrous amount of memory and was horrendously slow. When I say slow, I mean it took over a day to process the entire events collection. The collection was admittedly over a terabyte in size, but that was still unacceptable. If the code had been capable of determining the most recently translated record’s timestamp and only pulling events that were created after that point in time, maybe the other limitations would have been acceptable. But that wasn’t the case.
Over the next few months, I spent a great deal of time reading about ETL and data warehouse solutions. I came into this project without any knowledge of these things, and with programming experience that was fairly limited to typical Ruby on Rails applications and other small-scale school projects. After evaluating other ETL solutions in the wild, I settled on a second iteration that used Pentaho’s Kettle. Kettle is an open-source Java ETL engine provided by a company that’s been doing ETL for over a decade.
My thought process behind using Kettle was simple: Why reinvent the wheel when people who are smarter than I am have already figured out these problems? I spent the next few weeks implementing version 2 of my ETL solution using Kettle. Working with Kettle involves a kind of visual programming using a drag-and-drop GUI interface. Transformation components (read from a database, match against a string, etc.) are arranged on a canvas, connected together, and then configured using step-specific menus. In some cases, you can script custom solutions using Javascript, Java, and with help from a third-party plugin even Ruby. There are literally hundreds of pre-defined steps for everything from reading a CSV file to interacting with a Salesforce module.
By the end of this second iteration, I had a working solution that was reliable and performant. On each run, only new data was processed from the source collection and subsequently written to the destination database. From the end users’ perspective (our analysts), it was a success.
Unfortunately, it was not a success from a development perspective. Kettle was designed for non-programmers. The development environment is mouse-click heavy with tons of windows and menus for every minute detail. It’s possible to dive into the Java source in order to manage, create, or extend any component of the system, but that’s not the Kettle Way. They’ve gone above and beyond in order to make sure that anything and everything you could possibly want to do to your data is available in a pre-baked configurable step from inside the GUI. This is great for ensuring that Kettle can handle any conceivable ETL problem, but it also results in a lot of complexity. My Kettle project was massive and completely unmaintainable. Making a change required at minimum a full day of refreshing myself on how things worked. Debugging anything was a nightmare. It was completely untestable, and to make matters worse I was the only developer on our team who had any idea how to work with Kettle.
A few weeks ago, we made some changes to our Mongo infrastructure that resulted in a need to refactor how my ETL project works. I was dreaded even opening the project files. Then, I had an epiphany. Why not do it in Ruby? It’s been a year since my first failed attempt. I’ve spent that year reading about, working with, and generally absorbing a great deal of knowledge about ETL solutions. As the only ETL developer on the team, I could say with certainty that re-implementing this entire thing in Ruby would make developer happiness increase by 1000%.
Enter Rodimus.
One major thing I’ve learned about ETL in the past year or so is that solutions tend to be very targeted to their specific domain. When you try to generalize too much, you end up with a conglomerate of options and a barrier of entry that is far too high. Kettle is, by design, the one stop shop for ETL solutions. Entire books have been written on it because it takes an entire book to even begin to understand everything it has to offer. That’s all well and good if you’re a non-programmer who’s working with ETL, but I wanted something simple, clean, and easily maintainable. I wanted something that would fit nicely into our Ruby ecosystem.
I approached Rodimus with the goal of simplicity. I wanted something very lightweight with minimal dependencies upon which targeted ETL solutions could be implemented. Its forking process approach to concurrency is actually inspired by Kettle’s design. Check out the README for more details.
It only took me a couple of nights to produce the first version of
Rodimus. A few days later, I had rewritten our entire ETL stack on top of it.
When I look at the code now, its simplicity still surprises me when compared
to the monstrosity that I had previously implemented in Kettle. Approaching
the ETL project is no longer an exercise in frustration. I am more confident
in my (now testable) solution, and I can easily share an understanding of the
project with my coworkers. In all, I am a much happier developer.
I have future plans for Rodimus, but I think I will continue to strive for simplicity at its core. ETL can be complex, but that complexity should live solely in the specific application. It shouldn’t be the concern of the ETL engine.
If you enjoyed this post, please consider subscribing.
]]>After this happened a few times, I realized that I needed to examine the running process for more information. My limited C/C++ exposure told me that gdb would be a great tool for this, but I wasn’t sure how I could get useful Ruby information after attaching to the running process. A little googling led me to an excellent blog post by Thoughtbot that included some helpful gdbinit definitions:
1 2 3 4 5 6 7 |
|
The first function redirects the process standard out to a temporary file which you can then tail -f in order to see what’s going on. The second function allows you execute arbitrary Ruby code inside the process. Assuming you have the above definitions in your ~/.gdbinit file, and a running irb process with pid 12345, the process would look like:
1 2 3 4 |
|
Obviously, substitute your own path to the appropriate Ruby binary. In a separate window, you can tail -f /tmp/ruby-debug.12345 after the first command. The second command will then output the current execution stack for the running process. Your ruby_eval calls are running in the context of the currently executing code. So, for instance, you could see a list of currently defined local variables with Kernel#local_variables. You could then examine each of those variables in turn in order to get an idea of what was going on at that particular point in your code.
If you’re curious as I was about the call to rb_eval_string_protect() (which is an internal C function in the Ruby source), the first argument is the string of Ruby code that’s being executed. The second argument is an integer pointer to a defined error constant. In this case, 0 means the code executes successfully. A non-0 number would indicate an error, and the function would return nil.
These little gdb tricks have changed my world when it comes to debugging. I use this technique all the time.
If you enjoyed this post, please consider subscribing.
]]>1 2 3 4 5 |
|
Fairly straight-forward. Plenty of libraries use class methods like this. You can easily group related methods together in your own modules, and then have them defined on the class when the module is later included. This is accomplished with the Module#included method like so:
1 2 3 4 5 6 7 8 9 10 11 |
|
Then, include this module in one of your other classes:
1 2 3 4 5 |
|
A friend of mine has been working on a Ruby MUD server that I’ve been helping with when time permits. One thing we want to do is persist game objects to disk using any number of backends (Mongo, SQLite, etc.). We want an API that’s storage-type agnostic which can talk to the appropriate mechanism by way of an adapter. The adapter should be hidden from the main application. We want to be able to write object classes that might look like this:
1 2 3 4 5 6 |
|
This is very similar to the Mongoid syntax (a gem my friend and I are both fans of). Using the above pattern in our Persistable module allows us to accomplish exactly what we’re going for. The full code after our initial refactor is below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
If you enjoyed this post, please consider subscribing.
]]>1 2 3 4 5 6 7 |
|
This allows you to define your validation methods in a custom class. It can be useful for extracting validation behavior out of your model. However, what you might not know is how this class is instantiated by Rails. I assumed that a new instance of my validator would be created each time validation was performed on an instance of the model during a web request. I also assumed that the validator instance would stick around if multiple validation attempts were made. Based on this assumption, I attempted to memoize the model instance inside my validator.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
This won’t work, and leads to some unexpected behavior. Everything seemed fine while submitting my form via the browser, but my tests were failing. Things that should have been valid weren’t, and vice versa. This eventually led me to do a little research on how Rails uses validators.
1 2 3 4 5 6 7 8 9 10 11 |
|
The moral of the story is that validates_with is a class method that creates a single instance of the validator when the model class is first loaded. If you memoize an instance variable inside the validator, it will not be replaced on successive calls to the validate method. In other words, the validator might be trying to validate the wrong object.
If you enjoyed this post, please consider subscribing.
]]>After surveying the landscape, I decided that Clojure would be my language of choice. Why Clojure? For one, it runs on the JVM and is able to take advantage of all the default Java/JVM tools and libraries. Other then that, it was mostly a random pick. Functional programming seems to be the new (old?) hotness these days, and there’s a ton of options. Many of them bring a lot of similar things to the table. In the end, I chose Clojure because it’s an evolution of Lisp, which is itself very similar to the MUSH/MUX code I spent much of my spare time writing in younger days.
I took a stab at Clojure by writing a simple telnet chat server, which is usually my go-to app for learning a new language.
Documentation for Clojure isn’t bad. I mainly used two sites: The Clojure Documentation Site and Mark Volkman’s Tutorial. I found a healthy mix of easy to advanced guides and examples to help me on my way.
My overall experience was pretty pleasant when it comes to programming in Clojure. I found myself struggling occasionally with finding alternatives to defining variables and maintaining application state, but I think that has less to do with Clojure itself and is more reflective of my inexperience with functional programming. Clojure syntax is fairly intuitive and easy to use when it comes to native Clojure concepts, but I found it a little less so when dealing with Java libraries.
I ended up relying on those Java libraries for a lot of input and output in my application, mainly due to the fact that I couldn’t find a socket library for Clojure. I also found myself falling back on Java threads for concurrency. Clojure offers some very nifty concurrency tools in the form of delays, futures, and promises, but none of them solved my need for long-running loops in a separate thread. I’m not sure if the use of native Java threads is intended when using Clojure, or if that was a result of my inexperience with the language. One thing I really enjoyed about concurrency in Clojure was the ready-to-use thread-safe reference types such as atoms and agents.
At the end of the day, I can definitely see the advantage of using Clojure (or any functional language) for certain purposes: specifically math computations and stuff that require lots of concurrency. I don’t know if Clojure will be a go-to for me in general outside of those two specific applications. I’ll have to take a stab at other functional languages in order to form a better opinion.
]]>1 2 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
1 2 3 4 5 6 7 8 9 10 11 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
1 2 3 4 5 6 7 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|