Intention-Revealing Methods Can Save Your App
Read on if: your methods are huge, buggy, hard to read/fix/change or painful to test.
We've all written code like this:
def process_data
raw_data = load_data()
clean_data = raw_data.map do |row|
%w{ name address phone email }.each do |field|
if (row[field].nil? || row[field] == "")
row[field] = "Unknown"
end
end
row
end
persist(clean_data)
end
We've got some data, we know we need to clean it up before we can safely persist it, so we just clean it at the point we need it cleaned. It's the simplest thing to do.
The problem is that in order to realize that we're cleaning (the variable name does give a hint), or to understand what "cleaning" means (when we want to change it or fix a bug) we have to analyze the code to discover that "this is the bit that guarantees all required fields have a reasonable value in them."
Also, cleaning in-place obscures the overall behavior of
process_data
, which is: load, clean, persist.
What if we did this instead?
def process_data
raw_data = load_data()
clean_data = raw_data.map { |row| fill_in_required_fields(row) }
persist(clean_data)
end
def fill_in_required_fields row
%w{ name address phone email }.each do |field|
if (row[field].nil? || row[field] == "")
row[field] = "Unknown"
end
end
row
end
Wrapping the ugly bit in a method whose name explains its purpose makes the code self-documenting and cleans-up the calling method. We call this pattern "Intention-Revealing Method".
Let's not kid ourselves: all the complexity of filling-in
fields is still there, but it's been moved to a method
that only fills-in fields for a single row, which is easier
to think about because we can ignore everything else
that process_data
is doing.
It gets even better when there are multiple steps to get a "clean" row.
def process_data
raw_data = load_data()
clean_data = raw_data.
select{ |row| not_in_system(row) }.
select{ |row| is_active(row) }.
map{ |row| fill_in_required_fields(row) }.
map{ |row| only_relevant_fields(row) }
persist(clean_data)
end
# ...all support methods
Imagine how large and difficult-to-read the last
example would be if all the logic was implemented at the
point of use! Now we can see the "big picture" view of
the process_data
behavior and if we're
interested in any of the specific aspects, we can jump
to the method that implements it.
We've also created new testing opportunities: we can pass a sample row object into any of those methods and ask "does this particular bit work?" without worrying about any of the other bits messing up our results.
Let's take the final step.
def process_data
raw_data = load_data()
clean_data = DataCleaner.clean(raw_data)
persist(clean_data)
end
module DataCleaner
def self.clean data
data.
select{ |row| not_in_system(row) }.
select{ |row| is_active(row) }.
map{ |row| fill_in_required_fields(row) }.
map{ |row| only_relevant_fields(row) }
end
# ...all the supporting methods
end
Now process_data
is distilled down to its
essence (load, clean, persist) and the details of what it
means to clean data is neatly gathered in a dedicated module
called DataCleaner
which can be extended and tested
independently of process_data
.
DataCleaner
knows nothing about getting or persisting
data and the only thing process_data
knows about "cleaning"
is that DataCleaner.clean
knows how to do
it. DataCleaner
can be tested with any data we
can fake up and doesn't require it to be loaded or persisted,
greatly-simplifying our setup and results-checking.
Choosing good names
Of course, it's still possible to ruin all our good work by choosing names that are brittle, misleading or not helpful.
Good names describe what we need done, not how to do it. Keeping this in mind leads us to names that are "future-proofed": they won't have to change if the underlying code changes.
Good name: not_in_system
Bad name: cant_find_id_in_mysql
The bad name breaks if we change how we determine a row is missing, or if we switch from mysql to postgres or redis.
Good name: fill_in_required_fields
Bad name: fill_in_name_and_address
The bad name breaks if we add/remove required fields.
Takeaways
-
When you realize you need a bit of behavior at a certain spot, write something to do it for you (a method or even a class/module) with a name explaining the behavior, then call that thing from the spot you need it.
-
When choosing a name, explain the behavior in terms of "what" and not "how" in order to future-proof your code.