Sunday, December 16, 2012

Oh noes, another post on naming conventions?

Hi, today I'd like to write about a naming convention that I picked up some time ago and which help me to write more understandable code and better focused code.

Just to make things clear - I'm not going to evangelize anyone that what I suggest is "the best" or "the right" thing to do. I just want to share what works for me.

Ok, now that I explained myself, time for the rule and the rule is simple:

Name your classes and methods after domain concepts

Did I just hear you laughing and shouting "Don't say!"? Just wait a little before leaving, I wanna explain how I understand this rule. All I want to say is: when you name your classes and methods, use names that would be understandable to a domain expert, not to only to a programmer. There are couple of things to keep in mind when following this rule:

Watch out for 'er' or 'or"

I don't know why, we, software engineers, suffer from a cognitive bias that suggests us that each class we create must be a kind of entity - a material, physical thing. If there's no such "thing" to represent as domain concept, we make it up. For example, we tend to call classes that perform some kind of processing "XYZProcessor". Also, when we want to validate anything, we create "validators". When we need a special way to run part of logic, we create "runners".

In my opinion, this kind of naming has some flaws that I'd like to list below:

  1. I mean, come on! What are these? One thing sure - they're not domain concepts. If you told your user that you're "passing the frame through a validator", he'd be like "What?". The language that's used for communication between programmers and other stakeholders is hurt and requires translations between the "design language" and the "domain language".
  2. This kind of approach tends to create a lot of repetition when using classes and methods. These repetitions make it harder to read code. The words thing that we can get is something like:
    frameValidator.ValidateFrame(frame);
    
    Just read it aloud and answer yourself whether it forms a text that is easy to memorize. Even if we work a little on this example, the best we can come up with is:
    validator.Validate(frame);
    
    Which still has one repetition. While this is just one statement, when running into a code that's full of such constructs, it's very hard to rebuild a domain model in mind.
  3. Also, as I learned, such naming convention makes it easier to cross the boundaries of class responsibilities. For example, let's take a "MessageProcessor". the name points that it has something to do with processing messages, but the relationship is indirect. So, one day, someone thinks "this processor is an actor, so maybe it can do more than one thing" and puts in a method for persisting the messages. I've seen countless times, that if something is not directly related to a domain concept, it eventually gets used for some additional tasks.

Of course, there are well-established names like "Builder" which clearly point to a design pattern used. These are sometimes appropriate. Also, it's good to name your class "Processor" if it really models a processor (e.g. in some closer-to-hardware systems). Anyway, as a general rule, when I encounter a name ending with "er" or "or", I ask myself: is it named after a thing or a process? If it's named after a thing, it's okay. If it's named after a process (e.g. validation, encryption, sending are all processes), there is almost certainly a better name that fits the domain better.

Let me offer some examples of how usually things get named and what naming I usually use (they use variables, but I usually name variables more or less after their classes):

I don't use... Instead, I use...
validator.Validate(frame)
validation.ApplyTo(frame)
httpSessionTimerRunner.Run()
httpSessionExpiry.Schedule()
parallelExecutor.Execute(task)
parallelExecution.PerformFor(task)
serviceManager.DoWork()
serviceWorkflow.Start()
requestProcessor.Process(request)
requestProcessing.ExecuteWith(request)
requestProcessor.CanProcess(request)
requestProcessing.IsApplicableTo(request)
fileNameFormatter.Format(fileName)
fileNameFormat.ApplyTo(fileName)

I learned that such names improve communication, help sharpen focus of a class and build a better conceptual domain model from looking at the code.

avoid putting implementation details inside higher level abstractions

This most often applies to abstractions created around collections. For example, let's take a tree-like structure, where each tree node has its children. Sooner or later, we discover that we need a separate abstraction over the set of children nodes, because we need some customized search criterias etc. for the child nodes.

How would you name such abstraction? ChildrenSet? ChildrenList? Why not just call it Children - after all, that's the concept from the domain that this abstraction models, doesn't it? Why the need for "List" or "Set" or anything else? Why reveal the internal implementation detail which is how the elements are stored? What if one day, you decide to split the set of children into two sets internally - one for terminal and one for non-terminal children - to speed up searching (if you know upfront that a node is terminal, you don't have to check its children at all)?

In rare cases, name objects and classes after verbs

Do you know Assert class from XUnit-type frameworks? Ever wondered why it's called "Assert" and not "Asserter" or "Assertions"? That's because it is a part of a mini-language and helps forming statements that read more like sentences, which improves readability. The Assert class is focused so much on a single task that it makes perfect sense to call it after a verb, not a noun as it is usually suggested. In the long run, it made possible forming expressions like this (in nUnit):

Assert.That(X, Has.Property("Y"));

Personally, I tend to apply this convention to small factories and other small objects that have a very strong focus. Let's say that I want to create credentials from User object and this object has either user name or account name filled. I prefer to create the credentials from account name because it's globally unique,but if the account name is not supplied, the creation method returns null and I create a less-privileged credentials based on user name, which is only unique locally. Thus, I may create the credentials factory like this:

CredentialsFactory createCredentials;

And then use it like this:

var credentials 
  = createCredentials.BasedOn(user.Account) 
    ?? createCredentials.BasedOn(user.Name);

Again, I don't use this too often, but I sometimes do.

Ok, that's all for today, feel free to leave a comment and share your tips for naming abstractions, variables and methods.

No comments:

Post a Comment