The power of terminology in codebase
A few months ago, there was a controversial article named Goodbye, Clean Code written by the famous Dan Abramov.
The article received multiple angry responses. Some claim that duplicate code is wrong and that we should create an abstraction to reduce duplication whenever possible. Others claim that duplicate code is fine.
Sandi Metz, the famous author of Practical Object-Oriented Design, even claimed that duplication is far cheaper than the wrong abstraction.
There will always be an argument on whether we should leave code duplicates or create an abstraction. In this article, I invite you to take a look at one particular perspective on the cost of abstraction — the term.
Consider this scenario. A company sells goods through both offline and online channels. First, customers receive the price quotation through an offline channel, and then later on they put products they wish to purchase into a shopping cart on an e-commerce system, which is an online channel.
The company has got a tiered loyalty program where customers enjoy different types of discounts, depending on their rank. This is what price calculation in the shopping cart looks like.
class ShoppingCart
def net_price
return price * 0.93 if customer.level == :grade_a
return price - 0.95 if customer.level == :grade_b and summer?
return price - 2000 if customer.level == :grace_c and Time.now.sunday?
price
end
end
Then in the quotation, the exact same thing needs to be done.
class Quotation
def net_price
return price * 0.93 if customer.level == :grade_a
return price - 0.95 if customer.level == :grade_b and summer?
return price - 2000 if customer.level == :grace_c and Time.now.sunday?
price
end
end
So, you see the duplication, and you abstract it to module.
module IamNameless
def net_price
return price * 0.93 if customer.level == :grade_a
return price - 0.95 if customer.level == :grade_b and summer?
return price - 2000 if customer.level == :grace_c and Time.now.sunday?
price
end
end
The class will then be
class Quotation
include IamNameless
end
class ShoppingCart
include IamNameless
end
The code duplication is gone. Hooray!!
I invite you to compare the above code to the duplicated version. Is this new code easy to reason about, or easy to maintain?
Clearly not, since IamNamelessdoes not provide any meaning.
Anyone who reads Class Quotation would ask, what does that module mean? Why is it here? There will be a lot of questions and a lot of swearing if we decide to leave the code at this state.
I would argue that the duplicate version is better than this version.
Until we properly name the module
module Pricing
def net_price
return price * 0.93 if customer.level == :grade_a
return price - 0.95 if customer.level == :grade_b and summer?
return price - 200 if customer.level == :grace_c and Time.now.sunday?
price
end
end
And classes code will be
class Quotation
include Pricing
end
class ShoppingCart
include Pricing
end
Now, this code is easy to reason about and easy to maintain.
The only difference between the first and the second version is the term.
The cost of abstraction
When you introduce any abstraction, you need to name it. You need to create a term for it.
As I demonstrated earlier, if you use a terrible term, then the duplicate code is easier to read and reason.
On the opposite, if you use a proper term, then the abstracted code becomes easier to read and reason.
There are two difficult problems in computer science: cache invalidation and naming
For every abstraction, you introduce a new term into the codebase. That means every contributor needs to learn a new term. That takes time. That is a price you need to pay.
In a large team, you need to ensure that everyone has the same idea of what each term means. It could be by documenting terms or by explaining them to every contributor.
That is what I’ve found most software engineer or software architect often overlooks. We sometimes focus on how to abstract things in the right way, use the correct approach or design pattern.
But just like theIAmNameless example above, the right abstraction with a wrong name can be harder to maintain.
And we don’t abstract or engineer the code just to be right. The “right way” does not matter. We care about these practices because they provide actual benefits to actual humans who are contributing to the system.
If we focus on being right, and we define duplication as wrong, then it is very easy to be the “right guy” by mindlessly removing duplication. You might end up with a problematic codebase.
We need to focus on the cost-benefit of each abstraction. For example:
- If there are 2 duplicated instances, and you need to introduce a long or an unordinary name such as AbstractBeanFactoryAwareThing, then I would argue that the cost of understanding this term outweighs the benefits gained from the code reduction.
- If there are 10 duplicated instances, even a very abstract term makes sense. The benefit gain outweighs the cost.
I encourage everyone to be mindful of these trade-offs.
The price of the term
Have you ever thought about how long it takes for every web developer to understand what controller in MVC architecture means?
Each term has a different price. Some are cheap, while some can be very expensive.
The cheapest type of terms is domain-specific terms. These are terms that are widely used in your business domain. For example, in the payment gateway business we have terms such as refund, void, settlement, etc.
Every developer needs to understand the business requirements before making any contribution to the codebase. It does not matter if they are junior, mid-level, senior, or John Carmack. If you use the term in the business requirement, the chances are that they already know them. As a bonus, you make it harder for every contributor to put a random change without understanding what the business goal is. A huge bonus here.
Next, we have those industry-standard terms. Factory, Controller, Model, View, Facade, Adapter, Component, etc. These are pricier than domain-specific terms as contributors would need to first learn what each of them refer to. Some of these terms are taught in university.
Here comes the tricky part. Many developers naively assume that the term known to them is a common industry-standard term. When you create a class name like AbstractBeanFactoryFacade, you cannot assume that others would interpret it in the same way.
Personally, I always try my best to seek domain-specific terms that can describe the same abstraction before falling back on to industry-standard terms. However, I do see a lot of developers do the opposite.
You can immediately tell that developers have a detachment from the business side when there is heavy use of technical terms in the codebase. I have even thought of creating a metric that measures the ratio between business-related terms and technical terms in a codebase. That metric can give you a sense of how well the developer understands the business impact of what they build.
Next, we have codebase-specific terms. So you have an abstraction that is very specific to your system. You need to invent a new name. Some maintenance cost comes with those names. You need to explain it to the team. You need to document it. You need to make sure new joiners don’t misunderstand it. My advice is to be mindful before you introduce this type of term.
The most expensive type of all is inconsistent terms. When one word has multiple meanings, things can get a little complicated. This usually happens as a result of context change. For example, when building an organization management system, the term “promotion” in the human resource module would refer to an employee stepping into a new position. In the sales module, the same term would refer to a product discount plan. For this type of term, you need to draw a clear context boundary. There is a concept in a domain-driven design called context mapping to solve this issue.
These will be a general guideline.
The power of terms
Now, you might think that it is always a bad idea to introduce codebase-specific terms, but I can tell you another story.
At Omise, we have a CordeliaService.
What is it???
That’s another story. What it actually does, does not matter in this article.
When you use this random term, it naturally causes the contributor to think twice before making a change. They need to be really really careful and look into all the details before taking action.
That can be both good or bad.
A term like the above can be used to emphasize the fragility or force contributors to pay extra attention in certain areas.
What if your codebase consists of many many unusual terms? Will it have the same effect? Will it cause people to think twice before writing every single line of code?
Well, you can try. All I can say is good luck with that approach. When everything is important, then nothing is.
Terms can be a tool to emphasize and de-emphasize.
When you use business-related terms, you force yourself and contributor to understand the business requirement deeply.
When you use industry-related terms, you force yourself and contributor to know a specific framework or engineering practice.
Each term you use in a codebase gives you the power to attract or detract a specific type of contributor. Each term you use in a codebase incentivizes a certain kind of learning, expertise, and behavior.
And that, my friend, is the power of the term.
Takeaways
In this article, I want to highlight three points.
First, code duplication can be either cheaper or more expensive than an abstraction. One of the main factors here is not just how you abstract it, but how you name it. Bad naming makes even the right abstraction, according to the best practice, harder to maintain. We often talk a lot about how to abstract things in the right way. But naming can be equally or more important than how you abstract. This factor is often overlooked when we talk about writing maintainable and well-designed code.
Second, I demonstrate the power of terms. You can attract a particular type of developer, a certain kind of expertise, or learning just by controlling what kind of terms are acceptable in your codebase. My general rule is that business-related terms always comes first. I want to work with people who understand the business impact of every line of code they work on, rather than a developer who transforms business goals into an engineering playground and dumping a lot of random engineering good or best practices to prove their ability as an engineer.
Thanks for reading!
詳細はこちらから
購読してくれてありがとう
メールにサインアップしていただきありがとうございます