Posted on March 7, 2016 by Gabe Parmer
Composite is a relatively large code base, and is our play-pen for research. It has a number of large external libraries (including a libc
), and has a reasonable amount of kernel code, as well as distinct user-level component implementations. It isn’t huge. For example, it isn’t on the scale of the Linux kernel, or even close to a modern browser. However, it is much larger than most hobby projects (> 100K Lines of Code). Importantly, since it is our vector for research, there is a fairly large amount of code churn in the foundations of the system. Our goal is not maximal code stability, or to asymptotically approach the “best” design. Instead, it is to investigate the function and benefit of changes within the system, especially with respect to existing systems (e.g. Linux).
Due to its research nature, and the cruftiness of some of the (often quite old) code, I think that there is a natural feeling that a developer can create throw-away code. Throw-away code is that which is not well designed, or not well crafted. We have a fairly elaborate Composite Style Guide (CSG) that defines what “well crafted” means in this context. Most projects of this size have a comparable document, though they are often less totalitarian. The goal of our CSG is to maximize for the readability of the code. It focuses on style and general code structure, rather than design, as design is best addressed in discussions, and should not be codified. It is common for me to receive a pull request (PR) that includes code that does not abide by the CSG. Often the justification is that the code is throw-away code, or “isn’t important”.
This post is a quick commentary on the importance of code craftsmanship which, most generally, is an emphasis on the quality of the code we produce. It is an emphasis on adhering to rules of style, and “good taste”. It is always thinking about “is it easier to understand the logic this way, or that”? It is a realization that just getting the code to functionally work is not sufficient. It is an emphasis on the proper use of abstraction to avoid redundancy, and clarify intent. So why do we care about code craftsmanship in a research project that a. has a lot of churn in core subsystems, b. has a lot of experimental code, and c. has built in modularization to hide ugly code? In the end, the alternative is just not tenable.
The idea is generally that some code that we generate does not have to be good because we’ll use it now, and never see it again. System administrators often love perl because it enables this style to shine. And it generates completely intractable code. That is often both OK, and beneficial when you really are making “one-offs” for system administration. It is less OK when working on system-level code.
I don’t think that the notion of throw-away code applies in systems. Either, you’re making a system-level component that can be used (thus read) by multiple clients, or you’re generating a client, which will likely be treated as example code for someone trying to write their own client. Either way, your code will be read, and likely copied. Microbenchmarks and unit tests feel like they are throw-away. Write them once, run them when you build, but don’t worry about the quality. Unfortunately, they are often used as examples of how to use APIs, so in some sense, they are some of the most important code to get right.
The ugly, intermediate versions of your code should never see the light of day. They do exist, but once your code is immortalized in a commit, it should be clean.
It is often natural to think that the code you’re writing right now is an intermediate state, and later you’ll clean it up. There will certainly be intermediate stages where the code gets messy. However, when you want to commit code, share code, or generally start to build off of code, it should be cleaned up.
Implicit in the idea that some code is “good” and some can be “bad” is the notion that when coding, we are writing one of the two types of code. Good code might be read by someone else, and possibly used by them in future development. Bad code will not. When we sit down to code, we choose which style we’re writing in.
I’ve always found the mental separation of these two types of code as post-hoc. When sitting down to code, I don’t think “this code is going to be good”; I sit down and try and solve a problem.
Code exhibits good craftsmanship because we take the extra time after finding a working solution to make it clean.
My suggestion is that we should always try and make our code exhibit high degrees of craftsmanship. It shouldn’t be a question, or a thought. It should be a given. Taking time on the craftsmanship of your code is not a serious commitment. However, it does take a slight change in your focus when writing code.
All code that enters into the repo has to pass code review. Code reviews are important as they make sure that everyone who is committing to the repo is on the same page, use good design, has the same goals, and adheres to standards. How does bad code relate to the code review process?
When someone submits a pull request with messy code, they are prioritizing their own time over the reviewer’s time. This is almost never the correct trade-off to make.
Simple as that. If you’re asking someone to commit their time to reviewing and improving your code, you should be making their job as easy as possible.
Were all code to be committed without any focus on code craftsmanship, the system would surely devolve into an unproductive and unapproachable mass. Opening the door to some code being of lower quality opens up the subjective assessment of if my code needs to be high quality. Laziness and expediency corrupt the integrity of this decision. Thus, if long-term stability of the overall code-base is a goal, the law must be that we each pervasively foster and encourage our own code craftsmanship.