Posted on June 8, 2017 by Gabe Parmer
Quite a bit of information exists on the internet for how you should perform Pull-Requests (PRs), and how to conduct a code review. This post discusses one vision about how these software engineering practices fit into a research group that is building a significant body of code. I’m not an expert in any of these areas, but I have seen what works, and what doesn’t work in support of sustained research.
It is rarely obvious why software engineering practices are necessary or beneficial to developers that haven’t yet worked in large teams. These practices provide “processes” that restrict the methods and means of software development in a team. Let me say clearly: they are necessary, and they are beneficial. When someone is maturing as a software developer, they often realize that simply writing code isn’t the hard part. The hard part is writing code that is understandable, debuggable, and maintainable in the longer term. The goal is often for this code to be part of our shared infrastructure. Importantly, it is important to realize that the code you write will not be used only by yourself. Thus, you’re writing code that needs to be understandable, debuggable, and maintainable by others. That is a pretty high bar.
A big part of software engineering can be seen as a set of conventions and processes overlayed onto our normal process of coding. They are designed to produce code, documentation, and development histories that are high quality, and can be productively used by every member of the development team. At least that’s the goal.
In academia, we have a set of constraints that work against code quality (races for paper deadlines, and lots of perceived throw-away evaluation code), documentation (it doesn’t feed into someone’s PhD), and development histories (no mandated version control usage patterns, and little training in how to use version control). So I’m going to discuss a process that we apply to our group’s development, and how it attempts to enable a long-term, sustainable infrastructure. We don’t do this perfectly, but this post is meant to give the rational and expectations behind this process.
Many development teams in industry have extensive processes for development aside from the actual code. These include a fairly strict regime including commit, PR, and code-review guidelines, continuous integration and unit testing, automated bug tracking and lifetime management, scrum and structured meetings to structure short and long term goals of the group and individuals, etc… Though some of these have been adopted by academia (for example, scrum for academia), we are not necessarily creating a production-stable API as research requires agility, and we are a small team with members that are often working on separate projects, so don’t necessarily need as much cross-team coordination. However, the notions of curating our code through structured PRs, code reviews, and unit tests/continuous integration are necessary for the code-base to have any longevity. This document discusses the first two of these.
For a software infrastructure to have any longevity beyond a single student, a group needs to be involved in the shared development of the system. Having implemented a feature that improves the infrastructure, it is important to get some buy-in and quality control from the group before the code is pushed to mainline. The main goal of producing infrastructural code is to generate maintainable and readable code. Thus, it is naturally quite useful to get others from the group involved in the process of generating your code. This is a way to get a sanity check that the code that you wrote to be readable, actually is readable.
The main goals of our code reviews are to:
Given this, how does one actually conduct a code review? I typically think of it as requiring two passes.
Code review first pass. When doing a code review, what should you focus on? What are you looking for, and what should you spend your effort on when doing a review? The first pass should focus on the following.
assert
ions added to check them? Try and to understand these, and ask the author questions about them.Code review second pass. It is often surprising for people to know that the core of a review actually revolves around the first pass. Any additional effort often focuses on the structure of the code. Should files be broken up? Should functions be broken up? Should an abstraction be refactored? This list is obviously much smaller than the first, but it requires that you go through all of the code before you can make a reasonable mental model about both the code’s structure, and what other potential structures might be. I’m going to avoid going into what might constitute a better structure here, and leave that for the CSG and for a future post.
Put simply, reviews provide neither testing, nor thorough evaluation of logic. It is expected that every time someone does a PR, it is for well-tested code that follows the guidelines for code craftsmanship and the conventions in the CSG. Solely reading code will not yield a deep enough understanding of the code to fess out complex logic problems, or ensure thorough testing. In this way, a code review is a process that focuses on issues (style, error checking, test coverage, and complexity) that are a proxy for refined, well-tested code. In this way, they are complementary to proper testing.
Before a code review is performed, a PR must be generated. It is very important to understand the responsibilities of a researcher submitting a PR, and the goals of the PR. To understand many of the goals and conventions around PRs, you have to understand how to use version-control. See the Appendix for tips on using git
.
PR intentions and responsibilities. Using git
“correctly” is one thing, but submitting code to be included in the main repository is another. You should submit a PR on your work when you reach a point of tested completion. You should plan any high-level feature that you’re adding as a set of successive, more manage-ably sized PRs. It does take some thought to figure out how to break your task into smaller constituent tasks, but this is something you really have to be thinking about anyway. It isn’t feasible to do a reasonable code review on a large PR because it requires too much accumulated mental state. This is similar to why we can’t just write thousands of lines of code at once; we must consume smaller, sequential problems. So if we take the code reviews into account, it becomes clear that we aren’t just writing code; we’re writing code in a digestible manner.
Studies have been performed that found that defects were most successfully found if code reviews are performed on 400 lines of code or fewer. It is not a good idea to send a huge PR and expect a good result. The likely result will be “please break this up into separate PRs that can be reviewed separately”.
It is very important to understand that when a researcher submits a PR, they are asking other people to spend their time on the author’s behalf (i.e. to help improve the code). Given this, any perceived laziness on behalf of the author is often not received well. Not spending the time to go through a researchers own diffs before submission to ensure that obvious errors are purged is disrespectful of the time of the people doing the review. It is inherently saying that the reviewer’s time is valued less than the researcher’s. This is, quite obviously, not OK. Don’t do this.
Given this perspective on code reviews and PRs, our research group has a number of policies:
@
-mention your fellow researchers. A good general rule is to @
-mention at least two colleagues.@
-mentioned, and no-one else has done a review, strongly consider it. If someone is working in your area (in an area where you’re familiar or want to learn about the code), then ask to be @
-mentioned.What tools should you always consider when making code modifications and additions, and when integrating with other’s changes? With git
, the obvious commands that you must integrate into your development cycle include:
git diff
, git diff --stat
, and git status
which all let you understand at different granularities what changes have been made. Before you commit code, you should likely execute all of these commands. You do this for two reasons. First, you want to make sure that only code modifications that you want in the commit, are in it. Each commit should be relatively focused, so it is important to catch this. Second, it is a good way to remind yourself what the commit contains so that you can better summarize it in the commit message. It is also useful to know that you can execute diff
on ranges of commits to see what changed (see git log
to see the list of commits).git commit -a
, obviously. Do not, do not, use the m
flag. You should write a “full form” message, not a single line. The focus of your message should be on 1. summarizing the commit in a single-line (which is displayed along with the commit in Github), and 2. an explanation of the details of the commit. The latter should explain the intention, and often includes an itemized list of the main aspects of the commit. @
s should be used to reference any issues it addresses.git branch <b>
and git co <b>
to use and manage your branches. Get used to using branches. Branches are useful for when you’re trying to debug a problem, or implement a new feature. To see how they’re useful, it is important to realize that it is not uncommon to get distracted, and have to redirect your attention to another feature or bug. Without branches (and the ability to roll-back to before your work on the feature), you end up with a commit history of half-finished features intertwined with bug fixes and other features. This makes your PRs schizophrenic and difficult to review. So by default, you should always be developing on a non-mainline branch. When you go fix a bug, or work on a new feature, ask yourself if it deserves its own branch.git stash
is an acknowledgment that we mess up with our branches. If you find that you’re somewhat accidentally developing a new feature, or fixing a new bug, but have a bunch of unstaged changes on the current branch, you can stash
them, create a new branch, stash pop
the changes into the new branch. I’m sure there are other ways to do this, but this example at the very least demonstrates how stash
can be used to delay a current set of changes.git rebase
enables you to update a branch to an updated master, and to “squash” multiple commits, and clean up your commit history. When the master progresses beyond the point where your branch split from it, you generally want to fast-forward merge your changes onto the new master. git rebase master
is your friend here. It is generally superior to git merge
as it maintains a cleaner (linear) history. git rebase -i
allows you to rewrite history by combining different commits. This is very useful as it enables you to have a set of commits that tell a story, and don’t have a number of “in progress” commits. My suggestion is that you label your commits that you’ll likely want to get rid of in the future with TODO
as a header on the title line. More information about this can be found here and here.