April 21, 2021•blog
Calling some code “Research-grade” is a euphemism for “it’s a mess”. It happens because we generally expect that we’re going to write this code once, run our experiments, and then throw it away. But that’s not how research works: we’re often working out the exact hypothesis to be tested while simultaneously developing the techniques needed to verify it.
It is often the case that, in pursuing a single hypothesis, we have to develop and discard a series of sub-hypotheses and techniques before we find the full set that works. In that sense, we’re writing code to reach a target that constantly changes, sometimes even before we’re reached it. Parts of code become outdated, and functions you are writing change what they need to do, sometimes even before a previous rewrite is complete.
It’s crucial that we can trust our code as we run experiments. A false-negative experiment at the wrong time will cost you time, and could cost you a paper, an entire discovery, or more. Testing is the only good way we have to believe what your code tells you.
When stuck with some incredibly complicated research code, I wrote a short guide to testing in research, and how to get neural networks to actually learn something. You can read it as a PDF. If you’d like to make changes, there’s also an editable Google Doc version. If you use it, I’d appreciate a link back!