Credit: Creative Images Lab/Getty One evening, my partner Boyan Li sat at the kitchen table marking student submissions for a coding course he was teaching as part of his PhD at Harvard Medical School in Boston, Massachusetts. The assignment required students to implement a computational-biology algorithm on a given data set. Each submission demanded more than a quick check. He ran the code, examined the output and traced the logic line by line. Some submissions were clearly correct; others were clearly wrong. But many fell into a grey zone: they were partly right, but uneven in their execution or reasoning. These were the hardest to assess, and the most time-consuming. As a higher-education researcher, I watched this process with professional interest. What seemed to be a purely technical task — running code and checking outputs — was revealed to be deeply interpretative. Assessing coding assignments involves deciding what counts as understanding, what counts as error and how much variation is acceptable.…