For a long time programming has been a “means to an end” kind of thing. Like math, I was never really interested in full derivation of the proofs of the theorems we were doing in class. I just did the math and I was damn good at it, if I might say so, but I don’t think I would be happy being a pure mathematician. Similarly, I enjoyed my programming classes in college very much. It was sort of a game for me who would write the best (efficient, working) programs with the least amount of lines. I always enjoyed that challenge.
As such, when it finally came to research, I’m happy to say that I did have the tools to tackle the daily tasks in programming presented to me. I had the techniques of algorithmic approach in me to take on large databases, for example. It was only a few years later that I started to realize that my programs were pretty bad. Oh, they worked perfectly and fast, but no person but me could actually use them (and maybe not even me 5 years from now). They were horribly documented, were unreadable and required the input data to be in an extremely narrow way specified by me. I realized this when I started to publish and people were starting to e-mail me asking me how I calculated the Gini coefficient or Balnicities or similar. It dawned on me that I would be a lousy programmer at a software company or for everything that involved large collaborations.
This was very present in my mind when I came to Germany to work on the Data Reduction Pipeline for MUSE. I was (and sometimes still are) scared that people were going to dismiss my work, since it was documented so lousily. I’m actually quite happy that I have been tasked with the documentation of the pipeline, because that is allowing me to approach this very rigorously keeping in mind that everybody in our collaboration should be able to use the software. In german, we have the acronym DAU – “Dümmster anzunehmender user” (dumbest possible user) as a wordplay on GAU – “Größter anzunehmender Unfall” (greatest possible accident) from nuclear power stations. It relates to program “usability” in general.
I was then quite excited when recently a paper was published on “Best Practices for Scientific Computing” by Aruliah et al. It exactly describes the problems that are facing us astronomers more and more. In a world were author lists are growing, survey data often out-trumps single observations and ever larger and more complex observing instruments, we are increasingly faced with programming – and not easy peasy programming at that.
A lot of my fellow colleagues think that ever astronomer nowadays should take at least one year sabbatical to *really* get into the world of computer programming, not just your quick shell script to modify your ascii table, but really rigorous scientific approach to it. I don’t quite share that extreme of a view, but if you check out the “whatmyfriendsthinkido” meme for astronomy going around at the beginning of the year (see above), it is a quite accurate depiction of my laptop screen most of the time. There are variations of this meme, but the main point remains the same.
Anyway, but back to the article. I had sort of an epiphany last week. I was reducing actual MUSE lab data – lamp tests, pinhole mask tests, geometry issues and the like. If you look at the image below, you’ll see that we still have a long way to go… uff. The important thing is that there were on the order of 50 files and there were not ordered particularly well, you had to look in the headers to find the type of observations, to see which BIAS you had to use – very chaotic. Part of me was screaming to just go through them by hand and just run the pipeline on them one by one, since I would have to use different biases, arcs, flats and the likes, plus they were totally named with different identifiers.
But I remembered that article that I should not keep on doing repetitive tasks. So I soldiered on and 3 days later and many google (stackoverflow and similar) searches later I had a working program on which I could just hit “Enter” and let it run over the weekend. Now, perhaps running it “by hand” might have taken the same 3 days it took me to write that automation, but I had so much fun doing it. My office mate even learned something new relating to bash. I went home that weekend so satisfied. And I was proud of my little program, ready to tackle the next round of testing!
And then I remembered the legend that “DVD Jon” once said (I think it was him that said that, but I’m not entirely sure) that he had way more fun developing DeCSS than later watching the movies derived from that program. In a way, I learned that way of thinking from my father, too. He builds molds for the plastic industry, mostly complicated thermoforming ones. But that challenging work is way more fun than the final product that just gets stamped out by the millions.
I don’t know, it was such a well written and easy to read article, just wanted to give out an endorsement of it again and urge you to read it!