Why Science Students Should Learn Some Computer Science

Rahul Simha

While science departments already require their undergraduate students to take courses in other science disciplines, hardly any are computer science courses. Let's first consider some of the more obvious, practical reasons why you should take computer science courses. Some exposure to computer science, and I don't mean an intellectually insipid "Word, Excel and the Internet" literacy course, helps you understand how software and hardware work, and leaves you comfortable with the kinds of software tools increasingly essential to research in any science discipline. Moreover, you will also be able to make better use of software tools, accomodate new interfaces, understand their limitations, and even customize open-source software to your needs.

I want to focus on the deeper, intellectual reasons for learning computer science. The first of these is to arm yourself with a major new problem-solving approach: computational problem-solving. Let me illustrate with an example. A biology colleague once approached me with the following problem:

"I've devised a probe to test for the existence of certain genes. Each time the probe is applied, the output shows the occurence of one gene from among a set of 20. Sometimes it's gene 1, sometimes it's gene 5 ... etc. The chances of getting gene 1 are quite high, about a 20% chance. The chances of seeing gene 2 are about 5% ... I know these percentages approximately for each of the 20 genes. How many probes do I have to apply in an experiment to be sure that I've seen every one of the 20 genes at least twice?"

Now, this problem is known to mathematicians as the coupon-collector problem, one of a large class of discrete statistical problems known as urn models. A simplified version can be solved analytically, but the solution is not pretty - it has a lot of gory math. Now, what's important is not that it's hard to solve mathematically, but that it's hard to recognize, even for a mathematician. Even after identification, the problem is not easy to find in books; one has to know where to look.

What does this have to do with computer science? A couple of CS students, taking their third CS course, were able to solve this problem in about 45 minutes. For reference, here's the program, written in the Java programming language. The example quite nicely illustrates the power of computational problem-solving: many problems that are hard mathematically are actually quite easy to solve numerically on a computer. In many cases, the programs tend to be small, easy to modify, and do not require a degree in CS to create. There is a large class of problems amenable to computational problem-solving in this manner, including those solved by simulation, Monte-Carlo type estimation, numerical integration and optimization, for example. A few years ago, one of my students simulated the spread of an epidemic and studied the effect of vaccinations on the rate of spread. This is another example of a problem that can be solved analytically with heavy-duty math provided simplifying assumptions are made, but that was computationally solved without those assumptions by a student who'd taken just three CS courses.

Beyond computational problem-solving, a second intellectual reason to learn computer science is to get some exposure to algorithmic thinking [2]. While taking a few CS courses may open the doors to computational problem-solving, exposure to a course in algorithms teaches you how computer scientists think about problems: how they formulate problems, how they've solved classic problems in their field, and how they've abstracted principles out of computational problem-solving into problem-solving paradigms.

So how much CS is enough? For the sake of discussion, let us identify three levels of CS skill: (1) Freshman: the now-standard CS1 and CS2 intro courses; (2) Minor: CS1, CS2 and 3-4 additional courses, including one in Algorithms and one in Scientific Computing; and (3) Dual-major: a combination of a science major with one in CS. Probably, what makes sense is for students in Biology, Chemistry, Geology and Experimental Psychology to take the two Freshman-level courses and then see if they wish to pursue more quantitative academic career tracks. Students in the more quantitative disciplines such as Physics, Mathematics and Statistics ought to seriously consider taking courses up to the level of a minor. Such a student should have no trouble solving the above coupon-collector problem computationally. You should also consider taking, if available, a dual-major with computer science, to become the kind of future interdisciplinary scientists described in the recent high-profile reports by the NAS or NIH [3]. Keep in mind that programming and computational problem-solving is a slowly-acquired mental skill, equivalent to mastering a musical instrument or chess. So expect some "pain" initially, and don't expect to become an expert with just a couple of courses. But, like any hard-won skill, there's a long-term payoff. In this case, it'll help you become more competitive in your own discipline.

References

L.Holst. Extreme Value Distributions for Random Coupon Collector and Birthday Problems. Extremes, 4, 129-145, 2001. (cached-copy)
F.Olsen. Computer Scientist Says All Students Should Learn to Think 'Algorithmically' . Chronicle of Higher Education, March 22, 2000. (cached-copy)
See these two key national reports. (1) BIO2010: Transforming Undergraduate Education for Future Research Biologists. http://books.nap.edu/catalog/10497.html (2) NIH Roadmap: http://nihroadmap.nih.gov/. See also: W.Bialek and D.Botstein. Introductory Science and Mathematics Education for 21st-Century Biologists , Science, 303:5659, pp.788-790, 2004.