Home
Bio
Research
Publications
Teaching
Software
Students
Advice
Misc
Department homepage

What makes a complete computer scientist?
Advice for CS students
What skills and experiences make a strong foundation in CS? How do you
know what's needed, what courses to choose? I'm going
to address what can be acquired in the educational system, as opposed
to the experience one gains developing large software projects in
industry (which we can't teach in academia).
I believe you aren't a strong graduating computer scientist
(BS/MS/PhD) unless:
 You are comfortable programming in a highlevel language.
You don't need to be familiar with particular APIs, but you ought
to be confident writing tens of thousands of lines of robust code.
 You have a strong understanding of how hardware works.
That is,
you've written assembly code at least once, you understand how
a CPU works, the basic instruction fetchexecute cycle, what
code might look like in binary. You also have a sense of
how it works at the lowlevel (gates, combinational and sequential
logic, a basic statemachine, microprogrammed control)
and at the highlevel (virtual memory, caching, TLBs, disks,
DMA, bus protocols).
 You understand how compilers and language runtimes work.
You don't need to have
written a compiler, but it certainly helps. You should have written
a parser for something, preferably using a parser generator,
understanding both lexical analysis (and its relationship to
regular expressions), and parsing (and its relationship to grammars).
And you should have seen how code is generated, and optimized.
And you need to understand the runtime support, such as garbage
collection, needed by modern languages.
 You have a strong understanding of memory as its used
in higherlevel languages. For example, you should understand
exactly what's global, on the stack, and on the heap at any
time during the execution of:
int foo (SomeObj s, int n)
{
if (n == 0) return 1;
int x = s.getX ();
return x * foo(s,n1);
}
public static void main (String[] argv)
{
SomeObj obj = new SomeObj (10);
print (foo (obj), 5);
}
It might seem odd to single out memory from a general
understanding of languages and architecture, but it is fundamental
to efficient programming and successful debugging, at least in
today's popular highlevel languages.
 You can think algorithmically.
You are familiar with ordernotation, and you are able to quickly
assess the running time of most snippets of code. You know the
core data structures well, know how they are built, and how
to use them in problems. You've been exposed to the classic
algorithms in a standard course, the theory of combinatorial
problems, and know how to write pseudocode with just the right
amount of detail so that someone else can code up your
algorithmic idea.
 You have a reasonable understanding of our corner of math.
The CS corner of the mathematical world, beyond the
basic discrete math needed as background, is the stretch
of material ranging from finite automata to Turing machines.
That is, you should be able to:
(1) describe a finite (deterministic or nonD) automaton,
its connection to regular expressions and regular grammars/languages,
and its use in hardware;
(2) do the same for pushdown automata and contextfree grammars;
(3) describe the language hierarchy;
(4) explain how a Turing machine works, and what it has to do
with the halting problem, and what it means for a problem
to be undecidable.
No, you do not need to be math wiz and be able to prove theorems
on your own, but you ought to be able to decipher notation,
and explain the key ideas.
I haven't listed the math topics needed in some CS specializations,
such as number theory for crypto, or
parts of the calcdiffeq sequence for robotics. These can be
acquired based on interest.
 You've had good courses in at least one (and ideally
two) of these three
CSrelated areas of math (in order of importance):
 Probability. Here I don't mean the insipid
intro stats course taught across the university, but a real
probability course where you will work with random variables,
distributions, joint distributions, the CLT, and ideally,
also see Markov chains. Core probability shows up in so many
places, for example: machine learning, stats, algorithms, computer systems.
 Linear algebra. The notideal course is the standard
math department course that uses this material to introduce
students to proofs. This makes the linear algebra course
abstract; many students who get an A in such a course say
they still can't tell me what the course was about. Far better is a
course that shows how linear algebra is applied, either
through computational or engineering examples.
 Logic. Logic courses often come in one of three
flavors: (1) the classic logic course that traces a path
from propositional logic to undecidability;
(2) the very computer sciencey course that combines
logic, semantics and programming languages, often with
supporting theorem proving tools; (3) an algorithmic
course that includes fast SAT solvers, verification,
LTL/CTL, theorem proving search strategies, and the like.
It would be nice to have a single course combining elements
of all three.
 You know how computer systems work and have
done some systems programming. For example, you could
programming an embedded device, or develop a distributed
application on top of sockets, or write a piece of an OS.
 You understand and are
comfortable programming in the dominant paradigm
of the day. Today's dominant paradigm is the web frontend,
dbase backend. This requires understanding in detail the whole
sequence from a browser click, to retrieving tables, computing
joins, and returning results aggregated into a page. And being
able to build such a simple web application using a modern
framework.
 You appreciate how difficult it is to build
robust complex software.
This means you've completed at least one big project on your
own or a team project in which you've had a major role, and in
which you've been through several stages of design (highlevel,
UI, specs, details, testing) and gotten complex packages to work
together. But it also means you've developed a healthy respect
for the work ethic needed to develop good software: good design
principles, an attention to detail, pride in craftsmanship,
plenty of defensive code, strong use of powerful dev tools,
and the humility to know that your code does have bugs.
 You understand our profession and your professional
obligations.
This means you know not just your little corner of web development,
but what other areas of CS are like, are familiar with
other career paths through CS, and the interactions between
CS and other disciplines. This means being familiar with
our rich history, our
professional societies and standards, the ethics of our
discipline, understanding that software development is a team
effort, and that mere coding mastery is only one piece of
the large puzzle of software creation, and only one piece
of being a complete computer scientist.
