We tend not to emphasize stylistic details in CSE 341 because (a) we are focused on the semantics and (b) we provide a lot of sample code you can emulate. Still, a style guide can help answer some common questions and motivate some stylistic choices. It can complement the more semantic focus on keeping your code straightforward and concise, using well the features we study in class.
Code is how we communicate our intentions to a computer, but it is also a written artifact we use to communicate with other humans. In this course, you are communicating with the course staff who are grading your work. In other situations, your code will be read by classmates, collaborators, coworkers, and (maybe most importantly) yourself in the future, after the problem is no longer fresh in your mind. The underlying rationale for any style guide is to help the person reading your code to understand it more easily.
The syntax and semantics of your code says what it does, but good style will also help you communicate why it does what it does. Choosing good variable and function names is crucial for this. Choosing variable names that communicate something about the purpose or the type of the variable is helpful.
Another way of clarifying your intentions is to comment your code well. Note that comments should only communicate information that isn't explicitly available in your code -- don't just summarize what the code does, but add information about why a function is needed or why a particular design choice was made. Keep in mind that under-commented code won't be clear to your readers, but over-commenting can make code cluttered and hard to read.
Important node about comments: In your homework assignments, always start each problem with a comment containing the problem number. This helps us when we are grading your work!
Redundant code is harder to read and harder to debug. Remove any variables or function arguments that are never used. If you find yourself writing the same code twice, think about how you can refactor in order to reuse a single piece of code instead. This could mean streamlining a recursive function, or binding an expression or function to a variable for reuse. (You have probably noticed that the homework problems sometimes build on top of each other.)
Also, redundant calculation can incur performance costs. If an expression is evaluated and its value is needed elsewhere, bind the value to a variable rather than evaluating the expression twice.
A secondary rationale of style guides is to short-circuit pointless arguments; it doesn't particularly matter if you indent using two spaces or four, but you shouldn't do both. Choosing one of two equally good alternatives means you can free up the mental cycles you would spend making that choice every time the situation arises. It also can sometimes save you from mistakes: for example, if you arbitrarily switch between camelCase and snake_case for variable names, at some point you'll probably choose wrong and end up with a bug in your code.
Learning how to write good tests for your code is an important skill in software development. There are all sorts of approaches and fancy testing frameworks out there, but all a test needs to be is some evaluation of a piece of your code, the value that you expect that evaluation to have, and a check that those two things are actually the same. In pseudocode, that might look like:
if foo arg = val1 then
print_endline "Test 1 passed"
else
print_enline "Test 1 failed"
It's a good idea to start writing tests as you work on your solutions rather than after your code is complete, and to run your tests periodically while you work. Not only will you catch errors early, you will also be able to see if you've made any changes that break previously passing tests.
Deciding how much testing is enough is something of an art. Think about one or more "normal" inputs to your functions, then think of a few inputs that could cause problems (empty lists? negative integers?). You could also look at your code and make sure that each important part of it will be evaluated when running your tests. For example, when testing a recursive function, you should make sure your tests exercise the base case and each way in which the function calls itself.
Keep your boolean expressions concise. Below we show some "bad vs. good" examples (historically this has been known as "Boolean Zen" within the Allen School):
⛔️ Bad:
if e then true else false
if e then false else true
if b then b else false
if not e then x else y
if x then true else y
if x then y else false
if x then false else y
if x then y else true
✅ Good:
e
not e
b
if e then y else x
x || y
x && y
not x && y
not x || y
Using unnecessary parentheses makes your code harder to read; try to only use parentheses that are syntactically required or that help to clarify expressions for the reader. For consistency, stick to one style.
If expressions are preferred if there are only two possible branches that can be distinguished with a boolean condition. Match expressions are preferred to nested if expressions.
The cases in a match expression should concisely communicate your intentions. This means:
Use wildcards for bindings you don't need.
Don't match on an unused expression.
Don't repeat code between cases.
Avoid unreachable match cases.
Don't use wildcards to match a single constructor. For example, do not use a wildcard for a case that could only be None. Using a wildcard in place of a constructor makes it harder for a reader to understand what values the case matches, and gives the compiler less information, making it less likely to catch your mistakes.
Nested match expressions should be avoided where possible (although they are sometimes necessary). Prefer nesting within the pattern instead.
For example, prefer
match (option1, option2) with
| Some x1, Some x2 -> ...
| Some x1, None -> ...
| None, _ -> ...
over
match option1 with
| Some x1 ->
match option2 with
| Some x2 -> ...
| None -> ...
| None -> ...
Don't wrap functions unnecessarily. Also, when defining helper functions within a let, don't pass an argument to the helper for a value that the helper already has access to in the scope of the main function. Don't use an accumulator in a recursive helper function if the function is not tail-recursive.
If a comment needs to span multiple lines, adding an asterisk at the start of all subsequent lines aids readability.
(* this is a
* multiline spanning
* comment *)
Think about which scope a value or function is needed in -- if a helper function will only be invoked within a main function, it makes sense to define it with a let inside that function. If a value will be used by multiple functions, it may make more sense to declare it in the global scope.
As in other languages, when catching exceptions, catch the most specific type of exception that you expect.
For more, please Ilya Sergey's excellent advice on OCaml style.