Biostatistics
(BSc Biol, 1st year, Spring 2021)
Welcome, students and any other curious people that have somehow decided to drop by! For some of the people taking this course, these will be the first and last statistics lessons you'll ever take, so my colleagues and I will try to make the experience as painless, pleasant and formative as possible.
Statistics is of uttermost importance in all of the natural sciences and particularly in any biologist's thought process. Due to time constraints and teachers-to-students ratios, it'll be unfortunately impossible to teach you all you should know and as well as you deserve, but we hope you'll get some basic understanding of the importance of the topic and that, by the time we part ways, you'll feel like learning more about stats at some point in the future.
My lessons might also be the first and last some of you ever take in computer programming. Speaking programming languages will be extremely useful in today's world for each of you, in your profession and elsewhere. The basic principles you'll learn would justify a course by themselves, and R (the specific language we'll try to speak, listen, write and read) is widely used in genetics and in some sectors of molecular biology, ecology, plant sciences and zoology, to name a few. For example, if you ever enrol in the MXX and you manage to study your notes of our course here quickly at the beginning of that master's degree, you'll save yourself lots of trouble (believe me — I've witnessed it!). But producing good notes requires understanding the contents of the course!
Whereas most of the courses you're taking this year will probably be useless in the mid and long term, we strongly recommend you to try to make the most of Biostatistics! Or else you'll regret it!
Syllabus
Profs. P Faraldo, A Saavedra, P Saavedra and myself are in charge of the students of the first half of the alphabet, namely of the following topics:Descriptive statistics
Basics of probability
Random variables. Notable examples of discrete and continuous distributions. Statistical independence
Parameter estimation and confidence intervals in Bernoulli and Gaussian populations
Hypothesis testing in Bernoulli and Gaussian populations
Simple linear regression. Parameter estimation and confidence intervals. Analysis of variance. Prediction
Course materials
The USC online campus for the course contains detailed notes for the theoretical classes, composed by my colleagues, who have been gentle enough to also include my name on them. My classes will be devoted to practical exercises focused on programming with R, while reinforcing the theoretical contents of the course. There are some useful materials below (mostly in Galician) and I may well add more in the future, but in any case the main communication channel will be the USC online campus, so there's no need to check this site often for updates.
In the very abnormal 2020–2021 academic year, all my classes are being livestreamed and the recordings are available on the course's channel (for students only). Attending on-site has also been possible since the beginning of March.
Please don't...
Please don't postpone installing:
R, from: https://ftp.cixug.es/CRAN
The desktop version of RStudio (after R is installed), from: https://rstudio.com/products/rstudio/download/#download . If you're asked to choose a license type, go for the free (as in "free beer") open-source version.
(Only for Windows users; the rest of you are already done.) Notepad++, from: https://notepad-plus-plus.org/downloads/ , which is an improvement of the awful, by-no-means-good Notepad (in the end, we didn't have much time to delve into this, but the TL;DR version would be that you should never ever use Notepad for anything). To figure out if you're need the 32-bit or the 64-bit version of NPP, go to "Control Panel", use the search bar to look for "System" and then click on "System." YMMV depending on the specific Windows distribution. Anyway, the chances are that you have a 64-bit system. Linux users are more than fine with the default editors; MacOS has TextEdit by default, which is enough for this course.
I'm begging you on my knees that at least you try, as soon as you can. Otherwise, we'll waste a lot of time during the classes, which we need for learning intersting stuff.
My goal; your goal
So is this all about you getting good grades? No, no, no. The practical lessons with R will be assessed via two 1-hour exams during class time, on dates that we'll agree with you. They'll be remarkably similar to practical examples and/or exercises on Prof Casares' notes (see USC online campus), which we'll also deal with during our classes. It's expected that anyone that does a minimum amount of effort gets ≥1.0/1.5 and that anyone who genuinely understands everything and isn't extremely unlucky/nervous on exam days gets 1.5/1.5... Or at least that's what I've been told (I have no control and no information on how the exams will be like).
Your goal should be to:
Get some basic understanding of computer programming because, once you speak a programming language you can:
Perform everyday tasks on your computer that you couldn't perform (easily) before.
Learn how to speak any other computer language very easily.
Have lots of fun! (I'll do my best to prove it, but first we need to start with some slightly boring stuff.)
Learn a few essential notions of statistical thinking.
Produce nice, short handwritten summaries of what you learn, to help you remember what you've learnt when you need it some day, maybe within years (e.g., your master's degree or your job).
Be able to learn independently in the future, both R and statistics. Learning stuff by heart is so 20th century! The best you can learn from me is how to learn.
My goal is to persuade you that the above is your goal, and to help you achieve it. Forget about the grades — they're going to be good if you focus on the actual goal and, if not, it means that we (or any professor who does) are terribly unfair and you shouldn't care about how we value your work.
Homework
I'd like you to devote some time to the following exercise before our March 26th class. Click here to download the data file.
First exam
The document below can be very interrrrrrrrrrrresting when it comes to preparing the April 9th exam.
Below you have the exam we did on April 9th:
Second exam
First of all, make sure to download the Galton dataset. The verrrrry interesting document for the exam is available below, since Sunday. Sorry for the delay! I am doing the best I can!
Handwritten summaries
Feel free to handle in your handwritten summary of what you studied with me, using this form, before the second exam. For historical interest, here you have the notes I composed when I was much younger:
Student survey
Please help me to be less awful to the students next year, by filling in this form. The goal of that form is to increase how much you impacted me; in order to increase how much I impact you, here you have a summary of what we discussed at the end of the last class:
We know it happens...
<⌒/ヽ-、___ ZZZZ...
/<_/____/
∧_∧
( 0ω0) ¡El worbuk!
_| ⊃/(___
/ └-(____/
<⌒/ヽ-、___ Ah no, k soy universitario.
/<_/____/ ZZZZ...
∧_∧
( ÔωÔ) ¡HOTIA, LAS PRÁCTICAS
_| ⊃/(___ DE BIOESTADÍSTICA!
/ └-(____/
FAQs
Why is this site in English? My website is all in English because it's aimed at a general audience; including international colleagues and friends, and exchange students. Plus, you really need to understand statistics and computer programming texts in English, if you want to to learn independently in today's world. So don't complain so much!
I have doubts related to statistics and/or the course. Can I ask them to you? Although I'm not being paid for these classes and they take time away from my actual and poorly-paid job, I'll be happy to help you learn about any topic that is (in)directly related to the course. That said, I have no knowledge on how the final exams will be nor about the assessment criteria or what happens in the lessons I don't teach. For any purpose related to studying for the course per se and the final exam, or if your goal is simply getting good grades, please contact the course coordinator ( mariadelosangeles.casares.decal@usc.es ) or the most appropriate of your real professors ( pedro.faraldo@usc.es , alejandro.saavedra.nieves@usc.es , paula.saavedra@usc.es ), depending on who taught what you want to ask about.
Can I contact you regarding other issues? Can I contact you once the course is over? Yes and yes. I've received valuable help from university staff in the past, with various personal and academic problems. If you don't know whom else to ask or if you feel that you trust me for whatever reason, please don't hesitate to contact me, regardless of whether we're in 2021 or 2023.