Course schedule

Join us on Slack!

Join us on Slack!

Anouncements

  • Final course grades are out! Here are the results of your course participation according to the course grading formula (Grade = [A]*0.10 + [B]*0.25 + [C]*0.30 + [D]*0.35 + [E]) on OSIRIS. Some success statistics:40 students passed.2 gave up.5 need a second chance.Some grade statistics:Highest grade: 9.9Average grade: 6.9; Standard deviation: 1.66Hardest assignment: endterm/final, with an average grade of 6.49We are pretty sure that we have now incorporated and graded all aspects of the course in all fairness, but should we have missed a course component, then just email Marco.I will post the second chance assignment for the Unfortunate Five later this week on the Assignments page (forwarded ...
    Posted 20 Dec 2017, 05:34 by Marco Spruit
  • Grades: Endterm assignment After quite a delay (sorry about that) we now have the grades for your endterm assignment, published on Slack. Please check ASAP whether we have not accidentally forgotten to grade your work (we did expect a few more submissions). It turns out that the Rstudio with Spark environment got somewhat buggy iff you didn't use it exactly as intended... Luckily PySpark was always an option. NB: Tomorrow I will update the post, and publish the complete course grades, and how to repair etc.NB2: Don't worry about the unfortunate Jan 4 date for the second chance exam, we are reasonable people... More tomorrow.
    Posted 19 Dec 2017, 13:18 by Marco Spruit
  • S3: Some Slack Stats Here are some statistics on your Slack usage during the course. It was used pretty intensively! And lots of direct messages...
    Posted 13 Nov 2017, 02:35 by Marco Spruit
  • Extra time (update) Based on the current status of a number of you with respect to the Mid-term Assignment, we have decided to give you an extra week to complete the assignment. The new DEADLINE for submitting the Mid-term assignment is now Tuesday Oct 17 at 15:00 CET. Furthermore, we will provide more step-by-step guidance in the workshop on Tue Oct 10 before the regular lecture, for those you have trouble with Python or commandlines and stuff. Finally, some tips that emerged from the questions you had this morning:Use Debugging tip #1 in the lecture slides: test each mapper script first from the local machine's commandline to see the actual output it results in, which will ...
    Posted 14 Oct 2017, 08:04 by Marco Spruit
Showing posts 1 - 4 of 8. View more »

Asssignments

  • Tutorial 5: PySpark data preparation To help you get started with the final assignment, I have prepared an example using PySpark on how to read and preprocess part of its tab-separated values data files. Together with the walkthrough given in the latest lecture, this should take care of some of the technical grunt work, so that you can focus faster on learning to experiment with Spark. Turn to the #random Slack channel if you can't seem to get through this tutorial. Note that this is an optional tutorial that you need not submit; if you are unable to get through this one, as you definitely won't be able to do anything related to the final assignment, right?!In addition, I can post ...
    Posted 19 Oct 2017, 04:14 by Marco Spruit
  • End-term assignment: Epidemiology The goal of this final assignment is to demonstrate your familiarity and hands-on experience with big data technology tools, in casu Spark, either using Python or R, to the extent that you can apply these tools to answer real-life questions from domain experts. This end-term assignment of the INFOMDSS course constitutes 30% of your final grade. Note that this assignment is an individual endeavour. We do check your scientific integrity… This assignment has been kindly proposed by the Epidemiology staff of the UMCU/Julius Center.The assignment consists of four parts. The first part “Preprocess data” requires the several basic, generic and essential Spark data processing skills to complete. Upon completion (in a correct and well-described ...
    Posted 17 Oct 2017, 11:27 by Marco Spruit
  • Mid-term Assignment: Neonatology Below you'll find the mid-term assignment on processing and analysing Neonatology data within the Hadoop distributed computing environment. This is your chance to demonstrate your familiarity and hands-on experience with big data technology tools, in casu the Hadoop ecosystem, to the extent that you can apply these tools to answer real-life questions from domain experts! This mid-term assignment of the course constitutes 25% of your final grade. Note that this assignment is an individual endeavour. This assignment has been kindly proposed by the Neonatology staff of the UMCU. See the attachment for all details and tasks.We are aware that some of the tasks are probably much easier to perform within other environments. However ...
    Posted 2 Oct 2017, 13:24 by Marco Spruit
  • Tutorial 4: Spark This is the fourth in our series of tutorial excercises -- Spark: again the ubiquitous example -- to help you get started and acquainted with the Hadoop big data technologies environment. As mentioned on the last slide, and as communicated earlier as well, it is required to go through these tutorials, even though you won't be graded for doing them. Below (and on the final slide in the attachment) is our lightweight solution to communicate to us that you have completed the tutorial:Upload a screendump of the VM in a non-maximised window Displaying the command line window after executing the ls –l command in the main user directory Made with the PrintScreen (PS) button on your keyboard (thus showing ...
    Posted 21 Sep 2017, 06:44 by Marco Spruit
  • Tutorial 3: MapReduce This is the third in a series of tutorial excercises -- Map/Reduce: its ubiquitous example -- to help you get started and acquainted with the Hadoop big data technologies environment. As mentioned on the last slide, and as communicated earlier as well, it is required to go through these tutorials, even though you won't be graded for doing them. Below (and on the final slide in the attachment) is our lightweight solution to communicate to us that you have completed the tutorial:Upload a screendump of the VM in a non-maximised window Displaying the command line window after executing the ls –l command in the main user directory Made with the PrintScreen (PS) button on your keyboard (thus showing ...
    Posted 21 Sep 2017, 06:46 by Marco Spruit
Showing posts 1 - 5 of 8. View more »