For assessment, we will use a combination of timely assessments to complement a final exam that covers the scope of the course.
For Days 1 and 2, students will be required to turn in answers to end of the section quizzes in the Loss Data Analytics sections in chapters 1, 2, 3, and 4.
The answers to the questions are in the reading, making this a non-threatening assignment.
To make sure that you understand these questions, a random selection will appear on the final exam.
For Day 3, students will be required to attest that they have completed the online tutorial.
As in Days 1 and 2, a random selection of questions will appear on the final exam.
As another check, students will be required to write a Datacamp style question on Chapters 2-4 of Loss Data Analytics.
I will assign a question to each person, see below for a few sample questions.
You will turn in an .Rmd file summarizing your work. Moreover, you will present your work to classmates on the morning of Day 5.
Days 4 and 5 will be covered in the final exam. In addition to the random checks, the final exam will focus on matters related to generalized linear and survival models.
Instructions for writing your Datacamp Style Question
Start with a short assignment text. State the source of the data, the type of model, and what this Question is about.
Break the question into 4-6 small segments, typically each segment represents a line or two of code. Write the solution first, then go back to the code and remove selected pieces that should be filled in by the person solving your Question.
The following are questions to be addressed:
1. Use the claims level data. Fit it using a Pareto distribution. Use the likelihood ratio test to assess the null hypothesis that alpha = 1.0 and theta = 2000.
2. Use the claims level data. You wish to compare the fit of the Pareto to the gamma distribution. For each distribution, determine the AIC and BIC statistics. Use these statistics to say which model is preferred.
3. Consider the following sample of 10 payments :
400 400 500+ 500+ 500 800 1000+ 1000+ 1200 1500
Here, the symbol + indicates that a loss exceeded the policy limit. Determine the Kaplan-Meier product-limit estimator of the survival function.
4. Consider the following sample of 5 observations
0.15 0.25 0.4 0.7 0.9
Assume that this is a sample from a population with cdf F(x)=x^p, 0<x<1.
Determine the estimate of p by the method of moments.
5. For the data and population assumption in Question 4, determine the estimate of p by the percentile moments, matching the smoothed empirical estimate to the median.
6. For the data and population assumption in Question 4, determine the estimate of p by the method of maximum likelihood.
7. Consider the following sample of 5 observations
1, 2, 3, 5, and 40.
Determine the value of the Kolmogorov-Smirnov test statistic for the null hypothesis that the distribution has density F(x)=exp(-2/x), x>0.
8. For the data in question 7, consider the gamma and Pareto distributions, fitted using maximum likelihood estimation. Calculate the Kolmogorov-Smirnov test statistics for each of these two alternative models. Based on these two test statistics, state which model you prefer and why.
9. Develop a question based on Section 2.3. See the sample code R for Loss Data Analytics.
10. Develop a question based on Section 2.4. See the sample code R for Loss Data Analytics.
And so on. Depends on the number of students in the class.