Basic Scala Problems

Data Set:

The files in /users/mlewis/CSCI3354-F20/InClass/data/BasicScala/ were pulled from the following sources.

The first one has information on educational attainment for various countries. The second one has information for GDP of countries. The third one is just locations in latitude for those countries that can be used for plotting.

In Class Questions (done in groups):

All the code that you write to answer these questions should be put in a package called basicscala in the assignment repository. (Note that a package is just a subdirectory. Since all of your code is going in src/main/scala for sbt, the code for this should be in src/main/scala/basicscala.) You should also make a file called BasicScala.md in the top level of your repository that includes a write-up with your answers to the questions and any requested plots. Tell me who you were partnered with for these questions in the Markdown file.

The goal of this assignment is to make sure that you are familiar with the higher-order collection methods. As such, you shouldn't use any loops to do this.

1. How many different types of values are reported in the education file under the "metric" column? What are they?

2. List the five entries with the highest value for "Education Per Capita". (Use "Education Per Capita" for all following education questions.)

3. Which country has had the largest increase in education per capita over the time of the dataset? How big is the difference? You should potentially have a different country for each age/gender combo so make a table.

Questions (done individually):

4. Which country had the largest GDP per capita in 1970? What was it? Give me the same information for the smallest value.

5. Which country had the largest GDP per capita in 2015? What was it? Give me the same information for the smallest value.

6. Which county had the largest increase in GDP per capita from 1970 to 2015? What were the starting and ending values? (Note that you can't assume no data means 0. It just means that it wasn't reported.)

7. Pick three countries and make a scatter plot with year on the X-axis and educational attainment of females ages 25-34 on the Y-axis. Your three countries should have good data going back to at least 1970.

8. For those same three countries you picked for #7, make a scatter plot of GDP over time.

9. Make a scatter plot with one point per country (for all countries) with GDP on the X-axis and education level of males ages 25-34 on the Y-axis. Make a similar plot for females. Do this for both 1970 and 2015.

10. Make a scatter plot with longitude and latitude on the X and Y axes. Color the points by educational attainment of females ages 25-34. Have the size of the points indicate the per capita GDP. Do this for both 1970 and 2015.