We plan to use surveys and interviews as our needfinding techniques to collect data for the project. Our ideal targeted surveyees and interviewees would be a combination of students with multiple education levels and machine learning backgrounds, ranging from undergraduates to Ph.D. students, and from those who are mildly interested in machine learning to those who are studying machine learning as their future career path.
The first tool is an online survey, which provides a flexible and comfortable environment for users to answer the questions without the presence of supervision. Although we guarantee that the survey result will only be used to understand how we could build our website to be a better benchmark database platform, the survey is made anonymous so users can freely give their honest responses without worrying that the answers may have negative effects on them if exposed. This technique also has the advantage of collecting many responses without too much time or expense, and the result could be easily built into illustrations for further analysis. Given the constraint of time to gather and interpret the information, this technique gives the convenience for its conciseness and the scope of surveyees it could reach. The survey aims to collect as much information as possible to understand what people expect to see in a benchmark database platform and examine whether there is a common trend in their habit of looking for or contributing datasets, which may reveal potential features that we should provide on our website.
We plan to create a survey using Google Forms and populate the form through social media including Facebook, GroupMe, and Slack groups to reach the segments of the audience that are interested in and have beginner to intermediate background in machine learning. To get feedback from advanced students or experts in machine learning, we would survey students from machine learning or AI courses in our university (for example, CSC246, CSC 242), and also Ph.D. students and professors from our connections. The survey is made up of a large portion of multiple-choice questions, mostly yes-or-no check boxes or number ratings, to allow quick responses, with some open-ended questions along with blank text boxes at the end to facilitate additional thoughts and opinions.
The second tool we use are in-person interviews. We intend to interview researchers and students working in the field of machine learning to understand more about the challenges they face when looking for and downloading datasets. Interviews allow for more follow-up questions to be asked than surveys do, while surveys deliver the same questions to everyone. Since we think that face-to-face interviews are better for developing rapport, the interviewees will be chosen from the University of Rochester. We will pick people with different backgrounds and knowledge levels of machine learning in order to get different ideas. We will also conduct expert interviews, by interviewing the professionals who are using the machine learning model to do the analysis of natural disasters. Expert interviews may give us more unique opinions of user needs.