This has been a fun year and now it's time for some reflection on the work I have done. I think the freedom provided by EMC did not pose too much of a challenge for me. Generally, I was able to figure out what I wanted to do for assignments and journals without a lot of indecisiveness. I was able to get a pretty clear picture of what I needed to do by talking to my coordinator. The advice I got was solid and allowed me to find direction in where I wanted to take my research.
The most difficult part of EMC for me was tying all my research together. My research seemed to take some broad jumps at points which made the whole thing lack focus. I think I could've prevented this by focusing more on questioning and making sure my questions were leading me down a logical path of research.
I think my research experience in EMC will help me with my future endeavors. I want to get involved with academic research so reading academic journals and investigating a single topic in depth are certainly useful experiences. Throughout my research I think I did take a few risks by trying out new mediums and asking new questions. I certainly felt comfortable taking these risks. Of the five C's I think my research this year most lacked collaboration. It was difficult to coordinate as I was mostly doing my own thing with my research.
The assignment I am most proud of from this year is my experiment on human-computer trust. Although it didn't go quite as planned, I learned a lot and was able to get some real results. If I were to change one thing about EMC it might be to allow more freedom towards the beginning of the year, though I'm not sure if that would work out how I think.
The experience of being able to tell an audience about my research at the Symposium was certainly fun. I think the delivery of my speech was mostly effective. However, the content could've been better if my research throughout the year was a bit more focused. Most of my work from the very beginning of the year didn't fit in at all. Overall, I am happy about my speech and my research this year.
My top choice for my symposium speech is the idea about how automating jobs currently done by humans is for the greater good. I chose this because it has the strongest connection to both the symposium theme and my own research. It will not be difficult to relate this topic to the theme because it addresses it directly through my assertion that automation is for the greater good.
The research that I have done so far is somewhat related to this idea as I have gathered plenty of information on humans working with automation. I will also need to do a lot of new research on the actual effect of automation on human society.
I am a little worried about the public speaking aspect of this symposium. I am very confident with research and writing so I don't think I will have trouble creating my speech. Delivering it will be another story. I will definitely need plenty of practice to pull this off.
The most satisfying and fulfilling this about my March SDA was being able to create a coherent piece of research that demonstrated positive results like the many papers I have cited leading up to this. This type of SDA worked well with open inquiry because I had total control over it. However, it did create difficulty as I had to make the call regarding when my project was completed. There were no requirements or guidelines. Because of this, I had to approach this SDA a little differently. My normal approach is to use the directions to flesh out my rough idea but this time I had to complete the entire concept myself. The nature of my SDA made this a little easier as it was modeled after existing experiments.
This SDA has taught me many things about the field I am researching. Normally in studies involving human participants, everything is conducted in person and monetary compensation is provided. My lack of time and funds made this impractical, so I decided to conduct this experiment online. This, however, had its own issues. When experiments are done in person, detailed instructions can be given, and questions can be answered. I needed to communicate my instructions online. My initial approach was just a few paragraphs describing what the participants needed to do. Unfortunately, through observation of some select participants I found that people were likely to skim or even skip the instructions as a large section of text can be intimidating. I then implemented an interactive tutorial but found again that my paragraphs were too long. I finally found success after splitting up my tutorial into a series of very simple units of information. This instruction issue taught me that communicating very specific instructions takes serious thought.
I also learned just how sensitive these types of experiments are. In the original experiment, they told participants the exact accuracy of the aids. I made the change of telling them instead that the aids were not 100% accurate (but did not disclose the exact accuracy value). I thought that this change would be insignificant however I found that many people were likely to not consider the gauges at all when told that they were not 100% accurate. I think that people are very unlikely to trust automated aids when given no information at all.
Clarifying the instructions and providing the exact accuracy of the gauges were two changes I needed to make to the experiment on the fly. This invalidated many results which is what led to my small and skewed sample sizes. I now realize that it would've been best to either do a test run of the online experiment prior to the actual one or just do the entire experiment in person.
Despite these pitfalls, or rather because of them, open inquiry provides one with the greatest opportunity to learn. In the future I would make the choice to do open inquiry again.
Symposium Speech Ideas:
People often villainize automation because it has the potential to take jobs, but it is beneficial in the long run as we will see people working alongside machines in new types of jobs.
For the greater good, we should decrease our reliance on machines.
I came across a study that looked into one of the issues I mentioned in my previous two journals. This is the issue of what effects automation misses cause versus automation false alarms. Misses are cases when a simulated aid fails to report on some occurrence while false alarms are when an aid incorrectly reports that something occurred. I wasn't sure about what the exact effects of each are, and I just chose to use misses to model another study and be consistent. This paper brought this topic into a brighter light.
Examining Single- and Multiple-Process Theories of Trust in Automation is a paper that was published in the Journal of General Psychology by Dr Stephen Rice in 2009. It looks into the effects of different types of automation error. In this case: misses and false alarms. They wanted to find out if relationships between operators and systems could be described with a single trust concept or two different ideas of trust are necessary. To investigate this, they had participants watch an image flashed onto a screen and identify whether or not the image contained a tank. The participants had aid systems to assist them and could choose whether or not to follow its advice. Different groups of participants had aids with different reliabilities and with different types of error. What they found was that "participants in the false-alarm-prone conditions were less likely than were those in the miss-prone conditions to agree with the automation when it judged that a target was present." This makes logical sense. It's the automation that cried wolf. If your aid always raises false alarms, you will be less likely to trust its future alerts. This was true to a lesser extend for miss-prone automation. Conversely, "participants in the miss-prone conditions were less likely than were those in the false-alarm-prone conditions to agree with the automation when it judged that a target was absent." This may all seem perfectly logical, but it proves that humans treat false alarms and misses differently when it comes to trust. Humans differentiate between trust in alerts (compliance) and trust in a lack of alert (reliance), affirming the two systems of trust theory.
What does this mean for my study? I think the important takeaway for me is that it doesn't matter which I use between misses and false alarms as long as I stay consistent with one. My study is about aid similarity, so having different types of errors would complicate things by introducing two different systems of trust.
Works Cited
Rice, Stephen. "Examining Single- and Multiple-Process Theories of Trust in Automation." The Journal of general psychology, vol. 136, no. 3, 2009, pp. 303-19. eLibrary, https://explore.proquest.com/elibrary/document/213638707?accountid=51266.
In this journal I will provide a more in-depth explanation of how I am modeling my experiment and my hypothesis along with my reasoning for it. I will create two systems of aids: one where the aids are visually similar and another where the aids are visually distinct¹. Half the participants will be assigned to the first system, and the other half will be assigned to the second system. They will be referred to as the similar group and the distinct group respectively. Participants will have to monitor two gauges each with their own ideal value and safe range. If the gauge falls outside of the safe range, an error must be reported. The gauges will be intentionally difficult to read, so the participants will have an aid to assist them for each gauge. One of these gauges will be 100% reliable while the other is 70% reliable², and participants will be told before starting that the aids are not necessarily 100% accurate so they should use their best judgment. The second gauge will only miss, meaning that it will incorrectly report that the gauge is safe but will not incorrectly report that there's an error³. After each diagnosis, the participant will be told whether or not they were correct. The idea is that the participants in the similar group will form a stronger correlation between the accuracies of the two gauges and will therefore feel that the perfect gauge is unreliable, moreso than the distinct group who will not make such a correlation.
What the distinct group would see
What the similar group would see
The experiment will be composed of a series of trials. In each trial, the participant will have eight seconds to examine two gauges and two aid recommendations that will appear before them. They will then be asked to report whether or not there was an error for each gauge. At this point, they may agree or disagree with the aid. A screen telling the participant if they were correct or not will then display for two seconds. At the end of all the trials, participants will be asked to estimate the reliability of each aid. I plan on using this along with their disagreement rate to determine how much each aid was trusted. A total of 32 trials will be conducted for each participant. The total experiment will take about six minutes.
I hypothesize that the similar group will report much lower reliability for the perfect aid than the distinct group. They will also demonstrate higher disagreement rates. This is because gauges that look more similar will seem to be more part of the same system. This will promote system-wide trust and result in more spillover of unreliability. One could argue that because it was shown by Rice and Geels in Using System-Wide Trust Theory to Make Predictions About Dependence on Four Diagnostic Aids that distance has no effect on the level of system-oriented thinking, visual similarity also should have no effect. They both are, after all, visual indications of the relationship between two things. I believe, however, that visual similarity is a much stronger indication and will not be as ineffectual as distance.
¹ I've been using "aid" and "gauge" interchangeably because they are strongly linked but to be clear: the gauge is the meter reporting information while the aid is the system offering a diagnosis based on the information.
² The two gauges will be assigned accuracies randomly.
³ I settled on this and the 70% value because those are what Rice and Geels used in Using System-Wide Trust Theory to Make Predictions About Dependence on Four Diagnostic Aids. I don't have a reason do diverge from their approach so it's best to stick to what is shown to work.
Works Cited
Rice, Stephen, and Kasha Geels. "Using System-Wide Trust Theory to make Predictions about Dependence on Four Diagnostic Aids." The Journal of General Psychology, vol. 137, no. 4, 2010, pp. 362-75. eLibrary, https://explore.proquest.com/elibrary/document/855993988?accountid=51266.
In this journal I will outline my research for the design of my experiment. I got inspiration from Rice and Geels in Using System-Wide Trust Theory to Make Predictions About Dependence on Four Diagnostic Aids. This is a study published in the Journal of General Psychology that is a follow up to System-Wide Versus Component-Specific Trust using Multiple Aids by Rice and Keller, a study in the same journal which I have previously mentioned.
In the original study they had operators monitor two gauges to diagnose system failures. One gauge was 100% accurate while the other one was not. They demonstrated that trust of the perfect gauge was pulled down because of the inaccuracy of the first one. They used this result to support the idea that operators trust aids as a system and not on an individual component level. This study was incomplete however, leaving some questions unanswered.
This new study by Rice and Geels is very similar to the previous one: operators diagnose system failures by looking at gauges. There are some key differences however,
They manipulated whether or not the operators received feedback on their diagnoses
They manipulated whether or not operators received prior information about the accuracy of the aids
There were four gauges, allowing them to test the effects of physical distance between the perfect and faulty gauge
This study only had operators reading gauges, they did not need to perform a second cognitive task like in the first study
They used automation misses (aid fails to alert when it should've) instead of false alarms (aid alerts when it shouldn't have)
These new factors were tested to see how they affect the spillover effect, and the results were conclusive. In all cases, system-wide trust was prevalent. In addition, physical distance between the aids was found to have no effect on this.
This study is valuable to me because it showcases an experiment that I can base mine off of and provides results that will be helpful in my endeavor. Showing that all the above factors do not affect system-wide trust means that I can incorporate any of these into my design and not worry about invalidating the experiment. In addition, the fact that distance is not a strong factor means that I do not have to test it.
For my experiment I plan on having two gauges, one 100% accurate and the other 75%¹ accurate. Operators will not be provided with prior information on the accuracy of these gauges (besides the fact that they are not necessarily 100% accurate). They will, however, receive feedback on their diagnoses. For one group, the two gauges will be visually identical, and for the other, the two will be distinct. In addition, there will be no second cognitive task and automation misses will be exclusively used². My hope is that this will create an effect where the two identical gauges are more thought of as the same system while the two distinct gauges are thought of more as separate. If this is the case, I should observe that operators disagree more consistently with the 100% accurate gauge that is paired with an identical, faulty counterpart than the other one.
This paper has given me many strong ideas for my experiment, and I am looking forward to beginning with data collection.
¹ This is strongly subject to change as it is not clear to me right now what all the effects of a given accuracy value are.
² This is also not set in stone as I need to do more research on the different effects of automation misses and false alarms.
Works Cited
Rice, Stephen, and Kasha Geels. "Using System-Wide Trust Theory to make Predictions about Dependence on Four Diagnostic Aids." The Journal of General Psychology, vol. 137, no. 4, 2010, pp. 362-75. eLibrary, https://explore.proquest.com/elibrary/document/855993988?accountid=51266.
Keller, David, and Stephen Rice. "System-Wide Versus Component-Specific Trust using Multiple Aids." The Journal of General Psychology, vol. 137, no. 1, 2010, pp. 114-28. eLibrary, https://explore.proquest.com/elibrary/document/213638805?accountid=51266.
How and to what extent is simulated aid distrust spillover affected by the similarity of the aids to each other across different metrics?
For this question, I would like to pursue an open inquiry path. This is due to the fact that I plan on conducting a study similar to the ones I've researched for previous questions. I think producing something new like this is more conducive to an open approach, and I want to be able to control the journals/SDA this month with my planned study in mind.
In the past my most successful SDAs have been those which explained topics I researched and contained interactive elements such as my October and November ones. This time I will be taking a slightly different approach. I plan on writing up a paper of scholarly nature that presents the results that I have found on my own.
I think open inquiry will set me up for success for this month because I will be trying to produce a new result and will need a lot of control to do so. My plan is to write journals about my process for designing the experiment and about my hypotheses then present what I found in an SDA at the end of the month. I plan on starting the data collection for my SDA soon as it should be going on concurrently with my journals.
Prior to this assignment I had been using databases and scholarly articles, so it wasn't too difficult for me to incorporate them. Reading scholarly articles in general is a bit more difficult than other research-based texts because the language is complex and takes real effort to understand fully. In addition, they often go into very detailed analysis which may not be easily understood. The articles I read had a lot on statistics, and I mostly had to skim over those parts to focus on what was relevant to my research. I think being limited to these types of texts was helpful towards my progress because it forced me to think critically about the issue and use my own words when explaining it. I wasn't spoon-fed any information. It also allowed me to take research from more reputable sources and made me more curious about the topic, through links to other tangentially related papers. As I used more scholarly papers, I refined my process for finding them. I would usually use some keyword search that I knew was related to what I was researching—sometimes discovered in previous papers—then check the introduction, abstract, and conclusion of the top few results to get an idea of what they were about. If nothing was really what I was looking for, I would try again with different keywords. This generally worked well for me, and I was able to find most of my sources fairly quickly. Something interesting I noticed while reading scholarly articles was the fact that the same names of authors would pop up in papers and citations of the field I was reading about. It shows how some people put a lot of work studying fields very deeply.
The most challenging part about this SDA was figuring out how to communicate the information I had gained. It was useful to have an essential question because it made it easier to figure out what was truly relevant. The essential question I started with did mostly get answered as I found examples of problems with unreliable simulations and the factors that affect them. However, I didn't really focus on the part about how prevalent the issue is. Though I do think the large volume of research I found on the topic speaks to that pretty well. I think this SDA did a good job of meeting my goals. I was able to effectively communicate a complex topic in a fairly simple way. Out of the five C's I think my strongest is communication because I can efficiently filter a lot of complex information and convey the most important points. The weakest is probably collaboration as most of my work has been on my own up to this point. I think what I've learned from this SDA will help me next month to find proper sources to investigate my idea.
My proudest moment from this midterm is being able to tie together the three different articles I read to paint a cohesive picture. It brought a framing to the first video that would not have existed otherwise. In terms of coordination, I think things have been going pretty well, and I do not have any major concerns.
For March, my question will be: How and to what extent is simulated aid distrust spillover affected by the similarity of the aids to each other across different metrics? This was a question that came up in a paper I read, and I think it would be interesting to delve into a specific topic like this and conduct my own study, though it will take quite a bit of work. I am looking forward to the March SDA.
I realized that my last three articles were largely focused on psychology (a cross-over I was not expecting,) so I decided to choose a paper more focused on the technical aspects of simulations. This new article was published in the Bulletin of the American Meteorological Society, which is the journal of the American Meteorological Society, dedicated to the study of atmospheric, oceanic, and hydrological processes. The paper was authored by Tim Palmer and Antje Weisheimer. Both Palmer and Weisheimer conduct research at Oxford University in the Predictability of Weather and Climate research group.
Palmer and Weisheimer addressed the problem of assessing the reliability of simulations in situations where present predictions can only be checked by future outcomes. Weather predictions of events happening in the future cannot be verified until that future arrives. This is problematic because "we build trust in climate models by critically evaluating their performance in present-day or past climate conditions" (Palmer & Weisheimer). It is difficult to develop trust in a model that cannot be verified quickly. This is especially bad in the case of forced responses. These are outside forces that act on a system. This paper mainly mentioned anthropogenic forcing: external forces caused by humans. In this paper, the authors propose a method of assessing the unreliability of a model due to external forces without having to observe the effects of the forces in reality.
The solution proposed is the use of initial-value ensemble forecasts. These are wide ranges of predictions made by a model with a multitude of different input states. To demonstrate the problem and the solution, the paper uses a simple analogy (see Fig. 1). This illustration shows two different systems of a ball being dropped onto a ramp and being able to probabilistically go into the left or right cup. The top system is the simpler case and can be thought of as the model. The bottom case can be thought of as reality. First, we must consider the case in which the fan is off, having no effect on the system. The model will predict that each cup has a 50% of being the outcome because when it falls it can roll either way. Reality concurs with this prediction¹. Now, when the fan is on, the model will predict that the left cup has a much higher change of being the outcome because the fan blows left. However, reality is much different in this case; the fan blowing left causes the ball to fall into the right cup with a much higher probability. "Compared with reality, the model responds incorrectly to the applied forcing" (Palmer & Weisheimer). This demonstrates how a simplistic model can seem accurate but fail when an external force is applied.
To understand why an initial-value ensemble allows for the assessment of the reliability of a simulation under unknown forces, consider Fig. 2. Here a multitude of drop locations are labeled on the two systems. First consider A, B, and C. In these cases, the model would agree with reality. In situation D however, the model would predict that the marble always falls into the left cup while reality would say that it could fall into either one with equal probability. "Some of the time the initial conditions will lie on the top channel; on other occasions, the initial conditions will lie on the bottom channel. Our imperfect model, on the other hand, is unable to discriminate between these situations" (Palmer & Weisheimer). Situations F and E are worse, the model will predict the opposite of reality. This shows that analysis across the input space can help determine the reliability of a model in the face of external forces.
From this paper, I have learned more about why simulations can be unreliable and about the nature of potential unreliability². In addition, it has brought to light a method of understanding the level of unreliability of a simulation.
One real world problem that this paper is highly relevant to is climate change predictions. They often fall into the category of models which cannot be easily verified in the present and which have strong external forces, often those which are anthropogenic.
A potential limitation with this paper is the simplicity of the model. I am wondering if this is really how real-world systems act, and I have a couple questions about this. What force, model, and real system exist for which an ensemble of input values cannot distinguish the model and reality, but the force can (if such a thing exists)? What would be required to prove that what was mentioned earlier exists (would a simple analogy be sufficient)? I do see how ensemble forecasts can test the entire input space and find out where the discrepancies are, but I wonder if this is guaranteed to work even in theory.
Fig. 1
Fig. 2
To find this paper I used the ProQuest database with the search term "simulation unreliability." I had tried searching with the term "operator" (from a previous paper) but it failed to bring up relevant results.
New Terms
forced response
anthropogenic forcing
initial-value ensemble forecasts
¹ What does it mean for reality to be probabilistic? There is only one outcome, right? This simplified model is actually saying that with some initial conditions, reality cannot be predicted with certainty. It is important for simulations to be able to model this. "Reliability diagrams do not only test the ability of models to capture the predictable features of the event under consideration, they also test the ability of the model to predict reliably the situations where there is no predictable signal, that is, by producing an ensemble probability equal" (Palmer & Weisheimer). A reliable simulation has the ability to predict its own unreliability.
² Simulations essentially should be honest about how reliable they are. If they cannot actually predict reality, they should not give an absolutely certain result.
Works Cited
Palmer, T. N., and A. Weisheimer. "A Simple Pedagogical Model Linking Initial-Value Reliability with Trustworthiness in the Forced Climate Response." Bulletin of the American Meteorological Society, vol. 99, no. 3, 2018, pp. 605-614. eLibrary, https://explore.proquest.com/elibrary/document/2117848457?accountid=51266, doi:http://dx.doi.org/10.1175/BAMS-D-16-0240.1.
The previous two articles focused on the relationship between human operators and automated aid systems in decision making environments. The first looked at the effects of unreliable aids while the second looked at the effects of operator bias. This paper studies the effect of multiple automated aid systems and unreliability across them. It tries to answer the question of whether operators see each aid separately or see all of them as one system.
To find this paper I used the ProQuest database search. The article was published in The Journal of General Psychology which is for papers on experimental psychology. One author of this paper is Dr. Stephen Rice, a professor in the Human Factors and Behavioral Neurobiology departments at Embry Riddle Aeronautical University. Some of his areas of expertise include autonomous systems and consumer perceptions. He has won awards for influential papers in the field of Human Factors. David Keller is a Human Systems Engineer at the Naval Surface Warfare Center.
This article is different from the last because it introduces multiple aids. In the real word, operators are often presented with multiple aids for different related systems. "Examples include aircraft cockpits and nuclear power plant control rooms" (Keller & Rice). In these cases, the different effects of the aids cannot necessarily be treated independently of each other. This paper looks at how reliability in individual component aids impacts the perception of each aid and the system as a whole.
In this experiment, participants were told to interpret readings on a screen to diagnose a "system failure." To do this they were given two gauges (designed intentionally to be difficult to read), each with an audio cue that sounded if the gauge was likely reading a system failure. One gauge was always 100% accurate while another was variably accurate (100%, 85%, or 70% for different participants). What they found is that sensitivity to system failure was much more impacted by the reliability of the possibly faulty gauge (p < .0001) than the difference between the gauges themselves (p = 0.644). Essentially, participants' trusts of the gauges were much more dependent on the reliability of the faulty gate than which gate they were using. This supports the idea that operators trust in terms of an entire system as opposed to individual components. This means that when understanding the reliability of simulations, the entire system needs to be taken into account. Components are not independent.
One limitation of the study is the treatment of operator trust as binary (as system-wide vs component-specific) when in reality it is likely to lie on a spectrum. The nature of the simulation itself may have contributed to the black and white results. The two gauges were identical and placed next to each other which may have made the operator more inclined to believe that they were part of the same system. I think an interesting study could be conducted on the nature of separation of two information automation aids and their treatment as part of the same system. Another limitation is the fact that the study only investigated automation false alarms (when it gives a false alert) and not automation misses (when it fails to raise alert but is quiet). "It is well-known that [automation false alarms] generally have a more detrimental effect on operator trust than do automation misses" (Keller & Rice).
The problem presented in this paper was that it was unclear whether or not multiple aid systems had their trust perceived as a whole or on a component-basis by operators. It has made a strong case that systems are seen as a whole, so it has contributed to solving this problem. There still must be further research to find in what situations this is true and to what extent.
New Terms:
system-wide trust theory
component-specific trust theory
automation false alarms
automation misses
Works Cited
Keller, David, and Stephen Rice. "System-Wide Versus Component-Specific Trust using Multiple Aids." The Journal of General Psychology, vol. 137, no. 1, 2010, pp. 114-28. eLibrary, https://explore.proquest.com/elibrary/document/213638805?accountid=51266.
What factors affect reliance on simulations, what problems can they cause, and how prevalent is the issue?
My article for this journal post was authored by Poornima Madhavan and Douglas A. Wiegmann. It was published in The Journal of the Human Factors and Ergonomics. I found it in the ProQuest eLibrary database using the search terms "simulations" and "reliance."
My last article was about overreliance on simulations. This one is about the opposite: undue distrust of simulations. It explores an effect called "cognitive anchoring," which is when a person becomes attached to their first hypothesis and rejects others that come after it. The human mind anchors to what it thinks of first and is reluctant to consider other possibilities. This is relevant in the field of automated diagnostic aids because human operators can often anchor to their hypothesis and reject that of the simulation. The example given in the paper was luggage inspectors "[formulating] an opinion concerning the likelihood that a particular passenger might be dangerous and then [basing] their interpretation of the automated alarm based on this hypothesis"¹ (Madhavan & Wiegmann). This problem is especially bad in this case because of how little information is given to the operators. They have to fully trust the detector as it can give them no reasons for alarm (this is like the strong decision automation from the last paper).
This paper had two different groups using automated aids to diagnose pump failures. The first group would always give a diagnosis after consulting the aid (non-forced anchor) while the other would give a diagnosis before and after consulting the aid (forced anchor). Within the non-forced anchor group, "an increase in the tendency to prediagnose system failures was associated with a significant decrease in agreements with the aid" (Madhavan & Wiegmann). This means that when participants formed their own opinion before consulting the aid, they were more likely to disagree with it, demonstrating cognitive anchoring. Within the forced anchor group, a subset would make their diagnosis public to the group before consulting the aid (high self-anchor group). It was found that "overall agreement probabilities did not differ significantly between the forced anchor and high self-anchor groups" (Madhavan & Wiegmann). This showed that cognitive anchoring was not affected by public commitment.
This paper is relevant to my question because it provides a new perspective on it. The last paper dealt with unreliability in simulations while this one deals with cognitive bias in humans. It also contains evidence for the inverse: harm due to under-reliance on simulations. In some cases, people are too distrusting of simulations while in others they are not trusting enough. I had not considered this until reading the paper, so I slightly modified my question.
The field of simulations can benefit from this paper because it demonstrates the cases in which distrust of models occurs in operators who are making use of them as aids. Even the best simulation is useless if it is ignored.
New Terms:
Cognitive anchoring
Opaque systems
¹ This brings to light another benefit of using simulations: they are not subject to the natural biases of the human mind.
Works Cited
Madhavan, Poornima, and Douglas A. Wiegmann. "Cognitive Anchoring on Self-Generated Decisions Reduces Operator Reliance on Automated Diagnostic Aids." Human Factors, vol. 47, no. 2, 2005, pp. 332-41. eLibrary, https://explore.proquest.com/elibrary/document/216463125?accountid=51266.
In what ways can unreliability in computer simulations cause harm, and how prevalent is this issue? This is a "need to know" question formulated from the topic of reliability of simulations discussed in my previous post. I chose this issue because I thought it was ironic how a tool used to give us insight into our world can potentially cause us to misunderstand it greatly. As I explained in the previous post, simulations can create overconfidence and can be used to manipulate. This question is important to the general public because it brings to light the use of computer simulations to gain undue trust. Those who are aware of this are less likely to fall victim to it. Developers of and people working with simulations can also gain value from this question. In cases where simulations create overconfidence, how can we understand the true limits of the tools at our disposal? I think this topic is preferable to the other two I mentioned in my post because it appeals to a wider audience and is strictly about simulation. Optimization is a subject that is really only pertinent to experts. Talking about the universe being simulated can quickly become a metaphysical discussion.
Over winter break I am looking forward to relaxing and enjoying the holidays with my family as I have been doing a lot of work this school year. I also look forward to finishing the last of my college applications.
Reliability of computer simulations.
Optimization of computer simulations.
Living in a simulation.
Although computer simulations have proven to be a valuable resource in predicting outcomes in our environment, they have their limitations which prevents many use cases. In addition, the false sense of security provided by the use of some simulations can prove to be extremely dangerous. Simulations can also be intentionally misleading in order to obtain trust from others. Highly accurate computer simulations can seem very impressive, but this can be twisted. This problem has great relevance because over-trusted simulations can lead to damages when something goes wrong in the real world that was not predicted in the simulations. This problem affects everyone. All people living in a modern society are subject to the results of simulations, such as on plane rides and in medicine. In addition, we are living in an era where misinformation is a massive problem. Simulations are sometimes used manipulatively to exaggerate the facts. For example, Zachary Lim explains how a specific investment company advertises their advanced stock market simulations in a way that does not fully reflect the true accuracy of the model. They convince people that their stock market simulation will predict gains over time with an extreme level of accuracy which is simply not the case.
A problem faced by some types of simulations is time constraint. Simulations can operate on a large amount of data so can sometimes take up very large amounts of computing resources. Often times many simulations need to be run in order to thoroughly test a process, so it would be most convenient for them to be as fast as possible. This problem is important because a simulation is not feasible to execute reasonably if it takes a very long amount of time. The issue is mainly pertinent to people who research the development of performance intensive simulations. At the University of California San Diego, engineers have developed special computers in order to run simulations (such as fluid simulations) which rely on trillions of parameters.
As we developed more complex simulations which mirrored reality to an ever-greater extent, a thought began to manifest in the minds of many: is our world just another simulation? Some claim that this conclusion is completely logical while others claim this view to be pseudoscience. What logic supports this idea, and is it sound? Is the question of whether or not we live in a simulation even useful? The main relevance of this problem is general discovery. This makes the problem really only pertinent to the people who discuss it. Many prominent figures in science, such as Neil DeGrasse Tyson and Elon Musk, have claimed that we are likely to be living in a simulation. Physicist Sabine Hossenfelder claims that these claims are based on weak evidence and are unscientific.
Works Citied
Hossenfelder, Sabine. “Why the Simulation Hypothesis Is Pseudoscience.” Big Think, 30 Sept. 2021, https://bigthink.com/thinking/why-the-simulation-hypothesis-is-pseudoscience/.
Lim, Zachary. “A False Sense of Security.” Medium, Towards Data Science, 27 Oct. 2020, https://towardsdatascience.com/a-false-sense-of-security-when-investment-firms-tell-you-they-ran-1000-simulations-11673ffc9572.
University of California - San Diego. "Engineers develop new methods to speed up simulations in computational grand challenge." ScienceDaily. ScienceDaily, 26 March 2015. <www.sciencedaily.com/releases/2015/03/150326152236.htm>.
I think that my biggest strength in EMC is being able to find complex topics and simplify them for a more general audience. Many papers I am reading contain extremely complex subject matter, and I try to explain them in a simple way. My biggest weakness is probably my procrastination problem. I do research throughout, but I feel like I am never motivated to actually start putting something together until the deadline is staring me in the face.
I had spent about four days working on this assignment, and I was able to use the time I carved out effectively. This, I believe, was partially due to the fact that I was allowed to choose my own due date. It made me feel more accountable. If the SDA was not finished on time, then it would be entirely my fault because I said I would be able to do it by a specific date. I had chosen to do a HyperDoc because I felt it was a good idea to allow the information to present itself through an interactive medium. My SDA was a showcasing of different types of simulations developed in academia so allowing the reader to explore these simulations with external links was useful. Picking my own format was meaningful because it forced me to ponder the nature of the information I was presenting and decide on the most ideal format for it. If I were to do an SDA like this again, I would probably link more external and internal resources to improve. A web of links is the core of a HyperDoc.
Finding resources for my SDA was surprisingly easy. I was initially worried that I would have to look very deep to find any academic sources for my assignment. However, using the ProQuest database and searching with keywords allowed me to find relevant studies quickly. My searches specifically queried sources that were about the use of either stochastic or deterministic approaches to model some process. Incorporating my sources into my SDA was not difficult at all because they were the core of the assignment. My SDA was entirely focused on examples of the work done with different types of simulations and explaining their relevance. I think I communicated my sources effectively because I included direct quotes and explained what they meant and why it was important. I think the directions on this assignment and the rubric were clear. I had both open while checking my work and was able to easily tell if I had met the requirements.
What examples can you find to exemplify the differences between types of simulations and their benefits in specific cases? This is my essential question for the month. I chose this question because I think it would be valuable for me to take a step back from the theory and look at some real world applications of what I have described. Looking at actual examples of a concept can help frame it and understand it better. I landed on this question after thinking about the fact that my SDA was purely theoretical. The two example simulations I provided were my own and were contrived cases. Looking into actual simulations used would prove useful.
This question is important to me because learning about simulations means learning about practical application as well as theory. To get a full understanding of the subject I think I will need to look into real world examples. This question is important to others developing and working with simulations because references to other simulations could help them with their own projects. Examples of how the difference between stochastic and deterministic simulations plays out in actual development could inform other on what they should be doing. I ended up choosing the ProQuest eLibrary database to find my sources. I chose the ones that I did because they are examples of a certain type of simulation being chosen with some justification for it. This will help exemplify the benefits I named for the types of simulations.
For this month's SDA I plan to create a HyperDoc. It will be an interactive document that showcases the answer to my question in a non-linear fashion. I plan on uses my sources as examples of how and why different types of simulations are used.
Works Cited
Györgyi, L., Field, R. A three-variable model of deterministic chaos in the Belousov–Zhabotinsky reaction. Nature 355, 808–810 (1992). https://doi.org/10.1038/355808a0
Juricke, Stephan, Tim N. Palmer, and Laure Zanna. "Stochastic Subgrid-Scale Ocean Mixing: Impacts on Low-Frequency Variability." Journal of Climate, vol. 30, no. 13 2017, pp. 4997-5019. eLibrary, https://explore.proquest.com/elibrary/document/1924739130?accountid=51266, doi:http://dx.doi.org/10.1175/JCLI-D-160539.1">.
Miina, Jari, and Jaakko Heinonen. "Stochastic Simulation of Forest Regeneration Establishment using a Multilevel Multivariate Model." Forest Science, vol. 54, no. 2, 2008, pp. 206-219. eLibrary, https://explore.proquest.com/elibrary/document/197731856?accountid=51266.
The question for my last assignment was: To what extent is the classification of simulations as stochastic and deterministic dichotomous? I learned that this categorization is completely black and white with a non-ambiguous definition for each. Now that I know this is the case, I can look for specific examples in the real world. Drawing from the Application section of the Higher Order Thinking Questions, I think the best question for this month's research is: What examples can you find to exemplify the differences between types of simulations and their benefits in specific cases? This is an important question because I have already established the different types of simulation that exist and the rough uses of each. Now I can look at specific cases where one was chosen over the other and explain specifically why it was. I have found some such examples and cited them below.
Works Cited
Juricke, Stephan, Tim N. Palmer, and Laure Zanna. "Stochastic Subgrid-Scale Ocean Mixing: Impacts on Low-Frequency Variability." Journal of Climate, vol. 30, no. 13 2017, pp. 4997-5019. eLibrary, https://explore.proquest.com/elibrary/document/1924739130?accountid=51266, doi:http://dx.doi.org/10.1175/JCLI-D-160539.1">.
Miina, Jari, and Jaakko Heinonen. "Stochastic Simulation of Forest Regeneration Establishment using a Multilevel Multivariate Model." Forest Science, vol. 54, no. 2, 2008, pp. 206-219. eLibrary, https://explore.proquest.com/elibrary/document/197731856?accountid=51266.
Györgyi, L., Field, R. A three-variable model of deterministic chaos in the Belousov–Zhabotinsky reaction. Nature 355, 808–810 (1992). https://doi.org/10.1038/355808a0
I spent two days working on this assignment. I made the mistake of underestimating the amount of work I would need to do on my college applications so I asked for an extension. As I continued work on my college application I realized that I had actually made a severe underestimation so I only had two days to work on the assignment instead of the planned four. During the time I was able to carve out for this assignment, I remained highly focused and didn't get distracted by anything. This was partially due to the fact that I had such little time.
I chose to make a video for this assignment because I felt that it was the most effective medium to get my point across. It allowed for visual animation and demonstration of my simulations as well as my commentary of them. I do not think technology was a limiting factor because I had all the needed resources available to me. However, my familiarity with the technology definitely held me back. The best time to learn new technologies is definitely not while completing a large project under time pressure. Yet I was using sound, animation, and video editing software that were all new to me. I later realized that I messed up in the sound department due to setting being off.
What I would do differently next time is plan out my time more wisely. This would allow me to spend more time on parts of the assignment and polish it up. Despite this, I am quite happy with my final product because the quality is good overall. I am inspired to think that I could pull something even better off by planning the assignment better.
The September Assignment helped me figure out how to convey information in an interesting way. I think it was useful to think of it somewhat like a story. The "know you know," "think you know," and "don't know" assignments helped me narrow down specifically what I wanted to look into. I think they were very useful for this assignment. The directions of the assignment and rubric were clear to me as well.
What I don't know:
I don't know to what extent the characterization of simulations as stochastic and deterministic is non-dichotomous.
I don't know the exact relationship between chaos and randomness in the context of simulations.
I don't know how interactive simulations, like my simulation of TianliBot, are classified.
I don't know how extensively different simulations are verified, as a small edge case has the potential to break an entire system.
I don't know why stochastic simulations are good at modeling processes that are deterministic in the real world.
I don't know how differential equations¹ relate to simulations, as I have heard of them a lot during my research.
I don't know to what extent and how artificial intelligence is used in different simulations.
I don't know what other types of simulations exist, either as subcategories of or alongside stochastic and deterministic methods.
I don't know where physical simulations² fit into this.
I don't know what the main causes are for seemingly correct models breaking down in the real world.
What I need to know for my October assignment is the extent to which stochastic and deterministic simulations are non-dichotomous. I think this idea is closely related to two other "don't knows" I listed: the relationship between chaos and randomness as well as why stochastic simulations model deterministic processes. There seems to be a very nuanced relationship between randomness, chaos, and determinism. I think these questions are important to answer first because some of the other ideas I listed build off of them. I keep wondering about the categorization of simulations and the different attributes of these groups. I think it is most important to first determine the nature of the classification itself. Is it black and white? What is the relationship between different categories?
¹ Not knowing what a differential equation is certainly doesn't help.
² By physical simulation I mean a predictive model of a real world process by something other than a computer. A famous example, around before computers were widely usable, is William Phillips' hydraulic macroeconomics. He invented what was essentially a simulation of the economy using water flow dynamics.
I am intending to investigate the differences between stochastic and deterministic simulations and when they are most effective.
I think I know that computer simulations can be broadly classified into two types of processes: stochastic and deterministic. Being stochastic means that a simulation depends on random inputs, so knowing the initial state does not allow you to predict the final state. Deterministic simulations, on the other hand, do not involve any randomness, so their end state can be predicted from the initial state. I think that deterministic simulations can still be chaotic, where small changes in input can produce radically different changes in output, making them seem random. I think deterministic simulations are best employed when the process being simulated has a known set of concrete rules, like in a classical physics simulation. If only a rough set of heuristics are known for a process—like in a simulation of natural selection—I think a stochastic approach is more effective. I think there is a large distinction between simulations of natural vs. manmade processes. My simulation of Discord and TianliBot, for example, was a simulation of a manmade process. I think these non-natural simulations, called emulations, are always deterministic because if something is manmade it must have a known ruleset. Emulations, I believe, are considered a subset of simulations.
I know I know that computer simulations are frequently used in finance to test the reliability of more simplistic predictive models, like equations. One such example of this is the use of the Capital Asset Pricing Model equation, which predicts returns on an investment based on its risk (French). Simulations of the market can be implemented to test changes to a model to determine if they improve it. Stock market simulations often employ a stochastic approach because concrete rules are hard to come by (Krishna). Simulations can also be used to predict outcomes to sports games (AccuScore). Many factors can play into these outcomes so simulating them accurately is difficult. Game predictions are frequently used in fantasy sports. One every day use of simulations is in weather prediction (Zwieflhofer). They work by using partial differential equations to simulate changes in attributes of the air like temperature, pressure, and density. The simulations run require vast amounts of computing resources because of the chaotic nature of our atmosphere.
Works Cited
(USA), Home page - AccuScore. “Home Page - AccuScore (USA).” AccuScore, 19 May 2016, https://www.accuscore.com/.
French, Jordan. “The One: A Simulation of CAPM Market Returns.” The Journal of Wealth Management, vol. 20, no. 1, 2017, pp. 126–147., https://doi.org/10.3905/jwm.2017.20.1.126.
Krishna, Reddy, and Clinton Vaughan. “Simulating Stock Prices Using Geometric Brownian Motion: Evidence from Australian Companies.” Australasian Accounting, Business and Finance Journal, vol. 10, no. 3, 2016, https://doi.org/10.14453/aabfj.v10i3.3.
Zwieflhofer, Walter, and Norbert Kreitz. Developments in Teracomputing Proceedings of the Ninth Ecmwf Workshop on the Use of High Performance Computing in Meteorology. World Scientific, 2001.
I spent three days working on my September Assignment. It was due Wednesday, September 29th. The got the idea while talking to my fellow EMCer, Abhi. That Sunday he said that he was going to be writing a story around a Discord bot he created called Mr. Bott. I was instantly reminded of another Discord bot I had created not long before: TianliBot. We thought of an interesting idea where the story was told through an interface mimicking Discord's. The next day I was brainstorming ideas for exactly how that would work. On Tuesday I wrote out the script, and on Wednesday I programmed the whole thing. Overall, I think I planned out my time well, not being forced to rush any aspect of it.
I think I put a good amount of effort into this project. It took me a while to plan it out and write the story elements in a manner that made sense. The programing aspect also posed some technical challenges that took some working out. In terms of creativity, I think the idea was unique and worked well with the story being told. The concept itself was difficult to understand so I spent a while figuring out the best way to communicate it. I think I did fairly well because everyone I showed it to was able to understand it quickly. One specific thing I wish I had done was additionally include some simulation of the original behavior of the bot. More generally, I could've tried to finish the assignment a day or more before it was due so that I had time to work out any technical issues that occurred.
I think the directions on the assignment were clear enough, and the rubric did help me understand what I needed to include. The idea originally came to me while speaking with Abhi, and I also got feedback on the project from him and Ram, another fellow EMCer. I also showed my assignment to a non-EMC friend.
One of my favorite stories is that of the manga Death Note. It is the story of an academically excelling but bored college student named Light Yagami. One day he finds a notebook which has fallen on the ground. It is titled "Death Note" and says that any human whose name is written in it will die. At first he is skeptical but after some testing (on horrible people he considers deserving of death) he determines that the notebook is very real. Light goes on to decide to secretly work to write the names of nasty criminals in the notebook to rid the world of evil. He assumes the role of some sort of god, deciding alone who is righteous enough to live and who deserves death. In his purge of wrongdoing, he attracts the attention of international police organizations who begin to suspect that one person is behind the mass murder of criminals. The head of these organizations is a mysterious figure known only as L. He leads an investigation that attempts to find the killer and bring a stop to his massacre.
This is a complex story with many elements that make it stand out. One major aspect is the plot. The story contains many complex mind games played out between L and Light. Instead giving them some contrived exposition, the story allows its plot to explain these to the reader more naturally. It does this by allowing some complex scene to play out then explaining what happened afterwards. The reader gets hooked on some interesting scene that they do not completely understand then gets blown away by the explanation of what happened¹. This explanation is often cleverly hidden inside the plot such as in conversations between Light and Ryuk (a god of death who originally owned the notebook).
Another captivating aspect of the book is the character development. Light starts with good intentions but his delusions lead him down a path of evil. It is honorable that he starts out wanting to make the world a better place but he goes about it in the wrong way and ends up turning evil himself. He even kills innocent people who get in his way such as law enforcement. This leads to an interesting development where many police officers want to quit the case because their lives are in danger. Their names and faces are publicly available information putting their lives at extreme risk for going against Light. The chief of the police force is Light's father and he is determined to catch the killer at all costs. He is put under constant pressure whenever his son comes under suspicion. Some police officers are not sure if going after the killer is the right thing to do at all. Global crime rates decrease as people fear the supreme judgement that can come upon them so perhaps the killings should be allowed to continue. The characters are presented moral quandaries like this and their reactions are very interesting.
A single-sentence hook for this story: "An ordinary student drops global crime rates, is being hunted by international police, and forces the world to answer difficult questions of morality all by killing with a notebook."
¹ Unfortunately I am unable to do any of these scenes justice so I will not attempt to describe them here. The true experience can only really come from actually reading the books.
One remarkable and important application of computers today is the simulation of processes in the real world. It is extremely valuable in predicting outcomes in our environment such as the weather or protein shapes. For a very long time I have been fascinated by these programs, especially because they often contain some captivating visual component. But more than that, the idea of a computer playing out reality—as opposed to using an equation¹—is very interesting. Other worlds being simulated on a computer is what led to the idea of our universe being a simulation. Obviously the idea is quite extreme but this feeling of a loss of reality is what hooked me. Diving deep into the topic, I learned of the many applications of computer simulations and that so many things rely on them.
From the little research I've done in the past, I know that computer simulations are used in a wide variety of fields to predict the future daily. The rise of machine learning has allowed for this to be applied where it was practically impossible previously.² Just as there are systems that cannot be modeled with an equation, there are systems which cannot be easily modeled with traditional algorithms which makes a machine learning approach necessary. I indent to focus my research on the simulations using traditional algorithms which I have created a few of in the past.
I can think of a few important questions on this topic. What different types of computer simulations are there and what are their applications? It is clear that not all of these simulations fall in the same category, and they are used for different types of systems. Classification can be helpful in researching any topic. How reliable are computer simulations in the real world and what factors affect this? These models can only be reliable to a certain degree, and I would assume there are a variety of variables affecting this. How are different types of computer simulations implemented? I want to dive a little bit into implementation because I am planning on creating a few myself.
¹ I think it would be valid to point out that an equation is another type of computer simulation, but there is a concrete distinction I can make for the types of simulations I am talking about. Automata theory essentially proves that Turing machines (traditional computer programs) can solve an entire class of problems that combinational logic (equations) cannot and therefore can simulate an entirely different class of systems. So when I use the term computer simulation, I am only referring to those which employ a Turing machine model of computation.
² Anything possible with a machine learning model is technically possible with traditional algorithms but the implementation may be impractical.
For as long I have been earning grades, my parents have expected excellence. Luckily for me, a combination of nature and nurture has allowed me to meet that expectation quite easily, so I have never had much conflict with my parents in that regard. Their expectations are a large part of my motivation for school. My parents immigrated to this country when I was two years old so that my brothers and I could get a good education. I feel somewhat obligated to take advantage of the opportunity given to me.
Some of my favorite school memories are doing practical experiments in science class and making projects in technology classes. This type of learning is especially fun because the results of your efforts are more tangible and the effort itself is more enjoyable. One of my favorite memories outside of class is winning second place in the Siena High School Programming Contest with my team. We did quite well, solving all but one problem out of seven. The excitement and energy in the room after winning was quite memorable.
One aspect of school that I enjoy quite a bit is learning new things—inside and outside the classroom. I often have moments when I am working on homework and some part of it leads me down a random research tangent. I'll only realize I need to get back to work after a while down the rabbit hole. This is not very productive at times but its still fun to learn these new things. An aspect of school that I do not like as much is the rigid structure and schedule. I get mentally drained after sitting through hours of classes so I am usually very distracted. My productivity comes in bursts throughout the day which I find is not very efficient for school.
The main thing that attracted me to E=mc2 was the ability to research any topic through your own means. I felt like this self-defined structure would be perfect for me. In addition to that, I feel the topics I want to research the most—in the domain math and computer science—are not well represented in the high school curriculum. I do not expect representation as it dives deep into specific topics but I love that E=mc2 will allow me to delve into these areas.
Outside of school I enjoy talking to and hanging out with my friends. I am lucky enough to be friends with quite a few people who have interests and values similar to mine. I also spend a lot of my time working on personal programming projects. I start many of them but few are seen through to the end. The more difficult a project is, the more interesting it has to be in order for me to complete it. Sometimes interest in an old project sparks again, and I go back to complete something I left months ago. I have worked on a large variety of topics from computer graphics to programming language creation. This makes it difficult for me to choose one topic to focus deeper study on, though I do have a general idea.
Some activities that I enjoy include biking and, more recently, rock climbing. I enjoy these physical activities, and they bring balance to my life because my passion is mostly sedentary. This summer I also started an online class with my friends called HackStudio where we teach students how to program, one of my long-time hobbies. Teaching a class like this has been a new experience for me, and I have really enjoyed sharing the knowledge that I've gained.
One surprising fact about myself is that my programming is entirely self-taught. I have never taken any class, online or otherwise. All of my skills came through creating increasingly advanced projects and online research. I think this experience will help me dive into the research and exploration ahead of me.