Took RCR training, topics included research misconduct, authorship, human and animal experimentation, conflicts of interest, collaboration, data management, peer reviews, and mentoring.
Attended lecture by Dr. Ming about procedures in research, learned about the legal, moral, and technical aspects of research, as well as ideas about how problem statements are developed and the sequence of research.
Attended lecture by Dr. Uthurusamy about having a career in research. Learned about the personal characteristics needed for research, an interpretation of the purpose of research, and what we are expected to learn about research from this REU.
Attended lecture by Dr. Sen about reading and writing research papers. Learned about the different types of reading (scanning, broad understanding, and thorough), the order to We met with Dr. Sen and our mentor Feyi and developed our plan for this week which consists of: reading research papers related to risk assessment for CAVs, setting up a time with Feyi to have a weekly technical meeting, working on our daily logs, possibly reading Quantum Computing for Computer Scientists, possibly working on developing the web application prototype.
read a paper, and the order to write a paper.
Attended lectures by Dr. Fu, Dr. Lu, Dr. Ming, and Dr. Sen about their respective research projects and was chosen for Dr. Sen’s complex probability risk assessment project. Met with Dr. Sen and Ivan Mo and spoke about the structure of the next few weeks and received papers to read.
We presented our week 1 presentation to the group and it seemed to be well received. The only thing I would change would to be slightly more brief, as we were very close to the 25 minute suggested limit.
We met with Dr. Sen and our mentor Feyi and developed our plan for this week which consists of: reading research papers related to risk assessment for CAVs, setting up a time with Feyi to have a weekly technical meeting, working on our daily logs, possibly reading Quantum Computing for Computer Scientists, and possibly working on developing the web application prototype.
Dr. Sen also gave us a method for reading papers this week and next, where Ivan and I each read 3 different papers, and at the end of the week we will discuss our readings.
This week I will be reading: "Autonomous Vehicle Security: A Taxonomy of Attacks and Defenses", "Connected and Autonomous Vehicles: A Cyber-Risk Classification Framework", and "Cybersecurity of Connected Autonomous Vehicles". Ivan and I will both read "Risk Assessment for Cooperative Automated Driving".
I read "Risk Assessment for Cooperative Automated Driving" and took notes. My biggest takeaway from this paper was the identification of CAV specific attack surfaces and attack methods, which are areas integral to our project.
We attended a talk by a former UnCoRe participant. He talked about his experiences with the research itself, and also about presenting this research and applying for graduate school. My main takeaway was his advice regarding presentations, that you need to present your research in a way that is interesting, unconfusing, and accurate.
I read "Autonomous Vehicle Security: A Taxonomy of Attacks and Defenses" today. The paper identifies multiple attack methods, targets, and motivations as well as defenses against each of these attacks.
From my readings from today and yesterday, I have made a Google Sheets document indexing identified attacks on CAVs, a description of these attacks, possible targeted systems, entry points, and a possible defense against the given attack. This is a preliminary list and is subject to refinement, but this will serve as a good base for our final CAV attack repository.
I read "Cybersecurity of Connected Autonomous Vehicles". Largely I don't think this paper is that relevant to our research. The ranking system they developed, in my opinion, seems too high-level and general to be useful to our project.
I read "Connected and Autonomous Vehicles: A Cyber-Risk Classification Framework". This paper is highly relevant to our project, as the framework they develop is similar to our own. Their framework is a Bayesian network, learned and verified using data from the National Vulnerabilities Database. Their results were very impressive, and their structure and variables could server as a good inspiration or base to our own framework.
I downloaded a template research paper LaTex file and filled in some information, mainly our tentative abstract, authors, and the citations of the papers I have read so far. I plan on making a Github repository to upload the Tex file to, so Ivan can also insert his citations.
Dr. Sen suggested that we use Overleaf for collaborating on editing the LaTeX document, so I transferred the files there and shared it with Ivan.
I added two more parameter columns to the attack index, one being pre-conditions, and the other post-conditions. I added some information in the pre-condition column, but I'm unsure if they are accurate. More work will be done to fill out these parameters.
I started reading "Quantum Computing for Computer Scientists" and read through Dr. Sen's notes that he provided on some of the chapters. In his notes he poses several questions about how our framework will utilize complex probabilities and what various interpretations of their properties will be. For easy referencing I aggregated these questions into a Google doc that I shared with Ivan.
Ivan and I met and discussed the papers we each individually met. Just based on the way we divided the papers, the ones I read were more on the technical side and so I suggested that Ivan read "Connected and Autonomous Vehicles: A Cyber-Risk Classification Framework" and "Autonomous Vehicle Security: A Taxonomy of Attacks and Defenses".
We also talked and worked on our presentation. We plan to meet this next Monday to review and revise our slides
I read more from Chapter 1 of Quantum Computing.
Ivan and I presented our work from last week to the group. We met with Dr. Sen afterwards and he gave us some critique. Things to work on for future presentations: state where specifically the papers were published, emphasize what we are doing for the project, relate the paper with our project, give a recap of the project outline and goals, showcase unique parts of the results, and utilize graphics and tables
We also met with Feyi and she gave similar critiques, and emphasized making the presentation with the audience in mind.
I read more of Quantum Computing and I'm now on chapter 2, which is about complex vector spaces.
I started reading "A Risk-Based Optimization Model for Electric Vehicle Infrastructure Response to Cyber Attacks". So far it has made me aware of one more attack surface, which is electric vehicle charging stations.
I finished reading "A Risk-Based Optimization Model for Electric Vehicle Infrastructure Response to Cyber Attacks". I didn't really understand much of the math behind their threat model or method, but I don't think it's entirely relevant. My main take away was the idea behind the threats they modeled for electric vehicles and EVESs.
Part of our goal this week was to create a draft of the network scenario for our project. The network scenario is a diagram or description of the network that a security framework is built for. I went back and reviewed some of the papers I read last week and this week, and started creating a diagram in Google Drawings.
I started reading "A review on safety failures, security attacks, and available countermeasures for autonomous vehicles". So far this paper seems very relevant, as it could help build the network scenario and give more information on attacks on CAVs to add to the attack index.
I finished reading "A review on safety failures, security attacks, and available countermeasures for autonomous vehicles". From this paper I was able to add a few more attacks to the index. Some attacks were similar to attacks already recorded but gave more insights into targets, pre-conditions, and post-conditions.
I re-structured the attack index, categorizing attacks according to the STRIDE risk classification.
I worked more on developing the network scenario, expanding the connections.
I read "Cyber Threats Facing Autonomous and Connected Vehicles: Future Challenges". This paper was helpful at identifying some additional attack, but more importantly it gave some additional information on attacks I already documented.
I did some additional work on the network scenario and the attack index. For the network scenario, I replaced the node, which were formerly boxes, with icons to make it more visually appealing and understandable. I also added another connection, that being the connection to navigation satellites. For the attack index, I filled in the pre and post-conditions for some of the attacks, based on the paper I mentioned.
Ivan and I met and discussed our presentation for next week.
Ivan and I presented our work from week 3 to the cohort. We ran slightly overtime, but overall it was better than last week. We felt that we put more emphasis on relating what we read with our work on the project.
I read Chapter 1 of "The Car Hackers Handbook - A Guide for the Penetration Tester". This is a book that was recommended to us by Dr. Fu. The chapter I read was mostly concerned with attack surfaces and threat modeling for vehicles.
Ivan and I met with Dr. Sen, where he talked about our plan for this week. We are largely going to be refining our attack repository, adding more parameters based on those found in the Common Vulnerabilities Scoring System (CVSS).
Ivan and I also met with Feyi about our presentation. Her critiques included preparing information ahead of time for diagrams and work that we've done, to prepare the audience for what they should expect.
I read back through some of the papers I had read these last few weeks to find more specific attacks. In my original pass through, I had grouped similar attacks together. Dr. Sen recommended that we include as many attacks specific to CAVs as possible. From my re-reading I was able to break several attacks into module specific attacks.
I started reading through the CVSS 3.0 Specification Document. As I read about the different parameters, I would add them to the attack repository, and assign levels for attacks that were obvious based on the descriptions of the attack and the level.
I started reading the NIST Information Security document. I've only gotten through part of chapter 1, which is an overview of what risk assessment is and its purpose.
I continued working through the CVSS Specification document, adding the level I thought was appropriate for each attack in the repository.
I continued reading the NIST Risk Assessment document. Chapter 2 is so far mostly about the different aspects of a risk assessment and related things, like threat models.
I continued reading the quantum computing textbook. I finished the sections for chapter 2 that Dr. Sen suggested.
One of our tasks for this week was to think about how we want to graphically represent our attack graph, the two main options being nodes representing attacks themselves or nodes being the different states of the system. Because so much of my work this week has been related to the attack repository, I am inclined to the node representing attacks. To get an idea of what this graph would look like, I sketched out the main relationships between the attacks, mostly based on the pre and post-conditions.
I continued reading the CVSS document and added metrics to the attack repository.
I read "Bayesian Attack Graphs for Security Risk Assessment". This paper was helpful in introducing what a Bayesian Attack Graph is and some of the math behind how it's made. One thing that would've been helpful for understanding the paper is if they had included a graphical representation of a sample network. One takeaway from this paper was how they used CVSS scores, specifically the exploitability score, to estimate the probability of successful exploitation of a vulnerability.
Ivan and I met together to discuss what we had been doing and reading this week, and to answer some of each other's questions. He gave me some input about determining certain metrics for attacks. We created the presentation for next Monday, and plan on meeting up tomorrow to work more on it.
I finished filling out all the base metrics for the attack repository from the CVSS descriptions.
Using the web application Lucidchart, I made a graph of all of the attacks in the attack repository, showing some of their causal relationships. I'm not very confident in all of the relationships I identified, but it could serve as a starting point for future work regarding our attack graph, assuming we use the attacks-as-nodes representation.
Ivan and I met again today, primarily to work on the presentation for next week.
I read more of the NIST document.
Ivan and I had our weekly presentation. I personally think we did well, though like in other weeks we ran slightly longer than intended. Feyi was unable to come to our presentation, so she won't be able to give feedback until tomorrow when the video is released. Similarly, Dr. Sen won't be able to meet with us until Friday. In the mean time he said our main focus this week was to draft our risk assessment procedure. That is to say we need to tentatively decide what representation be want our BAG to take. The other focus is on the interpretation and computation of complex probabilities.
I read more of the Quantum Computing textbook, and have almost finished chapter 3. Section 3.4 deals with complex probabilities and their use in adjacency matrices for graphs. This section has helped me gain and understanding of how complex numbers can be used to represent probabilities. The last piece of the puzzle is how we interpret them in the context of risk in CAVs and how we quantize our metrics.
I read "Dynamic Security Risk Management Using Bayesian Attack Graphs". It was good getting insight into how a Bayesian Attack Graph is actually built and how the conditional and unconditional probabilities can be calculated. I would assume that we will emulate their method of probability calculation, with the twist of complex numbers.
I read the other papers as assigned by Dr. Sen regarding traditional risk assessment. Both of the two papers used Bayesian Attack Graphs, but each utilized a different representation, one using nodes as attacks and the other nodes as states. The attack node paper, written by Dr. Sen, was similar in some ways to our project, as he also built a list of attacks on the target network.
Based on the papers and textbook I read this week, I've done a lot of thinking regarding what BAG representation we should go with. An attack node representation has the benefit of being a fixed size, as the list of attacks is fixed. Because we already have the list of attacks made with all of the base metrics filled out, I am leaning much more towards this representation. Additionally, I was able to determine some of the relationships between attacks last week, so some of the work of connecting nodes has already been done.
I also was thinking about how to interpret and assign complex probabilities. Many of the BAG papers used the Exploitability sub score as their probability, which I think is a good starting point for the real component. As far as an imaginary component is concerned, I think we could use the impact metrics. An idea I had was to assign Compromise and Integrity a positive imaginary number, and Availability a negative imaginary number. My reasoning is that for C and I attacks the operation of the network is necessary for exploitation, but if an A attack is ongoing/succeeds that will interfere with the C and I.
I spent some time reviewing over the math in Quantum Computing to try making some quantizations for our attack metrics. My naive approach was to derive a real and imaginary component from the CVSS metric levels by assuming they are equal, e.g. a metric level normally corresponding to 0.82 becomes 0.652 + 0.652i by solving the equation 0.82 = (sqrt(x^2+x^2))^2.
I met with Ivan to discussed what representation we wanted to use for our BAG. We were both of the mind that nodes as attacks would be the easier implementation.
I continued to try different interpretations of what "i" could mean in a quantization and created a possible equation for determining the probability of exploitation. Running an extreme case through, the magnitude square of the result was less than 1, so the equation could be valid. My concern is that this equation isn't really representative of the actual probability of the attack's success, but that probability is unknowable. I plan on asking Dr. Sen on some guidance relating to the quantization.
Ivan and I met again to discuss and work on a threat model draft. We had a few different ideas, so we decided to work on both of them and decide on which one we thought was best. We also discussed more about how we could do quantizations. We realized we were both confused in this area, so we created a list of questions we wanted to ask Dr. Sen on Friday.
Ivan and I both continued to work on our draft threat models. Additionally I worked on adding some attack instances to the network scenario.
Ivan and I met with Dr. Sen to talk about the progress of the project and questions that we had. I asked about the interpretation of complex probabilities, and we made a tentative decision to investigate a phase as state interpretation.
Ivan and I presented our mid-project presentation to the cohort. Unfortunately Feyi was unable to be there, so our meeting was moved to Tuesday so she could watch the video. Overall I felt we did a good job of explaining our project in a clear on concise way. We also only just went over time this week, which is a marked improvement from being 5 minutes over.
Our main goal for this week is working on interpreting and deriving complex probabilities. Today I spent a reasonable amount of time investigating whats known as percolating centrality, which is a way to rank a node's importance when there is an infection in the network. The idea sounds similar to possible interpretations of the phase, however there are parameters required that we simply won't know about our system.
I also spent time writing up ideas about what the phase of the complex number could represent. This is a problem for us to solve because the magnitude has a clear interpretation, as its square is the "collapsed" probability. However phase is less clear. I had though about having the phase represent the proportion of network resources captured by the attack, but I still need to find a way of estimating this quantity.
I spent most of the day investigating other ways of capturing the "state". One possible line I looked into was an equation used to describe the spread of malware through a network. A big part of this equation comes from a parameter beta, which represents the probability of infection. I'm not sure if this parameter is different from the probability of exploitation, but this might be something that can be estimated from user inputs regarding the security measures in the network.
Ivan and I met with Feyi to talk about our presentation, and she seemed to think it was good. Her only comments were that the titles of some of the slides were kind of misnomers.
I spent some time marking down possible inputs that we ultimately want to get from the user, and though about how they could be integrated into the computation of the probabilities. Things like increased security measures would obviously decrease the probability of success. Thinking more about security measures has made me think that maybe instead of thinking about attack interference, the property of interference could be derived from security measures.
I had meant to meet up with Ivan, but we both forgot that we had set it. We decided to reschedule until after Dr. Sen had answered some of our questions.
In preparation for construction of the Bayesian attack graphs, I've started standardizing the pre and post conditions, as right now they are inconsistent.
Dr. Sen got back to us today, and explained how we could represent the likelihood of an attack spreading to a different state of the system with the imaginary part of a number. This method of construction makes a lot more sense to me, but we still need to come up with an interpretation of the phase and the imaginary unit.
From this comment, I gathered a few equations to build our complex probabilities. One equation is the CVSS exploitability subscore, which with some modifications could give the real component. Another is an equation describing the probability of malware infecting a network node, given the infection rate and number of already infected nodes. This equation could possibly represent the imaginary component, though we would need to estimate both the infection rate and the number of infected nodes. The sign of the imaginary component then could be determined by the sum of the impact metrics, where Confidentiality and integrity are positive and availability is negative.
Ivan and I met up and discussed our progress this week, and things that need to be worked on.
I mostly worked on trying to refine our equations for deriving a complex probability. The biggest thing that needed refining was the equation for determining the likelihood of an attack spreading. For this I figured that some useful metrics about the security measures of the system could be used to compute the likelihood that an attack will not infect another node. The next variable we need to derive is how many infected nodes exist in the network.
Ivan and I met up to work on the presentation for Monday.
Ivan and I gave our weekly presentation to the cohort. I think we did very well as Dr. Fu said that our presentation was great and she didn't have any questions. Feyi will meet with us tomorrow to give us critiques.
I spent today implementing the equations for calculating the real component, imaginary component, and the imaginary sign in a Google Sheets document.
The first implementation I made was creating the security score. The security score is the average of 4 security parameters: V2V security, V2I security, V2X security, and authentication. These metrics range in level from none to low to high. The numerical values for these levels are based on similar CVSS levels. These parameters will be determined by the end user. The security score is used to represent 1-beta in the equation to determine the likelihood of spreading to other machines.
I also changed the equation for determining the real component to instead be a modified average of the exploitability metrics, so that the real component isn't dwarfed by the imaginary.
I tested the equation on an extreme case, where the attack is as strong as possible and the security is as weak as possible. I used the resulting magnitude to tune the normalizing coefficients so that the magnitude would be around 0.95. This number is arbitrary, but it feels like an appropriate probability for the worst possible scenario.
One parameter that might make the imaginary component blow up is |Pa|, or the cardinality of the set of parent nodes. This is and exponential factor, so when is starts increasing the imaginary component quickly approaches the asymptote of 1, which ends up making the magnitude more than 1. I might have to tweak this component so that the magnitude stays under 1.
Feyi met with Ivan and I can gave us our critiques for the presentation. She said that before jumping into the more dense topics that we worked on, we should preface with a reminder as to what it is we worked on, and how it fits into our overall project.
I worked on finishing up the main equations. For the parent node parameter, I decided to get rid of it. The effect of the exponential was too difficult to work with, as the probability would shoot up as soon as there was more than 1 parent attack node. Additionally, I created equations to calculate the "opposite" probability, or the probability of an attack failing. In terms of vectors, the opposite probability is the vector that when added to the regular probability has a magnitude of 1, and the phase is unchanged.
I mostly worked on implementing our equations in python. In addition I also created dictionaries in the same file so that we can easily convert string metric levels ("High") to their numeric value (0.85). In a different file I also worked on connecting to our database and extracting data from documents.
Ivan and I met with Dr. Sen to talk about our progress. Some suggestions he gave regarding our implementation was that instead of having an equation based on impact scores to determine the sign of the imaginary component, we should instead determine it ourselves. He said we should base it on whether we believe the attack would assist or interfere with each other, essentially creating another metric in our database.
I started looking over the documentation of PyMC3, which is a python module that helps build Bayesian networks.
I continued looking around at different python libraries that have functionality for Bayesian networks. Some of them seemed promising, like pgmpy and bnlearn. My issue with most of them though is that they don't have functionality for computing the conditional probability tables, as well as calculating the unconditional probability based on join type (OR or AND). If I can't find a library to perform these functions we might end up writing one or both of them ourselves.
Ivan and I met and created a plan regarding who does what work in the context of the implementation. We also created a road plan of the next couple weeks to ensure that we stay on schedule.
For demonstration purposes we need to make some assumptions about our model network so that we can calculate the base probabilities of exploitation. I was tasked with this and decided that V2V and V2I communications would have low grade security, V2X would have no security, and authentication would have high security.
I started work on nailing down the actual graph structure for our three BAGs. Currently I have a tree with all of the root nodes, so I have to separate this tree into its main components.
I finished making the structures for the three BAGs. Each BAG has a goal state that indicates the impact (Confidentiality, Integrity, Availability). In each graph I cut out any attacks that don't lead to that graph's goal state. From this pruning I discovered that the timing attack node was a leaf, and so doesn't appear in any graph.
I was looking around at how the conditional probability tables are constructed, and I'm unsure of how to move forward. I think I will ask Dr. Sen more about this.
Over the weekend, I worked on finalizing the structure of the 3 Bayesian Attack Graphs. This structure was outlines visually.
I worked on writing the script to calculate the unconditional probabilities from a given BAG structure. It seems like the script works, however Ivan and I will go back and calculate a case by hand to make sure that it is correct.
Ivan and I met and discussed what we would be working on this week. Once the script is totally finished, I will begin working on the report, as well as the poster. If we have extra time we will try to get some work dong on the Django web app.
I continued to work on the script to generate the unconditional probabilities. There were several bugs related to calculations that I fixed. I also re-formatted the script so that it was in the form of a function so that the process is more generalized.
Ivan and I met to work a little more on our presentation, as well as check in on each others progress.
I worked on adding some citations to the LaTex document. I also reworked the abstract and added some key points to the introduction.
Ivan and I gave our presentation. This was the last presentation before our final presentation, So next week we will be able to focus completely on the poster and report.
I worked on writing up some of the report. I have a pretty good idea of what all is supposed to go in the intro, so It is mostly a matter of getting things written and refining it later.
Ivan and I had our last meeting with Feyi. Her only piece of advise in relation to the presentation was just about wordage related to our attack metrics, as in a previous slide we had called them variables.
I worked on writing more of the introduction for the report.
Ivan and I met up with Dr. Sen to discuss what needs to be done in the next 2 week. The major priority is to finish the web app. Next would be to finish the report, giving it about 7-10 days of work. Then finish the poster and assemble deliverables.
I started working through a Django tutorial series to get the environment set up and a basic page built.
Over the weekend, both Ivan and I watched through the tutorials Dr. Sen recommended us. Additionally over the weekend I set up my Python virtual environment, my Django project, and started making some pages. Today I worked on getting the functionality of the website built. When going to the demo page, there is a form that gives options for all of the security parameters and the expected risk level. The results are computed and displayed in the form of an HTML table along side the images of the BAGs. I noticed that some of the numbers don't seem to be right, so tomorrow I'm going to go through and compute some test cases.
Ivan and I met I discussed progress and what still needed to be done in regards to the website. Most of the functionality is finished so a lot of the remaining work is about formatting, cosmetics, and overall niceness of the web app.
I worked mostly on finishing the functionality. I have had the suspicion that there was a flaw in the script that generated the unconditional probabilities. Indeed after talking with Dr. Sen and working through some examples from his paper, I had a fundamental misunderstanding of how the OR join operated. The new script works with the examples, and so I believe it is finished.
More work was done regarding the web app. The complex probabilities were rounded and the magnitude (collapsed) probability was added to the result table. Additionally, calculations regarding the imaginary component were configured to use a django session, so that results are stored per-user-per-site. The script to calculate the conditional probabilities was changed so that it took the imaginary component in as an argument, so that the attack database is not constantly being overwritten.
I worked on writing part of the procedure section on the report.
I worked on writing more of the report, specifically the portion regarding the process of developing our model. I also worked on getting some of the tables filled with information.
Ivan, Feyi, and I met with Dr. Sen. He thought the website looked good, just that it needed some explanatory text for the demo and results pages. He established a timeline
I worked on writing the report all day.
Ivan and I continued to work on the paper, with the rough draft being finished tonight. We also worked on getting a rough draft of our presentation made and our poster, so when we have our work meeting with Dr. Sen we have a skeleton to work around.
Ivan and I had our work meeting, and from that meeting we were able to finish our poster, and make significant progress on our presentation.
Ivan and I met up separately from Dr. Sen to finish up the presentation and assign speaking roles.
Ivan and I had our presentation today, and it went over well. We stayed within our allotted time, and Dr. Fu even said that she might use our poster as an example for future REUs.
Ivan gathered up deliverable that have already been finished and put them in the deliverables shared folders.
We began making edits to our report, working through the critiques that Feyi had left from earlier in the week.
Ivan and I finished going to through and implementing Feyi's critiques of our report. Dr. Sen went through the report and made some suggestions, as well as a few presentability edits.
We put our Django project folder zip in the deliverables, along with all of our report, poster, and presentations.
I worked some on a script for the pitch video Ivan and I plan on recording tomorrow for Mid-SURE.
I filled out a few exit surveys for the program.