IceCube Research Project
1/25/2018
Today, I confirmed that I am assigned to the IceCube Neutrino Telescope for research this upcoming year.
Related Links:
https://en.wikipedia.org/wiki/Neutrino_detector
http://icecube.umd.edu/Home.html
https://sites.google.com/a/physics.umd.edu/honrxxx/logbook/268n-2016/juan-dupuy
https://sites.google.com/a/physics.umd.edu/honrxxx/logbook/268n-2016/matthew-kirby
https://sites.google.com/a/physics.umd.edu/honrxxx/logbook/268n-2015/anat-berday-sacks
1/29/2018
Today was our first meeting with Dr. Blaufuss. Our main topic of discussion was what previous groups have done as well as possible topics for our project.
The 2015 group created an all-sky search program designed to be able to do the basic analysis of IceCube data. Essentially, what it did was pick an area
of sky and compare it to the rest of the sky to see if it is statistically significant. The 2016 groups expanded their work. Dr. Blaufuss did note that while the
previous group did work on the 2015 group's project further, there is no pressure for us to do so. He gave us brief summaries of possible paths that we
could do, such as working on a program to track particles through the detector and on how to reduce noise in the detector. He then sent us the 2012 data
as well as links to publications that he thought would be helpful to deciding a topic.
2/5/2018
We did not meet this week because Dr. Blaufuss had jury duty. We continued to read the literature Dr. Blaufuss gave us as well as Paul Neves, a previous
student's, webpage.
2/12/2018
We met today to mainly ask any questions we had regarding the papers. Our main topic of discussion was stacking and how it is used within the
collaboration. What it is, essentially, is you pick a catalog of objects and stack all the data coming from their locations into one bin and test for significance
compared to other random bins. There is more to it, of course, such as weighting the objects based on qualities that would affect their theorized neutrino
flux, but for now, we will wait until we better understand the mechanics to explain it further. Currently, our main objective is to become more familiar with
analyzing data with Python and to determine a catalog of objects that are potential neutrino sources.
2/19/2018
While we had a scheduled meeting today, Dr. Blaufuss was unable to attend due to his child being off of school for President's Day. Kun, Rachel, and I still
met, however, and we discussed our own understanding of various topics in order to gauge how far along everyone was. We also decided a "gameplan"
going forward so that we had a clear idea of what needed to be accomplished by next week. We agreed that we all needed to have python installed and
ready to analyze data, to have an understanding of the literature, as well as Paul's work, and to narrow down our candidates for the catalog. Currently, our
front runner for catalog choice is the Fermi Telescope 3FGL catalog of pulsars.
2/26/2018
We met with Dr. Blaufuss and discussed the specifics of catalog choice. Essentially, we need to have a reason why we are picking a class of objects that,
generally, boils down to there needs to be a core underlying theory as to why it will emit neutrinos. This made us question our choice of the 3FGL catalog.
Thus, we started looking again. We decided we wanted to look at active galactic nuclei (AGN) that are believed to be super-massive black holes. The two
major candidates are a group of AGNs found within the WISE Catalog (http://irsa.ipac.caltech.edu/cgi-bin/Gator/nph-scan?submit=Select&projshort=WISE)
and the GSU AGN Black Hole Mass Catalog (http://www.astro.gsu.edu/AGNmass/).
3/5/2018
Today, we met with Dr. Blaufuss and discussed our catalog choices. Dr. Blaufuss gave us some excellent input. He noted that the WISE Catalog is probably
just too big for us to parse through effectively. Luckily, our other choice, the GSU AGN Catalog, has 62 AGNs in it, a very nice number to work with. The next
major topic of discussion was how to effectively weight our bins for the stacking equation. More specifically, we discussed what are some possible variables
to weight based off of. The 4 major variables that the weighting will likely consider are as follows, the proper distance of the AGN from us, the mass of the
black hole, the energy of the particle detected, and the declination of origin. The last two are very important because it is based off of known effects due to
properties of the detector. The ratio of signal to background changes based upon the declination of origin for the particle. Similarly, the higher the energy of
the particle, the more likely it is signal and not background. The first two are based off predictions. If a object is emitting a constant flux of neutrino, one would
expect the amount observed to be proportional to the surface area of the sphere with radius of the distance between the observer and the object. The
relationship between flux and mass of the black hole is much more tenuous. There is no established model that predicts a relationship between mass and
neutrino flux so if we do choose to weight based upon the mass of the AGN, it would be guided by conjecture. Dr. Blaufuss did note that we should do a
"dry run", where all theoretical weights are 0. I encountered issues with my virtual machine, unfortunately. There was an issue with the VM file so it became
corrupted and I essentially had to restart, which is a little aggravating, but there is not much to be done about that. I also switched from VMWare to virtualBox
in order to ease file sharing between my host computer and the VM. I was having issues mounting the shared file system in the VM using VMWare, but I was
able to do it with virtualBox.
3/12/2018
We finalized our decision to use the GSU AGN Catalog. Other than that, there was not much content wise. We further discussed weighting but not too much
progress was made. We made some goals to get done over spring break, although I am unsure as to how much work I will get done during Spring Break.
Some goals I do have, though, are to get GitHub working and to get Paul's code working, just to get some base understanding of how it works, which could
be helpful in the future.
3/26/2018
I ended up being more productive over break than I thought I would. First off, my VM stopped working... again. So, I needed to make a new VM and install
all the necessary packages again in order to be able to run code. Once that was done, though, I got Paul's code running. The code is seen in the attached
picture and the graphs it produced are seen below.. Thankfully, I did not run into any complicated issues to get the code running. I had to fix some
indenting to make it consistent as well as ensure that some variable types agreed with each other. At one point, a float was being added to an int.
With some type casting, I resolved that issue.
These graphs are pretty much exactly what I expected, so that means the code is working. Dr. Blaufuss also sent us code that we can use to
determine the detector weighting. The code will be attached below as well under the name calc_effarea_convolve. Essentially, what it does is given
a declination, it looks for all events that occur within 5 degrees of that declination and determines how many events of a given energy would occur
in a year. An example graph of results is this.
By summing all the bins in a given declination, one can determine the expected number of events per year of a given declination. By
comparing that result to the results of other declinations, one can find the relative detector weight. This is done by taking the maximum of all of the
results and dividing that by the result of the declination of the bin that they are interested in. By this method, the lowest weight is 1, which occurs
when the declination is the same as that of the maximum.
4/2/2018
This week, we discussed what exactly we wanted to weight the data with. We determined that at the very least, we should weight based on the
redshift and the detector. We also are considering weighting based on the mass of the accretion disk, however, that is more dubious as there
is no simple connection between mass and gamma ray flux. In order to weight, a plausible connection must be established between detection rates
and the variable. For redshift, this is fairly straighforward and essentially boils down to the idea that the flux given off by an object is spread across
the surface area of a sphere. The farther away you are from the AGN, the greater the size of that sphere. At low redshifts (< 1), redshift converts
to distance in a roughly linear relationship. A proof of this can be seen in the attached image below. Using this, we can find a relative weight of a
source by dividing the greatest redshift of all sources in the catalog by the redshift of the source in question. Another important note is that we threw
out two sources in our catalog due to them having a much greater redshift than the others. While most were sub-.3 z, these two AGNs were over 1.
This could possibly skew results slightly due to a major difference between them and the rest, so we decided to take them out.
4/9/2018
This week, we looked a bit more into weighting based on mass and found a pair of helpful papers (http://adsabs.harvard.edu/full/2003MNRAS.343L..59H
and https://arxiv.org/pdf/1510.06746.pdf). The first one gave a bit more of a clear answer, however, it is based on the spectral index of the black hole,
which we do not have. We can always assume that everyone has the same one, but that is a blatantly false assumption. For now, we will ignore weighting
based on mass, but we may add it later. We wrote some code for weighting, which is attached below. Essentially, they are just functions that can be called
while running, which should hopefully make things a bit more organized. A note with the code is that mbh.csv is a file with the details of the catalog,
downloaded from the GSU AGN website, and all_mc.npy is a numpy file with events from the IceCube detector.
4/16/2018
I looked more into healpy this week, which is an implementation of NASA's healpix algorithm. Essentially, what it does is split up the surface of a sphere
and project it onto a 2d square surface. This is extremely helpful for as as it allows us to bin the data in a consistent and straightforward manner. We as a
group also looked into the statistical analysis of the data and how we will eventually interpret it. From what I understand, we will stack the normal data and
get a test statistic. We will then use the same stacking method on a set of data that has had all right ascensions randomized to get another test statistic
and then compare the two. This will allow us to have a null hypothesis and a fair alternative hypothesis to test against one another.
4/23/2018
We worked on our code a good amount this past week, and are almost there. The code is attached below (agn_pval_stacking.py). One may see that the
detector weighting is different than above. We wanted to try to try out not hardcoding the data so we did the full process for how each value was calculated.
We are having one major error though. We keep having an issue where the arrays that are being convolved to get the expected counts end up being two
different sizes. If we can not solve that issue, we will go back to hardcoding the base values.
4/30/2018
We got the code up (see agn_pval_stacking (1).py) and running. We ran into some issues with the actual vs expected being drastically different, but we
realized that in one of the loops the wrong variables were being added. Once we noticed that, everything worked fine. We did run into another hitch of sorts,
however. Turns out my understanding of the statistical analysis was not correct and so we need to rewrite a good portion of the code. I am still not entirely
sure what the correct analysis but currently my understanding is that we find an expected background. We then do the random scramble many times and
each time compare it to the background to get a test statistic. We then compare the actual data to the expected background to get another test statistic
and see how uncommon that value is. This is a little disheartening considering that we have to have our presentation ready by tomorrow. We will continue to
work until the poster presentation and hope we get the correct implementaton.